PDA

View Full Version : Re url re-directs



Anonymous
07-12-2005, 12:10 PM
Hi Ray,
I know we looked at this a couple of weeks ago and am still having some issues with re-directs

when i spider the following web site

http://www.itechcomputer.com.au it redirects to http://www.i-tech.com.au, this is only an example of one site, it happens on all sites when it re-directs to a totally different url, i know i can just delete from my spidering database but if it happens when im 70% through 1000 urls it means alot of extra work to go back.

When it re-directs the spider just hangs there, i have left it for over 10 minutes and nothing happens, can u advise if there is a solution to this.

im using professional edition cgi mode

Apart from that this is a fantastic search engine

wrensoft
07-12-2005, 09:18 PM
We thought we fixed this a couple of weeks back.

I just did a quick check with the URL you posted and you're right, there is something wrong. We'll investigate in more detail and let you know.

-----
David

Ray
07-13-2005, 01:06 AM
We've looked into the problem and have confirmed that it is a bug in the latest build (4.1.1003). This problem was re-introduced by some recent changes in handling redirections. We will fix this in the next public build (4.1.1004).

Anonymous
07-13-2005, 06:36 AM
Hi Raymond, that sounds great if u have a beta version available prior to the public build 1004 can u advise as im keen to start building up my database

Cheers

Ray
07-18-2005, 12:27 PM
Version 4.2 beta 1 is available here:
http://www.wrensoft.com/ftp/zoomsearch4_2_beta1.exe

Note that this is an early beta release. It includes the fix for the redirection bug mentioned above, along with some new features such as:

- Improved spelling suggestions
- Synonyms
- Negative searches ("zoom -search" will search for results containing "zoom" but not "search")

Note for future readers of this post - the above link will become unavailable when the final version is released (or as newer builds are introduced).

E-mail us if you have any bug reports/questions regarding this beta.

Anonymous
07-19-2005, 09:22 AM
Hi ray,
downloaded 4.2 beta and still not handling re-directs

www.antdiv.gov.au
redirects to www.aad.gov.au and locks up

Anonymous
07-19-2005, 10:58 AM
Just reporting another one hoping it helps with finding out why its doing this

http://www.businessaccess.vic.gov.au redirects to http://www.business.vic.gov.au and stalls spider

Ray
07-20-2005, 12:36 AM
We tested the URLs given and could not get them to stall the Indexer in Version 4.2 Beta 1.

Note that "stalling" here, refers to the fact that the Indexer would "freeze" immediately after the redirection, when it should continue processing other start points (or end indexing if this was the last/only start point).

If your problem is that it simply skips over the redirected site you wish to index, then this is only a configuration issue.

To index the redirected URL, you need to make sure to have an appropriate Base URL - otherwise the redirected domain will be considered a link to an external site, which would usually be ignored.

With Verbose Mode enabled, you should see something like:


Downloading file http://www.antdiv.gov.au/ (495 bytes)
URL redirected to: http://www.aad.gov.au/ [thread #1]
Skipping http://www.aad.gov.au/ (External site - does not match base URL)

And it should move on to the next start point (or stop indexing). This is expected behaviour if your start point for this URL has a Base URL of "http://www.antdiv.gov.au/".

To index the redirected site, you need to change your Base URL so that both domains would be considered part of the same site. You can do this by clicking on "More" -> select the URL -> "Edit" and change the Base URL text box value to:


http://www.antdiv.gov.au/;http://www.aad.gov.au/

The semi-colon character is used to define multiple base URLs. This would allow links to either domains to be indexed and qualified as "internal links". It should then behave like this:


Downloading file http://www.antdiv.gov.au/ (495 bytes)
URL redirected to: http://www.aad.gov.au/ [thread #1]
Queued URL: http://www.aad.gov.au/
Downloading file http://www.aad.gov.au/ (21433 bytes)
Index Thread got ready buffer for http://www.aad.gov.au/ (Content-type: HTML text)
Scanning http://www.aad.gov.au/
Queued URL: http://www.aad.gov.au/link.asp?transportinformation
Queued URL: http://www.aad.gov.au/link.asp?news
... etc.

If the above does not solve your problem, or you think that the new beta really is stalling in the same way as the previous version, then let us know and e-mail us your .zcfg file.

webdziner
01-30-2006, 12:08 AM
Hello all!
I am also having trouble with url redirects.
When is the next release due?
Yes I could delete the url, but when there are 1000+ urls....
It would be great if you can fix this problem.

webdziner
01-30-2006, 01:10 AM
oops.. :oops: I have downloaded version 4.2 and it has fixed the problem :)
Thanks