PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Re url re-directs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Re url re-directs

    Hi Ray,
    I know we looked at this a couple of weeks ago and am still having some issues with re-directs

    when i spider the following web site

    http://www.itechcomputer.com.au it redirects to http://www.i-tech.com.au, this is only an example of one site, it happens on all sites when it re-directs to a totally different url, i know i can just delete from my spidering database but if it happens when im 70% through 1000 urls it means alot of extra work to go back.

    When it re-directs the spider just hangs there, i have left it for over 10 minutes and nothing happens, can u advise if there is a solution to this.

    im using professional edition cgi mode

    Apart from that this is a fantastic search engine

  • #2
    We thought we fixed this a couple of weeks back.

    I just did a quick check with the URL you posted and you're right, there is something wrong. We'll investigate in more detail and let you know.

    -----
    David

    Comment


    • #3
      We've looked into the problem and have confirmed that it is a bug in the latest build (4.1.1003). This problem was re-introduced by some recent changes in handling redirections. We will fix this in the next public build (4.1.1004).
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment


      • #4
        Re Url re direct

        Hi Raymond, that sounds great if u have a beta version available prior to the public build 1004 can u advise as im keen to start building up my database

        Cheers

        Comment


        • #5
          Version 4.2 beta 1 is available here:
          http://www.wrensoft.com/ftp/zoomsearch4_2_beta1.exe

          Note that this is an early beta release. It includes the fix for the redirection bug mentioned above, along with some new features such as:

          - Improved spelling suggestions
          - Synonyms
          - Negative searches ("zoom -search" will search for results containing "zoom" but not "search")

          Note for future readers of this post - the above link will become unavailable when the final version is released (or as newer builds are introduced).

          E-mail us if you have any bug reports/questions regarding this beta.
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            re redirect bugs

            Hi ray,
            downloaded 4.2 beta and still not handling re-directs

            www.antdiv.gov.au
            redirects to www.aad.gov.au and locks up

            Comment


            • #7
              redirects

              Just reporting another one hoping it helps with finding out why its doing this

              http://www.businessaccess.vic.gov.au redirects to http://www.business.vic.gov.au and stalls spider

              Comment


              • #8
                We tested the URLs given and could not get them to stall the Indexer in Version 4.2 Beta 1.

                Note that "stalling" here, refers to the fact that the Indexer would "freeze" immediately after the redirection, when it should continue processing other start points (or end indexing if this was the last/only start point).

                If your problem is that it simply skips over the redirected site you wish to index, then this is only a configuration issue.

                To index the redirected URL, you need to make sure to have an appropriate Base URL - otherwise the redirected domain will be considered a link to an external site, which would usually be ignored.

                With Verbose Mode enabled, you should see something like:

                Downloading file http://www.antdiv.gov.au/ (495 bytes)
                URL redirected to: http://www.aad.gov.au/ [thread #1]
                Skipping http://www.aad.gov.au/ (External site - does not match base URL)
                And it should move on to the next start point (or stop indexing). This is expected behaviour if your start point for this URL has a Base URL of "http://www.antdiv.gov.au/".

                To index the redirected site, you need to change your Base URL so that both domains would be considered part of the same site. You can do this by clicking on "More" -> select the URL -> "Edit" and change the Base URL text box value to:

                Code:
                http://www.antdiv.gov.au/;http://www.aad.gov.au/
                The semi-colon character is used to define multiple base URLs. This would allow links to either domains to be indexed and qualified as "internal links". It should then behave like this:

                Downloading file http://www.antdiv.gov.au/ (495 bytes)
                URL redirected to: http://www.aad.gov.au/ [thread #1]
                Queued URL: http://www.aad.gov.au/
                Downloading file http://www.aad.gov.au/ (21433 bytes)
                Index Thread got ready buffer for http://www.aad.gov.au/ (Content-type: HTML text)
                Scanning http://www.aad.gov.au/
                Queued URL: http://www.aad.gov.au/link.asp?transportinformation
                Queued URL: http://www.aad.gov.au/link.asp?news
                ... etc.
                If the above does not solve your problem, or you think that the new beta really is stalling in the same way as the previous version, then let us know and e-mail us your .zcfg file.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  url redirects

                  Hello all!
                  I am also having trouble with url redirects.
                  When is the next release due?
                  Yes I could delete the url, but when there are 1000+ urls....
                  It would be great if you can fix this problem.
                  Colin

                  Comment


                  • #10
                    url redirects

                    oops.. I have downloaded version 4.2 and it has fixed the problem
                    Thanks
                    Colin

                    Comment

                    Working...
                    X