PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

How to Maintain Spider URLs after 301 redirect?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Maintain Spider URLs after 301 redirect?

    After purchasing and installing Zoom Search Professional 5.1 yesterday, I have run into a slight snag which I need to overcome.

    I input several URLs into the list of all URLs I wish Zoom to index. When I execute the indexing most everything completes as normal except a couple of errors for some URLs but this seems normal.

    But because a lot of the URLs I provide redirect; either via HTTP 301 or utilizing frames, the results of the searches i run have the final destination URL instead of the URL i provided

    Is there any way to maintain the original URL i provide instead of the actual URL?

    Example:

    I would provide: http://www.mysite.com/yahoo
    (this redirects to http://www.yahoo.com via a HTTP 301)

    Zoom Indexer Runs

    http://www.yahoo.com placed in the results instead of http://www.mysite.com/yahoo which I would like.

    Any ideas?

    If anyone can help, it would be most appreciated.

    Regards,
    Giac
    Zoom Search Engine Professional 5.1 (Build 1007)
    Newbie

  • #2
    According to the standards, a HTTP 301 means the document has "Moved Permanently". So in general it makes sense to use the final, new, URL rather than the old location for the document.

    From the standards, "The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible."

    So Zoom can follow a chain of HTTP redirects to get to the final document, but the final URL is always used in the index. There is no way to force different behaviour from Zoom.

    Comment


    • #3
      Changing the URL's after indexing?

      Many thanks for the prompt response.

      I still require to some how redirect my visitors to an alternative location rather than the final redirected 301 address. I understand this may not be under the remit of Wrensoft but any assistance in the matter would be appreciated.

      In an attempt to resolve this I have attempted to edit the URLs in the Zoom_Pagedata.zdat via find and replace? (Pipe separated from CGI output of indexing) but this is difficult without affecting the final check sum which reflects to Zoom as a corruption to the file and the search will no longer function.

      As a possible alternative to my problem, is there any known way to alter the Zoom_Pagedata.zdat and maintaining the index in an operational state?

      Thanks again,

      P.S. Looking at previous posts in the forum http://www.wrensoft.com/forum/archiv...hp/t-1369.html I see that something similar is expected in a point release of 5.1; is this expected soon? In the mean time is there any way to replace URL's in the zoom_pagedata.zdat?
      Last edited by moscag; Oct-14-2007, 08:11 PM. Reason: Update
      Giac
      Zoom Search Engine Professional 5.1 (Build 1007)
      Newbie

      Comment


      • #4
        In V5 of Zoom it is not possible to edit the zoom_pagedata.zdat with a text editor. You will just corrupt the index. it is not just a checksum issue. The internal pointers to records in the index all get messed up.

        The "rewrite links" URL search and replace option was implemented in V5.0.1005 (9 Mar 2007). It is on the indexing options tab of the Zoom configuration window.

        This feature allows you to rewrite the indexed URLs (eg. replace all URLs containing "http://dev.myserver.com/" to "http://www.myserver.com").

        Comment

        Working...
        X