Results 1 to 9 of 9

Thread: Indexing localhost / Base URL problems

  1. #1
    Join Date
    Jul 2009
    Posts
    11

    Default Indexing localhost / Base URL problems

    Hi,

    What I want to do is to index my site on a local server (http://localhost) because it's faster than indexing the online remote site which uses lots of php pages. When this is complete the indexing files would be FTP'd to the remote site as usual, but of course the base URL would need to be changed from localhost to the www URL.

    In Spider mode I've set the spider URL to http://localhost and I've tried manually changing the base URL to the remote, e.g http://www.examplesite.com/

    This doesn't work, looking at the log files the error message is 'External site - does not match base URL'

    I just have the 1 start point.

    Any ideas?

    Thanks.

  2. #2
    Join Date
    Dec 2004
    Location
    Sydney
    Posts
    4,156

    Default

    You need to use the rewrite links option. On the "indexing options" configuration window.

  3. #3
    Join Date
    Jul 2009
    Posts
    11

    Default

    Thanks for the info. I noticed this option, but am I right in thinking that if I enable this then I am unable to reindex only new and changed files?

    Thanks.

  4. #4
    Join Date
    Dec 2004
    Location
    Sydney
    Posts
    4,156

    Default

    Using the Rewrite Links option disables the ability to use incremental indexing on the produced set of index files. This means you will not be able to perform an incremental update, or add/remove pages from the index without re-indexing your site entirely.

    The limitation stems from the fact that after you have modifed all the URLs in the index, the indexer can't be sure what URLs were orginally indexed. And incremental indexing (obvioulsy) needs to know what URLs have already been indexed in order to do an incremental add.

  5. #5
    Join Date
    Jul 2009
    Posts
    11

    Default

    Ok thanks.

  6. #6
    Join Date
    Jul 2009
    Posts
    11

    Default

    Just a thought, but couldn't the URL be dynamically rewritten in the php search script?

    The site would be indexed on localhost with a base URL of localhost and the index files uploaded to the remote server. Then, when a user makes a search the localhost string retreived from the index is replaced with the correct remote address...

    Thanks for any info.

  7. #7
    Join Date
    Dec 2004
    Location
    Sydney, Australia
    Posts
    3,573

    Default

    Yes, we've considered implementing it that way, but it's inefficient to do a search and replace on every URL, every single time a search is performed. As it is, the replace is only done once at the time of indexing.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

  8. #8
    Join Date
    Dec 2006
    Posts
    49

    Default

    How about setting up the localhost/DNS on localhost so that it thinks that it is the global site? That way, the indexes will be consistent and incremental indexing will be possible.

  9. #9
    Join Date
    Jul 2009
    Posts
    11

    Default

    I've got this working. I've tested using the php search script and it seems to work ok but I've only gone through the script quickly so I may have missed other parts where I need to override the URL.

    I added the following line after line 2676

    PHP Code:

    $url
    str_replace("localhost","www.examplesite.com",$url); 
    The index remains the same, all that happens is the url is rewritten on the fly just for the results page.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •