What I want to do is to index my site on a local server (http://localhost) because it's faster than indexing the online remote site which uses lots of php pages. When this is complete the indexing files would be FTP'd to the remote site as usual, but of course the base URL would need to be changed from localhost to the www URL.
In Spider mode I've set the spider URL to http://localhost and I've tried manually changing the base URL to the remote, e.g http://www.examplesite.com/
This doesn't work, looking at the log files the error message is 'External site - does not match base URL'
I just have the 1 start point.
You need to use the rewrite links option. On the "indexing options" configuration window.
Thanks for the info. I noticed this option, but am I right in thinking that if I enable this then I am unable to reindex only new and changed files?
Using the Rewrite Links option disables the ability to use incremental indexing on the produced set of index files. This means you will not be able to perform an incremental update, or add/remove pages from the index without re-indexing your site entirely.
The limitation stems from the fact that after you have modifed all the URLs in the index, the indexer can't be sure what URLs were orginally indexed. And incremental indexing (obvioulsy) needs to know what URLs have already been indexed in order to do an incremental add.
Just a thought, but couldn't the URL be dynamically rewritten in the php search script?
The site would be indexed on localhost with a base URL of localhost and the index files uploaded to the remote server. Then, when a user makes a search the localhost string retreived from the index is replaced with the correct remote address...
Thanks for any info.
Yes, we've considered implementing it that way, but it's inefficient to do a search and replace on every URL, every single time a search is performed. As it is, the replace is only done once at the time of indexing.
Wrensoft Web Software
Zoom Search Engine
How about setting up the localhost/DNS on localhost so that it thinks that it is the global site? That way, the indexes will be consistent and incremental indexing will be possible.
I've got this working. I've tested using the php search script and it seems to work ok but I've only gone through the script quickly so I may have missed other parts where I need to override the URL.
I added the following line after line 2676
The index remains the same, all that happens is the url is rewritten on the fly just for the results page.PHP Code: