Zoom Search Engine FAQ - Troubleshooting problems

Q. Why is there no search form on my search page?

First of all, make sure you are opening the correct file from your browser. For PHP, ASP, and CGI, the search page to open in your browser (and link to from your other webpages) should be either "search.php", "search.asp" and "search.cgi" respectively. It is NOT "search_template.html". This is a template file used by the script to determine the layout it should produce. For the Javascript version, the search page is "search.html".

Second, make sure you have not selected "Do not generate" for the Search Form option in the Configuration Window (under the "Search Page" tab). If so, change this back to "Basic form" or "Advanced form" and re-index and upload your files for the change to take effect.

If you have checked the above two things, and your search page still would not show a form, then you should make sure your "search_template.html" is not corrupted. It must contain a HTML comment <!--ZOOMSEARCH--> on a seperate line. There are also other important HTML lines for the Javascript version which must not be removed.

Note: Some web authoring applications such as FrontPage and MS Word (see here for more issues with MS Word) have options which "compresses" the HTML by removing code it considers to be irrelevant. Often, these options will remove important code required by Zoom. Make sure you have these options disabled when dealing with Zoom related files.

GoLive CS is another application that has been found to be a culprit of this behaviour. The offending option in GoLive CS can be turned off under "Export Site Options"->"Strip Options..." and unchecking the "Comments" box.

If you suspect your search template ("search_template.html" or "search.html" ) has been corrupted, rename the file (for backup purposes - eg. "search_template_old.html"), and re-index your site to the same output directory. The Indexer will generate a new default search template when it finds that it is missing. You can then compare the two files to see if anything important was removed, or just start from scratch with the new default template - making sure to not remove the required lines of code again.

Q. Why are some of my pages being skipped in the indexer?

The best way to determine why a page was found but then skipped is to turn on "Skipped" messages on the Indexer's "Log" tab. Now when you index your website, the indexer will display the files it skips, and most importantly, the reason they are skipped.

The main reasons for files being skipped and not indexed are:

  • The file or directory name starts with an underscore, e.g. "_notes" and are considered to be hidden. This is now an option you can disable in the configuration window.
  • In spider mode, a page may be considered to be an external link if the URL of the file is not included by the Base URLs specified.
  • The file name extension does not match those listed in the configuration extensions list.
  • The file attributes indicate that the file is not a normal file, e.g. the file is a hidden file.
  • The file name satisfies a criteria you have entered in the page skip list.
  • Duplicate page detection is enabled in the configuration, and the file has identical content to a previously scanned file.
  • The file size is larger than the configuration limits you have selected
  • The document you are trying to index is password protected and or encrypted

Q. I am indexing my website with the spider mode but it is not finding all of the pages on my site

The spider indexing mode finds webpages to index by following the text links it comes across as it scans through your website, starting from the start URL you specified in the indexer. So if there are parts of your website that can not be found by clicking on links going from page-to-page of the start URL, then the spider will similarly fail to find them.

There are several other common scenarios in which the spider mode will be unable to follow the links (note that most of these also apply to external search engines such as Google, etc):

  • Javascript navigation or DHTML: If your website relies on links produced in a Javascript navigation menu, and there are no hypertext links to these other pages of your website, the spider will not be able to find them. See the Javascript menu question below for more details.
  • Parts of your website are hosted under a different domain or sub-domain: Zoom determines whether the links it finds are part of your website, or a link to someone elses' website by its base URL. You can add multiple base URLs to Zoom in order to allow other domains to be indexed as part of your site - see the FAQ on indexing multiple domains or sub-domains for more information.
  • Form-based navigation: If parts of your site can only be accessed by submitting a form and selecting a combination of different input parameters, then this would make it unfriendly to spider. The spider cannot guess every combination of options possible - in most cases, it is simply impractical. Note however, that if your website requires authentication, Zoom supports both HTTP authentication and cookie-based login forms. See "How to index sites requiring authentication with Zoom" for more information.
  • Shockwave Flash navigation: If your site depends on Flash-based navigation for links to the other pages of your site, then you will need the SWF plugin available for the registered editions. This will allow you to index and spider crawl links from SWF files.
  • You have hit the Zoom limits, such as the maximum number of pages, as set on the limits tab, before all pages were indexed.
  • The pages not being found are just the new pages on your site, and you are using caching in Zoom. The use of caching can result in old copies of pages being indexed and as a result links to new documents will not be found. Caching can be turned off in the "Spider options" configuration window.
  • You are indexing your site with the secure SSL protocol (HTTPS) and the security certificate has expired for the site, or the certificate is not valid for the site.
  • Your link to the internet isn't reliable and the connection was lost during indexing (especially common with wireless internet).
  • The server is overloaded or unstable and not responding to requests for pages.
  • Your firewall is blocking outgoing connections (HTTP on port 80 by default)

A sure-fire way to make sure that everything you want will be indexed with the spider mode will be to have a site map page - containing links to every page of your website, and using that as the start URL. This improves your website's usability and accessibility, letting more of your visitors (as well as search engines), get around your website better.

Q. Why are links in my Javascript menus being skipped?

Links in Javascript menus or DHTML are not followed. Javascipt is executed on the client side by the browser. Using Javascript it is to possible to create new links as the Javascript code executes. Some examples are,

  1. A link might be generated only after the user moves the mouse over a particular area of the screen or enters some data.
  2. The Javascipt code might create the URL for the link using an algorithm that takes into account other factors such as the date and time, the size of the browser window, security settings in the browser or hundreds of other factors.
  3. A link gets generated by the Javascript code only 10 seconds after the page is downloaded using a timer.

Zoom does not execute JavaScript. Even if it did execute JavaScript it would fail on the above examples. It is not possible to predict or simulate the user behaviour with the mouse or data entry. So using only JavaScript to generate links will result in those links being invisible to search engines.

Note that this is also true for external spiders such as Google, and users with Javascript disabled or incompatible browsers. For these reasons, it is generally recommended to not rely on Javascript links, and to always provide normal HTML links somewhere on your web page for accessibility purposes. A good way to do this would be to use the <noscript> tag. For more information on this issue, refer to w3.org's page for client-side scripting.

By contrast, links generated dynamically on the server with CGI's, PHP or ASP will always be OK. This is becasue the code has been fully executed before it gets to the client.

Q. I get the error, [No files found to spider] Check that the URL exists and satisfies the settings in the configuration window

No file was found to index on your web site. Normally this means there is something wrong with the web server or the start point URL is wrong. But there can be many reasons for this error. See the FAQ's above regarding page skipping and the spiders not finding pages for troubleshooting.

Q. I've indexed the site, uploaded the files, but I only see the search script code when I open the search page in my browser (or get a prompt asking if i want to download it) - what's wrong?

Your web server does not have ASP/PHP support enabled so it isn't being executed by the server. Contact your hosting company regarding support for ASP or PHP. Also make sure you are using the version of the script that it supports. If neither ASP or PHP is supported, you could consider the JavaScript version.

If you are using the JavaScript version, make sure you have JavaScript enabled on your web browser.

Q. I am seeing the wrong text appear in the search results (or the results look strange and messed up)

This is most commonly caused by a corrupted set of index files. You may have either:

  • Incorrectly uploaded a subset of the required index files, and failed to update all the necessary files. Note that ALL files listed in the "Required files" window at the end of indexing must be uploaded to your web server. If you only upload some of the necessary files, you may end up with a mix of files from different index sessions. This can lead to unexpected behaviour such as what you are seeing.
  • Modified any of the generated files, such as the "settings" file or the search script itself. In doing so, you may have broken search functionality. Re-index, and make sure you revert to the unmodified files and see if the problem persists.
  • Uploaded the files incorrectly using a third party FTP program. Some FTP programs will mistakenly upload the files in Text/ASCII mode. All ZDAT files should be uploaded in BINARY mode. If you use Zoom’s built-in FTP features to upload the files for you, then this will not happen.

You should also make sure you are using the same version (and build) of the search script or CGI as the index files generated. This should be ensured if you followed the above requirements (uploading all the files listed in "Required files", etc.)

Q. Why is there no spacing between my search results?

This is usually because you have either: (a) removed the necessary CSS from your search template page, or (b) upgraded from a previous version of Zoom with an old template page which is missing the new CSS.

To fix this, you simply need to add the following two lines of CSS in your template file ("search_template.html" or "search.html" for JavaScript). It will space out the search results and make it more readable again.

.result_block { margin-top: 15px; margin-bottom: 15px; clear: left; }
.result_altblock { margin-top: 15px; margin-bottom: 15px; clear: left; }

For more information on customizing the appearance of your search results with CSS, please see this FAQ.

Q. Why do some of my files show up in the search results with the current date and time?

These files may not have last modified date and time information available. This may be because:

  • You are using spider mode, and the web server failed to provide a last modified date for the file when queried by Zoom.
  • You are indexing a dynamically generated web page (eg: .php, .asp, .aspx, .cfm, etc.). There is no meaningful last modified date for these files normally because the file itself is just the script source code (which may not have changed), but the content served up (from a database for example), may be more recent. The web server would normally return the date and time when the script is executed and the web page is generated.

In both of these cases, you can use a last-modified Meta tag to specify a more meaningful date and time for the file. For more information, see chapter 6.7 of our Users Guide.

Q. When I click on a link in the search results, I get a 404 "Page Not Found" or "Page Cannot Be Found" error.

This is most likely due to an incorrect Base URL setting.

If you are indexing in Offline Mode, you need to specify a base URL that corresponds to the location of your web pages when they are hosted. This needs to be correct for the search result links to point to the correct location of the files.

If you are indexing in Spider Mode, the base URL should be automatically determined for you unless you have changed it manually.

For more information on specifying the correct base URL, please refer to the "Base URL" chapter in the Users Guide.

Q. What does the "Suspected invalid HTML ..." warning message in the index log mean?

This is a non-critical warning message that indicates Zoom found a problem in the HTML source code of your web page.

If you do not understand nor wish to learn HTML, then technically, you can ignore this. Zoom will try its best to index the content on your page as best it can. But basic HTML knowledge is really essential for all web developers.

Mistakes in your HTML source can mean one or more of the following issues:

  • The page may have rendering issues in various browsers. It may appear differently from browser to browser and you may have trouble getting it to display consistently.
  • The content on your page may not be indexed entirely if the bad HTML made the content ambiguous. This may not only affect Zoom but also external search engines such as Google.

There are many tools out there to help you check your page for HTML errors. The W3C offer an article explaining the need to validate ("W3C: Why Validate?") and also the most definitive and comprehensive online HTML validator (validator.w3.org), but simpler HTML checking tools such as this one can also be a quick way to find your mistake.

Q. I am getting an error "Microsoft JScript compilation ..."

Your IIS server is configured to use JScript for the default ASP language. Zoom's ASP search script however, uses VBScript.

You can change your IIS configuration to use VBScript from the IIS Control Panel (Start -> Settings -> Control Panel -> Administrative Tools -> Internet Information Services). Expand the tree and locate your website. Right click on the website, and select "Properties" -> "Home Directory". Click on "Configuration" -> "Options" and next to "Default ASP language:", replace the "JScript" text here with "VBScript". Click OK to apply. Note that doing this will default all your other ASP scripts to be parsed by the VBScript scripting engine.

An alternative would be to specify the scripting language within "search.asp" itself. This would allow you to have multiple ASP scripts that use different scripting languages within the same website/folder/application. However, this method requires modifying the search script source code (from Zoom, with ASP selected, click on "Templates" -> "Modify search script source code"). Add the following line to the very top of the file:

<%@ LANGUAGE="VBScript" %>

Save your changes and re-index. Note that this second method can cause issues if you are trying to include the search script within another ASP page using server-side includes. This is because the line must appear at the very beginning of the script output. So if this is the case, move this line somewhere appropriate (like the beginning of your include file).

Q. My search page is showing up blank and there is strange code in the source of my template page (eg. "<!--[if gte mso 9]>", "<o:DocumentProperties>", etc.)

This is caused by editing the search_template.html (or search_template_src.html) file in Microsoft Word or Office 2000, which has a known problem with producing corrupt HTML. This bug in Microsoft Word creates web pages that can not be opened/viewed correctly in certain versions of IE or even FrontPage.

You can check if you have this problem by opening the search_template.html file in IE, where you will see a blank page. If you "View Source", you will see the additional Office-specific markup ("gte mso 9", "DocumentProperties", etc.) which has been added to the file by Microsoft Word.

What you should do is either:

  • Remove the offending Office markup in the search_template_src.html file (Microsoft provides a tool to do this). However, we would be wary of any other inappropriate changes it may make to the HTML file.
  • Or more preferably, re-install Zoom, so that it will overwrite the template with the original file. Re-index your site and try again, this time without editing the search template in Word.

Q. I am seeing an error like, "... is not a valid Win32 application", when installing or trying to launch the Indexer (is Windows 2000 supported?)

This means you tried to install or run Zoom on a computer running Windows 2000 or an older operating system (such as Windows XP without SP2 installed).

While Zoom previously supported these operating systems, we eventually had to switch our development environment to the latest version from Microsoft. Unfortunately, Microsoft decided to drop support for Windows 2000 in their development tools (as they have also dropped user support for Windows 2000 sometime ago). So we were forced in to such a situation.

The last build of Zoom which supports Windows 2000 can be downloaded from here (V6.0 build 1024). Note that there have been subsequent bug fixes since that build, and further improvements will only be available in later builds.

Q. I am having trouble with installing Zoom, is there more information?

Please consult the Users Guide (PDF). We have tried to make this as comprehensive as possible, but if you still have problems, contact us at Click to send mail (JavaScript required) .

Return to the Zoom Search Engine Support page