PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Indexing non existent pages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indexing non existent pages

    zoom indexes my site but is indexing pages that do not exist. The log says it downloaded the missing page and indexed it. How do I get the indexer to skip over missing pages?

    Doug

  • #2
    Zoom doesn't create pages on your site. So if pages are being downloaded an indexed, then the pages must in fact exist somewhere.

    Or maybe you have a few broken links on your site? But in this case there will be no download or indexing message.

    If you continue to have a problem can you post the relevant section of the log, including the URLs in question.

    Comment


    • #3
      FIXED: phantom pages being indexed

      I found the problem: any page that did not exist is redirected to a custom 404 page; hence the page is "indexed". I turned off the custom 404 temporarily so the site can be indexed.
      However, this is not a good long-term solution.

      Doug

      Comment


      • #4
        Actually, this is a sign that your custom 404 pages are not correctly setup.

        A web server needs to return a HTTP Status for each page requested. Zoom does not index pages which return a 404 status code (File Not Found).

        However, it is a common mistake where custom 404 pages are setup incorrectly to return a 200 ("OK") HTTP status code. This status code is used for normal, valid pages. This is most likely the case here, which is why Zoom is indexing these pages.

        There is more information on this issue here (and a link to a tool to check your status code), also here and even on wikipedia (see "Soft 404" and "False 404 errors").
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Solved Redux

          Mea culpa. Thanks Ray, you were right. Zoom now indexes the correct pages.

          I used Fiddler and found out the redirect was returning a 200 instead of 404. I'm working on a new site and didn't check the return code. Also, I'll tell my host their solution isn't proper.

          I ended up using

          ErrorDocument 404 /custom.php

          for an Apache server.

          Doug

          Comment

          Working...
          X