Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

Getting error every time I try indexing website...

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting error every time I try indexing website...

    I'm trying to index a website and every time I do it, I'm getting an error. The error I'm getting says there are no files found to spider. I'm not sure why I'm getting this error. Can someone help me? I'm hoping this program works so I can incorporate a search function on my website.

  • #2
    It can be for many reasons.

    e.g.
    You don't have an internet connection.
    You entered in an incorrect start point URL
    Your firewall is blocking internet traffic
    Your web site is down

    We would really need to see the log to have a better guess.

    See also these FAQ
    Q. Why are some of my pages being skipped by the indexer?
    Q. Why are links in my Javascript menus being skipped?
    Q. I am indexing with spider mode but it is not finding all the pages on my web site

    Comment


    • #3
      It's got to be something on my host or something misconfigured in Zoom Search, because I've tried indexing on 3 different computers inside and outside of my firewall. I'm not sure what I'm doing wrong. Any help would be appreciated. Here is the log from a scan I just tried:

      11:55:38 - Maximum file size: 2097152
      11:55:38 - Will scan files with extensions
      11:55:38 - .htm
      11:55:38 - .html
      11:55:38 - .txt
      11:55:38 - .php
      11:55:38 - .asp
      11:55:38 - .cgi
      11:55:38 - .aspx
      11:55:38 - .pl
      11:55:38 - .php3
      11:55:38 - .pdf
      11:55:38 - Spider from: http://www.glendaleheights.org/
      11:55:38 - Web site URL: http://www.glendaleheights.org/
      11:55:38 - Estimated RAM required during index process: 563940 KB
      11:55:39 - Initiating HTTP session (thread #2) ...
      11:55:39 - DL Thread #2, got URL (http://www.glendaleheights.org/) off queue
      11:55:39 - [DOWNLOAD] Downloading file http://www.glendaleheights.org/
      11:55:39 - Initiating HTTP session (thread #1) ...
      11:55:39 - [DOWNLOAD] URL redirected to: http://glendaleheights.org/ [thread #2]
      11:55:39 - [SKIPPED] Skipping http://glendaleheights.org/ (External site - does not match base URL)
      11:55:39 - Check that the URL exists and satisfies the settings in the configuration window.
      11:55:39 - [ERROR] No files found to spider from http://www.glendaleheights.org/
      11:55:39 - Indexing failed
      11:55:39 - Waiting for threads to finish ...
      11:55:39 - Cleaning up memory used for index data... please wait.
      11:55:39 - Finished cleaning up memory.
      11:55:40 - Indexing aborted at Mon Aug 08 11:55:40 2016

      Comment


      • #4
        Your site is redirecting from the "www." domain to the domain without the "www." prefix.

        Note that technically this can be two different websites. So it is being rejected as not matching your Base URL of "http://www.glendaleheights.org/"

        The easiest way to fix this is to change your start URL from "http://www.glendaleheights.org/" to "http://glendaleheights.org/". This will update your base URL accordingly. And hopefully the rest of your site uses consistent linking and there would be no further issue.

        An alternative solution would be to change your base URL setting (by clicking on the "More" button and then selecting "Edit") and specifying two acceptable base URLs in the form of "http://www.glendaleheights.org/;http://glendaleheights.org/"
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X