Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

URL Does not exist

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • URL Does not exist

    I am having a problem indexing a website that previously was fine. I recently switched the site to Wordpress but I cannot see how this would create the problem.

    In starting the index I receive the following error message:
    Check that the URL exists and satisfies the settings in the configuration window.

    The site is at http:www/cla.asn.au with some 1600 posts. The log does indicate that there is an Invalid URL or domain name - which there is not!

    Thoughts!

    The log file reports:

    02|09/23/13 12:10:10|Config file loaded: C:\Zoom Search Engine 6.0\cla1.zcfg
    10|09/23/13 12:10:17|Start indexing (spider mode) at Mon Sep 23 12:10:17 2013
    02|09/23/13 12:10:17|Maximum number of words: 500000
    02|09/23/13 12:10:17|Maximum number of files: 20000
    02|09/23/13 12:10:17|Will scan files with extensions
    02|09/23/13 12:10:17| .htm
    02|09/23/13 12:10:17| .html
    02|09/23/13 12:10:17| .txt
    02|09/23/13 12:10:17| .php
    02|09/23/13 12:10:17| .asp
    02|09/23/13 12:10:17| .cgi
    02|09/23/13 12:10:17| .aspx
    02|09/23/13 12:10:17| .pl
    02|09/23/13 12:10:17| .php3
    02|09/23/13 12:10:17| .pdf
    02|09/23/13 12:10:17| .doc
    02|09/23/13 12:10:17| .dot
    02|09/23/13 12:10:17| .xls
    02|09/23/13 12:10:17| .xlt
    02|09/23/13 12:10:17| .ppt
    02|09/23/13 12:10:17| .pot
    02|09/23/13 12:10:17| .pps
    02|09/23/13 12:10:17| .wpd
    02|09/23/13 12:10:17| .djvu
    02|09/23/13 12:10:17| .swf
    02|09/23/13 12:10:17| .mp3
    02|09/23/13 12:10:17| .dwf
    02|09/23/13 12:10:17|Spider from: http://www.cla.asn.au/
    02|09/23/13 12:10:17|Web site URL: http://www.cla.asn.au/
    02|09/23/13 12:10:17|Estimated RAM required during index process: 389280 KB
    02|09/23/13 12:10:17|Initiating HTTP session (thread #1) ...
    14|09/23/13 12:10:17|DL Thread #1, got URL (http://www.cla.asn.au/) off queue
    04|09/23/13 12:10:17|Downloading file http://www.cla.asn.au/
    09|09/23/13 12:10:17|Could not download file: http://www.cla.asn.au/ (Invalid URL or domain name)
    02|09/23/13 12:10:17|Initiating HTTP session (thread #5) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread # ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #7) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #4) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #9) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #3) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #2) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #6) ...
    02|09/23/13 12:10:17|Initiating HTTP session (thread #10) ...
    08|09/23/13 12:10:19|No files found to spider from http://www.cla.asn.au/
    07|09/23/13 12:10:19|Indexing failed
    14|09/23/13 12:10:19|Waiting for threads to finish ...
    02|09/23/13 12:10:19|Cleaning up memory used for index data... please wait.
    02|09/23/13 12:10:19|Finished cleaning up memory.
    10|09/23/13 12:10:42|Start indexing (spider mode) at Mon Sep 23 12:10:42 2013
    02|09/23/13 12:10:42|Maximum number of words: 500000
    02|09/23/13 12:10:42|Maximum number of files: 20000
    02|09/23/13 12:10:42|Will scan files with extensions
    02|09/23/13 12:10:42| .htm
    02|09/23/13 12:10:42| .html
    02|09/23/13 12:10:42| .txt
    02|09/23/13 12:10:42| .php
    02|09/23/13 12:10:42| .asp
    02|09/23/13 12:10:42| .cgi
    02|09/23/13 12:10:42| .aspx
    02|09/23/13 12:10:42| .pl
    02|09/23/13 12:10:42| .php3
    02|09/23/13 12:10:42| .pdf
    02|09/23/13 12:10:42| .doc
    02|09/23/13 12:10:42| .dot
    02|09/23/13 12:10:42| .xls
    02|09/23/13 12:10:42| .xlt
    02|09/23/13 12:10:42| .ppt
    02|09/23/13 12:10:42| .pot
    02|09/23/13 12:10:42| .pps
    02|09/23/13 12:10:42| .wpd
    02|09/23/13 12:10:42| .djvu
    02|09/23/13 12:10:42| .swf
    02|09/23/13 12:10:42| .mp3
    02|09/23/13 12:10:42| .dwf
    02|09/23/13 12:10:42|Spider from: http://www.cla.asn.au/
    02|09/23/13 12:10:42|Web site URL: http://www.cla.asn.au/
    02|09/23/13 12:10:42|Estimated RAM required during index process: 389280 KB
    02|09/23/13 12:10:43|Initiating HTTP session (thread #1) ...
    14|09/23/13 12:10:43|DL Thread #1, got URL (http://www.cla.asn.au/) off queue
    04|09/23/13 12:10:43|Downloading file http://www.cla.asn.au/
    09|09/23/13 12:10:43|Could not download file: http://www.cla.asn.au/ (Invalid URL or domain name)
    02|09/23/13 12:10:43|Initiating HTTP session (thread #2) ...
    02|09/23/13 12:10:43|Initiating HTTP session (thread #6) ...
    02|09/23/13 12:10:43|Initiating HTTP session (thread #3) ...
    02|09/23/13 12:10:43|Initiating HTTP session (thread #4) ...
    02|09/23/13 12:10:43|Initiating HTTP session (thread # ...
    02|09/23/13 12:10:43|Initiating HTTP session (thread

  • #2
    The site seems to index OK from here.

    So maybe,
    • The site was down for a short period. e.g. bad wireless connection
    • You internet connection was down for a short period.
    • You have some security software or firewall blocking internet access for the Zoom application


    As an experiment, can you try doing the indexing from a different PC.

    Comment


    • #3
      Originally posted by wrensoft View Post
      The site seems to index OK from here.

      So maybe,
      • The site was down for a short period. e.g. bad wireless connection
      • You internet connection was down for a short period.
      • You have some security software or firewall blocking internet access for the Zoom application


      As an experiment, can you try doing the indexing from a different PC.
      As I also index six other sites from this pc without issue I don't think it can be the firewall nor the PC, hence the call for help. The site has been up and functioning and the issue has been recurring over days and not just as a once off.

      Thoughts.

      Comment


      • #4
        Can you E-mail me your configuration file (C:\Zoom Search Engine 6.0\cla1.zcfg)

        What exact version of Zoom are you using?

        Also, as an experiment, can you try doing the indexing from a different PC.

        Comment


        • #5
          Thanks for the help. I rebuilt the configuration file from scratch and it is now working.

          I initially had problems as it was giving some wildly differing index results (300+ to 900+). After changing down to 5 multiple threads and a 0.2sec delay I am receiving the expected 1900+ indexed files. I am using V6 (build 1029).

          Comment


          • #6
            If the delay fixes the problem, then it sounds like your site might be overloaded and starts failing when put under a bit of load. e.g. returning HTTP 503 busy error codes, or just failing to serve pages. Even if you have solved the problem for Zoom, you might have the same problem from time to time with real visitors to the site not being able to access pages when the site is busy.

            Comment

            Working...
            X