PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

PDF Indexing with indirect access to PDF via Download.aspx?aguid=X&node=Y

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PDF Indexing with indirect access to PDF via Download.aspx?aguid=X&node=Y

    Hi,

    I am trying to index (spider mode) a bunch of PDF files with the following URL pattern



    http://www.mydomain.com/files/DownloadFile.aspx?aguid=X&node=Y

    e.g

    http://www.mydomain.com/files/DownloadLiterature.aspx?aguid=6119f45a-34b4-4fc5-9598-ae0642fa6aa8&node=4347


    I created a simple text file with a list of URLs and imported it.
    The PDF plugin is installed

    When i run the indexer it downloads the 1st PDF then it says

    Processing PDF file http://www.mydomain.com/files/DownloadLiterature.aspx?aguid=6119f45a-34b4-4fc5-9598-ae0642fa6aa8&node=4347
    Then for the remaining links it says

    Additional start URL invalid or already scanned: http://www.mydomain.com/files/DownloadLiterature.aspx?aguid=b61ef1e2-abaf-40ef-af87-8e718ef6478b&node=1723
    Then the indexing ends.
    Any ideas what the problem is

  • #2
    That error message implies that the second URL was already indexed, or considered invalid for some reason.

    The first thing to do would be to turn on Verbose Mode and see if you get more messages (such as a Skip Message) related to this file and why it may have considered it invalid.

    Second, would be to check if you have any other start points or pages indexed as part of your configuration. You mentioned that you imported a list of URLs. If any one of the start points before this URL includes many links to other pages, and results in the crawling of other pages which happen to include this one, it will be rejected as you see above.

    It would be helpful if you can provide us with your ZCFG file containing your full indexing configuration. This way we can check if prior start points may be related, etc. You can e-mail these files to us.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      It would also be worth confirming if you are using the latest version and build. You will find the latest build available here.
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment

      Working...
      X