PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Index only a part of website and linked documents

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Index only a part of website and linked documents

    Hello,

    I want to index only a part of my website (www.domain.com/news/) and also the linked documents (pdfs, docs, …). The problem is all documents are skipped because they are located under www.domain.com/files/ and Base URL is set to www.domain.com/news/.

    I tried to set Start URL=”www.domain.com/news/” and Base URL=“www.domain.com/news/;www.domain.com/files/” but it doesn’t work an only one site are indexed (only root of www.domain.com/news/).

    How can I accomplish this ?

    Many thanks in advance for your help

    John

  • #2
    You are doing the right thing. The base URL needs to have two URLs in it with a semi-colon.

    But you need the full URL including the protocol, HTTP://

    So try this for the base URL
    http://www.domain.com/news/;http://www.domain.com/files/

    Comment


    • #3
      Hello,

      thanks for your answer, I tried to set in spider mode "http://www.domain.com/news/" as Start URL and "http://www.domain.com/news/;http://www.domain.com/files/" as Base URL but it doesn’t seem to work and only one site are indexed (only root of www.domain.com/news/). All other sites from http://www.domain.com/news/ are skipped with "External site – does not match base URL)".

      Thanks for you Help

      John

      Comment


      • #4
        We've confirmed that this is a bug in the current release.

        The multiple base URL feature (separated by semi-colon delimiters) is broken due to our adding the URL escaping feature, which encoded the ";" character inadvertedly. This will be fixed in the next release V7 build 1016.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X