Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

Confused about configuration

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confused about configuration

    Hi,

    I have a main website, and the videos and other data are fed by a different site. I would like the content to be indexed on both, but am having problems configuring this.

    I want to start to spider from this url: http://webdev.multimediaservices.ca/en/sitemap

    This sitemap links to all my pages, but also some external links. I only want to index my site http://webdev.multimediaservices.ca, and this site http://cdn.forces.ca.

    In Start Options/Start spider from this URL > MORE, my spider url is http://webdev.multimediaservices.ca/en/sitemap and my base url is http://webdev.multimediaservices.ca/en/; http://cdn.forces.ca/.

    I've tried selecting the Index page and follow internal links, as well as Index page and follow internal and external links, but the files on cdn.forces.ca are always skipped. Here is a sample log entry:
    Skipping http://cdn.forces.ca/_VIDEOS2010/wmv/1140_how-to-apply_en.wmv (External site - does not match base URL)

    What am I missing?

    Thanks.

  • #2
    Originally posted by crichard View Post
    In Start Options/Start spider from this URL > MORE, my spider url is http://webdev.multimediaservices.ca/en/sitemap and my base url is http://webdev.multimediaservices.ca/en/; http://cdn.forces.ca/.
    The problem may just be the extra space after the semi-colon (";") character. Check again and make sure your base URL is exactly:
    http://webdev.multimediaservices.ca/...cdn.forces.ca/

    No spaces, no dots at the end.

    This should work with the default "Index page and follow internal links" setting (first spidering option in the list). When you've changed the base URL, it's considered an internal link.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thank you, that was the problem !

      Comment

      Working...
      X