PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Index page with links, but not follow the links

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Index page with links, but not follow the links

    I have a page (links.asp) that contains categories of links. A user can select "Entertainment" and it will re-submit the page with the category as a querystring (links.asp?cat=1 for example).

    I would like Zoom to index the page, but not follow the external links. I have a page setup to do this that Zoom indexes. It is hidden from regular users view and is only used for the search engine. It contains a list of all the headings with the link being links.asp and the query string. I would like the search engine to follow this page, and index the resulting page but not follow the external links. I have set this up under the "MORE" button on the Spider tab but it doesn't seem to find the results. For example, if I have a link to "YMCA" on the entertainment page with a link to their site it won't come up in the search engine. I would like the entertainment page to come up in the search results when searched.

    Thanks,
    Shawn

  • #2
    I am not sure if I understand your situation correctly.

    You don't want the spider to follow external links. But this is the default behaviour of the spider and no extra configuration is required to make this happen.

    But then at the end of your post you say you have an [external] link to YMCA and you want it to appear in the search results? Which seems to contradict the initial requirement?

    It also isn't clear to me what the links ASP script does, nor if the entertainment page is internal or external, and what settings you are using for your additional spider start point(s).

    So maybe you can provide us with
    1/ The actual URLs, so we can see the pages in question.
    2/ The spider start point(s) you are using and the spider options for each point
    3/ The URLs to the external pages that you want in the results.
    4/ Maybe an extract of the Zoom indexing log if you think that will help us understand the situation.

    ------
    David

    Comment


    • #3
      Hi David,

      Sorry for the confusing post. All my pages are internal and I want them indexed so that people know the content is there, but I do not want to spider the external site.

      I have an ASP Script that is not on any menu systems. Call it - zoomDBSearch.asp for example. This is a start point in Zoom and it simply lists all the categories of external links on the site with the appropriate querystring. Example:
      Code:
      [url="links.asp?cat=1"]Entertainment[/url]
      [url="links.asp?cat=2"]Family[/url]
      [url="links.asp?cat=3"]Tourism[/url]
      The only purpose of this file is to provide Zoom the proper query strings to pull up the complete page of Entertainment/Family/Tourism links. When the spider visits links.asp with the proper querystring then it will get a list with a description like:
      Code:
      [url="http://www.ymca.com"]YMCA[/url]
      [url="http://www.website.com"]Website Description[/url]
      This page is what I would like Zoom to index the information but not index any of the external sites. So that if someone typed in YMCA into the search engine, it would come up with any pages that contain the content of YMCA and in the search results (somewhere) it would include a link to the etnertainment page (links.asp?cat=1).

      Thanks,
      Shawn

      Comment


      • #4
        Hi Shawn,

        You should be able to achieve what you're after with the default spider setting - "Index page and follow internal links".

        This default setting follows and indexes all internal URLs (such as "links.asp?cat=2"). It will index the text used for links such as "YMCA" and "Website description" in your example. The external links will not be followed because "http://www.ymca.com" are considered external from your site.

        However, something else we noticed while looking at your website (sent to us via private message), was that you had blocks specified. But before the actual content of the page, and right after your side navigation menu, you have a ZOOMRESTART tag specified as follows:

        Note the spaces between the ZOOMRESTART keyword and the HTML comment characters. This will not be recognized. Zoom requires the tag to be (no spaces). I think this could well be the actual cause of the problem, because it means that the ZOOMSTOP block was not closed (until further down the page where it was specified correctly), resulting in the main content of the page being excluded from indexing.

        Hope that helps, and let us know if this didn't solve your problem.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Yep, that took care of it....

          Now I have to play a bit more with my search results - I had them where I wanted it before, but now, everything has changed around a bit.

          Thanks!
          Shawn

          Comment

          Working...
          X