PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Two Questions...Categories and Excluding text

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two Questions...Categories and Excluding text

    We've been using a search engine product that we're not entirely happy with. Zoom has been suggested twice and looks great...especially for the price. I had two questions regarding the product that I couldn't find a direct answer to in the documentation.

    Categories:

    This is a great option and looks like it would allow us to create multiple indexes for each of our separate 'sub sites'.

    However, we use URL rewriting, so, often, the same URL can be written in two different ways. Is there a way to specify two different rules for the same category?

    For instance, we want a category called 'district 10' and for it to include any URL that contains either "district/10" OR "siteID=District10". Possible?

    Excluding Text:

    We love the ability to specifically index meta data. I'm also looking for a way to exclude certain content on each page. For instance, we don't want the navigation bar indexed on each page, so is there a way to mark that so that part of the page isn't indexed? Something like my navigation bar ?

  • #2
    I love the easy questions,

    Question 1:
    =======
    From the included Zoom help file,

    A category definition has three fields:

    1. Name: The name of the category must be unique. These category names will be listed in the dropdown box of the search page.

    2. Pattern: The pattern is used to determine what pages belong to the category. It is matched against a page’s full path or URL. Should the pattern text appear anywhere within a page’s URL, the page will be filed under that category. Note that this includes the base URL or domain name of each page, so that you can index multiple domains, and have each defined as a separate category.

    For example, a pattern of “test” will collect the following pages:
    http://www.mysite.com/test.html
    http://www.mysite.com/test/index.html
    http://www.test.com/
    …etc.

    Hints: You can use patterns such as “.pdf” to create categories based on file extensions.

    You can also specify multiple patterns for each category, separated by a semi-colon character. For example, a category named “Downloads” may contain a variety of file formats with a pattern like “.pdf;.doc;.xls;.ppt;.exe”

    3. Description: This is a short description of the category that will only be used internally in the indexer for your own convenience in the future. It is not a required field.

    Question 2:
    =======
    From the PDF users guide

    Sometimes there are situations where you would want to stop a section of a page from being indexed. This can be accomplished by enclosing the unwanted section of the HTML document in the following tags: and . Note that this tag must be used as it is, in upper case, with no space characters within the tags. For example,

    Code:
    This text will be indexed</p>
    
    
    
    This section is skipped</p>
    
    
    and no indexing will occur</p>
    
    
    
    Indexing starts again here</p>
    This is often used to exclude some text that appears on every page, such as a navigation bar. Note that the hypertext links within a ZOOMSTOP and ZOOMRESTART section would still be followed in Spider indexing mode.

    ---
    David

    Comment


    • #3
      Excellent!

      Apologies for making you quote from the manual. I didn't give it a full detailed read (it appears to be a very thorough manual!).

      That, combined with this great forum, and your fast response has me thinking this is definitely the way to go. I'll be downloading it this week and taking it for a spin!

      Comment

      Working...
      X