PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

How many pages can the Zoom Indexer skip?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How many pages can the Zoom Indexer skip?

    Hello,

    I just ordered Zoom Professional and I'm still getting everything configured. I will be indexing a group of online help files, rather than a traditional Web site. The help authoring tool generates a large number of .htm files that are necessary for the help to work properly, but those files do not contain actual content and should not be indexed. When I add all of the file names to the Zoom Indexer skip list, I get the following error message:

    You have entered too many skip pages for the Zoom Indexer to handle. Please try to change the skip pages criteria/patterns to encompass more files (eg: "/forum" rather than "/forum/1.html", "/forum/2.html")

    What is the maximum number of files that I can skip?

    The files I need to skip are in many directories, and moving them to a single directory is not an option because then the help would not function correctly.

    Any suggestions? Thanks!

    Lisa M.

  • #2
    You can skip an unlimited number of files, but you can only enter up to 100 skip page entries (see chapter 6.14 in the Users Guide for other technical limitations: http://www.wrensoft.com/zoom/usersguide.html)

    Note that each skip page entry acts as a pattern that is matched again the full file path and filename. This means that you can skip all files beginning with "nav_" for example, by use of a single skip page entry such as "/nav_".

    A skip page entry for "test" would skip all files containing the word "test" in its filename or path, for example, the following would all be skipped:

    /test/index.html
    mytest.html
    /news/sometesting/archives.htm

    What are the filenames of the files that need to be skipped? Are they really all completely unique and have no common text in their filename or path that allows them to be skipped with fewer skip page entries? Perhaps you can give us some examples.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks, Ray. I guess I thought I had to use the full filename.

      If I'm understanding this right, when I have files named

      whskin_frmset01.htm
      whskin_frmset010.htm
      whskin_homepage.htm
      whskin_info.htm

      then I can use a skip entry of "/whskin_" and it will skip all of them. Is that correct?

      Does that also work in reverse? If I enter "/_text0" would it skip these pages?

      Maint_PO_Hdr_text0.htm
      Maint_PO_Items_text0.htm
      Creat_POs_Inv_text0.htm

      This is just a small example. I have quite a few files ending in "text" and a number that I need to skip. Their "parent" files (Maint_PO_Hdr.htm, Maint_PO_Items.htm, Creat_POs_Inv.htm) are the pages with indexable content.

      Lisa

      Comment


      • #4
        I can use a skip entry of "/whskin_"
        and it will skip all of them. Is that correct?
        Correct.

        If I enter "/_text0" would it skip these pages?
        No. You would need to use,
        _text0.htm
        as this is the text that is common to all files that need skipping and hopefully this text doesn't appear in file names to be indexed.

        ------
        David

        Comment


        • #5
          Thanks so much for all of your help. I am now able to skip the files without having a giant skip list.

          So far I'm finding Zoom to be very easy to use and configure. My boss was most impressed with the test search that I created. I credit Zoom for making me look good.

          Lisa

          Comment


          • #6
            Would it then follow that if I have a series of files that have pic in the middle :-
            • sp3pic1
              sp3pic2
              sp3pic3
              sp4pic1
              sp4pic2
              sp5pic1
              sigpic3
            I would then use _pic_ to exclude them all? Should it be
            Code:
            "_pic_"
            or just
            Code:
            _pic_
            as an entry in the skip files?

            Comment


            • #7
              Neither. You would just use
              Code:
              pic
              without the underscores, as there are no underscores in your file names.

              However using such a short skip string my match more files that you want. For example it would also force the skipping of files like,
              /pictures/myhouse.jpg
              pickup.gif
              epic.doc

              So a better solution might be to make the skip strings longer and more precise. For example,
              Code:
              pic1
              pic2
              pic3
              ---
              David

              Comment


              • #8
                Thanks David that is really useful. I have a large number of files that are identified as picture files no useful search content and this will allow me to knock them off the indexing without exceeding 100 skip files.

                Comment

                Working...
                X