PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Skip floating point numbers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Skip floating point numbers

    Hi,

    I am using V6 in offline mode with JavaScript to index a number of files on my computer. Among them are Excel files and the Zoom indexer indexes all the numbers in them, leading to a large index and slow search. Questions:

    1. Is it possible to skip numbers, especially floating point numbers, during indexing?
    2. Since I didn't find this option yet, I tried to erase all lines containing numbers from the 'zoom_index.js' file with the result that the search does not work any more. Is there a safe way to remove lines from the index file?

    Thank you!
    Karsten

  • #2
    Javascript is the slowest script option. Can you switch to using PHP, ASP, CGI or .NET? As this would solve the problem.

    Comment


    • #3
      No, unfortunately I don't have any webserver software available (company IT restrictions...) so I have to use Zoom completely offline.

      Comment


      • #4
        You could also look at the CGI front end.

        There is no way to skip just floating point numbers, but adding
        *1
        *2
        *3
        ...
        *9
        *0
        etc..

        To the word skip list (not the page skip list) should skip all numbers. The first line, *1, skips all words that contain at least one 1 character. The second line, *2, skips all words that contain at least one 2 character, etc..

        Skipped words are words which we do not store any index data for. This means they can not be searched for, and it saves space in the index files (which makes searching other words faster).

        They are not entirely omitted from the index however. We still need to keep track of them for other purposes, most significantly the Context Description, which recreates the paragraph of contextual text you see around the matched word. This is also the reason why you have words with different upper and lower casing. So it will not reduce your unique words count.

        Comment


        • #5
          Thank you!
          I chose the CGI option and it works perfect for me! Much better than the JavaScript solution, even if there are no numbers in the indexed files. Great.

          Best regards,
          Karsten

          Comment

          Working...
          X