PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Double search results

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Double search results

    I have two issues here that I haven't seen.

    1. I am getting similar search results for a couple PDFs. When you look at the left hand side in adobe it shows 4 results and another 4 that are similar below. Here is a link that will show what I am talking about:

    http://www.digifind-it.com/plainfield/data/pdp/1894/1894-11-26.pdf#search="tweedy"

    2. The 2nd issue I am having involves sub-categories. Using my custom php search template and some javascript I made categories with sub-categories. So for this example I have newspapers that are separated into decades then they open into their individual years. If I search for anything for the category "1890s" I get a search result from a PDF in the 1894 year. BUT, if i specifically search for anything with the "1894" category, nothing is found, no matter what I search.
    You can test it here:
    http://www.digifind-it.com/plainfield-search/pdp/pdp-search.php

    any help will be much appreciated.

    -Thanks

  • #2
    Originally posted by IDI View Post
    1. I am getting similar search results for a couple PDFs. When you look at the left hand side in adobe it shows 4 results and another 4 that are similar below. Here is a link that will show what I am talking about:

    http://www.digifind-it.com/plainfield/data/pdp/1894/1894-11-26.pdf#search="tweedy"
    When you say "left hand side in adobe", I presume you mean opening the PDF file in Adobe Acrobat Reader (aka Adobe Reader), doing a Find(CTRL+F) and having the results bar show up on the left side of the document listing the occurrences for the word "Tweedy"?

    This doesn't have anything to do with Zoom. This is completely up to the Adobe Acrobat software.

    It would also depend on the version of Acrobat. I just tried this here using Adobe Reader XI. I don't get a results bar to the left anymore, but it lists "1 of 8 matches", whereby after the first 4 results for "Oliver L Tweedy", it then scrolls into some random page elsewhere in the document without such matches and continues forward.

    If I close the file in Adobe, and then reopen it, and repeat the "Find" operation (and click "Next" for each), it would find the first 4, then repeat the same 4 on the same page, before it says it has hit the end of the document.

    There does seem to be something odd about this PDF file that Adobe Reader is struggling with. I would believe the text layer from the OCR scan might have been messed up.

    Originally posted by IDI View Post
    2. The 2nd issue I am having involves sub-categories. Using my custom php search template and some javascript I made categories with sub-categories. So for this example I have newspapers that are separated into decades then they open into their individual years. If I search for anything for the category "1890s" I get a search result from a PDF in the 1894 year. BUT, if i specifically search for anything with the "1894" category, nothing is found, no matter what I search.
    You can test it here:
    http://www.digifind-it.com/plainfiel...pdp-search.php
    I tried a number of searches but could not find any result that was in the category for "1894" when searching in "1890s".

    Note that if a file was successfully indexed in the "1894" category, it would have the following tags after the link:
    [1890s] [1894]

    Note that most other years I've seen have two tags. But none of them have [1894].

    Searching for "tweedy", the first result is this:

    1.1894-11-26.pdf [1890s]
    ... encores encores were were earnestness If united with patience and OLIVER L. L. TWEEDY. TWEEDY. ef st purpose, purpose, willwill ultimately ultimately Saturday.December Hall ...
    Terms matched: 1 - Score: 35 - URL: http://www.digifind-it.com/plainfiel...1894-11-26.pdf
    When you say you are seeing PDF files from 1894, I presume you are referring to the actual document origin and URL. Not the category specified.

    This would indicate to me that there is something wrong with your Category configuration for "1894". Check in your Indexer, "Configure"->"Categories" and see what pattern you have specified for the 1894 category. You should have a Match pattern of "/1894/" or similar to catch URLs like this.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Ray,

      Thank you very much for your response. For the first issue I figured it was an issue between our OCR software and Adobe but I figured I would ask just in case.

      As for the second issue this was my error. For the category 1890's I had it correctly configured as "/189/" but my mistake was for 1894 which I accidentally wrote as "/1984/"

      -Thanks again

      Comment

      Working...
      X