PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

category search with multiple categories

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • category search with multiple categories

    Is it possible to use "AND" logic with searches across multiple categories?

    For example I would like to be able to search for all PDF documents within a certain section of my site.

    I have these file type categories: PDF, Excel, Word

    And these structural categories: Section 1, Section 2 etc.

    Right now if I check "PDF" and "Section 1" I get results matching either category but I want to get results from the intersection of these 2 categories. Is there a way to do that?

    Ideally I'd like to combine AND logic and OR logic (but doubt that's possible). For example I'd like the user to be able to look for PDF OR Word documents in "Section 1" (i.e. ("PDF" OR "Word") AND "Section 1").

    The only work-around I can think of is to create discreet categories such as Section1_PDF, Section1_Word, Section1_Excel, Section2_PDF, Section2_Word, Section2_Excel etc. but this could get tedious because I may end up having more file type categories and a lot of sections.

    Any suggestions would be appreciated.
    Thanks,
    KG.

  • #2
    One possibility is instead of having categories for PDF, Excel, and Word, you could just search for them by filename wildcards, for example a query of "*.pdf" (without quotes) in category Section 1 with "match all search words" selected.

    Then you can also do, "*.pdf *.doc" (again without quotes) with "match any search words" selected, in category Section 1 to achieve the same as your example (i.e. ("PDF" OR "WORD") AND "Section 1")

    You will need to enable Filenames for indexing and allow Dots to join words under "Configure"->"Indexing options".
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      How to allow advanced search options for complex structure

      Thanks Ray, that does seem helpful for the file-type portion and I was hoping it would also help me with the categorization piece, but I can't figure out how to do it.

      I'm actually indexing a network drive with highly structured content. So for example if I have a PDF called "Project Info.pdf", the info telling me what project its referring to comes higher in the file structure. "/Institutions/Harvard/Project Info.pdf". This information may not be included in the PDF itself.

      In my case, filename search is only really useful if I can search the full URL path (and not just the file name in isolation). Can you think of any tricks for doing this? Clearly the full path is stored somewhere because it shows in my search results, so its a little frustrating that its not searchable...

      Any help would be appreciated, thanks,

      Ken

      Comment


      • #4
        We're adding the use of the folder names in the path as part of the indexed terms in V7.

        But surely that information would also appear within the PDF file content somewhere? Be it the title or the document properties? Is it just not ranking high enough? Or are you saying the words "institutions" and "harvard" never occurs within "Project Info.pdf" at all?

        If that is the case, you can add them as keywords in the PDF file's document properties, or use .DESC files to specify meta keywords. But this might be a tiresome exercise if you have alot. I wonder though, if your PDF files are relatively sparse/empty and lacking in decent meta data itself (such as good titles), and if that's the case, then this approach would address that too and give you better looking search results.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Glad to hear that's coming in version 7.

          My example was merely indicative of a general problem. We have 1000s of files (docs, ppts, pdfs, xls etc) that were developed on projects years ago without any metadata or classification information (and often without contextual information within the file content). The only form of "metadata" is in fact the folder structure into which these files are placed.

          I guess it wouldn't take too much effort to write a script that would add .DESC files to reflect the folder hierarchy for each file to get it to work with version 6. But right now I'm trying to do everything without touching the data on the drives.

          Thanks for the tip.

          Comment


          • #6
            If you're using Spider Mode, there's a way to pick up .DESC files stored in a different folder than where the data files are. See "Configure"->"Spider options"->"Use this offline folder for all plugin .desc files" (click on the Help button for an explanation). This would allow you to use .DESC files without needing to add new files to the data folders.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment

            Working...
            X