PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

V5 development progress - Improved Categories

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • V5 development progress - Improved Categories

    The categories feature has been overhauled and significantly improved in the upcoming version 5.0.

    First of all, it will now be possible for a file to belong to multiple categories. This means that a file such as "zoom.pdf" can belong to both the "Zoom" category and the "PDF files" category.

    You can also search for multiple categories via a checklist from the search form (enable the "Allow searching in multiple categories" option in the Categories tab of the Configuration window). The old dropdown categories is also still available.

    If you want your categories to behave as they did in previous versions (where a file can only belong to a single category exclusively), you can check an option in the Add/Edit Category window that says "Files belonging to this category can not belong to any other category".

    Another new feature is the support of the ZOOMCATEGORY meta tag. You will be able to now specify this meta tag within your web pages, to specify the category that the file should reside in (this will override the URL/filename pattern matching method). You should specify the name of the category like so:
    <meta name="ZOOMCATEGORY" content="News">

    You can also turn off the "default"/"catch-all" category now, and have files that belong to no category. This was a source of confusion for some users in the past, and it is now optional rather than required.

    Finally, you can now use wildcard match patterns for your categories. This means that you can create a pattern such as "/news/updates_*.pdf" which will get all PDF files, inside the news folder, with a filename starting with "updates_". Non-wildcard categories still work as before.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

  • #2
    Re: V5 development progress - Improved Categories

    Awesome! I am eagerly awaiting this release!!!!!

    Comment


    • #3
      I don't understand the meta tag option. I have several thousand PDF files that I want to index. Are you saying I would have to add <meta name="ZOOMCATEGORY" content="News"> to each pdf document? If so, how?

      We currently have the "basic" meta tags in the PDF populated with information. We have the "SUBJECT" field specified with an I (Instructions), P (Parts), and S (Specifications). Can I some how pull from those already populated fields?

      Comment


      • #4
        You can't add the HTML meta data inside the PDF file itself. But you should be able to put the ZOOMCATEGORY meta tag in a .desc file that is associated with the PDF. But this means you need a .desc file per PDF file. The most efficient solution is to somehow code the category into the URL.

        e.g. different directories.
        /Instructions/doc1.pdf
        /Parts/doc2.pdf
        /Specifications/doc3.pdf

        Comment


        • #5
          I've tried setting the categories via the url also. But not having much luck with it either.

          I have three seperate folders, each containing their corresponding pdf files.

          /TechDoc/ (this contains Specification Sheets)
          /Volume_4_CD/ (this containts Parts Books)
          /Volume_3_CD/ (this containts Installation manuals)

          I have the match pattern for each set as follows:

          Volume_3_CD (Installation category)
          Volume_4_CD (Parts category)
          TechDoc_CD;S*.pdf;F*.pdf (Specifications category)

          So by specifiying the above, I'm saying to only pull data that containts folder/filename with the above text, correct?

          It works fine for the Parts and Installation categories. However, for the TechDoc folder, I only want it to pull any PDF's starting with an S or F. But, it's pulling everything from all categories when I specify it to search the Specifications category. If I take out the S*.pdf and F*.PDF match pattern, only leaving TechDoc_CD, it will only pull files out of the TechDoc_CD, which would be ideal, if I wanted all files from TechDoc (but I only want ones that begin with S or F).

          I've also tried putting the entire path of the pdf's. (/TechDoc_CD/PDF-File/S*.pdf; /TechDoc_CD/PDF-File/F*.pdf) and it still didn't work.

          What am I doing wrong?
          Last edited by mcrawford575; Sep-25-2006, 03:13 PM. Reason: More clerification.

          Comment


          • #6
            First thing I should point out is that you said the original folder name is "/TechDoc/" in the above, but you have a pattern for "TechDoc_CD", which does not match. This was probably just a typo, but I thought I'd check this with you just in case.

            Second thing to note is that the category patterns are substring matches, even when used with wildcards. And that they are matched against the entire URL (in spider mode) or path (offline mode) of the file being indexed. This means your patterns, "S*.pdf" and "F*.pdf" will match all PDF files with an S or F in their URL, eg.
            http://www.site.com/abc/blah.pdf
            http://www.abc.com/files/blah.pdf
            etc.

            This should explain the behaviour you were seeing with those two patterns.

            So to really achieve what you were after, you would need to use the longer/full path as the pattern, as you said. And it would seem to me that "/TechDoc_CD/PDF-File/S*.pdf" should work fine presuming all your files are actually in that path as they are indexed. Perhaps you should try just the files beginning with S and see if that works, then try adding the pattern for the files beginning with F, just to see if we can deduce where the problem is.

            If you continue to have problems despite checking the above, e-mail us your ZCFG file and we can take a closer look at your settings. Perhaps include a screenshot of your file/folder structure in Explorer (or if possible, send us a ZIP of some example files to reproduce the behaviour)
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              I have been testing this and it works great so far.

              Two questions.

              1) Is there a way to nicely format the layout of the check boxes. I have 30 categories and they are sort of spread out across the page in no order.

              2) Is there a way to when multiple categories are selected for a search, when the results are returned that the results themselvs get put into categories?

              For example now if you do a search the results say...
              Search results for: xxxxx in category "yyyyy", "zzzzz"
              500 results found.
              This poses a small problem. It may be 3 pages before you get to any results in cat "zzzzz" as the results don't seem to be intermixed with each other. Probably more likely it is based on the hit level.

              What would be great is if the results could be ordered by the category.

              For example...
              Search results for: xxxxx in category "yyyyy", "zzzzz"
              250 results found in category "yyyyy"
              250 results found in category "xxxxxx"
              Clicking the links would just reorder the page. This is sort of in line with how Veritas and some other companiees do their support site searches.

              Thanks!

              Comment


              • #8
                Originally posted by MikeR View Post
                1) Is there a way to nicely format the layout of the check boxes. I have 30 categories and they are sort of spread out across the page in no order.
                The order of the categories in your search form are defined by the order that they appear in the Zoom configuration window. There, on the "Categories" tab, you can manipulate your category order by clicking on the up/down buttons next to the list.

                You can also change the layout of your checkboxes with CSS. The checkboxes are specified in an unordered list, and the default CSS displays them horizontally. But you could change this so that it is displayed vertically, or even in multiple columns via some clever CSS.

                2) Is there a way to when multiple categories are selected for a search, when the results are returned that the results themselvs get put into categories?



                For example now if you do a search the results say...
                Search results for: xxxxx in category "yyyyy", "zzzzz"
                500 results found.
                This poses a small problem. It may be 3 pages before you get to any results in cat "zzzzz" as the results don't seem to be intermixed with each other. Probably more likely it is based on the hit level.
                The search results are sorted by relevance (terms matched + score) or by date (if the "sort by date" option is selected). The categories do not affect the sort order of your search results.

                What would be great is if the results could be ordered by the category.



                For example...
                Search results for: xxxxx in category "yyyyy", "zzzzz"
                250 results found in category "yyyyy"
                250 results found in category "xxxxxx"
                Clicking the links would just reorder the page. This is sort of in line with how Veritas and some other companiees do their support site searches.
                I'm not sure if sorting by categories would be very useful. In the above example, it would mean that clicking on "xxxxx" would lead to 250 results from category "xxxxxx" followed by 250 results from category "yyyyy" (which you would most likely never get to). Wouldn't it make more sense to just narrow down the search to only category "xxxxx" in this case? I can see how a category summary as you have in the above would be useful, to give the user an idea of which category they might want to narrow their search to.

                Nonetheless, this is all food for thought and will be things we can consider for a future version (V5.1 maybe?). We had some similar ideas in mind for the upcoming 5.0 release, but could not come up with anything that was universally useful, and we had wanted to make sure that multiple categories was being well used and accepted by our users, before we pursue more intricate result grouping options.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  I'm not sure if sorting by categories would be very useful. In the above example, it would mean that clicking on "xxxxx" would lead to 250 results from category "xxxxxx" followed by 250 results from category "yyyyy" (which you would most likely never get to). Wouldn't it make more sense to just narrow down the search to only category "xxxxx" in this case? I can see how a category summary as you have in the above would be useful, to give the user an idea of which category they might want to narrow their search to.

                  Nonetheless, this is all food for thought and will be things we can consider for a future version (V5.1 maybe?). We had some similar ideas in mind for the upcoming 5.0 release, but could not come up with anything that was universally useful, and we had wanted to make sure that multiple categories was being well used and accepted by our users, before we pursue more intricate result grouping options.
                  Yes the users can narrow down the search and they will be happy with that. I guess my thinking was that you put in one search string, choose the cats you want and then once the results are returned you can refine by organizing the results like what you do now for "Date".

                  Anyway...just a thought. Thanks for the info on question 1.

                  Comment


                  • #10
                    Originally posted by Ray View Post
                    The order of the categories in your search form are defined by the order that they appear in the Zoom configuration window. There, on the "Categories" tab, you can manipulate your category order by clicking on the up/down buttons next to the list.

                    You can also change the layout of your checkboxes with CSS. The checkboxes are specified in an unordered list, and the default CSS displays them horizontally. But you could change this so that it is displayed vertically, or even in multiple columns via some clever CSS.
                    There seems to be an issue when using columns in CSS. I created CSS entires that put 30 categories into 3 columns of 10 each. If I don't create a CSS class to bring the categories to the top of each new column, the category in the columns are displayed like a stair step, but the check boxes work fine.

                    However as soon as I add a class to bring them up...

                    For example:
                    li.topcol2 { margin-top: -12en; }
                    li.topcol3 { margin-top: -14en; }

                    Only the check boxes for the categories in the 3rd column work. In the other two clicking the boxes does nothing. There seems to be some strange overlapping happening and it's got me stumped.

                    Meanwhile I put the categories into a table and they look great, but I would really like to figure out this CSS problem.

                    Comment

                    Working...
                    X