Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

V5 development progress - Image indexing and thumbnails

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • V5 development progress - Image indexing and thumbnails

    This post is short update on one aspect the development process of V5 of Zoom. It covers the up coming image index features and image thumbnail support.

    Zoom Search Engine v5.0 introduces a new feature that allows users to search for images such as photographs and diagrams. Searching is carried out by using metadata associated with the file. Image files like JPEGs, PNGs and TIFFs are capable of storing textual data to provide more information about the image as well as technical metadata in the image file that details the photo-taking conditions such as camera make/model, if the flash was on, the shutter speed and aperture value, etc. The ImageInfo plugin extracts this metadata and allows Zoom to index this metadata according to its configuration.

    Digital cameras save images as specified by the EXIF (Exchangeable Image File) image file format. The specification uses existing file format such as JPEG (Joint Photographic Experts Group) or TIFF (Tagged Image File Format) with the addition of specific metadata tags.

    Further on, a multi-media news exchange format called the Information Exchange Model (IIM) was established to provide additional information, such as caption, news category or dateline. Metadata elements of IIM are quite commonly known as "IPTC headers" of digital image files. ImageInfo extracts this metadata based on the EXIF and IIM standards. While the image files supported by ImageInfo are JPEGs, PNGs*, TIFFs and GIFs, different levels of meta information will be available depending on the file type and the way the file was created.

    In addition to indexing meta data Zoom will index the any ALT text associated with an image on a HTML page and any text in the link that points to the image.

    It will also be possible to only index images larger than a certain minimum size. (to avoid indexing all the the small images, like buttons, found on a typical web site).

    In V5 of Zoom a new item of meta information will be supported, ZOOMIMAGE. This will allow you to associate an image with a
    particular page so that it will appear alongside the link in the search
    results. To do this, you will need to insert a meta tag on your pages like
    so:
    <meta name="ZOOMIMAGE" content="mydog.jpg">
    You can specify the appearance of the images in your search results by
    modifying the CSS in your search template file.

    As an alternative to specifying the thumbnail image by metadata you will be able to create a directory that contains all you thumbnail images. The thumbnail and the full image are associated via their file names.

    An example of how this looks is below,


    Finally if you don't have an image prepared for each of your documents you can instead select to display a fixed icon for all documents of a particular type. e.g. The MS Word icon for all DOC files.



    I would also like to remind everyone that we offer free upgrades for 6 months after a purchase, so if you purchase V4 now, it will be a free upgrade to V5 when it becomes available.

    ------
    David

  • #2
    That is great... Do could you support "MHT" format if you don't already. Also support for open office files if you don't already. Also "Zip,RAR & any other type of document that allows for you to just to be able to summarize the search...

    Other things such as image search, video search and other searches would be great..

    Comment


    • #3
      MHT is not really an image file format. And we haven't had any requests for its support, at least not from anyone that actually uses it. We'll probably look at Open Office some time soon. But as most of these 'open' file formats are XML based, they should already work with the current version of Zoom.

      I am a bit confused about your request for image search as this was the exact topic of my initial post?

      We plan on providing searching for other binary files types (.ZIP, .EXE, .MOV, etc..) by their file names, but not on their content in V5.

      ------
      David

      Comment


      • #4
        Some additional preliminary documentation explaining the usage of the new image handling features can be found here.
        How to index images?
        How to customizing image search results layout

        Comment


        • #5
          You can download a beta version of the image plugin here:
          http://www.wrensoft.com/ftp/imageinfo_beta.zip
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            One of our users asked us if it would be possible to make the search results appear with only thumbnails and nothing else, in a grid-like fashion. We made an grid layout example to illustrate this and thought we'd post it up here for people to see what is possible with some configuring and CSS modifications.

            Note that this is just an example, and is one of many possible layouts that can be created with Zoom and CSS.

            Below is an actual screenshot of Zoom setup to show thumbnails only:



            To achieve this, you will need to turn off all the other elements in the search results (via the Configuration window, under the "Results Layout" tab) so that only the Image is displayed (along with the text link which we can not hide from here - we will do so via CSS).

            And then in your search_template.html file, where you can customize the CSS for your search results, you should have the following changes:

            Code:
            .result_title { font-size: 100%; display: none; }
            This will hide the search result links so that only the images are shown.

            Code:
            .result_block { margin-top: 15px; margin-bottom: 15px; display: inline; }
            .result_altblock { margin-top: 15px; margin-bottom: 15px; display: inline; }
            This will allow the search results to appear next to each other, as opposed to being on separate lines.

            You might also want to push the "Result pages" part to the next line with such:

            Code:
            .result_pages { clear: left; }
            Further changes could of course be made to get it closer to your ideal appearance. We will update the documentation in the final release with more information on the new CSS classes available in V5.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Hello,

              Is it possible to have the image alt text as the actual clickable text as it is much better so have 'title of book' as opposed to 'short title.jpg'.

              Or, would it be possible to include it in the RSS output?

              Thanks
              AG!

              Comment


              • #8
                IMG ALT text is currently indexed in V5 so you can search for it and find the image.

                However, there is currently no option to use it as the title for the image link. We currently use either the meta title stored inside the image file (if available and configured to do so), or the filename itself. We may consider adding an option to use the ALT text for title if there is enough demand.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  As of V5 beta 13, You can now specify thumbnails for ALL file extensions supported. This means you can even create thumbnails for your PDF documents, PPT slideshows or HTML web pages (using third party thumbnail generating applications), and have Zoom display them alongside your search results.

                  To enable this in Zoom, double click on the extension in the "Scan Options" tab of the Configuration window, and clicking "Configure Images". Here you can select "Display different thumbnails for each file" and specify the thumbnail options similar to before (including changing the file extension for the thumbnails as required).
                  --Ray
                  Wrensoft Web Software
                  Sydney, Australia
                  Zoom Search Engine

                  Comment


                  • #10
                    In V5 of Zoom a new item of meta information will be supported, ZOOMIMAGE. This will allow you to associate an image with a
                    particular page so that it will appear alongside the link in the search
                    results. To do this, you will need to insert a meta tag on your pages like
                    so:
                    <meta name="ZOOMIMAGE" content="mydog.jpg">
                    You can specify the appearance of the images in your search results by
                    modifying the CSS in your search template file.
                    Hello,

                    I actually index a books website and on each page, the only image which is indexed is the image associated with the page ie a screenshot with the book, is their any way of getting this image next to the search result (of the page text) without actually adding extra HTML?

                    I ask this because only one image picked up by zoom on each page.
                    AG!

                    Comment


                    • #11
                      There are several ways to add an image or icon next to each search result.

                      1) Add ZOOMIMAGE meta data to each page which tells Zoom which image to use with the page.

                      2) Display the same icon for every page of a particular type. e.g. a PDF icon

                      3) Link pages and image using the page file name. For example you can create a series of image files that have the same name as the page. So if your file was dog.html the image file could be dog.jpg.

                      In each instance Zoom needs to be told which image should be used. It never attempts to guess if an image might be appropriate for the page in question. These options can be selected from the "Scan options" tab in the Zoom configuration window.

                      Comment


                      • #12
                        Any chance possible to assign a thumbnail based on the category?

                        I have many documents that have the PDF, DOC, TXT, etc... in different categories. I think it is a great feature now to be able to show a thumbnail to identify the file type.

                        I suppose I could do it by having an thumbnail for each file in the category but with over 150,000 files that's a lot of thumbnails to have to generate.

                        For example every file in the /docs/productguide/ folder could have a particular thumbnail and every file in the /docs/releasenotes/ could have a thumbnail, etc...

                        My categories follow my directory structure and would be nice to have a global thumbnail for each directory instead of file.

                        Thanks

                        Comment


                        • #13
                          We don't support assigning an image per directory. Only per file or per file type.

                          But if you are a web server expert, you could do something tricky. Like do some some URL rewriting to map all HTTP requests for image files in a particular directory to a single image file. We don't have a script to do this but it should be possible.

                          Comment


                          • #14
                            Originally posted by wrensoft View Post
                            We don't support assigning an image per directory. Only per file or per file type.

                            But if you are a web server expert, you could do something tricky. Like do some some URL rewriting to map all HTTP requests for image files in a particular directory to a single image file. We don't have a script to do this but it should be possible.
                            It's easier to just create a individual thumbnails . Here is a very quick command that could probably use refinement...but it works.

                            find productguides | awk -F"/" ' NR>1 { gsub("\\.(pdf|xls|doc|txt|htm|html)$",".jpg",$NF); print "cp /home/images/pgthumb.jpg \"/web/images/" $NF "\"" }' |sh

                            This command will find any file in the /productguides folder and create a thumbnail from a single JPG file you specify, for each of those files.

                            That way if you have a special icon you want to use for a category of documents no matter what the extension is, you can do it this way.
                            Last edited by MikeR; 10-26-2006, 03:27 AM.

                            Comment


                            • #15
                              images flowing left to right in results

                              I added the code "display: inline;" for both .result_block and .result_altblock but there was no change in results format, still displayed vertically instead of flowing from left to right.

                              Anything I'm missing or any reason why that would not change the way the results are displayed?

                              Thanks,
                              Sam T

                              Comment

                              Working...
                              X