PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

V7 beta release available for testing

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #76
    Not sure what you're asking for. But Zoom V6 supports image searching and you can display the results as just thumbnails in a grid manner like what you see here:
    http://www.wrensoft.com/zoom/screen_images.html

    This is described here:
    http://www.wrensoft.com/forum/showth...=4592#post4592

    More details on image indexing here:
    http://www.wrensoft.com/zoom/support...ge_layout.html

    As for competing with Google and Bing -- you may want to note that Google is running on over 2 million dedicated servers, and Steve Balmer reported that Microsoft is running on over 1 million dedicated servers. How many dedicated servers do you have available?
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #77
      I think there are 2 bugs in PDF search of V7 Beta.

      1. The date sort is not working because the date given in the result page is always the same. It is always '16.Nov. 2012' (at least on my indexes).

      2. Sort by relevance is also not correctly working, because it is not correctly sorted according to the score.
      Some higher scores can be found rather low in the list.

      Could you please check that.

      Thx!

      Peter

      Comment


      • #78
        Unfortunately nobody replies to my previous post.

        Does nobody else have got this problem?

        I have tested it on a different machine and it is the same, also for HTML files (Date 16 Nov 2012, always).
        I even tried to use .desc files, but no avail.

        Am I doing something wrong?

        Peter

        Comment


        • #79
          Hi,

          I am finding an issue in this new version with 'skipped' words being highlighted when selecting a search result and using the 'Jump to match highlight' feature.

          I have indexed a site where a likely search term will be 'recall a message' the problem is that when selecting the result from the results list (which does disregard the noise word 'a') the word 'a' is still highlighted on the page by the 'Jump to match highlight' feature.

          The specific problem with this is that on many pages the word 'a' much higher up the page than the more important words and it's these that the is user is directed to first with sometimes no view (until they skip down the page) of the most important words, in this case 'recall' and 'message'.

          Will this be addressed in the next version?

          Many thanks

          Comment


          • #80
            Peter -- sorry for missing your posts. Somehow the forum did not indicate they were new posts to me, alongside some of the spam posts we regularly get, we sometimes miss the actual new posts.

            Originally posted by Peter View Post
            I think there are 2 bugs in PDF search of V7 Beta.

            1. The date sort is not working because the date given in the result page is always the same. It is always '16.Nov. 2012' (at least on my indexes).
            Given that you say your non PDF files are also reporting this date, I would speculate the following:
            (a) Are you indexing in Spider Mode, and using dynamically generated pages such as PHP pages which are also serving the PDF files? If so, it is possible that it is reporting the date of the PHP file, rather than the files or content being served.
            (b) Have you mixed files from different sessions? Make sure ALL files listed at the end of indexing (in the "Required Files" window) are uploaded and overwritten the old files.
            (c) If you still have trouble, zip up the search files (and your config file) and e-mail them to us and we can take a look.

            Originally posted by Peter View Post
            2. Sort by relevance is also not correctly working, because it is not correctly sorted according to the score.
            Some higher scores can be found rather low in the list.
            The number of terms matched has priority weighting over the score. So "Terms matched: 2 -- Score: 50" can outrank "Terms matched: 1 -- Score: 55".
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #81
              Originally posted by Dez View Post
              I have indexed a site where a likely search term will be 'recall a message' the problem is that when selecting the result from the results list (which does disregard the noise word 'a') the word 'a' is still highlighted on the page by the 'Jump to match highlight' feature.

              The specific problem with this is that on many pages the word 'a' much higher up the page than the more important words and it's these that the is user is directed to first with sometimes no view (until they skip down the page) of the most important words, in this case 'recall' and 'message'.

              Will this be addressed in the next version?
              The "jump to highlighting" feature has no awareness of the Skip List (or synonyms for that matter). This is a technical problem -- the "jump to highlighting" script is planted on each of your web pages. It has to be light and as compatible with as many pages as possible. For it to do the extra work of accessing the skip words list from the index, alot of extra things would have to happen for such a seemingly "simple" idea. For example, perhaps the "search.php" script will take a special AJAX query to return the skip list (retrieved from the dictionary), and every time anyone accesses ANY of your web pages, your server will have to handle the load of performing said query and accessing the skip list. Another option would be to generate a static list to be included with the highlight.js file. But then this would have to be maintained regularly.

              Short answer: it's not a simple practical solution. We'll keep it in mind for future changes, especially if there's enough demand for it. But alot of people don't quite realize what it involves in terms of either resources or the maintenance side of it, and once they realize that, are less keen for it.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #82
                Hello Ray,

                Thank you for your reply.

                Originally posted by Ray View Post
                Given that you say your non PDF files are also reporting this date, I would speculate the following:
                (a) Are you indexing in Spider Mode, and using dynamically generated pages such as PHP pages which are also serving the PDF files? If so, it is possible that it is reporting the date of the PHP file, rather than the files or content being served.
                - I am using the Javascript search (sorry, I should have mentioned it). I tested it with PDF or static HTML pages.
                Originally posted by Ray View Post
                (b) Have you mixed files from different sessions? Make sure ALL files listed at the end of indexing (in the "Required Files" window) are uploaded and overwritten the old files.
                - The files are all on my local HDD, and they surely have different time stamps.

                Originally posted by Ray View Post
                (c) If you still have trouble, zip up the search files (and your config file) and e-mail them to us and we can take a look.
                - Thank you very much for your support, I will send a sample ZIP-file.

                Originally posted by Ray View Post
                The number of terms matched has priority weighting over the score. So "Terms matched: 2 -- Score: 50" can outrank "Terms matched: 1 -- Score: 55".
                - OK, understood.

                Peter

                Comment


                • #83
                  Originally posted by Peter View Post
                  The files are all on my local HDD, and they surely have different time stamps.
                  Thank you very much for your support, I will send a sample ZIP-file.
                  Got the file, and yes it looks like it is a bug in the Javascript search option with document dates in offline mode. Unfortunately with Christmas coming up it will be 1 week of Jan before we will release a fix.

                  Comment


                  • #84
                    Originally posted by wrensoft View Post
                    Got the file, and yes it looks like it is a bug in the Javascript search option with document dates in offline mode. Unfortunately with Christmas coming up it will be 1 week of Jan before we will release a fix.
                    OK, then I'll have to wait.
                    BTW, in the case the results are sorted by relevance, if the number of terms matched and the score is the same, will then the result be sorted by date?

                    Peter

                    Comment


                    • #85
                      if the number of terms matched and the score is the same, will then the result be sorted by date?
                      No. The order is not defined in the case where two documents are scored equally, but in practice I think the document that was indexed first will appear first.

                      Comment


                      • #86
                        Originally posted by wrensoft View Post
                        No. The order is not defined in the case where two documents are scored equally, but in practice I think the document that was indexed first will appear first.
                        I think it would be a good idea to have a date parameter in the weightings, so that newer documents are ranked higher, wouldn't it?

                        Comment


                        • #87
                          The perfect search engine software.
                          1# A select all button for the urls, So one can set attribs for all urls like scan single page or seating it as a default , As this can be crazy to do this when one has 100 url's to set,there gos your day

                          2# New search option to add a search img, so one can search img. just like Google Lycos bing has, big time needed This will help one competing with the big boys on the block

                          3# A confirmation box on the delete url . Big time

                          4# A option to pick your own limitations on php or java the exceeds your limitation,

                          5# A better pic option beside the search results As one should not have to set this up when a url is scanned a default pic for that url is shown beside the search results as many people identify a search result by picture, some more support for ones that are out to catalog the net.

                          6# A skip button if your spider is stuck or cataloging way to much of one url, One click and it moves to the next url to search . this would be the worlds greatest software and i am shear many of your costumers would full agree with me.

                          7# category's should be in there own folders , net ,apps, news, internet,programs, but i am not shear if it dos this now, have not tried out this option yet


                          ONE more thing 9# To have a spider added to out put folder, So it keep spidering and updating what you have already cataloged, So one would not have to keep doing it from the counsel , This option would be nice extremely nice and easy to implement with php ASP java
                          Last edited by Brittany; Dec-18-2013, 05:17 PM.

                          Comment


                          • #88
                            Can I delete the main html template so when it uploads it will not change the look i have

                            Comment


                            • #89
                              Originally posted by Brittany View Post
                              Can I delete the main html template so when it uploads it will not change the look i have
                              Zoom does not overwrite the "search_template.html" file in the Output Directory if it exists. It only creates a new file in the Output Directory when it does not exist.

                              So if you always make your changes to the file in the Output Directory, then your changes will persist and not be changed.

                              But if you manually copy the contents of the Output Directory to another folder (where you have your customized template), then it will keep resetting your file to the default.

                              Alternatively, if you are modifying the file on your web server directly (i.e. the master copy of your template file is actually on the server), then you can disable the FTP upload of the file (under "Configure"->"FTP"->"Do not upload search template")
                              --Ray
                              Wrensoft Web Software
                              Sydney, Australia
                              Zoom Search Engine

                              Comment


                              • #90
                                Originally posted by Brittany View Post
                                1# A select all button for the urls, So one can set attribs for all urls like scan single page or seating it as a default , As this can be crazy to do this when one has 100 url's to set,there gos your day
                                We would advise managing the list using an external text file in this case. The additional start points window allows you to Import or Export to a CSV (comma separated values) text file. This, in turn, can be edited with programs like Microsoft Excel. Please see section 7.3 in the Users Guide for more information.

                                Originally posted by Brittany View Post
                                2# New search option to add a search img, so one can search img. just like Google Lycos bing has, big time needed This will help one competing with the big boys on the block
                                Image indexing and searching is described here:
                                http://www.wrensoft.com/zoom/support...ins_image.html

                                Originally posted by Brittany View Post
                                3# A confirmation box on the delete url . Big time
                                Are you referring to the option under "Index"->"Manage existing index"->"View/delete pages from existing index"? Here, you do not delete a URL until you click "Add to delete list" and then click "Proceed", and you are still then given a warning before proceeding.

                                If you are referring to the "Advanced spider URL options" window, then you can simply NOT save your configuration and re-load your prior configuration file to restore/undo your change.

                                Originally posted by Brittany View Post
                                4# A option to pick your own limitations on php or java the exceeds your limitation,
                                As explained in our personal emails: first of all, there is no Java option, and Javascript is very different from Java.

                                Second the limits in PHP and Javascript are explained in detail throughout the software and our website (please see here), it is not OUR limitation, it is a limitation of the platform you are using. They simply are not designed to deal with that much data. It is like using a golf cart to compete in the Indy 500. Or using a toaster to cook a steak.

                                As explained in the URL above, use the CGI version for searching over 65,500 pages. This is a technical necessity, despite being harder to configure. Just as it is a technical necessity to assemble and learn to drive a race car to compete in the Indy 500. For lack of a better analogy.

                                Originally posted by Brittany View Post
                                5# A better pic option beside the search results As one should not have to set this up when a url is scanned a default pic for that url is shown beside the search results as many people identify a search result by picture, some more support for ones that are out to catalog the net.
                                I think you mean we should generate our own thumbnails, as opposed to requiring there to be thumbnail files already. This is something we did consider, but it complicates setup (you would have to manage these thumbnail files on your website). Ultimately, the current design of the image search is really for people indexing and searching their own sites. Not for indexing and searching other people's sites.

                                We'll consider adding a feature like this for those indexing external sites, if there is the demand. I do think that there's alot of maintenance work on your behalf (i.e. our users) to manage this (it would be a lot of disk space on your server for example) that many users may not be prepared for however.

                                Originally posted by Brittany View Post
                                6# A skip button if your spider is stuck or cataloging way to much of one url, One click and it moves to the next url to search . this would be the worlds greatest software and i am shear many of your costumers would full agree with me.
                                Interesting idea, but it means you would have to sit there and watch the indexing every time. And it would be impossible to be consistent for each index. Instead, couldn't you use the "Limit files for this start point" option? (Click "Edit" for the start point). Then you can consistently only index a certain number of pages for that point.

                                Or even set a global "Limit files per start point" setting ("Configure"->"Limits") so all start points / web sites have the same limit.

                                Originally posted by Brittany View Post
                                7# category's should be in there own folders , net ,apps, news, internet,programs, but i am not shear if it dos this now, have not tried out this option yet
                                Not sure what you mean here. Categories can be specified by the folder that the page is in (matched against the URL pattern, you can have "/net/" and all files belonging in a "net" subfolder will be categorized as such).

                                Originally posted by Brittany View Post
                                ONE more thing 9# To have a spider added to out put folder, So it keep spidering and updating what you have already cataloged, So one would not have to keep doing it from the counsel , This option would be nice extremely nice and easy to implement with php ASP java
                                You can schedule automatic indexing using the configuration file you've created (so it will re-spider the same sites). See section 2.15 in the Users Guide and also this support page.
                                --Ray
                                Wrensoft Web Software
                                Sydney, Australia
                                Zoom Search Engine

                                Comment

                                Working...
                                X