PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

A few questions...

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • A few questions...

    First, I love your product...Works great, is very fast (I am using V4.0 Pro cgi version on IIS 5.0 Win2K). I have a few questions though.
    1) Can Zoom sort the results by filename?
    I have Zoom loaded on my company's Intranet site. We have files named after part numbers i.e. 618_0404.htm, 703_1234.htm or 813_01432.htm. There may be one or many references to 703_1234.htm in the other two files, so when I search for 703_1234 it will return a link to 618_0404.htm and 813_01432.htm as well as 703_1234.htm. The return from the query sorts the results by how many times 703_1234 actually appears inside the htm file. So my results may appear with 813_04132 as the first link, 618_0404.htm next and 703_1234.htm last. I would love to sort by filename numerically. Is this possible?
    2) When would I want to use Spider vs. Offline? My web site is an Intranet and all files exist on the same server. Would I ever use Spider on my website? If yes why? I use the Offline method right now.
    3) Why is it that I have to index my whole site just to add one file's information to the index? It would seem pretty easy to add a check on the time stamp of a file and index only those files after that time stamp (one that was derived from the previous index and stored as a variable to be accessed by the next index execution) and append the information to the existing index files. Am I wrong? Can I index only new files? This would be a great feature to add to the Professional version. I have many tens of thousands of files to index and write about 50 or so files to the site daily. I would like to add the file to the index as it is published to the site. The indexing process pegs the CPU to 100% though and brings the site to a crawl for the time it is indexing (which is well over an hour with 60,000+ files). I am leaning towards only indexing during non-peak usage times but the problem there is a file that was published to the site during peak hours, which is when all of the files are published, it will not be returned in the results to a query until the next day (after the index has ran).

    Thank you,
    Great Product!!
    Keep up the good work!!

  • #2
    1.) Zoom does not currently have a feature for sorting by filenames. However, your point is valid, especially in cases where the other files (eg. b.html, c.html) contains more than one reference to "a.html", and "a.html" contains no mention of its own filename. In this case, b.html and c.html would both come up before a.html.

    One workaround to this would be to add the filename as a meta keyword in each file. For example, in a.html, you can have:

    Code:
    <meta name="zoomwords" value="a.html, a.html, a.html">
    This would make the page more relevant for its own filename.

    A better solution, which we've added to our todo list (for version 4.1 or 5.0), is for us to add a boosting option for "filename" (similar to titles, descriptions and keywords in the "Indexing options" tab of the config window). This would then allow you to place more priority in a filename match than a match in the content of the page.

    2.) You would need to use Spider mode if you have dynamically generated web pages (eg. PHP, ASP, CFM, etc.) which must be processed on the server-side before viewing. If you index these files with offline mode, they would be indexed as text (it would index the script source code). Otherwise, offline mode is a faster method for indexing local files.

    3.) The current version of Zoom must recreate the entire index. It would be nice to be able to index only updated files, but the indexing data is currently compressed and optimized in such a way that it is very difficult to update/modify. Note that typically, you do not want to just 'append' information because every page can potentially change (eg. you add a page, and the sitemap or navigation menu may now have an additional link - this navigation menu may be on every page, etc.) We do hope to come up with a solution for this, but it would be a fairly big task which will have to wait for a future major release.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thanks Ray...

      Ray,

      Thanks for the reply. A few points though. As far as the sort, I have the following in each page...

      <HEAD>
      <TITLE>MTD Parts Online Browser for 703_3164</TITLE>
      </HEAD>

      I do not include Meta descriptions, Meta keywords or Meta author in the index. I only index Title of page, Page content and Filename. I do need a brief description of the page in the query return. I have Page title set at +5 Boost, description at -1 Deboost and Keywords at -5 Deboost. What this does is make 703_3164 the first return but the rest of the links returned are still in no apparent order.

      It is somewhat impractical for me to add lines to my web pages as there are 60,000+ pages in existence already. In addition to this I do not think adding ZOOM tags will enable me to order the files numerically. Do I have access to the code (C,C++) that created the CGI? I believe that is the only way I will be able to truly affect the order the results are returned in.
      [/img]

      Comment


      • #4
        Re: Thanks Ray...

        Originally posted by jsprenz
        What this does is make 703_3164 the first return but the rest of the links returned are still in no apparent order.
        The rest of the links are sorted depending on how many times "703_3164" appears on the page (or your other search terms). If they are only mentioned once on each page, then yes, there would be no apparent order for these results (which would be considered to have equal relevance).

        Originally posted by jsprenz
        Do I have access to the code (C,C++) that created the CGI? I believe that is the only way I will be able to truly affect the order the results are returned in.
        We do not provide the C/C++ source code to the CGI due to maintenance and support issues. However, we can offer custom development if you specifically require the feature to sort by filename. Please e-mail us if you are interested in this and we can discuss your requirements in more detail. Our contact information is at: http://www.wrensoft.com/contactus.html
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X