PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Output index in html help code

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Output index in html help code

    Hello,

    at first I want to say, that the zoom search program works very fine and is a great job, thanks!

    I searched a little bit in this forum but didn't find something for my question/suggestion and hope it isn`t answered before. If so, please give me a link without a comment.

    I`m writing little programs for Outlook with help files in the windows .chm format. Now it would be very nice if the zoom search could output the index into this .chm format.

    At this time I do this job by a excel macro and it works not bad but it is too much manually work. I think this option could not be very difficulty to include and want ask if you have thougt about this functionality.

    Thanks in advance
    Peter
    Last edited by Peter Marchert; Jan-24-2007, 12:46 PM.

  • #2
    I'm not exactly sure what you are asking for. Outputing the index data (which is all in a proprietary internal format for use with Zoom's search scripts) in the .CHM format does not make much sense on its own.

    I presume what you are actually asking for, is the ability to use Zoom to update the internal search function in your CHM (aka Windows HTML Help) files. This is not possible, and does not seem like a simple task at all. CHM files are a binary format created by HTML Help Authoring applications, and has its own internal format for providing the search functionality AFAIK.

    I am vaguely aware of the ability to add scripting to CHM files. I think it uses VBScript. Zoom currently provides an ASP script option, which also uses VBScript. Although the script is in the same language, I am sure that there are significant differences that would require a more intensive development port. Some functionality may be absent all together when running in the CHM enviornment as opposed to ASP, and this could prove to be a very difficult endeavour. I am also unsure if the scripting functionality allows us to override the normal search function built into CHM files.

    Also, I am not sure what problems you are having with the built-in search functionality that is created by Microsoft's utility for creating CHM files?

    Combining the facts that: there has been no demand for us replacing the existing search functionality in CHM files, with the fact that we are unfamiliar with scripting in CHM files, it is something that seems unlikely for us to pursue.

    Let me know if I have misinterpreted your question.
    Last edited by Ray; Jan-25-2007, 12:15 AM.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Thank you for your reply, Ray.

      The functionality I suggested, means not a complete search function for chm-files, only the creation of an index.

      The index file has the extension .hhk and contains simple html tags.

      Here is my actual code to create the index file:
      Code:
      Sub CreateIndexFile()
      
          Dim strHTMLRoot                 As String
          Dim strIndexFile                As String
          
          Dim strWord                     As String
          Dim strTitle                    As String
          Dim strHTMLFile                 As String
          Dim strTarget                   As String
          
          Dim lngBackSlash                As Long
          Dim lngRow                      As Long
          Dim lngRowWord                  As Long
          
          Dim bytFreeFile                 As Byte
          
          strHTMLRoot = Sheets("Files").Range("C2").Value
          
          lngBackSlash = InStrRev(strHTMLRoot, "\")
          
          strHTMLRoot = Left(strHTMLRoot, lngBackSlash - 1) & "\"
          
          strIndexFile = strHTMLRoot & "Index.hhk"
          
          Kill strIndexFile
          
          bytFreeFile = FreeFile
          
          Open strIndexFile For Output As #bytFreeFile
          
              '-------------------------------------------------------------------------------------
              ' Header
              '-------------------------------------------------------------------------------------
              Print #bytFreeFile, "<!DOCTYPE HTML PUBLIC ""-//IETF//DTD HTML//EN"">"
              Print #bytFreeFile, "<HTML>"
              Print #bytFreeFile, "  <HEAD>"
              Print #bytFreeFile, "    <meta name=""GENERATOR"" content=""Microsoft&reg; HTML Help Workshop 4.1"">"
              Print #bytFreeFile, "    <!-- Sitemap 1.0 -->"
              Print #bytFreeFile, "  </HEAD>"
              Print #bytFreeFile, "  <BODY>"
              Print #bytFreeFile, "    <UL>"
                      
              For lngRow = 2 To ActiveSheet.UsedRange.Rows.Count
                  
                  '---------------------------------------------------------------------------------
                  ' Reading data
                  '---------------------------------------------------------------------------------
                  strWord = Trim(Cells(lngRow, 1).Value)
                  If strWord = "" Then Exit For
                  strTitle = Trim(Cells(lngRow, 2).Value)
                  strHTMLFile = Trim(Replace(Cells(lngRow, 3).Value, strHTMLRoot, ""))
                  strTarget = Trim(Cells(lngRow, 4).Value)
                  If strTarget <> "" Then strHTMLFile = strHTMLFile & "#" & strTarget
                  lngRowWord = lngRow
                  
                  '---------------------------------------------------------------------------------
                  ' Printing data
                  '---------------------------------------------------------------------------------
                  Print #bytFreeFile, "      <LI><OBJECT type=""text/sitemap"">"
                  Print #bytFreeFile, "          <param name=""Name"" value=""" & strWord & """>"
                  
                  Do
                      
                      Print #bytFreeFile, "          <param name=""Name"" value=""" & strTitle & """>"
                      Print #bytFreeFile, "          <param name=""Local"" value=""" & strHTMLFile & """>"
                      
                      lngRowWord = lngRowWord + 1
                      
                      strHTMLFile = Trim(Replace(Cells(lngRowWord, 3).Value, strHTMLRoot, ""))
                      strTarget = Trim(Cells(lngRowWord, 4).Value)
                      If strTarget <> "" Then strHTMLFile = strHTMLFile & "#" & strTarget
                      
                  Loop While Cells(lngRowWord, 1) = strWord
                  
                  Print #bytFreeFile, "          </OBJECT>"
                  
              Next
              
              '-------------------------------------------------------------------------------------
              ' Footer
              '-------------------------------------------------------------------------------------
              Print #bytFreeFile, "    <UL>"
              Print #bytFreeFile, "  </BODY>"
              Print #bytFreeFile, "</HTML>"
          
          Close #bytFreeFile
      
      End Sub
      After creation I get something like that for the index.hhk:

      Code:
      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
      <HTML>
        <HEAD>
          <meta name="GENERATOR" content="Microsoft&reg; HTML Help Workshop 4.1">
          <!-- Sitemap 1.0 -->
        </HEAD>
        <BODY>
          <UL>
            <LI><OBJECT type="text/sitemap">
                <param name="Name" value="/loadinf">
                <param name="Name" value="H&#228;ufig gestellte Fragen">
                <param name="Local" value="htm\faqs.html#question2">
                </OBJECT>
            <LI><OBJECT type="text/sitemap">
                <param name="Name" value="/saveinf">
                <param name="Name" value="H&#228;ufig gestellte Fragen">
                <param name="Local" value="htm\faqs.html#question2">
                </OBJECT>
            <LI><OBJECT type="text/sitemap">
                <param name="Name" value="/silent">
                <param name="Name" value="H&#228;ufig gestellte Fragen">
                <param name="Local" value="htm\faqs.html#question2">
                </OBJECT>
          <UL>
        </BODY>
      </HTML>
      That output should be possible by zoom search too, I think.

      Peter
      Last edited by Peter Marchert; Jan-25-2007, 06:52 AM.

      Comment


      • #4
        The "Index" tab that you see in your HTML Help window is quite different in nature to the index files we generate for a Search function. Like the index on the back of a book, this sort of end-user index should provide only keywords and nouns - they really need to be manually picked out from the content of the pages in question - which is why you are doing so manually at this point.

        The Zoom Indexer scans and indexes all text content on pages, so that the search script can perform full-text searching. This means all words are indexed, nouns, verbs, adjectives - things that would just clutter up your end-user index, you'll have entries for "that", "output", "should", "be", "possible", "think"... it's just useless in that scenario.

        So the index data that Zoom generates is not what you want for the "Index" of your CHM files.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Originally posted by Ray View Post
          So the index data that Zoom generates is not what you want for the "Index" of your CHM files.
          That is nearly right. A good index will provide words which are not included in the files.

          But I think it is not wrong to provide all words of the files too. So if the user do not use the full search function he will found fast in the index for what he is searching for.

          In combination with the recommended links (that include words which are not found in the files) you could create a good and complete index.

          Peter

          Comment


          • #6
            Originally posted by Ray View Post
            This means all words are indexed, nouns, verbs, adjectives - things that would just clutter up your end-user index, you'll have entries for "that", "output", "should", "be", "possible", "think"... it's just useless in that scenario.
            For this words you need a exclusion list. For German words I have this nearly completed and in my excel macro is a "learning" function for this.

            Ok, this would be some more work for you and the most people using Zoom does not need this.

            Peter

            Comment


            • #7
              Originally posted by Peter Marchert View Post
              But I think it is not wrong to provide all words of the files too. So if the user do not use the full search function he will found fast in the index for what he is searching for.
              But this would essentially be replicating the "Search" tab functionality which is built into the Windows HTML Help / CHM file viewer, not the purpose of the "Index" - which is to provide only the relevant, and associated keywords - again, like the back of a book.

              Originally posted by Peter Marchert View Post
              For this words you need a exclusion list. For German words I have this nearly completed and in my excel macro is a "learning" function for this.
              I think you may be underestimating the differences between a full-text index and a back-of-book-style index. For example, on one page, the use of the word "shop" may not be relevant at all to a topic, it might be part of an expression in a blog entry: "... i'm going out to shop around this weekend ..." - but if you add the word "shop" to the exclusion list, you'll also be omitting it from a page where the actual word "shop" is relevant: "Welcome to my Online Shop page!".

              I do not believe you can create an exclusion list to make this more useful. And as mentioned before, if the "Index" tab contained all words from the content, your users would not only have a very large list of words to dig through, but each keyword selected will return hundreds of pages, and the user has little means of digging through the results, because essentially, they would only be able to search by one keyword.

              I can imagine that, for a very small list of keywords - where the words are very unique (eg. source code class names or function names), and you only want to search for those keywords (assuming you ignore ALL other normal words in the english dictionary, which is way too much to add to an exclusion list really), it can be useful. But this is really not the designed use of Zoom, and as you realize, of very little use to the majority of our users.

              I should point out that there are many HTML Help file authoring applications out there, which aid in the process of creating CHM files. Although, even they usually require you to enter the "Index" keywords manually, because again, in almost all cases, manual selection would create a far more effective and useful end-of-book-style index.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                Hello Ray,

                thanks again for your detailed answer.

                I think you are right. It was just an idea and now I think it would be better to create an index manually.

                Peter

                Comment

                Working...
                X