Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

Symbols change to strange characters

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Symbols change to strange characters

    I have some help files that use the symbol Em Dash in the Framemaker source file. Since I have updated Zoom Search to v7.0 from v6.0, these same symbols are now being converted to strange characters. I have updated to v7.1 in the hopes that it would have changed but I am still getting the same results. Any ideas?

  • #2
    There are quite a few things that can come into play regarding how these special characters get rendered. These include:

    1) How is the em dash character specified in the (HTML?) source file. Are they specified as HTML entities, e.g. "—" or as a UTF-8 character?
    2) What encoding or charset is specified by the source file? This would be a HTML meta charset tag in the source file.
    3) What encoding is the source file saved in? This would depend on the editor used.
    4) What encoding have you specified in Zoom? (under "Configure"->"Languages"). Did you select "Use Unicode (UTF-" if your pages are encoded in UTF-8?
    5) What charset have you specified on the Zoom search page? This would be specified by the HTML meta charset tag in the "search_template.html" file.
    6) What encoding is specified by your web server, when it's serving the search results page?

    All of the above can affect the outcome. So you need to look into each one.

    If the search page is accessible online, give us a link to the page and we can take a look and at least rule out a few things.

    In my testing, a properly encoded em dash character is usually stripped out in most HTML situations, as the case was made for a number of users that it can be used by OCR software to join words broken by formatting/layout. So we'll really need a closer look at the situation to determine what's happening -- for example, can you provide us with the source file in question and your .zcfg indexing configuration.


    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Hi! I was havening the same issue, I have updated Zoom Search to v7.0 from v6.0, and then apperas in strange characters. . I think that i encoded in other then utf-8, but i m not sure of that.

      Comment


      • #4
        Hi Beatrix,

        The questions I gave above also apply in your case, tell us what the characters are, and how they are specified in the HTML source code.

        Or e-mail us a copy of the file in question, along with your .zcfg indexing configuration and we can take a look.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X