PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Characters appears incorrectly from PDF names

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Characters appears incorrectly from PDF names

    Hello

    I have a problem when i'm indexing the intranet of our company. There is a lot of documents like .pdf, .doc and when I'm indexing some characters doesn't display correctly in the result page, i don't know why.
    I created a pdf file with the name :"Fichier Test pour Zoom Search Engine.pdf" and in the result name display as "Fichier Test pour Zoom Search EngineEngine.pdf" and this happens to some files.
    The webpages are encode in "windows-1252" but i'm using UTF-8 encoding as recommended in knowns issues in this page : https://www.wrensoft.com/zoom/support/languages.html

    The website is on a Windows Server 2000 with IIS 5.0 so not it's possible to use "codePage"

    Do you know something I can try to fix that?


    Thanks a lot


  • #2
    Hard to tell what the problem is without seeing it. Is the page online?

    Some characters displaying incorrectly is likely a codepage / character set issue. You can have codepages specified for the website within IIS configuration, as well as on the page itself (or include pages).

    Support for Windows Server 2000 and IIS 5.0 were officially discontinued by Microsoft in 2010 (8 years ago). So I don't guarantee any of the following.

    I would strongly suggest updating to actively supported software if you want support.

    In any case, my understanding is that you can specify the @CODEPAGE operative in IIS 5.0. Only Response.CodePage is not available in IIS 5.0 (and introduced in IIS 5.1).

    More information at MSDN here.

    Also see our Support page on setting locale (ASP)

    Your second problem, regarding a search result title such as "Fichier Test pour Zoom Search EngineEngine.pdf", this looks like either:
    a) You have corrupted your set of index files from not uploading all the required files at the end of indexing (and thus mixing files from different index sessions)
    b) You are using an old version / build of Zoom. Click Help->About in the Indexer to confirm. Latest version here
    c) You have modified the search script (search.asp) in some way. Please let us know if this is the case.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Hi, thanks for the answer

      No the page is not online, it's in an intranet
      Yes, we need to change the software but we can't change for the moment.

      I don't know why but my message changed, maybe the character is not supported in the forum. I joined a screen of the result page.

      I checked the file "zoom_pagedata.zdat" and it's already corrupted in that file, but only when i'm indexing in spider mode. I tried to index the same pdf file in "Offline mode" and the result is good.
      I joined the "zoom_pagedata.zdat" file too

      I'm using the last version of Zoom and yes I modified the search script. I removed the lines where "Response.codePage" is use(line 451,452,456,466,467,471)

      Is this possible it's due to our server? I didn't try yet in another one






      Attached Files

      Comment


      • #4
        Open the PDF file in Acrobat Reader and check the Document Properties and the actual Title that is stored there. Is there a title specified, that matches the filename?

        It is possible your PDF file was created with the title (that matches the filename) but with an unusual character as part of it for some reason.

        None of this can be confirmed on our end without seeing the PDF file, the search page, etc.

        We can't support any script that has been modified in any way. Let alone one that has been hacked to work with IIS 5.0 which is no longer supported. If you have removed any lines, you need to make sure the surrounding lines of code have not been broken as a side effect.

        If you are a registered user, we can take a courtesy look if you email us with your Invoice number, and include a ZIP file of your index files (search.asp, settings.asp, search_template.html and all .zdat files) AND also include the PDF file in question you are using. I can then confirm if the problem is obvious.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X