Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 24

Thread: problems with arabic diacritic marks

  1. #11
    Join Date
    Dec 2004
    Location
    Sydney, Australia
    Posts
    3,585

    Default

    We've confirmed that this is a bug in the current release [V6.0.1028]. It has been fixed in the V7 Alpha release.

    If you wish to apply the fix manually by editing the PHP script, then search for this line in "search.php":

    Code:
    $query = preg_replace("/[\s\(\)\^\[\]\|\{\}\%\£\!]+|[\-._',:&\/\\\](\s|$)/u", " ", $query);
    And replace it with this:

    Code:
    $query = preg_replace("/[\s\(\)\^\[\]\|\{\}\%\!]+|[\-._',:&\/\\\](\s|$)/u", " ", $query);
    Note that you will have to be very careful when you're editing the PHP script and we would not advise doing this if you are uncomfortable with PHP scripting.

    Note that the "search.php" file in the output folder will be rewritten when you re-index. You can modify the source copy under "C:\ProgramData\Wrensoft\Zoom Search Engine Indexer\scripts\PHP or ASP\" but note that modified scripts are difficult for us to support as functionality may be broken by incorrect modifications. So if you are uncomfortable with editing, then use V7 Alpha.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

  2. #12
    Join Date
    Apr 2012
    Posts
    15

    Default

    hi thanks so much man i will try it and i hope everything goes right
    thank you

  3. #13
    Join Date
    Apr 2012
    Posts
    15

    Default

    hi guys i have one more question how to let the search engine detect all the following letter when search on of them :
    "أ" alif with above hamza
    "إ" alif with down hamza
    "ا" just alif
    "آ" alif with above madda

    chm search engine able to find :
    ( أ , ا, إ )
    at the same time of searching is there anyway i can do that with search zoom
    thanks in advanced

  4. #14
    Join Date
    Dec 2004
    Location
    Sydney
    Posts
    4,157

    Default

    While I haven't looked at this in detail, I would have assumed these characters would just work like any other character if you are using UTF-8 as the character set.

    Is there something special about these characters compared to all other Arabic characters?

  5. #15
    Join Date
    Apr 2012
    Posts
    15

    Default

    Quote Originally Posted by wrensoft View Post
    While I haven't looked at this in detail, I would have assumed these characters would just work like any other character if you are using UTF-8 as the character set.

    Is there something special about these characters compared to all other Arabic characters?
    i use utf 8 as my language settings for encoding characters and
    those characters that i mentioned above post starts in first Arabic words for example :
    أمي
    even when i enable strip Diacritics still cant find similar words like
    امي
    إمي
    and sometimes when typing in Arabic we dont put hamaza with Alif like this أ or إ
    we simply type it like this " ا " without the quots and it would be nice to able to find this characters when searching the words that contain one of this 3 characters like this exmple :

    search :
    ان
    results :
    ان + إن +أن

  6. #16
    Join Date
    Dec 2004
    Location
    Sydney
    Posts
    4,157

    Default

    OK I see, you are asking for the 3 types of alif character to be treated as the same character. So when you search for one of them, it matches the other 2 versions of the character. Correct?

    Like we do for French accents, é and e for example.

  7. #17
    Join Date
    Apr 2012
    Posts
    15

    Default

    Quote Originally Posted by wrensoft View Post
    OK I see, you are asking for the 3 types of alif character to be treated as the same character. So when you search for one of them, it matches the other 2 versions of the character. Correct?

    Like we do for French accents, é and e for example.
    exactly .. also there is four type of alif character which is آ alif with madda
    Last edited by mrbasserby; 11-30-2012 at 09:11 PM.

  8. #18
    Join Date
    Apr 2012
    Posts
    15

    Default

    hi i wonder if its possible of highlight words with diacritic for the file highlight.js like this example :

    http://jsfiddle.net/FUg85/15/

  9. #19
    Join Date
    Dec 2004
    Location
    Sydney, Australia
    Posts
    3,585

    Default

    We can probably add something like that into V7. However, we're not familiar with Arabic lettering so I'm not entirely sure how universal the above suggestion is. Did you write that bit of code yourself, or is it from someone else? Are you aware that it simply strips the following 5 characters:







    From the two strings being compared? Is that enough to fix all issues with diacritic marks in Arabic or are there other marks that are not addressed by this approach?
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

  10. #20
    Join Date
    Apr 2012
    Posts
    15

    Default

    the above script example i found it in the net but it could helps as an example for the java-script of highlight diacritic words in the highlight script file and this is the standard arabic characters :



    أ alif with above hamza
    ب baa
    ت taa
    ث close to thaa
    ج jaa
    ح haa or 7aa
    خ khaa
    د daa
    ذ thaa
    ر raa
    ز zaa
    س saa
    ش shaa
    ص close to saa
    ض close to daa
    ط close to taa
    ظ close to thaa
    ع ayin or close to aaa
    غ close to khaa
    ف faa
    ق close to kaa
    ك kaa
    ل laa
    م maa
    ن naa
    هـ haa
    و waa
    ي yaa



    and this the diacritics used with it i will put it to ( ـ ) as indicator to the arabic characters :


    ( ـُ )
    damma

    ( ـَ )
    fattha

    ( ـِ )
    kassra

    ( ـٌ )
    tanween damma or double damma

    ( ـً )
    tanween fattha or double fatha

    ( ـٍ )
    tanween kassra or double kassra

    ( ـْ )
    skoon

    ( ـّ )
    shadda

    ( ـَّ )
    fattha above shadda

    ( ـُّ )= -ّ + -ُ
    damma above shadda


    ( ـِّ )= -ّ + -ِ
    shadda above kassra

    ( ـَّ )= -ّ + -َ
    fattha above shadda

    ّ( ـٌّ )= -ّ + -ُ
    double damma above shadda

    ( ـٍّ ) = -ّ + -ٍ
    shadda above tanween kassra


    إ = ا + ء
    stand alone characters
    hamza under alif


    أ = ا + ء
    stand alone character
    hamza above alif


    آ = ا + ~
    stand alone character
    madda above alif


    لأ = ل + أ
    stand alone character
    laa with hamza above alif

    لإ = ل + إ
    stand alone character
    laa with hamza under alif

    لآ = ل + آ
    stand alone character
    laa with maddda above alif

    ( ؤ )= و + ء
    stand alone character
    hammza above wow

    ئ = ى + ء
    stand alone character
    hamza above short alif

    ( ى ) short alif stand alone character

    ( ء ) just hamza consider as stand alone character

    and you could use notepad to see it better and to understand more how this characters sound you could use arabic text to voice program like this :

    https://acapela-box.com/AcaBox/index.php

    if guys need more information how to use it with keyboards I'm glad to help you for more details information about it and check wiki site for images and information :
    http://en.wikipedia.org/wiki/Arabic_diacritics
    Last edited by mrbasserby; 12-03-2012 at 03:23 PM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •