-
Stemming now added to V6!
We can now confirm that V6 will feature STEMMING.
This is a much requested feature, that when enabled, search results will match similar words or words which are derivatives of each other (e.g. plurals). For example, searching for the word "fish" will return pages containing the singular and plural words variates "fish", "fishes", "fishing", etc.
Adding this feature required some significant changes to the index file format and the way we index and search words, but we are glad to see that the end results seem to be worth the effort.
Stemming will not be available for JavaScript. The PHP and ASP scripts will only support English stemming, while the CGI version features improved stemming and also stemming support for 16 languages.
The feature will be enabled by default in V6. But you may want to turn it off, if for example, it is absolutely critical that your website differentiates between "booking", "booker", "book", etc.
More information on V6 here.
-
Stemming and single-case languages
I notice that stemming is disabled when "support for single-case languages (ie asian)" is enabled.
Is this intentional? I can't use both?
Thanks
-
The stemming algorithm is very language dependent. It doesn't make sense for most asian languages where there are no linguistic concepts such as plurals or verbs.
-
I would like to know which the 16 languages are that stemming works for. Couldn't find it in the article. I think it's a great feature.
-
They are listed on the languages window in the Zoom configuration. (You need to select the CGI script option first however).
-
Thank you
-
Hi, please tell us, and Russian language support v6?
-
Russian is supported with a few minor exceptions. See,
http://www.wrensoft.com/zoom/support/languages.html
For Russian stemming you need to use the CGI option.
-
Does that mean that the stemming function does not work for Chinese as well?
-
There is no stemming functionality for Chinese. Linguistically I don't see how that would work either. There is no plural or singular forms of words, nor is there present and past tense in the Chinese language and most asian languages that we are aware of.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules