I have been working with weighting options trying to understand how the scores are created. One search that we us as benchmark is for history of us which is a title of a document in our collection. It show up around 12th in the Zoom Search Results. We have a setup using google search and bing with each of them have it come up number 1 as I think it should. Not sure what I need to change to have better results in the weighting options.
1.I do know that of and us are removed because of skip word list
2.Search results for: history of us
The following word(s) are in the skip word list and have been omitted from your search: "of", "us"
762 results found containing all search terms.
77 pages of results.
Press Release: State Legislatures Make History
Terms matched: 1 - Score: 330 - 06 Nov 2008 - 53k - URL: http://www.ncsl.org/default.aspx?tabid=17288
.... 12 results later......
SL Magazine: This History of Us
Terms matched: 1 - Score: 156 - 11 Jun 1999 - 107k - URL: http://www.ncsl.org/default.aspx?tabid=17667
3. Why a score of 330 vs 156 on these two? They both seem to have history 7 times in each document.
The reasons would be:
1.) Because the words "of" and "us" are skipped, you are essentially just searching for a single word: "history". To change this behaviour, remove these words from your skip list or change the setting for "Skip words less than x characters" to 1.
2.) You are not searching for the phrase but just individual words. If you enclose the phrase in quotation marks "history of us", the page you expect would be the first result.
3.) The default weightings give preference to smaller pages than large pages (with the assumption that large pages like PDF documents, have words appearing frequently but it is not necessarily more significant). Your result #1 is a much shorter page (53kb) than result #12 (107kb). To alter this behaviour, change the setting under "Configure"->"Weightings"->"Content density" from "Standard adjustment" (or "Strong adjustment) to "No adjustment".
4.) It is also possible that this could be affected by the "Word position" adjustment (since on the shorter page, the word "history" appears much closer together than the longer page), so change this accordingly to see how it affects you.
More information can be found under this FAQ:
Q. How do I make some pages appear higher up in my search results? How does Zoom's page score system work