PDA

View Full Version : File priorities and creation dates



AndrewD
03-14-2009, 05:08 PM
We index a large collection of pdf files using .desc files, which contain carefully-managed metadata. These files date from 1970 (actually earlier, but Zoom uses system routines that are not happy with the world before 1.1.1970) to today.

In order to give priority a) to recent documents and b) to key reports whatever their date, our default search settings are to present results 'by date' and to mark the key reports as 'recommended links'.

I would prefer to use 'by relevence' as the default order for presenting results, provided there was a good way to let Zoom know that newer documents are normally (but not always) more relevant than older ones.

As far as I can tell, Zoom does not use the 'Last modified date' meta record when determining relevance, so I'm adding 'ZOOMPAGEBOOST' meta records to the .desc files, with the boost value set to -1 to -5 according to how old the document is, or incremented by 2 if the document is also marked as a 'recommended link'.

But this is all trial and error - I'm not sure how much effect different page boost values will have, or if there's a better way (currently or possibly?) to achieve the desired results. Any suggestions/comments?

wrensoft
03-14-2009, 11:47 PM
newer documents are normally (but not always) more relevant...

The is a hard requirement for a computer to deal with, as it really requires some degree of intelligence and understanding of the document, on the part of the algorithm to work out the correct ranking.

Your post is pretty much correct otherwise however. Date is only used for a date sort, and not used when sorting by relevance.

This is something we might look at for a future release (letting dates influence sorting by relevance to some degree).

AndrewD
03-15-2009, 02:20 PM
This a hard requirement for a computer to deal with Indeed :)


This is something we might look at for a future release (letting dates influence sorting by relevance to some degree).

Thanks

A minor bug report (I think, at least in the cgi interface). The 'drill down' urls that are provided after a search to 'refine your search by category' correctly preserve all the search parameters except the sort order. So, clicking on one of these links always gives you 'sorted by relevance'. Can this be fixed, please.

Ray
03-17-2009, 12:00 AM
A minor bug report (I think, at least in the cgi interface). The 'drill down' urls that are provided after a search to 'refine your search by category' correctly preserve all the search parameters except the sort order. So, clicking on one of these links always gives you 'sorted by relevance'. Can this be fixed, please.

We have confirmed that this is a bug in the CGI version. It will be fixed in the next build (V6.0.1012).