PDA

View Full Version : How to get the total numbers of pages indexed


quebecostarica
11-08-2008, 06:26 PM
I have nearly 600 PDF files all indexed with Zoom.

Here is the output of the indexing:


20:46:16 - INDEX SUMMARY
20:46:16 - Files indexed: 589
20:46:16 - Files skipped: 66
20:46:16 - Files filtered: 0
20:46:16 - Files downloaded: 0
20:46:16 - Unique words found: 109976
20:46:16 - Total words found: 2880916
20:46:16 - Avg. unique words per page: 186.72
20:46:16 - Avg. words per page: 4891

I would like to know how many pages (total pages of the 589 documents) has been indexed?

Is there a way to get that info?

Thanks

Roger Pilon
ponics.org

wrensoft
11-08-2008, 09:31 PM
The information about pagination is lost once the text is extracted from a PDF. So there is no way to know if the PDF files had 589 pages or 5890 pages in total from looking at the Zoom index files.

You could however make a rough guess based on the number of words. If you assume you averaged 250 word per page, then 2,880,916 word would be 11,523 pages of text in your PDF files.