Zoom Search Engine FAQ - Plugins
Q. Does Zoom (with
plugins) index ALL the words inside the PDF and DOC documents?
Yes. Zoom converts these files to plain text and indexes
all words found in the entire PDF or DOC document. Images,
diagrams, graphs, etc. will however, not be indexed.
Q. How do I specify my
own titles and descriptions for PDF and DOC files?
As many external binary documents do not contain useful
title and description information, Zoom allows you to specify custom
meta information for any plug-in supported files.
This option can
be enabled for each individual plugin. On the "Scan Options" tab of the Configuration window, double click on a plugin supported file extension and check the option to "Use description (.desc) files". Once it is enabled,
the indexer will attempt to look for files ending with the ".desc"
extension for this file type.
For example, if you have a file called "mydocument.doc",
you can create a text file called "mydocument.doc.desc"
in the same directory with the following contents:
<title>This is my document custom title</title>
<meta name="description" content="This is my document's custom description">
Zoom will then index the words found within "mydocument.doc",
but use the title and description information found in "mydocument.doc.desc"
- so that you will see your custom title and description in your
search results.
For Spider Mode - you will have to upload these .desc
files to your web site, alongside the files you are indexing. If
you are having trouble with the Indexer finding the .desc files
on your webserver (and you are sure you have uploaded them), read
the following question.
Q. Why are my .desc
files not being found by the Indexer?
If the Zoom Indexer is unable to find your .desc files
in Spider Mode, check with your host that your webserver allow for
files with the ".desc" extension to be hosted. Although,
they are simply text files, some web servers have extra security
restrictions placed which refuses access to any files with an unknown
extension.
On Windows web servers running IIS, this setting can
be found in the IIS Control Panel, under website properties. Follow
these instructions:
- Select the site to configure in IIS, right click and select
"Properties"
- Under HTTP Headers Tab, select "File Types" under
the MIME Map section and select "New Type"
- Type ".desc" as the associated extension and "text/html"
as the content type, and select "OK".
More details are available on the Microsoft TechNet site here.
Q. Why are some of my
PDF files failing to index with a "PDF plugin error"?
There are some limitations with indexing PDF files.
If you find that the plugin is failing to index some of your PDF
files, it may be because of one of the following:
- The file is not a valid PDF document. Try opening the file in
Acrobat Reader to confirm.
- The file may have Acrobat Security settings enabled, which prevents
content from being extracted or copied. You can confirm this by
opening the PDF file in Adobe Acrobat, and clicking on File
-> Document Security -> Display Settings. If so, you
can either specify the password that was used to encrypt these files (by double clicking the PDF extension on the "Scan Options" tab of the Configuration window) to allow Zoom to index them in their decrypted form, or you will have to remove the protection on these files via the "Document Security"
window in Adobe Acrobat. This setting can also be found in
the Security tab of Adobe Distiller Preferences.
- The file may not contain any textual content. For example, it
may have been created by scanning a physical document, which would
only store the document as an image. For more information, see
the following FAQ.
Q. Why can't I find words
from my scanned PDF files? (PDFs created from scanning in physical
documents)
When you scan a physical (paper) document in with
a scanner, the page is captured as an image. PDF files created this
way contain images rather than actual text. Effectively, this is
similar to taking a photo of your document as opposed to typing
it up. If you try opening your PDF file in Adobe Acrobat Reader,
and clicking on the Text Selection tool, you will notice that you
can not select or copy the text out because of the same reason.
However, if you create PDF files from Word, or use OCR software
to create your PDF file, it would be stored as proper text, and
Zoom would be able to index this without problem.
Adobe provides the Paper
Capture online service to convert PDF image files to searchable
PDF documents. There is also a Paper
Capture Plug-In which you can install for Adobe Acrobat to do
the same thing. The more advanced Acrobat
Capture software allows you to convert large volumes of PDF
files at once.
Q. Why are some of my
DOC files failing to index?
Q. I get the error message "Error
processing DOC file or unable to write to Zoom folder"
Check that the DOC file is a valid Word document. Note that if
the file simply loads up in Microsoft Word is not indication enough
that it is actually a Word document. You must also then click on
"File" -> "Properties" -> "General",
and look for the document type listed. It must say "Microsoft
Word Document".
A common problem is that some users may have RTF files which have
been mistakenly renamed to a ".doc" file extension at
some point. While Microsoft Office appears to load it succesfully
regardless of the filename, it actually automatically detects the
format internally and opens it as a RTF file, without telling the
user. If you open the file up in Word, and follow the above instructions
and see "Rich Text Format Document" listed as the document
type instead of "Microsoft Word Document", then this is
the case.
We would recommend renaming these files back to their rightful
".rtf" extension. You can then install and enable the
RTF plugin to index these files appropriately.
Alternatively, you can save these files in a proper DOC file format,
by loading them up in Word, and selecting "File" ->
"Save as" -> and under "Save as type:", select
"Word document (*.doc)". They will then be indexed by
the DOC plugin successfully.
Q. Why are some of my XLS files failing
to index?
There are some limitations with indexing XLS files.
If you find that the plugin is failing to index some of your XLS
files, it may be because of one of the following:
- The file is not a valid XLS document. Try opening the file in
Microsoft Excel to confirm.
- The XLS file may be created in an old obsolete Excel file format
that is not supported (eg. prior to Excel 95). You can check your
file by opening it in Excel, clicking on "File" ->
"Save as" and looking at the file type selected in the
"Save file type as:" drop-down box. If this is the case,
you can convert the file to a newer format by selecting a different
file type and clicking "Save". You would then be able
to index the XLS file successfully.
- The XLS file may contain password protected worksheets or workbooks.
The XLS plugin currently does not support any XLS files with password
protection.
Q. What can Zoom index from my AutoCAD DWF files?
Using the DWF plugin, Zoom can extract and index all meta information within a DWF file, in addition to all the properties, layers, model attributes such as part numbers, description, comments, mass/weight, and anything that is specified as a property.
However, vector-based text created within the Canvas Pane can not be searched. This is because the "text" here is not actually constructed as text data - they are stored within the DWF files as vector shapes. This is also why AutoCAD itself does not offer a "Search Text" function for this type of content as far as we know. It would essentially require OCR (Optical Character Recognition) processing to identify the text and store this seperately. Unfortunately, there are some DWF files out there which are largely made up of vector based content, and lacking in actual text property/content. These DWF files are difficult to search in, and are akin to a PDF file containing nothing but a scanned image of a paper document. In such cases, you can create custom .DESC files to add meta description and keywords to specify the additional information necessary to make your DWF files more searchable.
Q. I am using Vista, and when Zoom indexes a plugin supported file, a security warning appears: "The publisher could not be verified. Are you sure you want to run this software?"
Windows Vista expects all executables downloaded from the Internet to be signed so that the publisher can be verified. Unfortunately, since some of our plugins are developed by third parties, and they do not sign their executables, we are unable to sign it for them to avoid this security warning. Not to mention that Vista will still prompt you when it runs an application which is signed and verified (with a different security message). As such, this security warning is normal, and you should clear the checkbox labelled, "Always ask before opening this file" so that you are not prompted again.
All software downloaded from www.wrensoft.com is guaranteed to be free of spyware, viruses, malware, or adware.
|
|