PDA

View Full Version : Primarily MHT intranet



FD929
03-20-2008, 09:46 PM
Hello support, I have a couple problems. Let me explain the environment a bit.

I'm building an intranet based primarily off of MHT files. Not because I particularly like to make my life more difficult than it already is, but because using MHT is actually the most logical solution for this conversion because the files also need to continue to be easily modified occasionally by 'less than saavy' employees. Excel is the way to go. So anyways, I have a client that had about 200 xls workflow files that they wanted converted into a functionality 'site'. Understandable because how can 200 excel files in a server directory be 'functional'? Certainly not easy to navigate. Again, I digress. The engine indexes these files beautifully, however, since the mht files have no <title>, leaving the title option gives me 'no title' for every document. Not only that, but the files are named by their title (e.g. New Productivity Tracking Workflow.mht), lots of spaces, so removing the title option in the configuration gives me %20 in the links. The search page ends up looking like this:
1. NEW%20WORKFLOWS/Red%20Team%20-%20Business%20Support/Compliance%20Workflows/Compliance%20Tools/quality%20control%20disclosures.doc
... interview: Complete 1003 and disclosures and send within three days of interview for signature and ... . Applicants received via the internet or email: Complete 1003 and disclosures and send within ...
Terms matched: 2 - Score: 240 - 13 Feb 2008 - URL: NEW%20WORKFLOWS/Red%20Team%20-%20Business%20Support/Compliance%20Workflows/Compliance%20Tools/quality%20control%20disclosures.doc
2. NEW%20WORKFLOWS/Red%20Team%20-%20Business%20Support/Human%20Resources/LF%20NEW%20HIRE%20DOCUMENTS/new%20hire%20packet.pdf
... Employees should expect that communications that they send and receive by the corporation's private e-mail system ... This is not the intention of this email. However, this email is to make ...
Terms matched: 2 - Score: 150 - 4 Jan 2007 - URL: NEW%20WORKFLOWS/Red%20Team%20-%20Business%20Support/Human%20Resources/LF%20NEW%20HIRE%20DOCUMENTS/new%20hire%20packet.pdf
3. NEW%20WORKFLOWS/Green%20Team%20-%20Business%20Operations/Loan%20Scenario%20and%20Lead%20Retention%20Workflo ws/Manage%20Good%20Faith%20Estimates.mht
... Faith Estimate- Broker. Click the Send button. Click the option do not protect ... task bar, drag over= the Email option, click on Forms. A form ...
Terms matched: 2 - Score: 48 - 18 Feb 2008 - URL: NEW%20WORKFLOWS/Green%20Team%20-%20Business%20Operations/Loan%20Scenario%20and%20Lead%20Retention%20Workflo ws/Manage%20Good%20Faith%20Estimates.mht

This just doesn't work. I need a way to be able to use the filename as the document title and also mask the html character codes ... is this at all possible?

Manually adding a proper <title> in the mht files won't work either because once edited again in excel, it gets removed.

Any ideas?

wrensoft
03-20-2008, 10:18 PM
MHT files are just HTML files the images & CSS embedded inside the file.

They are only supported by Internet Explorer. So the 1 in 4 people using other browsers can't use a MHT based web site.

In addition MHT files are not easy to edit. So using MHT because it is "easily modified" doesn't make sense.

Nor does MHT do anything to improve the navigation of a web site.

In short it seems like a very strange solution for building a web site, One that will surely make your life more difficult than it needs to be.

I would suggest,
1) just putting the Excel files on your web site. Or if you want them to be read only convert them to PDF files.
2) Adding a directory listing script to list out the available files. Or make a nice home page with links to all your documents.

But if you really insist on using MHT files, you can set the page title from within Excel.
http://www.wrensoft.com/images/forumimages/Excel_MHT_Title.png

FD929
03-21-2008, 03:12 AM
But since it's a company intranet, the clients will all be IE so we have no issue there, besides Firefox displays mht just as well as IE. As I said, it's easiest for them to continue to be able to edit the files in excel, so mht is the way to go for this particular situation. I'm aware how much of a pain it is, but I'm pressed for time and it is only temporary however, since eventually I'll be incorporating this all into a database. That will take a lot longer however.

Anyways, I see you are using Vista, I am not and don't have the options you show. I'm using Excel 2007. How would I go about editing the title? Looks like the summary tab on the document properties? That will show correctly in the search results?

wrensoft
03-21-2008, 09:28 AM
FireFox does not support MHT files. It just brings up a window asking if you want to view the file in IE.

Yes, you can set the meta title of an Excel file from the properties window in Excel. The screen shot above was from Excel 2007 (the version you have).

FD929
03-21-2008, 06:01 PM
Must be the IE plugin ... looks like it is rendered in IE, never noticed that.
http://i183.photobucket.com/albums/x308/FD929/scuseme.png

Thank you for the help, I'll give it a try Monday.

wrensoft
03-21-2008, 10:06 PM
There is a plug in for Firefox that adds support, but most people don't have it. (and it wouldn't be too surprising if the plug-in use IE in the background). MHT files certainly don't load in our basic copy of FireFox.

FD929
03-21-2008, 11:46 PM
There is a plug in for Firefox that adds support, but most people don't have it. (and it wouldn't be too surprising if the plug-in use IE in the background). MHT files certainly don't load in our basic copy of FireFox.

Yes yes, it's the IE plugin. Look at the tab at the bottom of the google tab, it's the ie icon. It's still ok though since it's for a work intranet and all users are running IE. Thanks again for your help, I'll give it a go Monday.

FD929
03-27-2008, 07:24 PM
The title option didn't work. Any other ideas?

MergeThis
03-27-2008, 07:46 PM
Ray, would the Excel plug-in work on the MHT files? Your help only lists the XLS extension.

If he can use the plug-in, then he could create DESC files to configure his titles. Hmmmm???


Leon

FD929
03-27-2008, 08:06 PM
Also the description is showing html code. Sorry to be such a pain, but considering your engine is the only one I've found that can index mht, consider me your guinea pig. ;)

wrensoft
03-27-2008, 08:28 PM
An MHT file is mostly HTML. So the Excel plug-in will not work after you have converted the XLS file to MHT.

In the testing I did Zoom V5.1 correctly picked up the title from the MHT file (which was made from Excel 2007).

Actually we don't fully MHT as yet. It is the embedded images in MHT files that Zoom V5 will have a problem with. In V6 MHT support will be better.

But V6 will not solve this problem, as it already works as far as I can see in V5. Zoom found the title "Excel Test Document" as specified in the screen shot above. Admittedly this was a simple example file and only tested on Vista with Excel 2007.

FD929
03-27-2008, 08:32 PM
Still showing up as 'no title' for me after editing title in the summary tab of the document properties and reindexing the directory. :confused:

wrensoft
03-27-2008, 10:25 PM
Did you look inside your MHT file with a text editor to see if there is a <title> tag? There should be. Do you have the option of indexing titles turned on in Zoom (on the indexing options tab of the Zoom configuration window)?

FD929
04-01-2008, 11:23 PM
Did you look inside your MHT file with a text editor to see if there is a <title> tag? There should be. Do you have the option of indexing titles turned on in Zoom (on the indexing options tab of the Zoom configuration window)?

Yes of course. The only thing that shows 'title' in the code is:
<v:imagedata src=3D"GeneralTaxAdviceWorkflow_files/image001.png" o:title=
=3D""/>

http://i183.photobucket.com/albums/x308/FD929/Noname1.png

http://i183.photobucket.com/albums/x308/FD929/Noname.png

FD929
04-01-2008, 11:45 PM
Looks like the only thing working is manually entering the <title>. Well this just got a bit more time consuming. :(

Ray
04-02-2008, 06:23 AM
The "Properties"->"Summary" tab in Windows does not specify the title for the MHT file. This is merely summary information that Windows keep in its file system. It is fairly useless, and Microsoft removed it in Vista from what I can tell.

Your file above does not have a title if that bit of code was the only thing resembling it (that was not the title tag).

You can specify the title for MHT files in Excel 2007 in XP. I have just confirmed this on an XP machine. Click on 'File'->'Save As'. Select "MHT" for "Save as type:". The page title line will appear below it and a "Change Title" button will be available. Click on the "Change Title" button and specify your title there. Click "OK" and then "Save".

http://www.wrensoft.com/temp/excel_mht_title.jpg

FD929
04-04-2008, 09:07 PM
Awesome, completely missed that. Thanks Ray, they've all been updated (in dreamweaver, but at least now I know).

Another MHT funky issue is it would be good to have [= ] (minus the brackets) as a word joiner in indexing options - Indexing word rules. e.g.: Po= sition Title: CEO Budget: N /A Prepared: April 2008 Re= ports to: ...

Anything I can do about html showing up in the description?

Ray
04-07-2008, 12:14 AM
Version 6 of Zoom will feature a MHT plugin which allows you to index MHT files properly. The current version (V5) does not recognize MHT files officially, it just treats it as an unrecognized plain text extension, so its searching is less than perfect.

FD929
04-07-2008, 04:50 PM
Fair enough. Thanks for all your help (all of you)! I'm content with the setup at this point and am looking forward to the next release!