PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Need to edit a VERY large zoom_pagetext.zdat file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need to edit a VERY large zoom_pagetext.zdat file

    Anyone up for a challenge?

    I need to edit the zoom_pagetext.zdat file that was created when I indexed a very large site (almost 500,000 htm files).

    Here is why

    The site that I indexed is setup with about 34,000 folders with each folder containing 12 to 13 htm files. One of the htm files is a default.htm file that is actually a frame page that displays the other htm files. When I ran zoom to index the site it did not index the default.htm pages in each folder because there is no text in it.

    I want my users to always see the default.htm page rather then being able to select the other pages from the search engine

    Normally I would just do a text replace with notepad or wordpad, but even with 4 GB of RAM, I can't get either notepad or wordpad to open the file.

    Any ideas?

    I’m playing with some VB scripts but I’ve been running into memory issues with that too.

    Thanks and Merry Christmas!

  • #2
    For efficiency reasons (larger and faster searches) we changed the file layout in V5.

    One unfortunate consequence is that it is no longer possible to edit the zoom_pagedata.zdat with a text editor. (I assume this is file you are talking about and not really the zoom_pagetext.zdat file?)

    But if you don't change the number of bytes in the URL, then it is still possible to do the search and replace. Changing the number of bytes will cause a corrupt index and a crash.

    We know lack of search and replace is a problem for a small number of users and we will be putting out a point release (V5.1) that will have somethng like the search and replace option built into Zoom. Thus fixing the problem and removing the extra external step you previously need.

    As we are coming into christmas we will not be able to fix this issue in the next couple of weeks.

    But maybe you are looking at the overall problem the wrong way. Altering the Zoom index will not prevent people getting to the files. They can still type in the URL and see the files without a frame. A better solution might be to redirect people back to the top frame using Javascipt if they try to access a page they shouldn't.

    But to answer your actual question, UltraEdit V11 is a good text editor that will handle large files.

    Comment


    • #3
      So was this fixed?

      I have to reindex the site and I have muliple listings in the zoom_pagetext.zdat file (123.htm. asdf.htm, whatever.htm) that all need to say default.htm.

      Thanks

      Comment


      • #4
        We added a search and replace feature in release V5.0.1005, which has been available for about a year now.

        It is called the "Rewrite links" feature which allows you to rewrite the indexed URLs (eg. replace all URLs containing "http://dev.myserver.com/" to http://www.myserver.com). This solved the problem of editing links for the vast majority of our customers.

        In your case I still think the better solution is to add server side re-directs (via a '.htaccess' file) or Javascript re-directs to force people into your frame set. There are many ways people can accidentally get outside of a frameset without using Zoom.

        Comment


        • #5
          JavaScript redirect

          Do you know of a good redirect script that I could use?

          Thanks for your help

          Comment


          • #6
            javascript

            Wouldn't I have to put the redirect script on every other htm file? I have a little over 500,000 files that i would have to edit.

            Comment


            • #7
              Sorry I keep getting more quesitons.

              Is there a way to enter mulitple rewrite links? maybe in the zcfg file?

              Comment


              • #8
                <SCRIPT language=Javascript>
                if(top.frames.length <= 0)
                top.location = "http://www.your-frameset-page.com";
                </SCRIPT>
                The above code on all framed pages solves the problem of pages being loaded without their framesets.

                But it is such a common problem that there are dozens of solution available if you search Google. e.g.
                http://www.netmechanic.com/news/vol5/javascript_no7.htm
                http://www.digitalroom.net/index2.html
                http://www.pageresource.com/jscript/jredir.htm

                With a bit of care you should be able to use the search and replace function in a good text editor to insert this kind of script into every page. Or if you have built you site carefully, then maybe only one edit is required on an include file.

                Comment

                Working...
                X