Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

.jar files indexed three times

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • .jar files indexed three times

    Hi,

    Javascript, 6.0 build 1003

    I added a .jar extension to the Scan Extensions list, and used the file type Binary, and configured it to use a .desc file.

    It's working great except now the jar file is being scanned three times, so that a search on "test.jar" returns 3 identical results. Is there a way to fix this?

    This isn't a huge problem (after all, the file is being scanned and correctly displays in the search result) but I would like it to only return one result.

    Thanks!

  • #2
    I think it is unlikely that the same file was indexed 3 times. More likely is that you have the same file duplicated at 3 different URLs.

    What is the URL to your search function so that we can see the problem?
    Are you indexing in Spider mode or Offline mode?

    Comment


    • #3
      Well, there is only one .jar file in my entire documentation. I just performed an explorer search within my document files to confirm this.

      Also, since this post, I have added .bat to my scan extensions list (also Binary type) and there are a total of 3 files. Curiously, the search results returned 9 results!

      Could it be indexing my files three times? Very odd......

      I'm running in Javascript mode, offline.

      Comment


      • #4
        Here are the URLs displayed in the search results for the 3 results:

        1. URL: ../../files/folderA/folderB/test.jar

        2.URL: ../../folderA/folderB/test.jar

        3. URL: ../../../Doc/files/folderA/folderB/test.jar


        The correct (and only) location is #1:
        ../../files/folderA/folderB/test.jar
        ../../files/folderA/folderB/test.jar.desc

        Also, all three search results display the same .desc information.
        Last edited by JG867; 01-22-2009, 09:10 PM. Reason: typo

        Comment


        • #5
          If you look in the Zoom log does it show this file being indexed more than once?
          Can you zip your Javascript search files and E-mail them to us.

          Comment


          • #6
            Please include your ZCFG configuration file when you send us your files. I am suspecting you have unusual start folder (or additional start folders) and base URLs.

            I just tried to replicate your scenario but could not reproduce the problem.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Thank you.

              Yes, I will send you what I can, but it may take me a few days.

              Also, fyi: The.jar and the .bat files are the only files that multiply themselves, and the only difference in my config file between these two files types and the others listed is that .jar and .bat are binary. I can try to see if this issue occurs for me using V5, if that will help.

              I viewed the log file and I could NOT see where it was indexed three times... but the total files listed at the end of the log indicates .jar = 3 and .bat = 9. One thing I noticed is that it first indexes the .jar, then indexes the .desc, but right beneath that is another line for the .desc but its indicated with (blocked by extensions list). See below:


              00|01/22/09 16:24:13|Indexing filename only for <SNIP>\files\FolderA\FolderB\test.jar
              06|01/22/09 16:24:13|Using .desc file found for <SNIP>\files\FolderA\FolderB\test.jar
              01|01/22/09 16:24:13|Skipping <SNIP>\files\FolderA\FolderB\test.jar.desc (Blocked by extensions list)

              Comment


              • #8
                That's normal. The .desc file is correctly skipped because it is not an extension in the Scan Extensions list.

                You should look for any other occurrences of "test.jar" in the Log window (or save it to a file). The path that you are snipping and not showing us might be important. Actually to clarify, when you had paths like "../../" in your previous post, did you actually mean that was the path or you are just replacing it with dots?
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment


                • #9
                  Ok, then the fact that the (blocked by extension list) is normal is a good sign. Yes, the paths ../../ are what actually display...and this is correct. In my previous post, I just replaced the real folder names with FolderA and FolderB (FolderA is the same named folder in all three results, same with FolderB)

                  I have been using V5 for about 7 months with no problem, but I hadn't gotten that deep into Zoom to configure the extra extensions But again, this error only happens on the .bat and .jar files. Should I change the file type to something else other than binary? I really only need the filename indexed...

                  If you have any other suggestions on what to test, I'd be happy to do that. It might take me a few days before I can prep the files to send them to you....

                  thanks again for your help

                  Comment


                  • #10
                    Not sure why it should take you time to "prep" them, just zip and send. We only want the search files (the generated "zoom_*.js" index files, search.js, settings.js, and the ZCFG file). If you plan to modify the files before you send them over, then you increase the chance that your modifications will break/change the behaviour and defeat the point of letting us see your problem.

                    If you have been modifying your files after Zoom generates them, then the problem may well be caused by your changes. Please let us know in advance if this is what you are doing. Same with the search script.

                    Eitherway, we'll wait for your files before we look at this again. Otherwise you're just having us guess and take pot shots which could go on forever.
                    --Ray
                    Wrensoft Web Software
                    Sydney, Australia
                    Zoom Search Engine

                    Comment


                    • #11
                      Yes, I understand and agree with you. I appreciate your patience and time to my issue. I know it's difficult to diagnose a problem without actually seeing it or having a good (and complete) description of the problem.

                      I'm sorry, I didn't realize you only needed the search files, I thought you wanted the entire document........(in which case I would need to remove proprietary content). I will send them tomorrow at the very latest.

                      I did not modify the files after Zoom generates them.

                      Thanks so much!
                      Last edited by JG867; 01-23-2009, 05:47 AM.

                      Comment


                      • #12
                        I just sent the zip file.

                        As noted in my e-mail, I performed further testing and when the binary scan extension is changed to rich text format, the files are correctly scanned once. I sent the before files (with errors) and the after files (no errors).

                        Thanks again!

                        Comment


                        • #13
                          Just to update -

                          My error was caused by having redundant start directories.....just one was all I needed. Problem fixed!

                          Thanks for your help! I really appreciate how much this forum is monitored.....it's helped me countless times!

                          Comment

                          Working...
                          X