PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Is it possible to index this Java file manager?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it possible to index this Java file manager?

    I am testing different file management script. I liked this one so I installed it on my server, but I tried to index one simple PDF file and it did not work!

    Maybe I am doing something wrong or because it cant be done, as is is using some java script.

    Could you just look at :

    http://www.internet-marketing-library.org/pdfs/

    Zoom indexed the links on the left side but not the file inside Explorer/Adsense system (pdf)

    Is there a way, any trick I could use to have Zoom indexed this pdf file (it is just a test)? A way around will be fine with me!

    Or maybe I have to "scrap" (again!) this file management script and use a more traditional script not using this time any java!

    I am using the php ouput!

    Roger

  • #2
    First of all, to avoid further confusion for yourself and our other readers, I should clarify the terminology: Javascript is NOT Java, and the two terms should not be used interchangeably because they are two totally different languages and platforms. What you have on your website there is a Javascript which provides a file browsing user interface on your web page. Because this Javascript sends asynchronous requests to the server at different points of execution, this type of Javascript is commonly referred to as "AJAX".

    Javascripts are generally not spider friendly. So if you are using Spider Mode to index your files, it is likely that Zoom will not be able to locate the files to be indexed through your Javascript browser. See this FAQ for more details (and solutions):
    Q. Why are links in my Javascript menus being skipped?

    So alternatives would be:

    a) Index the files in Offline Mode. This method does not depend on the use of a spider and crawling/finding links. You will simply need a copy of the files on your hard disk. See our Users Guide chapters on Offline Mode for more information.

    b) Create hypertext links to the files. You can either do this as part of the <noscript> section of the site (as explained in the FAQ above), or create an alternative sitemap page which simply contains text links to all the files. This has the extra benefit of allowing visitors who have Javascript disabled/incapable browsers to access your site, and also allow external search engines like Google and Yahoo! to index the files on your site, thereby improving your presence on the web.
    Last edited by Ray; Jan-22-2007, 11:46 PM.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Clear like cristal!

      Thank you!

      Roger

      Comment


      • #4
        I changed again of script..but still do not work!

        I am testing a forum script. You can see it at :
        http://roger.mywowbb.com/

        One forum, one topic, one post with a pdf attachment. Goal just to index the pdf attachemnt.

        I have the PDF ectension and a php output.

        The forum is private. So I logged on with Explorer, let Explorer opened with the forum, than configure, identification, cookies with Explorer.

        Cant index th pdf files. Last, in confogure, I select just index "PDF", no .php.

        The forum script has a rewrite Aapche module on (friendly URL), maybe this is the problem??

        I sent you by PM a login to test the forum..if possible!

        Thanks for your outstanding support!

        Roger

        Comment


        • #5
          You should take a look at the following FAQs regarding indexing forums and other complex scripts:
          Q. How should I index my site if it features a message board, forum, or calendar and other similarly complex scripts?

          And regarding indexing sites with user authentication/login:
          Q. How do I index protected parts of my website requiring user authentication?

          I had a look at indexing your site above, and noticed a few issues that you will need to cater for in your configuration. Most of this is documented in the above FAQs, so consult them for more details.

          1.) You were correct in using Internet Explorer to login to the forum and allowing Zoom to use IE's cookies so that it can access the forum as a logged in user. However, the forum contains links to "Logout" and the spider will follow this link off the first forum page - which erases the cookie, and thus logs the spider out so that it indexes the rest of the forum as a Guest user. You need to prevent this by adding the "Logout" link to the Skip pages list (on the "Skip options" tab). From looking at your site, this would be "login.php?out=1".

          2.) Similarly, the forum software contains a large number of links which are not relevant for indexing purposes. This would include the links/buttons for "New Topic", "Reply", "Quote", "Print", "Watch Topic", etc. See the first FAQ linked above for more information. You will need to add these links to the skip pages list as well so that it does not index a large number of these pages. Most importantly, features like the "Calendar" would generate an infinite number of URLs and you need to skip that entirely. Again, see the above FAQ.

          As a start, here's a skip list of URLs that I noticed needs to be skipped. I'm sure there's more if you look into it further.
          Code:
          login.php?out=1
          new_topic.php
          edit_topic.php
          reply.php
          &print=1
          &quote=1
          &sort_by=
          my_account.php
          view_user.php
          calendar.php
          help.php
          search.php
          pm.php
          recent.html
          selected.html
          popular.html
          Using the above list, it did manage to index the PDF file attachment in my testing and I stopped it from further indexing. Further fine tuning should produce pretty good results.
          --Ray
          Wrensoft Web Software
          Sydney, Australia
          Zoom Search Engine

          Comment


          • #6
            I reread the FAQ and I had missed...

            this very important info:

            Important: If you are using one of the above methods to allow the spider to login to your cookie or session-based authenticated site, you need to make sure that the spider does not follow a link to the "logout" page, subsequently logging itself out of your website. You can prevent this by simply specifying the logout page in the "Skip pages and folder list" (in the Configuration window, under the "Skip options" tab), eg. "logout.asp" or "&logout=1", etc.

            My bad!

            Thanks for your time to answer my post even if my homework was far from perfect!

            Excellent support even to customers that do not take time to read carefully the instructions!

            Don't hesitate to buy from this company!

            Roger

            Comment


            • #7
              Let Start's Again...On The Right Foot This Time!

              I scrapped everything, all the scripts I was testing and went for...make a guess: VBulletin!

              If it i good for Zoom, it is good for me!

              Remember, my goal is just one: to index .PDF files and .PDF files only! I reread what you mentioned the last post:

              So here the skip files:

              /forums/private.php
              /forums/usercp.php
              /forums/faq.php
              /forums/memberlist.php
              /forums/calendar.php
              /forums/search.php
              /forums/forumsdisplay.php?do=markread
              /forums/login.php
              /forums/modcp/
              /forums/member.php
              /forums/showthread.php?goto=newpost
              /forums/newthread.php
              &daysprune=-1&order=
              /forums/showthread.php?p=
              /forums/showthread.php?mode=hybrid
              /forums/showpost.php
              /forums/editpost.php
              /forums/newreply.php
              /forums/online.php
              /forums/profile.php
              /forums/report.php
              /forums/postings.php
              /forums/misc.php
              /forums/subscription.php
              /forums/poll.php
              /forums/sendmessage.php
              /forums/printthread.php
              &goto=nextnewest
              &goto=nextoldest
              /forums/infraction.php
              /forums/archive/
              /forums/viewtopic.php
              /forums/showgroups.php
              /forums/cron.php
              /forums/admincp/

              than I was really careful to add this one:

              http://www.internet-marketing-librar....php?do=logout
              this is the logoff specifically for vbulletin. I tested it with Explorer and it worked.

              I opened Explorer, logged on and right away try to index the forum. No PDF indexed!

              I have a few .php files that are still indexed but I will eliminate them later on. No big deal!

              So I am piss off at ..myself! !! What again am I missing?????

              Will create a account for you as an admin to ..show me the light! See PM!

              I am feeling more like a 10 watts right now than a 100 watts!

              Thanks again for the more than excellent support!

              Roger Pilon, Editor
              Internet Marketing Library

              Comment


              • #8
                I am curious as to why you are changing scripts again... I thought I had given usable solutions to your previous scripts in the above posts, in particular, the one just before your latest change (mywowbb) should have worked okay?

                Please tell us if you have problems with the other solutions, else we might be wasting time circling around the real issue - and you might be unnecessarily re-installing different scripts.

                Originally posted by quebecostarica View Post
                Remember, my goal is just one: to index .PDF files and .PDF files only!
                Is this actually your only goal? It was assumed that you would be wanting to index the forum threads / postings as well, as this was never implicitly stated before. There is a difference in attempting to ONLY index the PDFs hosted on a forum (and not index any of the forum threads and content). Let us know if this is what you actually mean and I'll go into more detail.

                Originally posted by quebecostarica View Post
                I opened Explorer, logged on and right away try to index the forum. No PDF indexed!
                Did you check the "Remember me" option when you logged into vBulletin? This is required for it to actually use cookies for logging in. Otherwise it is a session-only login, which will not work when accessed from a different program or even another instance of IE (eg. if you close IE, and open IE again and go back to your site, you will find you are logged out).

                I tested indexing your forum after logging in with "Remember me" checked, and it successfully found both PDF files. One of them was indexed, the other was not due to it being protected by Acrobat Security settings. You can enter the encryption password to index this file, more info here:
                Q. Why are some of my PDF files failing to index with a "PDF plugin error"?
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment

                Working...
                X