PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Session based Authentication

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Session based Authentication

    Hi,

    I have been trying to get the authentication in Zoom to work so that password protected areas of our site will get indexed. We are using session based authentication and I have entered that information into the configuration of the zoom indexer (i.e. login page, username and password). When I look at the log after the indexing it seems that zoom can log in properly, but when it tries to call up some of the password protected pages it seems that the session is not properly working.

    Zoom finds all the pages but it cannot call up the page sections that should appear once it is logged in (some sections of the pages appear to logged in users and some don't). The php pages check for the session variables before loading.

    Rgds
    ph

  • #2
    What version of Zoom are you using?
    How does your sire authentication work (in a technical sense). I assume it is a PHP script setting a cookie?
    What exactly do you mean by "but when it tries to call up..."? Are you talking about the indexing process downloading pages for indexing? Or are you talking about the search process displaying result pages, or something else?

    I assume you have already read this FAQ on authenication?

    Comment


    • #3
      Hi there,

      Thanks for replying quickly.
      I am using Zoom version 6.0, build 1019, professional edition.

      Let me try to explain better.

      - Technically we initiate a session when the user logs in via the login page.
      - No cookie is set. We just access the _SESSION variable in php when necessary.
      - Some of our pages have content that is only visible to certain user groups. I.e. the page displays different content depending on the user's role.
      - The username and pw I entered in Zoom are for a user that should be able to access everything on the site, i.e. Zoom should be able to "see" the pages with the content.
      - In the log I see that that is not the case because Zoom follows links (back to the login page) that should not appear to the logged in user.
      - Evidently, when I do a search for password protected content, I don't get the expected results.

      - Yes, I have looked at the FAQ.
      - I have tried alternate method 1) (Logging in on MSIE w/ the same user before indexing) . The result was the same.
      - Method 2 won't work because we do not accept GET in the login.
      - Method 3: We cannot by-pass the login process ... i.e. we need the login process to determine the user's role on the system.
      - Have not tried Method 4 as the site will not work offline.

      Hope this helps.
      ph

      Comment


      • #4
        It is difficult to say without seeing the actual site in question or knowing exactly how the authentication was implemented.

        Sessions can be implemented with or without cookies. Is there a session ID that is passed via the URL? If not, then it is likely using cookies behind the scenes despite your PHP code not directly accessing the cookies. More information here.

        There are situations where the automatic login is not possible because, for example, the authentication method requires more parameters than just the username and password to be submitted. There is a possibility of this if you have any anti-bots mechanisms in place. Another possibility is it may be checking the User-Agent string and disallowing submits from non-recognized clients.

        You can e-mail us details (e.g. URL to site, and perhaps a test user account for attempting to login) and we can take a closer look but ultimately we would be limited by what we can see from over here and we won't have the precise details of how your site is implemented and what it is checking for in the authentication process. It may be more effective if you consulted the developer of this part of the system, or if you are using a third party off the shelf CMS product, contact their support regarding the ability to automate login with spiders and/or bots.

        Regarding the workaround methods, method #4 requires a good understanding of the existing implementation. If, as you say, it requires a specific user role to be assigned, then theoretically, code can be added to assign the necessary role (that would suit indexing) when it recognizes the spider's user-agent and/or IP address.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment

        Working...
        X