Home » Forum
  • If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Announcement

Collapse
No announcement yet.

No Index without Enabling robots.txt support

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • No Index without Enabling robots.txt support

    Hi,

    We use the robots 'noindex' tag on certain pages to tell the big search engines not to index that page - but we still want our internal Zoom search engine to include those pages.

    However, there are pages that we want the Zoom Search Engine to skip completely - effectively the same as the robots 'noindex' tag.

    Is there such a tag like the 'ZOOMSTOP' tag that will do this? Something like 'ZOOMNOINDEX'?

    Or is there a way we can emulate this behavior by using Filtering?

    We can't add the URL's into the Skip List as only certain pages under the same dynamic URL need to no be indexed.

    I hope that makes sense.

    Any guidance you can give will be appreciated.

    Thanks,
    BH-Tech

  • #2
    Just a quick UPDATE

    I missed a section in the Manual where it says you can essentially use filtering to do what I'm after.

    But instead of filtering for -<meta name="robots" content="noindex">

    I will filter for -ZOOMNOINDEX

    My new question is.....

    Will the Zoom Search Engine recognise the filter word if it is commented out?

    <!-- -ZOOMNOINDEX -->

    Thanks,
    Taylor

    Comment


    • #3
      HTML comments will be ignored by the Content Filter feature.

      But you could use an unique meta tag instead, e.g.

      -<meta name="ZOOMROBOTS" content="noindex">

      Meanwhile... note that to allow Zoom to index pages that you've instructed other search engines to noindex, make sure you have unchecked "robots.txt support" under "Configure"->"Spider options"

      Originally posted by bhtech View Post
      We can't add the URL's into the Skip List as only certain pages under the same dynamic URL need to no be indexed.

      Not sure if I follow this. If it has the same dynamic URL, then wouldn't it return as the same page?

      If you mean that it's the same script (e.g. "mydynamicpage.php") but with different HTTP GET parameters, e.g.

      http://mysite.com/mydynamicpage.php?article_id=123
      http://mysite.com/mydynamicpage.php?index&sort=1&news=2

      Then you can still use the Skip List and specify those parameters. It is not limited to just the filename. It applies to the entire URL.
      --Ray
      Wrensoft Web Software
      Sydney, Australia
      Zoom Search Engine

      Comment


      • #4
        Hi Ray,

        Perfect - exactly what I was after.

        I will add the custom meta tag and filter that way.

        I can confirm that we have the Enable "robots.txt" support checkbox unchecked.

        Apologies for my poor explanation of the dynamic URL - I did mean the same script with different parameters.

        Example:
        http://www.domain.com.au/forums/thread.php?10-This-Is-A-Thread
        http://www.domain.com.au/forums/thread.php?20-Another-Thread-For-You

        Our logic in the script is if the thread id (10 and 20 in the example) is lower than 15 than make the page no index.

        I'm not sure if I would be able to achieve my goal through the skip-list, however I'm happy to do it through the Filtering option.

        Let me know if there is a way though.

        Thanks,
        Taylor

        Comment


        • #5
          Hi Ray,

          We have been successfully using the settings above to filter out the required pages.

          However I am now curious to know if there is a way to do a No Index BUT Follow links rule?

          Similar to setting <meta name="robots" content="noindex,follow" /> for search engine bots.

          Is that something I can achieve?

          Cheers,
          Taylor

          Comment


          • #6
            You don't need to use "Follow".
            Specifying "index" or "follow" values in the robots meta tag will have no effect as this is thedefault behaviour for all pages scanned.

            Comment


            • #7
              Hi,

              I mean that the links on the page will be followed, but the page itself won't be in the index.

              Is that possible?

              Cheers,
              Taylor

              Comment


              • #8
                Yes, that is what you get with the noindex flag.

                Comment

                Working...
                X