PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

How much memory does it take to index a large site?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How much memory does it take to index a large site?

    I've tried Zoom and like it but would like to as a question before buying.

    I have a site that has 35,000 pages and grows by about 10,000 a year. Each page is small in terms of content (up to 500 words each). I'd like to know how much RAM is required to index such.

    Also, can you give an indication of the size of the files generated for the 22,000,000 word Wikipedia search. This would be handy.

    Thanks

  • #2
    For 35,000 pages we would recommend using a machine with at least 300MB of RAM install for indexing. 512MB would be better.

    The Wikipedia search can be found here,
    http://www.wrensoft.com/cgi-bin/wikipedia/search.cgi
    It searches 21,014 files, and 22,528,847 words

    The size of the index files used for this are a total of 134MB. Which is not too bad when you consider that the source data was probably around 1GB+ in size.

    -----
    David

    Comment


    • #3
      what is required or needed in a larger scenaro as if some one wanted to make a search engine that indexed many different message board sites, some of which contain over 50000 posts?

      thanks

      Comment


      • #4
        Message board typically have a lot of pages that you don't want indexed. (e.g. member list pages, profile pages, etc..). So you should filter which pages are indexed. See this previous post on this topic,
        http://www.wrensoft.com/forum/viewtopic.php?t=165

        If you do a good job filtering, 50,000 posts might in fact be only 10,000 HTML pages (if the message board displays 5 posts per page). So in this case you might not need a particularly powerful machine. (e.g. 256MB of RAM).

        -----
        David

        Comment

        Working...
        X