I have a couple sincere questions.
I recently found a very nice web-site search engine. It seems to be used on hundreds of sites, including the following: http://www.ifbsearch.com/index.htm
The search engine is called mnoGoSearch. (No, I do not represent them and I am not advertising for them.) Are you familiar with them?
I sincerely want to be able to do something similar to the ifbsearch.com site. The main advantage of the mnoGoSearch program is that it uses (among others) MySQL database. This allows for great flexibility and speed with a huge database.
I have already invested close to $100 in Wrensoft's ZOOM search engine. For a Windows version of mnoGoSearch, it is $1,000 which doesn't seem feasable (or even reasonable, to me).
Can you offer suggestions?? I would really like to know what you think.
By the way, if you are wondering how I figured out that the ifbsearch.com was using mnoGoSearch, it wasn't too hard. If you are persistent enough, you can always find out what someone used to create what they did.
Using MySql is not an advantage. In fact it is a huge disadvantage. MySql is slow in general.
In isn't hard to find cases where it almost fails completely.
For example, go to their advanced search page.
http://ifbsearch.com/advanced.htm
Enter,
Search for text: s (Just this single letter)
Search for drop down: substring
Click on Search.
Then be prepared to go a make a cup of coffee.... I timed search times between 1 and 2 minutes.
And not only is the speed slow, it isn't accurate either.
Look at these results,
http://ifbsearch.com/cgi-bin/search.cgi?wf=2143&q=Children+Celebrations
Then try an exact phrase search,
http://ifbsearch.com/cgi-bin/search.cgi?wf=2143&q=%22Children+Celebrations%22
It just doesn't work at all !
Futher, in addition to the $1000 software cost, you should note that they say,
http://ifbsearch.com/whatsnew.htm
it took 5 months for them to implement it.
Why would you want a solution like this??
------
David
Thank you SO much for doing all of the work to reply. But, what if I want to accomplish a project of that size? And how would I have ONE database that could be searched by visitors? And how would I maintain that database on a server instead of a computer (and adding new search results to the existing database? rather than a new database being created each time?) Further, their set-up seems to provide "cache" pages. How would that be done in an automated way?
I would offer this search as an alternate exact phrase search:Originally Posted by Wrensoft
http://ifbsearch.com/cgi-bin/search....2Bible+time%22
This site's search is based on what sites they have actually crawled. It is technically possible that they haven't crawled a site that has "Children Celebrations" in the site.
Again, I am not advertising any sites. I am wanting to do something at least as good and hopefully much better.
Your example for a working extra phrase search, just proves my point that it doesn't work. I had a look at a few of the results.
Result 57, was this page,
http://www.wilderness-cry.net/bible_study/kjvissue/understandable/chap7.html
And the exact phrase "bible time" doesn't appear anyway on this page.
Have a another look at the example I gave in my previous post. It proves this is not the case. You can see the text appears in the results of the 1st search.It is technically possible that they haven't crawled a site that has "Children Celebrations" in the site.
As I said before, it just doesn't work.
-------
David
In the case of Zoom, you just add new start points to make a single index with multiple web sites.how would I have ONE database that could be searched by visitors?
A server is a computer. So this question doesn't make much sense. But the Zoom indexer will run on any Windows machine, including Windows server.And how would I maintain that database on a server instead of a computer
If you are indexing external sites then you have no control over when they are updated and thus need to spider then regularly. Which by the way, is also a flaw in the ifbsearch search, their index is obviously out of date.and adding new search results to the existing database? rather than a new database being created each time?)
Zoom doesn't do incremental indexing. See this post for a discission on the topic.
http://www.wrensoft.com/forum/viewtopic.php?t=299
Yes. I agree with this. They provided cached pages. Zoom doesn't provide cached pages becuase the storage requirements are enormous.Further, their set-up seems to provide "cache" pages. How would that be done in an automated way?
------
David
Do you have documentation on how to do this?In the case of Zoom, you just add new start points to make a single index with multiple web sites.how would I have ONE database that could be searched by visitors?
Did you really not know what I meant or were you just being sarcastic? Honestly, you must have known what I meant and it is irritating that you didn't take the time to answer the question. The point is, if I am trying to maintain, in an ongoing fashion, a web-site (in the fashion of Google, let's say, but not anywhere of the same magnitude and more of the size of ifbsearch), I would not want to be maintaining these ZOOM databases on my local machines - I would want to maintain it (in a working fashion) on a server.A server is a computer. So this question doesn't make much sense. But the Zoom indexer will run on any Windows machine, including Windows server.And how would I maintain that database on a server instead of a computer
In my estimation, Google is always out of date too, so what is the point of your remark of them being out of date? Obviously I would have to keep crawling the sites. But, how to maintain such a project in an organized fashion is the challenge.If you are indexing external sites then you have no control over when they are updated and thus need to spider then regularly. Which by the way, is also a flaw in the ifbsearch search, their index is obviously out of date.and adding new search results to the existing database? rather than a new database being created each time?)
I did take the time to read this thread. By the way, is it possible to create a program (maybe something for cgi-bin?) that will run on a server instead of on a local computer? something that could crawl specific sites (that the user inputs) on a regular basis and re-index them?Zoom doesn't do incremental indexing. See this post for a discission on the topic.
http://www.wrensoft.com/forum/viewtopic.php?t=299
If you had something designed to run strictly on a server, the server could store the cached pages instead of the person's local computer. By the way, how do you imagine the ifbsearch site obtains the pages and keeps them cached in the way that they are doing - and in a way which is so organized and maintained?Yes. I agree with this. They provided cached pages. Zoom doesn't provide cached pages becuase the storage requirements are enormous.Further, their set-up seems to provide "cache" pages. How would that be done in an automated way?
See this FAQ question,Do you have documentation on how to do this?
Q. How do I index multiple domains or sub-domains as one site (in spider mode)?
The question was, "..how would I maintain that database on a server instead of a computer?"it is irritating that you didn't take the time to answer the question
The answer given was, "the Zoom indexer will run on any Windows machine, including Windows server".
So you can install Zoom on a server and schedule it to run on the server, then add / remove URLs as required using the Zoom user interface (on the server).
Correct. But you were asking about adding new results to an existing database. I was just pointing out that you need to re index on a regular basis (in case you were thinking of just appending new sites, which can't be can't be done with Zoom in any case).In my estimation, Google is always out of date too
And becuase you need to re-index form time to time you should also consider the time required to re-index. I think you'll find Zoom is much faster.
See my previous answer, the Zoom indexer will run on any Windows server (of course you also need permission to install software on the server).is it possible to create a program (maybe something for cgi-bin?) that will run on a server instead of on a local computer?
At the moment the only way to add URLs to the list of URLs to be indexed is via the user interface. Anything else would require some additional custom development.
It does run on a server. See above....something designed to run strictly on a server
The location of the storage doesn't alter the amount of storage required. It would still be enormous. There isn't much advantage to cached pages if your index is up to date.the server could store the cached pages instead of the person's local computer
But don't under estimate the task of building a large search engine. It can be a significant job with you have 1000 of web sites and GB's of data to deal with. I also don't want to give you the impression of Zoom is the perfect solution for everyone.
Zoom is not going to give you an identical experience to ifbsearch, there are pluses and minuses. You have to consider what is important for your project.
----
David
While I'm not a DB expert, nor a search engine expert, I think that quote deserves highlighting. Zoom search was insanely easy to set up and get running. That alone puts this product ahead of the bunch, IMHO.Originally Posted by Wrensoft
A web interface to the Zoom config tool would be pretty slick. ;o)Originally Posted by Wrensoft