BotSpot has been updated!
Go to http://www.botspot.com to find what you need

First Glance

Bot News
MySpiders

By Brian Proffitt


There is some debate as to how accurate search engines can be, given the huge additions to the Internet on a daily basis. The sheer volume of data out on the Internet means that some things are going to be missed until the search engine's bots can get around to finding the new page.

MySpiders, a student project by the AARG students at the University of Iowa, is an experimental prototype that lets you see first hand how its search is doing--in real time.

Under the direction of Filippo Menczer, the students have devised a Java applet that runs smoothly within Internet Explorer 5.0+ and Netscape Navigator 4.07+. Once installed and running, users can enter a search phrase and watch ten spiders crawl out onto the Internet in real-time in order to find the terms.

Installing MySpiders is just slightly more cumbersome than the usual Java applet, as the creators are clearly concerned with implementing proper security protocols for your machine. To run MySpiders, you will likely have to install version 1.3 of the Java 2 runtime environment, followed by a security certificate from Menczer, and finally the applet itself.

None of this is complicated, just don't click on the link and expect the applet to pop up like other Java applets do. Of course, this only occurs on the first load of the applet.

The interface of MySpiders is straightforward and easy to grasp. The spiders will launch themselves on every search, turning into "corpses" when they run into a search dead-end or reach the cumulative limit of pages you specify to find.

You can also click on each individual spider as its searching to see a more detailed report of the spider's progress.

Results are reported as active URL links and nothing else, but that's what you should expect in a prototype. Results are also not comprehensive, since spiders will drop out of the search when they reach a dead-end, and there is no central storage of "found" URLs located anywhere.

None of this bothered me so much as the lack of exact phrase search capability. Multi-word searches have the spiders picking up too many false leads.

This technology is clearly experimental, but I was still impressed with its speed and the multithreaded capabilities. This is a definite building block to some sharp tools, built by some sharp students, in the future.