This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


MIND

Flux
flux@microsoft.com
Douglas Boling
Is Anyone Out There?
E
ver since Intel introduced the i386, personal computers have had more power than they need to perform basic tasks. Face it, word processor and spreadsheet features peaked years ago. Improvements since then have been made mainly to the user interface to make programs much more friendly. While those programs ran well on a 386, today we have Pentium III systems with a hundred times the processing speed. Now, someone has come up with a way
Anybody there?
to use that idle CPU power for something more useful than a fancy screen saver. Well, some may not find it useful, but the idea is pretty cool.
      That someone is SETI@Home (http://www.seti.org/), a group of researchers based at the University of California at Berkeley. SETI stands for Search for Extraterrestrial Intelligence—yes, those folks looking for E.T. What is brilliant about SETI@Home is they're taking advantage of all the latent CPU cycles used for nothing but drawing cute screen saver pictures to instead parse huge amounts of radio telescope data. Actually, the SETI@Home software draws cute pictures too, but instead of drawing fish, it draws graphs of fast Fourier transforms.
      Whether you consider the SETI research a boon to mankind or a colossal waste of time, the approach of using the excess CPU bandwidth of hundreds of thousands of interconnected PCs to process data is very snazzy. The project isn't just about looking for E.T.; it's about pushing the bounds of extremely distributed computing.
      Downloading and installing the 250KB client processing program is as painless as installing any other Windows®-based program. There are also versions for Mac, and text versions for Unix and other OSs. When installed, the program asks if you'd like to use it as a screen saver or run it in the background. I decided to try it in the background, and on my 300MHz mobile Pentium II the program doesn't cause any noticeable sluggishness in the system. The program does its thing, though; while it runs, the CPU meter on the taskbar is pegged at 100 percent. But since the program doesn't access the disk, or, with my 128MB of RAM, cause any swapping of data to the disk, the program is frankly unnoticeable running in the background.
      The data the program is processing comes from the Arecibo radio telescope in Puerto Rico. The data is gathered at the site and shipped to Berkeley on tape, where it is broken down into small chunks of unprocessed data and placed on a server. The client programs download individual chunks, process them, and return the results to Berkeley. If a suspicious signal is detected in the chunk, the original data is flagged for analysis.
      The group has also factored in people's competitive nature to encourage wider participation. As chunks of processed data are returned to Berkeley, credit is given to the owner of the machine or the group of machines. A list of the top 100 contributors is updated continuously. And, of course, your box might be the one to find that first real signal. Strangely, the thought was enough to make me steal a glance at the Fourier patterns displayed every once in a while. Not that I'd notice E.T.'s ring if it was there on the graph. If it does answer, and your machine detects the signal, the folks at SETI@Home promise to mention you as a codiscoverer.
      But the real news isn't the SETI research, it's the way the program takes advantage of the system without imposing any noticeable burden on the user. The program is clearly well designed, and apart from a few incidents where bugs caused anomalies in the scoring process, the project seems to be progressing nicely. Now, how can this technique be applied to other problems? Mainframe and supercomputer CPU time is expensive since the systems have to be bought to handle the peak loads of the user. This technique of widely distributed, coordinated processing could be used to carry some of the processing load for tasks deemed unworthy of dedicated supercomputer time.
      One big issue with this technique is security. You wouldn't want to send confidential data to be processed outside the company. You'd also have to be careful of malicious programmers spoofing results. The SETI folks seem to have dealt with both sides. First, the data chunks are so small compared to the original data that it would be hard to replicate the original data. Of course, with the SETI data, the input data is essentially noise, so the data can't be reproduced without all the original data. On the return side, any spoofing of the results is detected by independently verifying the suspicious results using the original data.
      While SETI@Home is using its first-kid-on-the-block status to get a fair amount of publicity and the resulting large number of contributing client machines, others trying this technique may have to compete to gain access to your CPU time. Inducements like gifts or even money may be needed to gain access. Perhaps this type of program could be used to finance "free" PCs. It would be better than the current practice of displaying ads on the desktop to recoup the hardware cost. However the inducements will be made, I expect others will attempt the same technique. Soon you may be asked the equivalent of, "Hey buddy, can you spare a MIP?"

From the September 1999 issue of Microsoft Internet Developer.