Crudely measuring the state of grid computing
Grid computing is a subject that has been talked about for years. The benefits are huge: increased application scalability and reduced application run time. For some companies the ability to run applications on a grid and reap those benefits is worth millions of dollars a year. I’m not kidding. So what kinds of companies are interested in grid computing? What technologies are they using? What is the general acceptance of grid computing? How many grid computing solutions are home-built vs. bought?
To get a crude estimate on the answer to those questions, I search job boards. I find that job descriptions contain an incredible amount of information. They’re ads from a company to a prospective employee; of course each company wants the best applicants they can get so they tend to pack a lot of valuable information into each posting. And because of this, job posts are a great way to gather free information.
Let’s do some searches:
C# => 33,945 hits
"grid computing" => 398 hits
"grid computing" windows => 64 hits
"windows compute cluster" => 5 hits
Obviously there are a lot of folks doing C# development and significantly fewer doing grid computing on Windows. I often drill down on the search results to get a better feel for the development environment around the grid computing posts. Run the ‘"grid computing" Windows’ search and drill down on some of the listings. Pretty interesting reading.
Are all of those C# folks potential grid customers? No. Grid computing is simply a tool and will only help those applications that have the right kind of performance and/or scalability problems. Just like multithreading an application isn’t always the answer, neither is grid-enabling an application always the answer. However, like multithreading, developers will find more uses for it than they initially expected.
Traditionally grid computing has been hard. Developers had to break their applications up into executable tasks (mytask.exe), worry about initializing each instance of mytask.exe, move the files (often requiring preinstallation on each node), worry about how to get the results back, build their own task recovery mechanisms, etc... That is what made grid computing hard. The Digipede Network has taken away a lot of that complexity by focusing on Windows, providing a GUI to define jobs (no more Perl scripts), taking care of all the plumbing (error handling, guaranteed completion of tasks, moving files, etc...), and most importantly providing an API that allows developers to distribute .NET objects from within their applications. Once people start realizing that grid computing isn’t hard any more we’ll start seeing those numbers go up.
I’ll try to collect these numbers from time to time, we’ll see if there are any changes as Microsoft and Digipede continue to push HPC on Windows.