A Day in the Life: Out and About: October 8th, Silicon Valley Code Camp

This Sunday I’m giving two talks at the Silicon Valley Code Camp in Los Altos Hills, CA. Foothill College is hosting the event and from the pictures it looks like they have a nice campus. Registration is still open and the event is free, so if you're in the San Francisco Bay Area...come on down.

Dan is also presenting at this event and you can catch his session titled ".NET Development and Excel Services" at 9:15 on Sunday morning. Dan has been working with Excel Services for several months now and did an MSDN webcast on the subject this morning. You will soon be able to watch that webcast here. Dan also put up a post for more information on Excel Services here.

Sunday at 1:15 PM Concurrent Software Development

** this is the blurb from the session wiki **

My talk on concurrent software development was designed for a two hour session. Because the code camp format is one hour I’m dropping most of the high level discussion about why concurrency matters and concurrency design. Instead you will find some of my thoughts here as well as links to references where you can learn more. I will also continue to blog on concurrency so I encourage you to visit from time to time. (http://krgreenlee.blogspot.com)

Why Concurrency Matters

Concurrent software development is not new. If you’ve written code that uses threads then you’ve written code that, in theory, is concurrent. However, with computers that have only one CPU, true concurrency does not happen. The opportunity for real concurrency is increasing. Recently the hardware vendors have started making multi-core machines. A dual-core laptop is now on the market and affordably priced. This puts multi-core machines into the hands of the average user. Previously, the expense of multi-processor boxes had relegated those machines to the server room. No more.

The reason for the shift to multi-core machines is that the faster chips are generating too much heat. Until a cooling solution is found we are not going to see any faster chips. To get around this problem the hardware developers have started adding cores (usually with slower chips) and increasing the on chip cache size.

In the past when hardware developers came out with a faster chip, software automatically got faster. This was great for both developers and users. But with multi-core machines this is not the case. For the software to run faster, the software companies are going to have to modify their software to take advantage of the available cores. Software that is already concurrent, but never tested on multi-processor machines, may fail.

For a long time software developers could count on the hardware getting faster so an emphasis on performance, for many products, didn’t exist. I’ve even had conversations with engineers who told me to stop worrying about the efficiency of an algorithm because of that very fact. Well folks, responsibility for performance improvement has landed squarely back into our laps. We, the software development community, have work to do.

"The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" – Herb Sutter

Concurrency Design Considerations

You've decided that you need to add concurrency to your applications. The questions then becomes, "Where?" and "How?" Finding those areas in your applications that would benefit from concurrency is called decomposition. There is data decomposition and functional decomposition. And sometimes there is overlap between them. Identifying opportunities for performance improvement is an ongoing process and that is the same for concurrency opportunities. As your understanding of concurrency improves so will your ability to identify those opportunities.

Let's say that you have an application that processes very large EDI files. These files come in at the end of each month and it's a mad house trying to get the files processed quickly. Right now the application is fed a file and runs for six hours. How can you use concurrency to speed that up? Programmatically what is happening is an ETL (extract, transform, and load) process. Because you know that there are transaction boundaries within the EDI file, you can easily break the large EDI file up into smaller EDI files. This is where your opportunity to add concurrency occurs. By breaking up the large file into much smaller ones, the application can then process each of the smaller files concurrently. Significantly reducing the processing time.

Or let's say you have an application that does risk analysis. You are running a lot of Monte Carlo simulations to attempt to evaluate the risk potential of an investment. The application execution takes eight hours. Because a Monte Carlo simulation is based on random number generation it is an algorithm that is very parallelizable. With concurrency you can significantly reduce the execution time or if the execution time is fine you could change the algorithm to run more simulations in an attempt to get a more accurate result.

Some other areas to look at during decomposition are places where the application interfaces with a slow device or a human being. A slow device may be the hard drive, and you may want one execution path that reads/writes to the hard drive while another execution path processes the data. I think one of the most common places to find parallel execution paths is between the UI (foreground thread) and a background thread. Giving the user the impression that an application is responsive is an important contributing factor to how the average user perceives the value of a product.

Another area to look for concurrency opportunities is in loops. Long running loops, where loop iterations do not have dependencies on past iterations, are great candidates.

The other two concurrency design considerations are synchronization and communication. These two areas are tool dependent. Meaning that the language and tools you use to write your application directly affect your synchronization and communication options.

The above is a quick summary of the first half of my two hour session. The rest is code. However, if you want to read some good article on concurrency and threading I suggest you read the following:

"Using concurrency for scalability" – Joe Duffy
"What Every Dev Must Know About Multithreaded Apps" – Vance Morrison

Sunday 10:45 AM VS2005: Debugging Tips and Tricks

I want this debugging session to be an opportunity for sharing information. I'm planning on looking at the different debug tools. I’ll share some tips and tricks I've learned over the years and I hope that others in attendance will share theirs.

See you on Sunday.

Labels: Out and About

A Day in the Life

10/04/2006

Out and About: October 8th, Silicon Valley Code Camp

0 Comments:

About Me