Two years in the software performance world
This past summer marked my second year (plus one summer internship) of working at Microsoft and on the Visual Studio Team System profiler. As a fresh out of college hire I had no experience working with large applications or working with application performance in particular. For me, as for many other college students I would assume, performance only meant that the application didn’t lag visibly when presenting it to a professor. If an application did have performance issues my education did give me some clues about where to look to improve application performance but improving performance was mostly a trial and error process of tweaking the high CPU time functions to try and get them running faster. Working at Microsoft has exposed me to so many brilliant performance engineers that sometimes I still feel like I’m in that same college stage of not really knowing what I’m dealing with when it comes to performance. Yet when I look back over my past two plus years here I realize that I’ve learned quite a bit when it comes to performance engineering. So I figured that I would take this chance to share some of the most important things that I’ve learned about performance in my time here at Microsoft. Note that these are more general guidelines that I’ve learned about performance as opposed to specific X is better than Y recommendations.
1. Improving application performance should be a planned process as opposed to something that you do at the last minute in response to a major crisis
One of the trickiest things about performance is detailing exactly what “good” or “acceptable” performance is from an application. To really engineer application performance you need to have some goals about how fast various operations in your program need to finish. Now setting these goals can be very hard, especially if you are working in a relatively new area. Still try to get some numbers down about how you expect / need your program to perform. If you are having trouble getting these numbers try looking at some competitor’s products or products that perform similar tasks to yours. If you are working on a video conversion utility to convert DVD video to Zune sized video and a similar program already exists you better make darn sure that your program takes about the same amount of time or less. Also, remember that these numbers can be changed as you flesh out your design and are able to get some initial performance numbers from prototypes and initial builds. The important thing about these numbers is that early in your development you are thinking about performance issues and about how your application’s performance will affect the experience of your end user.
2. Don’t go crazy trying to optimize low level performance with the profiler before you understand the high level operation of your program and have examined the basic algorithms that you are using
So the profiler is a very powerful tool and it can be indispensible for optimizing the performance of your applications. But you have to know the best times to actually unleash this tool. If you spend a ton of time early in your development cycle using the profiler to optimize some order n squared algorithm you might not be making good usage of your optimizing time. Perhaps you should take a step back and examine if you really need to use that order n squared algorithm and if you could replace it with something order n log n. Remember that the profiler is not a magic performance wand, you need to understand your program, your algorithms and what you are trying to accomplish to use it properly.
3. Know your framework, domain and language
A lot of time in school you will hear your professors tell you (often heard when students complain about having to work with a specific language) that if you learn good programming skills you should be able to pick up different languages and be effective in them very quickly. Now there may be a lot of truth that if you learn how to be a good programmer you can work well in different languages; but when it comes to performance optimizations you really do need to know the language, domain and framework that you are currently dealing with. For example when working with a .NET datatable you need to call BeginLoadData before you start loading in your initial datarows so that you can turn off things like table constraints until all of the initial data is loaded and you call EndLoadData. But if you don’t know this Datatable will happily allow you to add items one at a time and your performance will be terrible. And if you look use the profiler you can see that it is taking a long time to load your initial data but you may not have any clue how to fix it. Really knowing a framework takes a lot of time getting familiar its syntax and all of the little nooks and crannies with it. Performance optimizations can be really hard until you’ve put in enough time to at least be proficient with the framework, domain and language you are working with; general programming and performance knowledge won’t be able to bail you out here.
4. Be willing to ask experts for help in specific areas
In the previous section I talked about really needing to know the specific domain and framework that you are working with before you can make the best use of profiler performance data. But how is it possible to keep up with all the different programming domains and frameworks even internally at a company like Microsoft? The simple answer is that you really can’t, sometimes you just need to isolate a performance issue and then take the results to someone who knows that domain to help you to analyze you results. Say that in the example above you are adding rows to a dataset without calling BeginLoadData and your startup time is scary slow. At this point, if you don’t know much about working with datasets ditch the profiler and seek out someone with expertise in this area. Note that this doesn’t mean you need to know someone personally to help you out. This can be as simple as searching MSDN or live.com for “initialize .NET Datatable” or something of that type. The more important point is to not go crazy staring at the profile trying to squeeze blood out of a stone. Go ask someone who knows better than you or start doing some research of your own online. Just make sure that you’ve done the footwork first and can narrow down the problem for whoever you ask. Being able to use the profiler can help you ask questions like “My application startup time is about five seconds longer then I would like as I seem to be taking to long time to load my initial data into my Datatable, can you help me out with this?” as opposed to the perpetual horror “my app is way too slow d00d!!111!!1fix teh .NET framework l4zy M$ devs!!!!1111!!!"