On Writing Safer APIs

Our team owns lots of smallish applications; all of them have a DB layer, an OM/API layer, logic and UI layers. What happens over time is that we have different people working on the same apps, switching between them over the course of time.

This is great for cross-training, for support, for finding missed bugs, for preventing boredom-induced suicides, etc. However, the price you pay is losing that intense expert focus that someone gets from living+breathing a single app for years. That can lead to a few more bugs.

I've noticed that at the core of many of these bugs is the OM/API layer, and I wanted to summarize some issues I've seen and some suggested fixes. I'd also love to hear any other pointers you guys have.


Problem 1: Database call abuse.

We take great pride in squeezing the most out of our database servers. One of the primary ways to do this is to hand-code your DB access via sprocs and ensure that they return multiple result sets where possible. So for example, if I know that the main build status page needs to show builds and their comments and their last 15 status events, then I'll make sure that one sproc call returns all of those for all the builds - optimize for reducing the number of DB round trips. BTW, you can do this with LINQ-to-SQL.

However, this generally conflicts with writing a "neat" OM which performs simple concise tasks; and is often people's resistance to using ORMs. You need to learn how to balance the two.

Invariably, you'll eventually realize that some page is relying on some call which relies on a deep OM call which makes a sproc call. But that parent page makes the call 300 times; one for each item... Ouch.

Some solutions:

ABP: Always Be Profiling. This simple and brilliant idea comes courtesy of my teammates Jason and Karthik. Basically, as you're fixing bugs on an app, have the SQL Profiler running on your 2nd monitor and watch out for crazy spikes in traffic coming from your app.

Caching: When possible/applicable, cache the results of your DB calls at the lowest level that makes sense. You can cache for the whole lifetime of the app, for the session, for the context, or using a custom caching policy with the caching object. The point is; ask yourself whether you really need to read that stuff from the DB.

Commenting: One of my favourite VS features is that the "/// <summary>" comments show up as intellisense hints. This means that as your API consumer (your team-mate!) is using the method call, they get free documentation. So, if the method being called doesn't cache, call it out. It'll cost you 20 seconds:

    1: /// <summary>
    2: /// Get the comments for a build.
    3: /// NO CACHING - this always makes a DB call.
    4: /// </summary>


Problem 2: Units of measurement

This might seem obvious/stupid, but I bet it happens to the best of us. Take a look at this call:

    1: int BuildSize = build.Size;

"Size", eh? Bytes? Megabytes? Gigabytes?

Some solutions:

Naming: If you suffix your properties/methods with the units they return, this is unlikely to happen. SizeInBytes, SizeInGB, etc. Really cheap.

Typing: Rather than Size being of type int, have it be of type "Gigabyte", which you can create yourself. The type system will then ensure you don't make silly mistakes. This can be a little more costly if you do it pervasively, but it's quite powerful and allows for centralizing your unit conversions (are there 1000 MB in a GB, or 1024?).

Commenting: Again with the summary comments. Check it out:

    1: /// <summary>
    2: /// Get the size of the build, in BYTES.
    3: /// </summary>

So cheap.


Problem 3: Discoverability

I'm new to this application and I want to find the size of the build. My instinct is to look for build.Size (sorry, SizeInGB! :)), but I don't see it. So I go off and write it, to later find out that there already was a build.GetSize() .

Some suggestions:

Properties vs. Methods: Some people would deliberately use the GetSize() method rather than the Size property if determining the size required some "work" (DB call, expensive calc), to indicate to the caller not to take this call too lightly. Personally, I think it's confusing. I think that if something is a logical property, then it should be declared as a property; methods are for "doing stuff"; although sometimes that's a gray area. Whatever you choose, ensure that you're consistent across your app(s) when doing this.

Smaller classes: Often easier said than done, but a class with 30-40 methods and 20 properties is going to have discoverability problems, not to mention other issues. Break it up into smaller classes or helper classes where possible. Read this.

Design the API: Often, the API just "happens" as a result of writing the application; don't pretend you haven't done it! I think that designing the API up front with a view to having it used by other people makes for a much better, more well rounded API that has more discoverable avenues for success and needs less documentation.

Pretty files: This goes in the "Commenting" bucket. Use regions to separate your class file into logical parts. Put all the properties together in one place; put all related methods close to each other. Ensure that methods are named similarly when they do similar things.


If there are suggestions for other solutions (or other problems), please contribute; I've got lots to learn.