Threat Modeling the Bold Button is Boring

I've been reading Larry Osterman's blog lately – he's a smart guy, and one of the very first people at Microsoft I ever met (virtually anyway – it was years before we met in person). Larry came to my defense when Seattle Lab tried to tell us that Windows buffer overruns weren't exploitable in October 1997 (yes, they really did this) – my short advisory is archived at Larry's been drinking too much threat modeling kool-aid, and so has Michael Howard, who was quoted as saying:

"If we had our hands tied behind our backs (we don't) and could do only one thing to improve software security... we would do threat modeling every day of the week."

Ahem. I really have to disagree. If I could only ditch one of the SDL components for most of the Office client apps, it would be threat modeling. Yes, I'm a heretic, but I didn't get here by thinking inside the box, or even accepting that there IS a box. When I came to Office, I'd been drinking the TM kool-aid too, and then got educated that large client apps are really different than operating systems (duh).

So here's a real world example of an utterly useless threat model – if you have a document parked on a share, and you have it open for edit (which is the default), then everyone else who tries to edit the document gets a warning that they can't edit it. The feature being threat modeled was to put a bit in the document that told the app not to even try to open it for write, so now everyone doesn't get an annoying dialog. If you really wanted to write the document, you could go and unset the bit, or do a save-as. There's zip security consequences to this feature. We came up with 2 threats – someone could sneak up on your document and set the bit while you weren't looking, causing user astonishment, which is always bad. What's even worse, and something we felt was out of scope, someone could sneak up and to atttrib +r to the document. Wow, what a ghastly DoS attack (not). The only real thing we could sort out was that we should make sure that it didn't mess up IRM somehow, and even that isn't a real security bug. Some PM spent a couple hours making this document, and we had 6 people in a room for an hour to review it. So as a rough SWAG, let's say that our real cost to Microsoft is about $200/hr including overhead and so on (I have NO idea whether this is the right number – I'm making this up, but it is probably in the ballpark), which means Microsoft spent $1600 to find there were no threats. That's ridiculous.

Larry and Michael are suffering from the belief that everything is like Windows. In Windows, it would be really rare to see a correct threat model that didn't have a trust boundary. In Office, I've seen lots of threat models where there were NO trust boundaries, and you're LINKED with the external processes. I've even seen threat models that just didn't have any way to accept user input. A threat model where ALL the threats are along the lines of "the developer needs to not screw up" is a complete waste of time. Wasted time spent on security is time you could have been doing something productive to secure your app, and so is really making things less secure, not more.

If I have to spend 20 minutes filling out all the little boxes concerning the various bits of a threat model, doing a DFD, all to find that "Hello World" has no threats (much less any vulns), that's a waste. What we're doing about this is trying to make the threat modeling process more productive and less about filling out annoying boxes with the same useless information. I have a concept I'll call "Threat Modeling Lite" – if you fill out just the background info (briefly), and do a quick DFD, you can figure out whether something has trust boundaries and is interesting enough to bother to finish.

Second thing to do is to try and do TM's on a system and sub-system basis, NOT a per feature basis. If you do TM's at too low a level, you do too many and pay too much "TM-tax", and you lose threats that result from interactions. If you go to too high a level, you lose details.

So here's another way to look at it – let's say that somehow you knew the entire universe of vulns for an app, and it was 90% implementation, and 10% design flaws. The design flaws are probably harder and more expensive to fix, but let's say that threat modeling was 90% effective at removing them – we're left with 91% of the original vulns, and have a lot of problems left to fix. You're probably not noticeably more secure by any real measure. Now let's consider one where we have 50% design and 50% implementation – remove 90% of the design bugs, and we're at 55% of the original set of vulns – that's worth doing. If you can look at something and get an idea which class your app fits into, you can make better decisions about where to spend the amount of time you have available for security.

If I had only one thing I could do, it would be to improve code quality – run OACR (prefix++), and fix fuzz bugs. Let's also face it – while occasionally we do see a really interesting bug come into MSRC with design implications, most of the time we see yet another implementation error. When the design problems do come in, it's horrific to try and fix, which is why it's REALLY important to deal with them early in the process, but worrying about design when your biggest problem is implementation errors isn't smart. Most of the attackers out there aren't good enough to even find real design flaws, and frankly, you can have a great design running someone's shell code and you've got a disaster. From the attacker's point of view, they want to find as many viable attacks as possible and get paid, so why should they waste time finding design flaws (which are hard)? Maybe they're able to find the design flaws, but they're smart enough to know how they get paid.

I'm going to keep coming back to this point – there is NO easy solution. A different language isn't an easy solution. TM's aren't an easy solution. The SDL isn't an easy solution. What _is_ the solution is to recognize that making secure apps is just about making a quality app, just like performance and reliability. It's going to take hard work and attention to detail. You can have all the checklists and tools you want, but the real key is having people working on it who care about the quality of their product. A good post on this I agree with is here - This observation is very true:

"The mandate helped, the process enabled, but it was the sheer determination of the program managers, developers and testers to address customer demand for better security that caused many of our products to improve so dramatically."

More later – maybe much later – next week, I'll be here - - maybe with my friend Mary – here we are on the way to completing 160 miles in 3 straight days - We're about 1/2 way through day 3 at this point. Maybe we can go further this time – some horses and riders do 260 miles in 5 days, which is pretty hard.