Security Briefs

A Follow-on Conversation About Threat Modeling

Michael Howard

The May 2009 issue of MSDN Magazine includes an article titled "A Conversation About Threat Modeling" that discloses a conversation between Paige, a young security neophyte, and Michael, a somewhat jaded security guy. This month I'll take up the conversation where it left off.

Scene I

A small office kitchen, next to the coffeepot.

Paige: Last time we met, you took a good look at my threat model, but you said you'd cover some cryptographic and secure design issues at a later date. Well, welcome to that later date.

Michael: Can I please get a coffee first?

Not waiting for a response, Michael pours himself a huge coffee.

Paige: Er, sure.

Michael: Remind me again what your app is.

Paige: It's a product that allows users to store data on our servers. There's a small piece of client code that pushes the bits to a server set aside for that user. This code can upload files from the user to our back end via the Web server, and the files are stored in the file system along with file metadata stored in SQL Server for rapid lookup. We might store billions of files eventually. The two major environments are domain-joined computers and Internet-joined computers.

Michael: Oh, that's right, I remember now. So much code, so little time. OK, let's go back to your threat model to see which part we're concerned about. Do you have the DFD -- the data flow diagram?

The two walk over to Paige's desk. She logs on with her smart card and loads the SDL Threat Modeling Tool.

Paige: Here it is.

Michael looks over the diagram.

Michael: This is the Level-1 diagram, right? It is one level more detailed than the context diagram?

Paige: Yup. We also have a Level-2 diagram, but I don't think we need to go that deep just yet.

Michael: You're right, this is perfect. If more precision is needed as we go through this, we can look at the Level-2 diagram.

Paige: By the way, we don't call them DFDs anymore.

Michael: Er, OK! What are they called, then?

Paige: Application diagrams.

Michael: Whatever floats your boat, I s'pose. We'll be using crayons next. OK, back to the diagram. So the user makes a request of the client application to upload or download files to or from the server, and the server persists that data in the file system at the back end, along with some metadata about the files that is held in SQL Server?

Paige: That's one use; in fact, it's probably the main scenario. Of course, the admin needs to set up, configure and monitor the application; that's what the admin tool does.

Michael: Let's focus on that core scenario, then.

Scene II

Michael is staring intently at the application diagram.

Michael: Let's start by looking at each element in the core scenario, and we'll spell out each of the STRIDE threats.

Paige: STRIDE? Remind me again.

Michael: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege.

Michael starts writing quickly on a piece of paper.

Michael: Take a look at this list:

Paige: Aren't all of these in the Threat Modeling Tool?

Michael: Yes, but I want to show you how the tool arrives at the list. By the way, you don't need to use the threat modeling tool to be SDL-compliant, so long as the threat model is complete and accurate. Basically, each element is subject to a specific set of threats. I think you can work it out from the list.

Paige: Yeah, I get it, but aren't you missing something? All those data flows between the various application elements?

Michael: Yup, but I did that on purpose, because I really don't want to focus on those yet -- we discussed them in detail last time.

Paige: We did?

Michael: Yes. Look at the Threat Model.

Paige looks over the SDL Threat Modeling Tool

Paige: Oh, I see, that's where we discussed using SSL/TLS to fix the tampering and information disclosure threats to the data flow between the client and server processes, right?

Michael: Good! So here's a question for you. The data that moves from the user to the server and then onto the server's file system -- is that data sensitive?

Paige: You asked that last time. Yes, it might be.

Michael: Uh-oh.

Paige: What? The data is encrypted using SSL/TLS, so we're fine, right?

Michael: Not at all. SSL/TLS mitigates the information disclosure threat to the data as it flows between the two processes -- the 2.0 Client process and the 3.0 Server process. But after the data leaves that secured tunnel, the data is in the clear, and you're writing data in the clear to the file system.

Paige: Yeah, but what's the risk?

Michael: You tell me!

Paige: I don't understand.

Michael sighs.

Michael: Let's say a client of your application is an employee of a publicly traded company. Let's be more specific: the employee is the CFO of a publicly traded company and he uses your application to store a spreadsheet that shows fiscal data for the current quarter, data that is not public and will not be public until the end of the quarter, when the company announces its earnings. Let's say a hacker breaks into your system, gets that data, and uses it to sell or buy stock in the company. That might be insider trading.

Paige: Uh-oh.

Michael: Uh-oh, indeed. This is serious. The CFO does not have appropriate controls on this sensitive data, which might be a violation of SOX regulations.

Paige: You said "might" a lot.

Michael: You bet I did. Do I look like a lawyer to you? So back to my original question. Is this situation something you care about?

Paige: Well, not really. I think our terms state that you shouldn't use our service for ultra-sensitive data. But I'll play along. Let's assume I say, "Yes, we care about this scenario." Now what?

Michael: My first bit of advice would be to consult your attorneys to make sure you're not putting the company at risk with this scenario. But let's assume they say you can go for it, but you need to be sure you protect the data at the back end.

Paige: What you're trying to say is that the data held in data store 5.0, the file system at the server, is subject to information disclosure and we need to mitigate that threat. Am I right?

Michael: You're spot on. So how do you fix it?

Paige: ACLs.

Michael: Why access control lists?

Paige: We can limit access to only valid users of the data.

Michael: So how does the server application read and write the data?

Paige: Oh, let's assume the process runs as a unique identity. We'll call it FooID. We could apply an ACL to the files that allows FooID as well as to the valid users' access to the files.

Michael: It won't work; it's not secure.

Paige: Why not?

Michael: If I'm an attacker, I can compromise the server process by running as FooID and then run my malicious code on that server. Voila, my code is running as FooID, and I own the data!

Paige looks dejected.

Paige: Humph.

Michael: You have to use encryption.

Paige: Of course! The server will just read and write encrypted blobs, and if an attacker compromises the server, he still can't get at the data unless he can break the encryption.

Paige perks up.

Michael: Now the fun really starts. We touched on some of the crypto issues last time, especially as they relate to keys.

Paige: What do you mean?

Michael: OK, how are you going to encrypt the data?

Paige: The user types in a password in the client-side application, and the client-side application encrypts the data with the password and sends the encrypted blob across the wire to the server. The server writes metadata to SQL Server and then writes the encrypted blob to the server file system.

Michael: What's in the metadata?

Paige: Owner's identity, the file size, the filename, the time it was written to the file system and last read time. That kind of stuff. I know this information is held in the file system, too, but it's way quicker to do a lookup in something designed to store this kind of data: a SQL Server database.

Michael: Good, I'm glad it doesn't store data held within the file!

Paige: Why?

Michael: For a couple of reasons. First, it would mean your server application has access to data in the clear, and to get that data requires your server process to decrypt the data.

Paige: So?

In a loud but not-quite-shouting voice, Michael responds.

Michael: Because it means your server application needs to know a decryption key, which means you get into all sorts of
truly horrible key management games. If at all possible, you want to stay out of that business! There are ways you can do this
cleanly by having multiple keys, but I don't want to explain that right now. If ever! If you really want to understand this, read up on how Microsoft encrypts files using the Encrypting File System, or EFS.

Paige: Could we use EFS?

Michael: Possibly. It depends on your clients. What platforms do you support at the client?

Paige: We'll ship at the end of the year on Windows and then a couple of months later on Linux.

Michael: No Solaris?

Paige: What's Solaris?

Michael sniggers and ignores Paige's reply.

Michael: You can't use EFS because it requires Windows accounts. So you have to encrypt the data using different technology. It'd be great if you could use EFS, or even Data Protection API, known as DPAPI, because both use the same underlying crypto technology and can seamlessly encrypt and decrypt data by using keys derived from the user's password. Oh well. Let's see what else we can do.

Paige: Can we use an encryption library?

Michael: Of course we can. In fact, that's a much better idea than something I heard the other day.

Paige: What?

Michael: Someone asked me if it would be OK to create his own crypto algorithm.

Paige: You said no, right?

Michael: Of course I said no. What would you expect me to say? I also made it pretty clear that it's a complete violation of SDL policy, and he should not even contemplate the possibility of using any homegrown crypto.

Paige: So what should we do?

Michael: Because your client code is C#, you could use the .NET System.Security.Cryptography namespace. It's available in Mono, which means you could call it from Linux. I haven't tried it, but you could do an experiment. You'd also need to chat with the lawyers to make sure there are no licensing issues.

Paige: What licensing issues?

Michael: It's third-party code. Who knows what the license says.Paige: OK, so we encrypt the data with the user's password, send the blob ...

Michael: No. Non. Nyet. Nada. Nope. You do not use the user's password as an encryption key; you derive the encryption key from the password. Passwords are way too easy to guess.

Paige: How does one "derive" a key?

Michael smiles.

Michael: With a key-derivation function.

Paige: Mr. Smarty. Could you be a little more precise?

Michael: Sure. You pass the key into a function such as Rfc2898DeriveBytes in .NET, along with a salt. A salt is just a unique number that makes it harder to perform certain attacks. It's an iteration count, usually in the tens if not hundreds of thousands. The function takes the password and munges it thousands of times with the salt. This is often referred to as "stretching the password." At the end of the operation, which usually takes less than a second, you get some bytes, and you can use those bytes as keys. Key derivation not only makes it hard to guess the key, it also makes it hard to mount high-speed password-guessing attacks because the attacker has to go through the iteration count, too. So if an attacker could normally test 1,000,000 passwords per second, with an iteration count of 100,000, he's reduced to 10 per second! Cool, huh?

Paige: Very cool. So we use that key to encrypt the data, using, say, Advanced Encryption Standard?

Michael: Yes. Of course, all this does is encrypt the data. It doesn't provide any form of integrity check, but that's pretty easy. You can derive another key and use that to create a message authentication code and store that along with the metadata. Don't use the same key for encryption and integrity checking. Derive a second key, and use that.

Adam, another security guy, walks by muttering.

Adam: You security wizards always want to go depth-first into crypto and the like. But the attackers go for the weak link.

Michael: Adam's right, security people tend to dig deep quickly. And I'm guilty as charged, but I want to get this out of the way.

Paige: Er, OK. Anything else?

Michael: Well, there's also the sticky problem of users forgetting their passwords. I would give users the opportunity to back up their password to a USB stick or something, and make them aware that we don't have the password and that if they forget it, there's no way we can bring the data back from the dead!

Paige: Are we done?

Michael: For the moment, yes. It's a big and important section of the threat model, and I hope this gives you an idea of some of the trade-offs you need to make when building secure applications.

Paige: I do now. Thanks.

Michael Howard is a senior security program manager at Microsoft who focuses on secure process improvement and best practices. He is the coauthor of five security books, including "Writing Secure Code for Windows Vista," "The Security Development Lifecycle," "Writing Secure Code" and "19 Deadly Sins of Software Security."