Data Parser

A data parser is a software construct that receives input data from a file, network, IPC, or other data flow and makes execution decisions based on it.

Parsers are very common:

  • Media editors and players

  • Compilers

  • Protocol endpoints

Parsers are the primary target of fuzz testing, and understanding where and how parsing is performed in code is an essential step in thread modeling an application to determine whether fuzzing is a worthwhile investment.

Parsing Behavior

Data parsing typically involves two subsequent steps: allocating and populating data structures and then executing code based on that data.

  • Primary – Allocation and Population of Data Structures

  • Secondary – execution (logic, API calls, etc.) based on data in the structures

What Parsing Is Not

Not all components that receive and transmit data necessarily parse it; it is very possible for a component to receive data into pre-allocated structures and then transmit it, making no execution decisions based on the data content.

For example, consider a web site that receives data from its users via form elements and hands it off to the data layer to be persisted. Or consider a message queue that receives messages of a fixed size, stores them, and then transmits them based on a receiver’s availability.

In each of these cases, the transport plumbing is typically not vulnerable to attack since the plumbing is designed to be content agnostic.