Home Research Teaching Misc

A Software Bug Taxonomy

1. Resource Management Bugs

a) Stack buffer overflow/underun, Heap buffer overflow/underrun

Description: Buffer handle used to access memory outside the region allocated.

Example 1:
  //stack buffer overun for sizes greater than 42
  void stack_buf( void* src, int size ) {
    char buffer[42];
    memcpy( buffer, src, size );

Example 2:
  //heap buffer overun for sizes greater than 42
  void heap_buf( void* src, int size ) {
    char buffPtr* = new char[42];
    if( buffPtr )
      memcpy( buffer, src, size );

b) Dereferencing NULL pointer

  public class nullDeref {
    public static void main(String argv []) {
    MyObject o = null;
    BufferedReader keyb = new BufferedReader;
    String inp = keyb.readLine();
    if( inp != "quit" ){
      o.construct( inp );

In the above example, if inp == "quit", toString() is called on a null object.
c) Dereferencing pointer to uninitialised memory
d) Failing to allocate sufficient memory for objects
e) Dereferencing pointer to freed memory

f) Memory leak

Code loses the ability to free dynamically allocated memory.

  void foo(){
    char* s = new char[42];
    char* t = new char[42];
    cin >> s;
    t = s; //loses handle to memory originally pointed to by t
    delete[] s;
delete[] t;

g) Double free

Attempt to deallocated already deallocated memory..

  void foo(){
    char* s = new char[42];
    char* t;
    cin >> s;
    t = s;
    delete[] s;
    delete[] t; //double free since both s and t point to same memory location

h) Returning pointer to local stack variable
i) Returning pointer to freed memory

j) Stack smashing

Special case of buffer overrun that may crash a program or transfer control to malicious code (or unintended sections) by overwriting the on-stack return address of the function.

k) Unititialised read: Uses memory that has not been initialised

l) Using handle to closed file or socket(eg. double close)
m) File handle/socket leak
2. Data Bugs

a) Serialisation

Data of type X stored in object of type Y resulting in possible data corruption.

b) Using unchecked stored data: Data from secondary storage not checked for format or corruption

c) Unchecked return values (read(), write() etc)

Not checking for failure on return from some functions and possibly incorrectly using the error value

3. Concurrent Bugs
are bugs that are due to multiprocess/multithreading computations

a) Atomicity/Data race

Sequence of access operations (including at least one write access) in a execution streams should be executed atomically, but may be interrupted or interleaved with operations from other threads. We use this class mainly for bugs that result from not using synchronisation mechanisms to ensure atomicity.

b) Inconsistent synchronisation

Shared variable is sometimes protected by synchronisation mechansim, but sometimes not. We use this class for bugs that occur when synchronisation is thought about and used, but not uniformly implemented.

c) Deadlock

d) Orphaned thread

Threads wait on dead threads

e) Livelock

Threads fail to make progress but perpetually change state due to shared resources held by other threads

4. Arithmetic errors
a) Divide by 0
b) Floating point exceptions

c) Range wraps

Operations may trigger range overflow

5. Semantic Bugs
are constructs that contradict the 'intent' of the programmer.

a) Data leak

Sensitive information is made accessible to untrusted components.

Example: Java String internals are not zeroed out after a function exits. This means user passwords stored in strings remain in memory.

Example: Plain text password sent over unsecure network.

b) Unsanitised output

Safe output properties do not hold on all possible paths.

  class user{
    //random user info field
  user foo;
  .//random activity
  write( foo, activity_log );

In the example, cached_passwd should not be written to the activity log. Another example is ensuring that output is properly formatted.

Another example is failing to erase the contents of files/scratch areas used for temporary storage.

c) Data taint

Trusted code segment uses data constructed from from untrusted input (network, user, etc ) as input for trusted operations.

Example 1: SQL injections
  public void authenticate(HttpServletRequest request){
    String username = request.getParameter("user");
    java.sql.Statement stmt = con.createStatement();
    String query = "select * from users where username = ’" + username + "’and password = ’" + pwd + "’";
    stmt.execute(query); ... // process the result of SELECT

In the example above the "query" executed is costructed partly from user input. This input may inadvertently or maliciously contain SQL commands

d) Unchecked input: input used as arguments without first being checked for validity

Example 1: Pointer passed in from user spaced dereferenced without being checked.
  /* from sys/kern/disk.c*/
    int sys_disk_request( u_int sn, struct Xn_name *xn_user, struct buf* reqbp, u_int k )
    if( reqbp->bflags & B_SCSICMD )
      return sys_disk_scsicmd( sn, k, reqbp );

e) Mismatched construct sequences: Sequences of operations (function calls, etc) fail to obey particular restrictions.

Example 1: new and delete must be properly paired along all possible executions paths.
Example 2: all accesses to variable foo must be preceded by aquire( foo_lock) and followed by release( foo_lock )
Example 3: interrupt disables must be followed by enables

f) Violations of "Never / always do X" rules

Example: Floating point ops in the kernel (violates "no floating point in kernel" rule)

g) Violations of "In X context do/do not do Y"

Example: Functions that can sleep are called when interrupts are disabled.

6. Peformance Bugs

h) Violations of "Do X instead of Y" rules

Example: Copying used instead of memory mapped IO.

i) Locking bottlenecks

Example: granularity too coarse
9. Others
a) Looping Errors

Off by one, increment errors etc.
b) Conditionals

Short circuiting conditionals
c) Unhandled exceptions

Notes on bugs wrt systems

1. Bugs of type 2a, 2c - 2k can be avoided by using type-safe, garbage collected languages.
2. Some errors are may be mild/no effect on some systems but catastrophic on others. For example, divide-by-zero is fatal on the Intel x86 family of processors, but gives a zero result on powerPC processors. (source)
3. Since semantic bugs are defined as contradictions of programmer intent, detection and classification requires specific knowledge of individual systems.


Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler, An Empirical Study of Operating System Errors
Benjamin Livshits, Thomas Zimmermann, DynaMine: Finding Common Error Patterns by Mining Software Revision Histories
Chadd C. Williams, Jeffrey K Hollingsworth, Bug Driven Bug Finding
David Hovemeyer and William Pugh, Finding Bugs is Easy
Dawson Engler, Benjamin Chelf, Andy Chou, Seth Hallem, Checking System Rules Using System Specific Compiler Extensions
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf, Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code
Eric C Allen, Various articles on bug patterns in Java
James Newsome, Dawn Song, Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software
Junfeng Yang, Paul Twohey, Dawson Engler, and Madanlal Musuvathi, Using Model Checking to Find Serious File System Errors
Ken Ashcroft, Dawson Engler, Using Programmer Written Compiler Extensions to catch security holes
Mark Sullivan1, Ram Chillarege, Software Defects and their Impact on System Availability - A Study of Field Failures in Operating Systems
Michael Martin, Benjamin Livshits, Monica S. Lam, Finding Application Errors and Security Flaws Using PQL: a Program Query Language
Nick Rutar, Christian B. Almazan, Jeffrey S. Foster, A Comparison of Bug Finding Tools for Java
Shan Lu, Zhenmin Li, Feng Qin, Lin Tan, Pin Zhou and Yuanyuan Zhou, BugBench: Benchmarks for Evaluating Bug Detection Tools
Sudheendra Hangal, Monica S. Lam, Tracking Down Software Bugs Using Automatic Anomaly Detection
Vipindeep V, Pankaj Jalote, List of Common Bugs and Programming Practices to avoid them
William R. Bush, Jonathan D. Pincus and David J. Sielaff, A static analyzer for finding dynamic programming errors<
Yichen Xie and Dawson Engler, Using Redundancies to Find Errors