Week 5: Automated Reasoning for Code Security

Last week we looked at how security fits into the various activities of the development process. Up to this point we have generally looked at particular software flaws and bugs and how to avoid them through good design and implementation practices.

This week we will look at the testing and validation phases of software development, and in particular how automated reasoning tools can assist developers and testers in finding important security bugs. We will focus on two technologies: static analysis and symbolic execution.

Learning Objectives

After the completion of this week's material, you will:

Know what static analysis (SA) and symbolic execution (SE) are, how they compare, and why they are hard
Understand the basics of each approach
Understand how to improve the precision and scalability of each approach
Be aware of several commercial and open-source SA and SE tools, and how they compare

Video Lectures

This week we cover two code analysis technologies: static analysis, and symbolic execution. The latter is often used as the core of a penetration testing technology called whitebox fuzz testing; we will discuss other fuzz testing techniques next week.

Static Analysis

Static Analysis: Introduction part 1 (5:10)
Static Analysis: Introduction part 2 (8:12)
Flow Analysis (8:44)
Flow Analysis: Adding Sensitivity (8:57)
Context Sensitive Analysis (8:53)
Flow Analysis: Scaling it up to a Complete Language and Problem Set (11:40)
Static Analysis: Challenges and Variations (8:01)

Symbolic Execution

Introducing Symbolic Execution (10:52)
Symbolic Execution: A Little History (3:05)
Basic Symbolic Execution (14:17)
Symbolic Execution as Search, and the Rise of Solvers (12:45)
Symbolic Execution Systems (8:26)

Break out: Interview with Andy Chou

In August 2014, Mike had the pleasure of interviewing Andy Chou. In 2003, Andy co-founded Coverity, a company that developed static analysis (and other) tools for finding software defects. Coverity's products started from Andy's Ph.D. research at Stanford, and grew as the company grew; the organization was at 300 employees when it was
acquired by SynopSys in 2014. In this interview we discussed principles and practice of static analysis, and the path towards making a practical tool that is now in wide use. Like last week, the interview is optional from an assessment perspective -- there will no quiz questions on it. We hope you find it interesting!

Mike Hicks interviews Andy Chou (32:31). Highlights, indexed by time:

start - Intro to static analysis, and different sorts of analysis
5:52 - Static analysis's place in secure development practice; tradeoffs
9:33 - Coverity Scan open source analysis: what it is, and its impact
11:07 - Static analysis challenges with particular languages
13:11 - Dynamic analysis: a complement to static analysis
17:35 - The process of choosing which analyses go into your tools
21:16 - How the market for static analysis has matured over the last 15 years
25:08 - More challenges for use and development of static analysis; looking ahead
29:19 - Free service Code Spotter; eating your own dogfood; closing thoughts

Supplemental Links

Here is some additional reading material, if you are interested in learning more.

Scaling Static Analyses at Facebook (August 2019) - An article explaining how Facebook uses various kinds of static analysis heavily in their development process.
Lessons from Building Static Analysis Tools at Google (April 2018) - How Google is using static analysis for software development
A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World (February 2010) - A classic article by the Coverity team on their experience writing and deploying a commercial static analysis tool
What is noninteference, and how do we enforce it? - A blog entry Mike wrote about using static analysis to find information leaks; relates to the lecture material on implicit flows and tainting analysis
All You Ever Wanted to Know About Dynamic Taint Analysis and Forward Symbolic Execution (but might have been afraid to ask) - mathematical presentation of these analyses and their variations

Here are some links to static analysis and symbolic execution tools you might find interesting.

Static analyzers

Clang Analyzer - A static analyzer built on the LLVM framework for C and C++ programs
Infer - A static analyzer for a variety of languages, including C, C++, Objective C, Java, and more.
Joern - A static analyzer for C and C++ programs
ErrorProne - A static analyzer for Java (the moral successor to FindBugs)

Symbolic executors

KLEE - A symbolic executor for LLVM bitcode (primarily supporting C and C++)
Java Symbolic Pathfinder - A symbolic executor for Java bytecode
Triton - A symbolic execution library and framework for x86 executables
angr - A combination symbolic executor and static analyzer for x86 executables

SMT solvers

These solvers are used by various of the tools mentioned above.

Z3 - developed at Microsoft Research
CVC5 - developed at Stanford and the University of Iowa
Yices - developed at SRI

Quiz

The quiz for this week covers all of the material for this week.

Project

For this week's project you will get your feet wet using a symbolic execution tool called KLEE, and comparing its performance to a black box fuzzing tool called radamsa (which we will learn more about next week). When you finish it, take the on-line assessment, due in three weeks.