Week 2: Defenses against low-level attacks

We continue our discussion of low-level software security by understanding ways to defend against memory-based attacks like buffer overflows and format string attacks, introduced last week.

Defenses fall into two categories: (1) automatic, and (2) manual (based on disciplined programming styles). We will also look at a sophisticated attack, called return oriented programming, that aims to overcome some of the automatic defenses, as well as an experimental defense against it. In the end, the most sure defense against low level attacks is to program with a memory-safe (or better yet, a type-safe) programming language in the situations that's possible.

Learning Objectives

After the completion of this week's material, you will be able to:

Video Lectures

Required Readings

The following two blog posts cover the topics of memory safety and type safety in somewhat greater depth

Quiz

The quiz for this week covers all of this week's material.

Supplemental readings and links

The following readings are optional: Check them out if you are interested in learning more about material we've covered in lecture (many were explicitly linked in the lecture slides).

Attacks and modern defenses, generally

Return-oriented Programming (ROP)

Control-flow integrity (CFI)

Secure coding

These are a few references linked in the lecture slides. We will cover secure coding and design in more depth during week 4.

Project

There is no new project this week. Don't forget to complete Project 1 on exploiting buffer overflows. Take the project quiz when you have completed that project.

Notes on Course Content

Writing in October 2020, a number of things have changed since 2015. Here are some of them.

Enforcing Memory Safety

In the lecture, I wrote "coming soon, Intel MPX!” Since then, Intel MPX has come ... and gone. As summarized on the Wikipedia MPX page, "In practice, there have been too many flaws discovered in the design for it to be useful, and support has been deprecated or removed from most compilers and operating systems.” As a specific example, the gcc compiler used to provide a compiler extension that could take advantage of the MPX instructions, but that support has now been deprecated. A key problem is that the extra hardware doesn’t actually improve performance over software-only implementations, and in some cases performance could be worse!

The CHERI (Capability Hardware Enhanced RISC Instructions) project is a more promising alternative. It defines a set of hardware extensions that provide capabilities, which can be used for enforcing memory safety. Initial designs targeted just spatial safety, but later work targeted temporal safety as well. Ongoing effort has focused on developing a C compiler that targets CHERI.

In late 2015 Microsoft began developing Checked C, an extension to C that aims to ensure spatial safety (and, to a degree, type safety); it has since been forked and is run by an independent foundation. Checked C is implemented as an extension to the open-source Clang/LLVM compiler; the compiler inserts run-time checks when needed for enforcing safety. Recent work (e.g,. in a partial refactor of the FreeBSD operating system to use Checked C) shows such checks to be relatively inexpensive, e.g., around a few percent. Of course, the compiler is still under development so things may change. I think very highly of this effort, so I am working with the Checked C team both to design the language and to develop tools to automatically migrate legacy C code to Checked C.

Another mature effort to ensure partial memory safety is Address Sanitizer (ASAN), but ASAN-ized code is much slower (e.g., 2x slowdown) with checks inserted, so it would not normally be used in production.

Enforcing Type Safety

In the lecture, I said that modern languages are emerging that aim to ensure type safety while also providing good performance. I mentioned
Go, Rust, and Swift, in particular. Since 2015, all three of these languages have become better developed and more popular. Indeed, I would say that Rust has emerged as a strong contender to be the "go to" safe systems programming language. Rust's notion of type safety implies not only memory safety, but also freedom from
data races. These owe to defects in concurrent programs, are difficult to find and debug, and can have security implications; Rust's type system ensures their absence. Developers love these benefits among others: According to Stack Overflow's annual poll of developers, Rust has been the year's "most loved" language for five years in row !

Avoiding Exploitation

In the lecture, I talked about several defenses that aim to make memory safety bugs harder to exploit, e.g., address space layout randomization (ASLR) and stack canaries. These defenses are still relevant today, but have evolved while facing new threats.

Using the Clang/LLVM compiler, stack canaries are enabled with the -fstack-protector flag. It also provides another defense called
shadow stacks, which has the same goal as stack canaries but is more dependable, while still being low overhead. The basic idea is to separate critical metadata, notably return addresses, into a separate stack so that it will not be overwritten by a stack-based buffer overflow. The Clang/LLVM compiler's -fsanitize=safe-stack flag enables shadow stacks (called "safe stacks"); the gcc compiler supports them as well.

ASLR is widely deployed, e.g., on Linux, MacOS, IOS, and Windows (though for the last it is turned off by default). ASLR is effective when the base addresses of randomized process memory segments are hard to guess. A common part of an attack against an ASLR-protected system is to exploit a bug that leaks a base address. Software and hardware side channels, e.g., due to caches, are a vector for such leaks. Indeed, as mentioned last week, the Spectre and Meltdown bugs have shown that hardware-based side channels are a significant threat. Moreover, recent work has shown that information disclosure is not always needed to defeat ASLR. As such, it remains to be seen whether ASLR remains an effective defense in the future.

Control Flow Integrity (CFI)

In 2015, CFI was being seriously explored as a viable defense. Full CFI, as originally proposed, added too much performance overhead, so researchers were exploring less expensive alternatives with less protection. I updated the links in the supplemental material, above, to discuss CFI a bit more.

The Clang/LLVM compiler supports CFI in various forms; Trail of Bits has written a nice tutorial about it. One variant is forward-edge only CFI (FO-CFI), which does not protect returns via addresses on the stack, but does protect calls via function pointers and virtual method tables. (Shadow stacks could be enabled, at low cost, to provide protection for return addresses.) For FO-CFI, "a performance overhead of less than 1% has been measured by running the Dromaeo benchmark suite against an instrumented version of the Chromium web browser." Mathias Payer also did a detailed study of Clang/LLVM CFI, and found per-dispatch overheads at around 20% (but forward-edge dispatches are relatively infrequent operations in the code). The Chrome team planned to include Clang's CFI protections in the browser. (Note that clang is used to build Chrome for Windows, too.)

The flip side of the lowered overhead is the lowered protection. How low is too low? Muntean et al developed an empirical framework for evaluating the different levels of protection offered by CFI which aims to determine the post-CFI-protected attack surface on particular applications; for example, it will determine the number of possible allowed targets per callsite, for each CFI policy (fewer is better). These surfaces can then be compared to understand a ranking of the offered defense, compared to its overhead.