Software Security On-line Course

This course, originally developed in late 2014 and early 2015 for Coursera as part of the University of Maryland's four-course specialization on cybersecurity, explores the foundations of software security. The course is old enough now that I don't feel comfortable charging money for it, so it is here hosted for free. See more on the current state of the course content, and invitation for contributions, below. All of the videos are collected on a Youtube channel, Software Security.

The course considers important software vulnerabilities and attacks that exploit them, such as buffer overflows, SQL injection, and session hijacking. The course also considers defenses that prevent or mitigate these attacks, including advanced testing and program analysis techniques. We take a build security in mentality, considering techniques at each phase of the development cycle that can be used to strengthen the security of software systems.

A video overview of the course is presented in the following lectures. The first two present an overview of cybersecurity generally, and software security in particular. The last is an overview of the course content, and a description of the learner's expected background.

Introducing computer security (6:17)
What is software security? (7:48)
Tour of the course and expected background (11:37)

Topics

The content of the course is as follows, with one topic per week linked to the relevant material:

Low-level, memory-based attacks, including stack smashing, format string attacks, stale memory access attacks, and return-oriented Programming (ROP)
Defenses against memory-based attacks, including stack canaries, non-executable data (aka W+X or DEP), address space layout randomization (ASLR), memory-safety enforcement (e.g., SoftBound), control-flow Integrity (CFI)
Web security, covering attacks like SQL injection, cross-site scripting (XSS), cross-site request forgery (CSRF), and session hijacking, and defenses that have in common the idea of input validation
Secure design, covering ideas like threat modeling and security design principles, including organizing ideas like favor simplicity, trust with reluctance, and defend in depth; we present real-world examples of good and bad designs
Automated reasoning for code security, presenting foundations and tradeoffs and using static taint analysis and whitebox fuzz testing as detailed examples
Penetration testing and fuzz testing, presenting an overview of goals, techniques, and tools of the trade

We have put together a glossary of terms that you might find useful while going through the course.

Projects and Assessment

There is a quiz for each week's materials. Each quiz is stored in a Microsoft Word .docx file, which includes both the questions and the answer. (I didn't find time to split them apart.)

The course also has three projects.

Buffer overflow attacks: The lab walks you through how a buffer overflow occurs, and how it can be exploited.
Web application security: The lab asks you to find and exploit common vulnerabilities in web applications, like SQL injection and cross-site scripting
Static analysis for finding security bugs: The lab will give you some experience using tools that aim to find security flaws automatically

To do the projects, you will need to install the Virtual Box virtual machine management software (which is free) on your own computer. Projects will be in VM images running Ubuntu Linux; you will install these images and run them to do the projects. Detailed instructions for doing so are on the relevant project pages.

Expected Background

Successful learners in this course typically have completed sophomore/junior-level undergraduate work in a technical field, have some familiarity with programming, ideally in C/C++ and one other "managed" program language (like Python or Java), and have prior exposure to algorithms. Students not familiar with these languages but with others can improve their skills through online web tutorials.

To test whether you have sufficient background to take the course, we recommend students take the qualifying quiz. Students who score less than 70% on the quiz may have difficulty with the technical concepts assumed by the presentations. (Note that the quiz does not count toward your final grade.) Much of the quiz is about C programming; most of the code in the course will be presented in C. For the first two weeks, this is critical: the low-level errors and defenses against them are germane to C and low-level programming. Students comfortable in other languages may be able to learn the necessary C concepts on their own. One book I recommend is K.A. Reek's Pointers on C. Several previous students of this course recommended the free book, Learning C the Hard Way.

Note about course content, and call for corrections

This course was originally put together in late 2014 and early 2015. Fortunately (for learning, not for security!), much of the course's core content is still relevant as of early 2024: The examples and some details are now nearly 10 years old, but the same sorts of vulnerabilities, attacks, defenses, methodologies, and technologies are still relevant.

Some readings such as this one have some added discussion and links to newer versions, or generations, of tools and technologies, and/or writeups about them. The first such updates were made in 2020, and some of these links may be dead or out of date. I welcome corrections and contributions! If you come across a dead/wrong link or would like to suggest some commentary to add, please either e-mail me (mwh@cs.umd.edu) or better yet, make the suggested change and submit a pull request on GitHub, where these course materials are hosted.

Errata

There are some small mistakes in the video lectures for the course, identified below.

Week 1

Other memory exploits lecture: Slide 5 (titled Heap overflow variants) in the recorded lecture implies that all C++ objects have vtables, but in fact only those with virtual methods do. This point has been corrected in the (downloadable) PDF version of the slides.

Week 2

Week 2 Intro slide implies ROP bypasses stack cookies and ASLR. The discussion was meant as a transition to thinking about more sophisticated attacks, but then seemed to imply that ROP is more powerful than it is. ROP is a way to overcome defenses against jumping to particular libc functions, by reusing together code fragments rather than whole functions. It does not bypass stack cookies or ASLR.
There is a typo in the secure coding lecture, "Design vs Implementation." Should be "consistent" instead of "consonant". This is fixed in the PDF slides.

Week 5

Taintedness as a lattice. In the flow analysis lecture, at time 5:00, Mike mis-speaks and says that tainted is less than untainted, which is wrong; it's the other way around. Note that he corrects himself immediately afterwards, and the slides are correct.