C static analysis tools often use intermediate representations (IRs) that organize program data in a simple, well-structured manner. However, the C parsers that create IRs are slow, and because they are difficult to write, only a few implementations exist, limiting the languages in which a C static analysis can be written. To solve these problems, we investigate two language-independent, on-disk representations of C IRs: one using XML, and the other using an Internet standard binary encoding called XDR. We benchmark the parsing speeds of both options, finding the XML to be about a factor of two slower than parsing C and the XDR over six times faster. Furthermore, we show that the XML files are far too large at 19 times the size of C source code, while XDR is only 2.2 times the C size. We also demonstrate the portability of our XDR system by presenting a C source code querying tool in Ruby. Our solution and the insights we gained from building it will be useful to analysis authors and other clients of C IRs. We have made our software freely available for download at http://www.cs.umd.edu/projects/PL/scil/.
@article{meister10cir, author = {Jeffrey A. Meister and Jeffrey S. Foster and Michael Hicks}, title = {Serializing {C} intermediate representations for efficient and portable parsing}, journal = {Software, Practice, and Experience}, month = feb, volume = 40, number = 3, pages = {225--238}, http = {http://www.cs.umd.edu/projects/PL/scil}, year = 2010 }
This file was generated by bibtex2html 1.99.