Falcon 0.1.0
Today marks the 0.1.0 release of Falcon. This is Falcon’s first release!
Falcon is a Binary Analysis Platform written in Rust, and released under the Apache License 2.0. It provides the building blocks for implementing analyses such as Symbolic Execution and Abstract Interpretation over binary programs.
Documentation can be found here.
What’s in this release
An expression-based IL
Falcon uses yet-another-IL, called Falcon IL, for analysis. Falcon IL is a cross between RREIL and Binary Ninja’s LLIL. The standard hierarchy of structures in Falcon IL are:
- Program - Contains many Functions.
- Function - Contains a ControlFlowGraph.
- ControlFlowGraph - Contains Blocks (and Edges, not listed here).
- Block - A linear sequence of Instructions.
- Instruction - Provides location to an Operation.
- Operation - Implements semantics of a program, modifies state. The six operations are
assign
,store
,load
,brc
,phi
, andraise
. - Expression / Scalar / Array / Constant - Building blocks of an Expression.
Elements of the Falcon IL can be grouped into two broad categories: Control-Flow and Semantics. This is a confusing, and perhaps contradictory statement, but best illustrated by comparing Function
with ControlFlowGraph
, or Instruction
with Operation
.
A Function
provides a location for a ControlFlowGraph
in a Program
. When we are searching for new locations to analyze, we search for a Function
, not a ControlFlowGraph
. However, when we analyze the semantics of a program, we analyze the ControlFlowGraph
, not the Function
.
Likewise, an Instruction
provides a location, or position, within a linear sequence in a Block
. The semantics of an Instruction
are implemented in an Operation
. A Symbolic Execution Engine, for example, can be broken into two parts: One that handles Control-Flow, and one that applies semantics over a symbolic state.
I will write more about Falcon IL in the coming weeks, but I have spent time documenting its internals with Rust’s inline documentation.
X86 Translation
Falcon uses capstone for instruction decoding, and lifts X86 programs to Falcon IL. The list of supported instructions can be found here.
Of note:
- Falcon IL does not yet support values >64 bits, though this is coming, and we therefor are not lifting SSE/AVX instructions.
- Falcon IL has no means of dealing with floating-point instructions. Floating-point instructions are translated to a Raise instruction. This allows you to translate 32-bit libraries which have floating-point instructions, but you can safely ignore them unless encountered during analysis.
A ControlFlowGraph
in Falcon has optional conditional edges. Direct conditional branches, such as je
, do not emit Operation::Brc
(A conditional-branch instruction), but instead place conditions over edges between successor blocks.
A Symbolic Execution Engine
Falcon comes with the building blocks for symbolic execution. An example which ties these blocks together can be found in the test for simple-0, which symbolically executes a contrived x86 Linux application, solves for the inputs, and verifies the results. This takes about a second, so I have made end-to-end symbolic execution a built-in test that takes place on every push to github.
Falcon’s symbolic execution underpinnings are currently using z3 for smt solving. However, Falcon produces smtlib2-formatted input instead of using the z3 API directly, and I have attempted to ensure the solver can be swapped out as needed.
Loading and linking of Elf Binaries
Falcon’s loader currently supports the Elf file format. Falcon will discover all functions for which symbols exist, and allows the user to specify additional function entries as needed. Falcon currently has no means to resolve indirect branches statically, so jump tables and other indirect branches will pose a hurdle for Falcon in this release.
Additionally, in order to symbolically execute our simple-0 example, Falcon can resolve dependencies and relocations between Elfs. This allows us to provide a ld-linux.so and libc.so from an Ubuntu 16.04 i386 environment, and leave Falcon to link these together for static symbolic execution.
The beginnings of platform modelling
Falcon provides a means for modelling platforms. Currently we provide a model only for LinuxX86, with one system call, to work around contrived examples. See the linux_x86
module for an example of what this looks like.
Unsupported: Data-Flow Analysis and Abstract Interpretation
Falcon provides a fixed-point engine for Data-Flow analysis. Implemented with this engine are analyses and transformations, including reaching definitions, UseDef/DefUse chains, dead code elimination, single static assignment, and a intra-procedural value set analysis.
This code requires refactoring, testing, and is currently unsupported. However, here is an example of the main
function from the simple-0
example lifted, with dead code elimination and SSA applied:
Using Falcon
The see the requirements for using Falcon, please refer to its Dockerfile. Dependencies are required for bindgen to link against capstone, as well as a requirement for the z3 solver. The Dockerfile will get you up and running in Ubuntu Xenial, though I do the majority of my development in OSX.
Falcon is just a library, with no command-line interface. You can run the tests with cargo test
, or begin playing around with the API.