Introducing Finch

2018/09/09

Introducing Finch

Note: This blog post is based on Falcon 0.4.2 and Finch revision 5ad7196f30.

Finch is a Symbolic Executor built on top of Falcon. With Finch, you can lift programs into Falcon IL, and then symbolically execute those programs to solve for certain properties.

It’s important to note that Finch is not a turn-key solution for Symbolic Execution. While I have used Finch successfully for some varying applications, this always requires both reverse-engineering of the target binary, and writing Finch-specific Rust code.

Additionally, you’ll notice lots of other features in the Finch codebase. As of September, 2018, you should ignore a lot of these. Yes, there’s a MIPS Linux userland environment in there, and a command-line interface. For now, I encourage you to use Finch as a library, until the other functionality is flushed out and stable.

Falcon/Finch Concepts

To understand and use Finch, you first need to understand some Falcon basics. We will start by running through those.

Falcon IL crash course

Falcon IL lifts and models both the semantics of individual assembly instructions, and the control-flow of programs. When using Falcon, we normally lift entire functions at once1. Falcon IL works at the, “Program,” level, and it is no surprise that the top-level container in Falcon IL is the il::Program. A Falcon Program contains many functions.

A function in Falcon IL, il::Function, provides, “Position,” to an il::ControlFlowGraph, which holds the semantics. In addition to holding a control-flow graph, a function holds the address at which the function was lifted, a name for the function, and, when added to a program, an index for the function which uniquely identifies it within the program.

An il::ControlFlowGraph implements a Graph over il::Edge and il::Block. An il::Edge is a directed edge with an optional condition, which, when present, guards the edge. An il::Block is a sequential list of il::Instruction.

With il::Instruction, we once again visit the concept of position and semantics. Where a function provides position to a control-flow graph, an instruction provides position to an il::Operation, which holds the semantics of the instruction. Additionally, instructions have an index, which contains its position inside a Block, an optional address, and comments for annotating Falcon IL with analyses.

There are 6 types of il::Operation:

Falcon is an expression-based IL, which means it operates primarily over expressions. Falcon expressions include the following types of operations:

You can find more information on Falcon expressions in il::Expression.

The terminators of expressions are either il::Scalar, or il::Constant. You can think of scalar as simply a variable. With the coming release of further codebases, it’s notation as, “Scalar,” instead of, “Variable,” will make more sense.

The last part of Falcon IL is, il::RefProgramLocation, il::RefFunctionLocation, il::ProgramLocation, and il::FunctionLocation. These constructs provide a convenient way to pinpoint a location in a Falcon program. If you are familiar with Rust’s lifetimes, the RefProgramLocation and RefFunctionLocation variants hold references to functions, edges, blocks and instructions. These variants allow you to reference Falcon IL directly. The ProgramLocation and FunctionLocation variants track locations in a Falcon program by index. They are faster in certain circumstances, and make certain operations possible with Rust’s borrow-checker. There are convenience functions to move between the two.

Falcon’s IL may seem complicated, but it only has 6 operation types, and 22 expression types. As you continue to work with Falcon IL, it becomes increasingly more intuitive and easier to understand. There also exists a wealth of helpful functions at different levels of the IL. For example, you can call, il::Instruction::scalars_read(&self) -> Option<Vec<&il::Scalar>> to get all scalars read by an il::Instruction, without having to dig down to the il::Operation.

Falcon, Finch, and Executors

Falcon comes with a concrete executor, which allows you to emulate execution of Falcon IL. Besides allowing us to execute Falcon IL programs over concrete inputs, this executor serves as a functional codebase to fork and build other projects. Finch’s symbolic executor is based on Falcon’s concrete executor. We’ll start by understanding Falcon’s concrete executor.

Falcon’s Executor

Falcon’s executor has two main components, executor::Driver and executor::State.

A State in Falcon’s executor contains the state of memory, and scalars, at a single point in program execution. A state can also take an il::Operation, and apply the semantics of that operation to itself. The result of this operation is an executor::Successor. A Successor tells us whether we should fall-through to the next instruction, branch, or whether an intrinsic was encountered. In Falcon’s concrete executor, falcon will throw an, ErrorKind::UnhandledIntrinsic error if an intrinsic is encountered during concrete execution.

A Driver, “Drives,” a State over a Falcon IL Program. A driver tracks the current Program being executed, and the current location in that program. When a Driver, “Steps,” it feeds the current Operation to its State, applying the semantics of the current operation to its state, and then moves forward in the program based on the Successor returned by the State.

When a Driver encounters an address which does not exist in its Program, it attempts to lift a Function at that address. If successful, execution continus at that address. If not successful, an error is thrown.

Finch’s Symbolic Executor

Finch’s symbolic executor builds on Falcon’s executor.

First, where Falcon’s executor works on values of type il::Constant, Finch’s symbolic executor works on values of type, il::Expression. This is reflected in the values held in State for scalars, as well as the memory model. Finch’s State also holds a set of path-constraints, which are of type il::Expression.

Whereas Falcon returns a single State each time the semantics of an Operation are evaluated over a State, and a single Driver each time a Driver is stepped forward, Finch returns zero-to-many of type State and Driver for these operations. This allows Finch to, “Fork,” symbolic state.

Finch’s State is tied in with bindings to the Z3 Theorem Prover via falcon-z3. This allows Finch’s State to evaluate symbolic expressions. When solving expressions with falcon-z3, Finch enforces its path-contraints as assertions which must hold true, and then evaluates expressions against its state. If this sounds confusing, don’t worry too much about it for now.

As you work with Finch, you’ll also notice Finch’s Driver and State take a Platform. A Platform models a runtime environment, such as Linux. This works by handling Intrinsic operations emitted by Falcon’s lifters. Platforms are still under development, and we will be using the Dummy Platform in our examples. The Dummy platform does not handle any Intrinsic operations.

There is more under-the-hood with Finch’s symbolic executor. For example, Finch makes use of Hash consing to keep expressions small in memory. Finch also allows for state-merging, where two symbolic states are combined into one state, when certain conditions are met. However, if you understand the basics of Driver and State, you’re good to begin working with Finch.

An Example Program

We are going to cover all of the concepts and steps necessary to bootstrap an application for symbolic execution with Finch. I do not repeat these steps everytime I wish to symbolically execute some code. I take a working codebase, rip out the parts I need, and put together a target-specific solution.

At the conclusion of this example, you will understand the basics of working with Finch. You can then do the same, rip apart examples and repurpose them to fit your needs.

Yes, this example challenge can be solved quite easily without symbolic execution, but that’s not the point of this tutorial :).

This tutorial is available at finch-examples/example-0.

Getting the environment set up correctly.

Finch requires Falcon and z3, as well as the falcon-z3, the Falcon bindings for z3. This means, in addition to Rust, you will need a working build environment, clang for bindgen, the correct version of z3 installed, and capstone development libraries. If you are familiar with Docker, a Dockerfile is available which sets up this environment for you. If you don’t want to use Docker, I would still recommend using the Dockerfile as a reference on how to set up and configure dependencies.

The environment may take a while to get set up. If you want to follow along with this example, I recomment you begin building the environment while you continue to read.

Last, Finch does not play well with Mac OSX. I strongly recommend you run Finch in a Linux environment.

The example program

In this example, we will imagine we are given a binary which performs some deobfuscation routines on an input, and then executes that input as shellcode. For the purposes of this example, we’ll start with the C code which generates this simple deobfuscation.

#include <inttypes.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

int main(int argc, char * argv[]) {
    // Read in one byte all shellcode will be XORed with
    uint8_t xor_byte;
    if (read(0, &xor_byte, 1) != 1) { return -1; }

    // Create some RWX memory
    uint8_t * executable_memory = mmap(0,
                                       4096,
                                       PROT_READ | PROT_WRITE | PROT_EXEC,
                                       MAP_ANONYMOUS | MAP_PRIVATE,
                                       -1,
                                       0);

    // Read in shellcode
    ssize_t bytes_read = read(0, executable_memory, 4096);

    // Deobfuscate shellcode
    ssize_t i;
    for (i = 0; i < bytes_read; i++) {
        executable_memory[i] ^= xor_byte;
        if (i + 1 < bytes_read)
            executable_memory[i] ^= executable_memory[i + 1];
        if (i > 0)
            executable_memory[i] ^= executable_memory[i - 1];
    }

    // Run shellcode
    ((void (*) ()) executable_memory)();

    // Return cleanly if shellcode doesn't crash/exit
    return 0;
}

Once this program is compiled, we need to open it in a disassembler (I recommend Binary Ninja), and determine where we want our symbolic execution to take place. If we open up our main function in Binary Ninja, we will see the following:

Main function in Binary Ninja

We will start immediately following the second call to read (highlighted in blue), and stop immediately before our shellcode is executed (highlighted in red). These addresses are:

In order, these are the steps we need to accomplish:

  1. Load our target binary into Finch, and initialize state appropriately.
  2. Symbolically execute up until our shellcode is called.
  3. Assert the memory which holds our shellcode to be equal to the shellcode we wish to execute.
  4. Have Finch solve for the inputs which result in our desired shellcode.

Load our target binary into Finch

The easiest way to load a Program for Finch, and find our initial il::ProgramLocation, is using Falcon’s loaders.

    let elf = falcon::loader::Elf::from_file(path)?;

Now we need to initialize state. This can be an involved process. We’ll start by initializing memory.

    let mut memory =
        Memory::new_with_backing(elf.architecture().endian(),
                                 falcon::RC::new(elf.memory()?));

Falcon IL does not explicitly encode the endianness of operations. We need to inform our memory model which endianness it should operate as. Additionally, we can pass an optional memory backing. While this is not required for our example, this is a required step if you will need Finch to lift new functions at runtime.

I won’t be going into detail with Falcon’s memory models, but suffice it to say Finch and Falcon memory-models can take a memory backing from the loader. Memory backings allow for fast and efficient forking (.clone() in Rust) of memory models. Normally we will supply this memory backing to our more comprehensive memory model, and this will allow a Driver to lift instructions, and load values from memory, as provided by Falcon’s loaders.

You will also notice the use of falcon::RC here. When Falcon is compiled with the option thread_safe, this is of type Arc, and when not compiled with thread_safe, this is Rc. Rc is faster than Arc, but Arc is required to use Falcon in multi-threaded applications. Always use falcon::RC in these cases, and your code will be thread-safe when it needs to be. Thanks to Rust’s compile-time checks, this is hard to mess up, and if our programs compile, we gain the thread-safety guarantees of Rust.

    const STACK_BASE: u64 = 0xb000_0000;
    for i in 0..4096 {
        memory.store(STACK_BASE + i, &il::expr_const(0, 8))?;
    }

Though not always required, it’s generally a good idea to initialize a memory area where our stack will live.

    const STACK_POINTER: u64 = 0xb000_0800;
    memory.store(STACK_POINTER + 0x7, &il::expr_scalar("i-0", 8))?;

We know our program will read in the first byte of our input in a separate call to read(), and store this at [esp+0x7]. We’ll pick an initial address for our stack pointer, and store our first byte of input at that address.

    const SHELLCODE_BASE: u64 = 0xa000_0000;
    for i in 0..4096 {
        // Remember, our first byte of input is on the stack, so we'll name
        // these variables starting with "i-1".
        memory.store(SHELLCODE_BASE + i,
                     &il::expr_scalar(format!("i-{}", i + 1), 8))?;
    }

Now we can start writing expressions into memory where we will place our shellcode. These expressions are 8-bit scalars named, i-0, i-1, etc., for each subsequent byte of input.

    let mut state = State::new(memory, None);

With our memory model complete, we can create and configure our State.

    state.set_scalar("rsp", &il::expr_const(STACK_POINTER, 64))?;
    state.set_scalar("rbx", &il::expr_const(SHELLCODE_BASE, 64))?;
    const SHELLCODE_LEN: usize = 256;
    state.set_scalar("rax", &il::expr_const(SHELLCODE_LEN as u64, 64))?;

From our reverse-engineering, we know the values of relevant registers are:

The last step before we create our Driver is to create an il::Program, and find the location where we wish to begin execution.

    let program = elf.program_recursive()?;

    const START_ADDRESS: u64 = 0x64d;
    let program_location: il::ProgramLocation =
        il::RefProgramLocation::from_address(&program, START_ADDRESS)
            .ok_or("Unable to find START_ADDRESS in program")?
            .into();

We ask the loader to lift an il::Program from the executable. There are two operations here, program() and program_recursive(). The program() variant will use Elf or PE symbol information, and the binary entry point, to find locations to begin lifting functions. The program_recursive() variant takes this information, lifts new functions from branch operations with constant targets, and continues this process until a fixed point is reached.

Our Driver also requires an il::ProgramLocation. We use the convenience function il::RefProgramLocation::from_address to find the first location in the program with our desired address, and turn the resulting il::RefProgramLocation in to il::ProgramLocation.

It’s important to note that we need to start our Driver with a valid il::ProgramLocation. If Falcon failed to discover the function, and the instruction, we wanted to begin with, we would need to manually specify the function and/or instruction location, and lift manually.

Now we can create our Driver.

    let driver = Driver::new(
        program.clone(),
        program_location,
        state,
        falcon::RC::new(elf.architecture().box_clone())
    );

Here we see falcon::RC again. We’re passing the architecture from the binary to our Driver. This allows the Driver to automatically lift new code from memory as needed.

It’s time to, “Drive,” our Driver over our il::Program. I normally do this with functions such as the following:

fn drive_to(
    mut drivers: Vec<Driver<Dummy>>,
    address: u64,
    steps: usize,
    kill_addresses: &HashSet<u64>
) -> Result<Vec<Driver<Dummy>>> {

    // A Vec to hold our finished drivers, which have reached the target
    // address.
    let mut collect_drivers = Vec::new();

    for _i in 0..steps {
        // If we are out of drivers to step, we're done.
        if drivers.len() == 0 {
            break;
        }

        // A temporary vec to hold drivers for this step
        let mut step_drivers = Vec::new();

        // Loop through the drivers we need to stop
        for driver in drivers {
            // As you debug your symbolic executors, this is a great place to
            // watch what's happening, and see the last instruction which
            // executed before something failed.
            // if let Some(instruction) = driver.instruction() {
            //     println!("{}", instruction);
            // }

            // Add all of the drivers which resulted from this step to our
            // temporary vec of drivers.
            step_drivers.append(&mut driver.step()?);
        }

        // Zero our our drivers.
        drivers = Vec::new();

        // Loop through all of our drivers for this step.
        for driver in step_drivers {

            // If we have an address for this driver (this driver is currently
            // sitting on an instruction),
            if let Some(driver_address) = driver.address() {
                // Don't add this driver to anything if it's in our kill set.
                if kill_addresses.contains(&driver_address) {
                    continue;
                }
                // If this driver is on our target address, add it to our
                // result drivers.
                if driver_address == address {
                    collect_drivers.push(driver);
                    continue;
                }
            }
            // Add this driver back to our drivers we're going to process in
            // the next step.
            drivers.push(driver);
        }
    }

    // Return the drivers which reached the target address.
    Ok(collect_drivers)
}

Feel free to copy-paste that function into all of your symbolic executor applications.

We need to step our Driver to our stop address. Based on our reverse-engineering, we don’t have any conditional branches dependent on symbolic values, so we should only have one Driver at the target address when we are done.

    // And now we drive to our stop address.
    const STOP_ADDRESS: u64 = 0x693;
    let drivers = drive_to(vec![driver], STOP_ADDRESS, 100000, &HashSet::new())?;

    // We should only have one driver
    if drivers.len() != 1 {
        bail!("Expected 1 driver, but found {} drivers", drivers.len());
    }

    let mut driver = drivers[0].clone();

    // Let's get the state from this driver.
    let state = driver.state_mut();

We grab the State of the program at our stop address.

We now have symbolic expressions in memory where our shellcode is stored. We need to fetch these expressions from memory, one at a time, and assert them equal to what we want our shellcode to be.

To help understand, we can print out a couple expressions starting at SHELLCODE_BASE.

    println!("{}", state.memory().load(SHELLCODE_BASE, 8)?.unwrap());
    println!("{}", state.memory().load(SHELLCODE_BASE + 1, 8)?.unwrap());
    println!("{}", state.memory().load(SHELLCODE_BASE + 2, 8)?.unwrap());

This gives us:

((i-1:8 ^ i-0:8) ^ i-2:8)
(((i-2:8 ^ i-0:8) ^ i-3:8) ^ ((i-1:8 ^ i-0:8) ^ i-2:8))
(((i-3:8 ^ i-0:8) ^ i-4:8) ^ (((i-2:8 ^ i-0:8) ^ i-3:8) ^ ((i-1:8 ^ i-0:8) ^ i-2:8)))

We don’t know what valid values for scalars like i-1:8 and i-2:8 are, but our symbolic executor has built up expressions which define their values in relation to one another.

If we set ((i-1:8 ^ i-0:8) ^ i-2:8) equal to the first byte of our shellcode, (((i-2:8 ^ i-0:8) ^ i-3:8) ^ ((i-1:8 ^ i-0:8) ^ i-2:8)) equal to the next byte of our shellcode, and so on, Finch will solve for satisfying assignments to these expressions. Finch will solve for i-0:8, i-1:8, i-2:8, etc. The result of those solves will be the correct values to feed in as our shellcode. We can feed those values in, and have our shellcode correctly deobfuscate and execute. Cool!

Assuming we have our shellcode loaded into a variable named shellcode_bytes of type Vec<u8>, we add the following code:

    for i in 0..shellcode_bytes.len() {
        let address = SHELLCODE_BASE + i as u64;
        let byte_expr =
            state.memory()
                .load(address, 8)?
                .ok_or(format!("No shellcode at address 0x{:x}?", address))?;
        state.add_path_constraint(
            &il::Expression::cmpeq(
                byte_expr,
                il::expr_const(shellcode_bytes[i] as u64, 8)
            )?
        )?;
    }

For each byte in our shellcode, we load the expression from memory where our final shellcode will be, and we tell Finch to add a constrant to the State. This constraint says, “The expression you found in memory must be equal to the value I need my shellcode to be at that address.”

Now all that’s left is to tell the state to solve for each byte of shellcode.

    let mut bytes: Vec<u8> = Vec::new();
    for i in 0..(shellcode_bytes.len() + 2) {
        let input_byte = state.eval_and_concretize(
            &il::expr_scalar(format!("i-{}", i), 8))?;
        let input_byte =
            input_byte.ok_or(format!("No satisfiable value for i={}", i))?;
        bytes.push(input_byte.value_u64().unwrap() as u8);
        println!("{}: {:02x}", i, input_byte.value_u64().unwrap());
    }

We use the function .eval_and_concretize() to get each byte. If Finch determined that ((i-1:8 ^ i-0:8) ^ i-2:8) should be 0, .eval_and_concretize() adds an additional constraint to the state, such that ((i-1:8 ^ i-0:8) ^ i-2:8) == 0. This information will then be used when solving for future values.

Last, we’ll use a bit of rust to print out the result:

    println!("{}",
        bytes.into_iter()
            .map(|b| format!("{:02X}", b))
            .collect::<Vec<String>>()
            .join(""));

Let’s run our program and check the result.

root@f147bd879083:/code/finch-examples/example-0/example# time cargo run --release -- ../c-source/example-0 ../shellcode
    Finished release [optimized] target(s) in 0.66s
     Running `target/release/example-0 ../c-source/example-0 ../shellcode`
((i-1:8 ^ i-0:8) ^ i-2:8)
(((i-2:8 ^ i-0:8) ^ i-3:8) ^ ((i-1:8 ^ i-0:8) ^ i-2:8))
(((i-3:8 ^ i-0:8) ^ i-4:8) ^ (((i-2:8 ^ i-0:8) ^ i-3:8) ^ ((i-1:8 ^ i-0:8) ^ i-2:8)))
0: 00
1: ff
2: a9
... omitted for brevity
62: de
63: f5
64: ff
00FFA9A8AAB77413B772FADBFFFFFFB772CF95FEA5B77401CD3FCC36B700360D51082E003695FEA77405B7742EF0FA36A0A13CB79A939390DFA8908D939BDEF5FF

real    0m2.658s
user    0m1.930s
sys 0m0.180s
root@f147bd879083:/code/finch-examples/example-0/example# 

Awesome!

But does it pwn?

We can throw together a quick script with pwntools, the lingua franca of ctf pwn challenges. This script will invoke our finch program, and solve for a valid sequence bytes to obfuscate our shellcode. It then ships that shellcode off to the binary, and goes interactive.

import base64
from pwn import *
import subprocess
import sys

# Get path to target binary and shellcode
binary_path = sys.argv[1]
shellcode_path = sys.argv[2]

log.info("Running symbolic executor")

# Run our symbolic executor over the binary to deobfuscate the shellcode
cmd = "example/target/release/example-0 {} {}".format(binary_path, shellcode_path)
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)

# Read and parse the shellcode output
(stdout, stderr) = proc.communicate()

log.info("Symbolic executor done")

solution = stdout.strip().split('\n')[-1]

shellcode = base64.b16decode(solution)

log.info("Saving shellcode to /tmp/shellcode.deobfuscated")

fh = open("/tmp/shellcode.deobfuscated", "wb")
fh.write(shellcode)
fh.close()

# Run the process, and feed our shellcode into it
s = process(binary_path)
s.send(shellcode)
s.interactive()

Fortunately, I have some shellcode I’ve compiled with Binary Ninja to give us a shell. Let’s run our script and check out the result.

root@f147bd879083:/code/finch-examples/example-0# python solve.py c-source/example-0 shellcode-bin-sh
[*] Running symbolic executor
[*] Symbolic executor done
[*] Saving shellcode to /tmp/shellcode.deobfuscated
[+] Starting local process 'c-source/example-0': pid 8898
[*] Switching to interactive mode
$ ls
c-source  example  shellcode  shellcode-bin-sh    solve.py
$ whoami
root
$ exit
[*] Got EOF while reading in interactive
$ 
[*] Process 'c-source/example-0' stopped with exit code 0 (pid 8898)
[*] Got EOF while sending in interactive
root@f147bd879083:/code/finch-examples/example-0#

A special thanks

A lot of people have helped me over the past many, many years in both getting excited about, and understanding the concepts behind, symbolic execution. In no particular order, I’d like to thank @richinseattle, @thedavidbrumley, gimpy, and Thanassis.


  1. For some definition of Function. Falcon lifters cannot resolve jump tables for example. A more accurate definition would be, “Falcon lifts code upto indirect branches,” which includes return instructions. This is normally equivalent to lifting a, “Function,” when no special control-flow mechanisms (such as jump tables) are used. [return]