This repository contains the source code for FuzzERR, a tool that takes as input the shared library generated using wllvm (so that it includes the whole program llvm bitcode for the library) and the errorblocks.json file for this library generated using DetectERR and produces the instrumented library shared object file. This tool is a part of our research paper accepted at AsiaCCS-2024, titled "Fuzzing API Error Handling Behaviors using Coverage Guided Fault Injection".
This repository contains the llvm opt pass for this instrumentation. This repository also included the source code for the Errlib which makes it easier to control the exact fault injection point.
The following instructions are for Ubuntu 22.04.
Packages from apt
sudo apt install -y python3.10 multilog bear patchelf
llvm
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 13 all
Cargo packages
# install rust (and cargo) using rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# reload your shell so that cargo bin path is added to your PATH, and then:
cargo install fd sd
Others
- Install libbacktrace as per the instructions in their repository.
Python Packages (pip)
pip install colorlog wllvm
The following are the steps involved in fuzzing a program with a given library.
- Get the potential fault injection points for the library (
errblocks.jsonfile).- To obtain this, run DetectERR on the library source code.
- Get the json compilation database (
compile_commands.json) for the library. - Build the library using
wllvm, with debug info. - Build the FuzzERR LLVM Instrumentation Pass in
InstrumentationPassesdirectory (only required once). - Get the LLVM bitcode for the library.
- Extract the whole library bitcode from the generated shared library in step above, using
extract-bcfromwllvmproject.
- Extract the whole library bitcode from the generated shared library in step above, using
- Instrument the library bitcode.
- Run the instrumentation pass on the extracted bitcode from step 4.
- Link the instrumented bitcode with Errlib to generate the final instrumented library.
- First build ErrLib (from
ErrLibdirectory in this repository) - Then link the ErrLib with the instrumented bitcode using
llvm-link.
- First build ErrLib (from
- Build the program to fuzz (with debug info).
- Use
patchelfto modify the RPath in the program binary to use the instrumented library created above. - Fuzz the program (with the modified RPath from step 9) using the modified
afl-fuzzfrom FuzzERR_AFLPlusPlus.
The example below uses jpegoptim as the program to be fuzzed and libjpeg as the library which would be instrumented for fault injection.
-
(Step 1). Download and extract the source code for libjpeg-turbo. Use
detecterrto generate the compilation database. Refer to DetectERR repo for details about its usage. -
(Steps 2,3). Refer to the
scripts/create_wllvm_libjpeg.shscript to compile libjpeg with wllvm. NOTE: Update theBASE_DIR,LIB_SRC_DIRandLIB_INSTR_DIRpaths as per your setup.BASE_DIR: the path to this repo.LIB_SRC_DIR: the path to source code for libjpegLIB_INSTR_DIR: the path where the libjpeg library, compiled using wllvm, should be moved to.- This script generates the compilation database as well (step 2), using the inbuilt support in cmake. For libraries using other build system,
bearcan be used to generate the compilation database.
-
(Step 4,5,6,7). Refer to the
scripts/inst_libjpeg.shscript that generates the instrumentedlibjpeg.soshared library from thelibjpeg.sofile generated in the step above. Internally it uses the helper scriptscript/inst_so.shwhich actually takes care of steps 4, 5, 6 and 7. TheBASE_DIR,LIB_SRC_DIRandLIB_INSTR_DIRare same as the step above. -
(Step 8,9). Download and extract the source code for jpegoptim. Build it using the
afl-clang-fast/afl-clang-fast++binaries from FuzzERR_AFLPlusPlus. Finally, usingpatchelf, add an RPath to the generated jpegoptim binary so that it would use the instrumented library generate above (step 9).- The
scripts/inst_libjpeg.shscript also contains the code for the steps described above (refer ????).
- The
-
(Step 10). The
scripts/fuzzerr/fuzz_bin.pyscript functions as a harness to fuzz the given binary. It uses a json config that contains the parameters for the fuzzing campaign. The json config for jpegoptim is provided at fuzz_jpegoptim.json. To run the process:# single process ./scripts/fuzzerr/fuzz_bin.py experiments/libjpeg/fuzz_jpegoptim.json # OR to use multiple processes in parallel ./scripts/fuzzerr/fuzz_bin.py experiments/libjpeg/fuzz_jpegoptim.json -p
The parameters of the fuzzing config file are explained below:
SETUP_CMDS: list of commands to run before the fuzzing campaign (for example, to setup a particular folder structure).LIB_INSTR_DIR,LIB_SRC_DIR: same as explained above.BIN_SRC_DIR: path to the directory containing the sources for the program being fuzzed.BIN_TO_FUZZ: path to the final binary to fuzz (jpegoptim).BIN_INPUT_DIR: path to the directory containing initial seed inputs for fuzzing.BIN_ARGS_LIST: list of invocations for the program being fuzzed. The following special variables are available for use in these commands. a.{{input}}: the input file path b.{{output}}: the output file path c.{{outputd}}: the path to a temporary output directory where this program will put its output files For example the line-d {{outputd}} -o {{input}}inBIN_ARGS_LISTmeans that jpegoptim will be run using the commandjpegoptim -d /path/to/tmp_output_dir -o /path/to/one/seed_input, and then the fuzzer (afl++) would inject faults in libjpeg, while being guided by feedback.DEBUG: Whether to print the output (for debugging purposes) (NOTE: this produces a lot of output and is really useful only when running without the-pflag i.e. in single process)TOTAL_FUZZ_TIME_MINS: total time in minutes for which this program should be fuzzedFUZZ_TIME_PER_INPUT_ARG_MINS: time in minutes for which one particular input should be used for one particular invocation of the command arguments listed inBIN_ARGS_LIST(explained above).AFL_TIMEOUT_MSECS: time delay, in miliseconds, which afl++ should treat as a timeout (useful for slow binaries)FUZZERR_TIMEOUT_IN_SEC: timeout to be used internally by FuzzERR while doing crash-minimization/filtering.LOG_DESTINATION: "MULTILOG"/"STDOUT" -- whether to save output to log files (MULTILOG) or just print on screen (STDOUT).