Focus on Fuzzing: Fuzzing Engines and Services

By: Kostya Serebryany, Google & Souheil Moghnie, NortonLifeLock with Adith Sudhakar,VMWare; Rohit Shambhuni, Autodesk; and Uday Bhaskar, Autodesk

SAFECode’s Fuzzing team is back to continue our discussion on fuzzing practices. If you are just joining us, be sure to take a look at the first three posts in our Focus on Fuzzing series – Getting Started, Types of Fuzzing, and A Closer Look at Coverage-Guided Fuzzing.

There are many types of software that facilitate fuzzing; the terminology is not uniform, and the boundaries between these types are blurred. In this post, we will discuss some of the most popular fuzzing engines and services.

Fuzzing Engines

AFL

American Fuzzy Lop (AFL) is a widely adopted coverage-guided fuzzing engine. It is supported on multiple operating systems and emulated CPU’s, and can be used to fuzz user-mode applications as well as kernel mode drivers.

You can run AFL with the following 5 simple steps:

Compile your software with afl-gcc or afl-g++ for C++ code
Create a sample input file to your software, which will be used to generate more fuzzed files
Create an input directory and copy the sample file into it
Create an output directory where the fuzzed files will be generated
Run afl-fuzz as follows: % afl-fuzz -i IN -o OUT ./a.out […any stdin params…]

Even though we’re using files for input in the example above, you can run AFL with minor changes if your input is a network stream (see details here). Also, you do not need special hardware to run most fuzzers nowadays, including AFL.

Finally, in some cases, your software may expect specific syntax to be followed where pure brute force wouldn’t make sense. AFL can easily handle such situations by providing a fuzzing dictionary to be used.

LibFuzzer

LibFuzzer (tutorial), part of the LLVM toolchain, is a coverage guided in-process engine for fuzzing libraries and APIs. It relies on SanitizerCoverage (also part of LLVM) to guide corpus expansion and mutations. It supports user-supplied mutators which allows it to perform structure-aware fuzzing for any complex input type. One such custom mutator, libprotobuf-mutator, allows fuzzing APIs that consume protobuffers. LibFuzzer is tightly integrated with the Sanitizers (ASan, MSan, UBSan), but can also be used separately. In addition to supporting C and C++, variants of libFuzzer are available for Rust, Swift, and Go programming languages.

If you have an API you want to fuzz, with libFuzzer it’s trivial!

// MyApi.cpp

//…

bool FuzzMe(const uint8_t *Data, size_t DataSize) {

return DataSize >= 3 &&

Data[0] == ‘F’ &&

Data[1] == ‘U’ &&

Data[2] == ‘Z’ &&

Data[3] == ‘Z’; // The bug is here. Can you spot it?

}

// MyFuzzTarget.cpp

//…

extern “C” int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {

FuzzMe(Data, Size);

return 0;

}

% clang -g -fsanitize=address,fuzzer MyApi.cpp MyFuzzTarget.cpp -o my-fuzzer

% ./my-fuzzer

…

==2335==ERROR: AddressSanitizer: heap-buffer-overflow …

READ of size 1 at 0x602000155c13 thread T0

#0 0x4ee636 in FuzzMe(unsigned char const*, unsigned long) …

Honggfuzz

Honggfuzz is a security-oriented, feedback-driven, evolutionary software fuzzer available for a range of OS/CPU platforms: Linux, *BSD, Android, Windows/WSL/Cygwin, and MacOS X. It’s capable of running multiple instances of fuzzed processes by maintaining shared input corpus and coverage feedback states between them all. Although this fuzzer makes use of the software-based code coverage feedback data provided by the SanitizerCoverage project, it’s also able to utilize hardware code tracking features available in modern CPUs (Intel Processor Trace, Intel BTS, and PMU counters) for black-box software fuzzing. When such specialized hardware functionality is not available, a software-emulator mode based on QEMU can be used instead.

Syzkaller

syzkaller is a coverage-guided fuzzer for OS kernels, supporting Linux, FreeBSD, Fuchsia, and some others. It allows the user to describe the OS system call APIs using a specialized declarative language (syzlang). Based on these descriptions, syzkaller performs structure-aware generation and mutation of system call sequences. Where available, syzkaller relies on compiler instrumentation to generate coverage information (e.g. KCOV on Linux), but can fall back to a simpler feedback mechanism for closed-source kernels. syzkaller comes with its own cluster management system and web UI, Syzbot.

Services and Infrastructure

Clusterfuzz

Clusterfuzz is an open-source fuzzing infrastructure that enables organizations to run their fuzzers in a scalable and automated manner. ClusterFuzz supports coverage guided fuzzing using the tools listed above (libFuzzer, AFL, Honggfuzz). Its features include – bug filing (Monorail and Jira), test case minimization, deduplication of crashes and, Web UI for viewing status and crashes. Clusterfuzz currently relies on the Google Cloud Platform (GCP), but organizations that wish to use other infrastructure may do so with some development effort.

OSS-Fuzz

Maintainers of open source projects who want to fuzz their code can take advantage of the OSS-Fuzz fuzzing service. The maintainer of the project is required to provide the fuzz targets along with build instructions in a format prescribed by the infrastructure. Once the project is accepted, the infrastructure starts to run the fuzzer(s) and reports issues as and when they are found. Clusterfuzz serves as the backend infrastructure of OSS-Fuzz. As soon as the developer fixes the bug, the fix is automatically verified and the issue is closed.

Microsoft Security Risk Detection (MSRD)

MSRD is a cloud-based file-fuzzing service that runs on Azure. The fuzzing process with MSRD can be described at a high level as follows:

It’s a web application that a user can login into to set up fuzzing jobs.
Once logged in, the user can provision a VM on Azure;login to the VM; install the application or copy the executable; provide the corpus of seed files; and configure the fuzzing job using a job wizard (where one would answer a few questions related to the fuzzing job and the executable).
Once a job is configured and MSRD’s job validation mechanism succeeds, then the Fuzzing job starts and MSRD minimizes the seed files, scales the fuzzing job, and de-duplicates the results during the fuzzing process.
Whenever a crash is found, the user can see the relevant info about a crash and download the file(s) that caused it from the MSRD portal, which can be used to debug and/or reproduce the issue.

According to Microsoft, MSRDuses the proprietary SAGE as the primary fuzzer for Windows applications and AFL and a few others for Linux applications. MSRD currently supports both Windows (Server 2016 and 2019) and Linux (RedHat 7.2) platforms although the support for the latter is in preview mode. It also has support for adding webhooks and has a well documented REST API which can be used to send the results of the Fuzzing job to Jira, Slack etc., if needed. The REST API can also be used to trigger new fuzzing jobs from a build pipeline directly.

Next Up

Our team has discussed why fuzzing is important, what types of fuzzing you can use, and provided a more detailed look at coverage guided fuzzing. We also provided a quick overview of the most widely used fuzzing tools and services. But there is a lot more to consider to successfully implement fuzzing. We’re interested to hear what you’d like to see next from us.

What would you like us to talk about next?

Let us know by taking our anonymous two-question survey at https://www.surveymonkey.com/r/K28T7BS