Fuzzing in Cybersecurity: Understanding the Two Main Types
Written By: John Peng
Two roads diverged in a wood, and I —
I took the one less traveled by,
And that has made all the difference.
The Road Not Taken: Finding More Bugs with Fuzzing Testing
Traditionally, fuzzing software has mostly been approached as a closed box engineering problem, where a series of unexpected/invalid inputs are thrown at a target software that are outside the realm of expected inputs. Fuzzing payloads are meant to guide the program away from the happy path intended by the developer, who tends to devote more time ensuring the correctness of the expected use case and covering their logic with unit tests and code reviews.
Instead, fuzzing is better applied to the logic handling the edge cases, and it is here on the road not taken where many bugs are found. There are two main types of fuzzing, and their applicability is determined by how much information we can extract from the target software regarding its execution state.
Closed Box Fuzzing
In the case where little to no information is available to us, then we call this closed box fuzzing.
This first type of fuzzing lends itself naturally to web applications, which are often treated as closed boxes from the perspective of the tester. In a typical pentest, this could involve fuzzing API endpoints for hidden URL paths and error responses, the WAF for payload bypasses, etc.
The anomalies can be observed in how a target responds, when a HTTP 404 suddenly turns into a 200 because a URL-encoded semicolon was parsed as a input delimiter, or when an additional delay suggests a different database query is being executed in the backend.
The classic example of a web application fuzzer is Burp Suite Intruder (https://portswigger.net/burp), which in its standard (and probably most used mode) simply takes an input and replaces it with a list of payloads.
For more complex use cases, Burp defines extra operations that can work on the inputs to generate more payloads, such as substituting characters in an input for another, encoding it in different output formats, or generating different permutations of the set of input characters.
Many of the other closed box fuzzers available, such as wfuzz and OWASP ZAP, operate in the same way as Burp. All mutations are applied directly to the input without relying on any feedback from the response. Although some information regarding the result of processing the input can be gleaned from the output (ie. the response status code, headers, response body, etc.), it is not enough to establish a clear metric that we can use for the generation of subsequent input cases.
Open Box Fuzzing: Evolution Is the Secret for the Next Step
In open box fuzzing, a target application is run under the influence of a fuzzer, which will iteratively mutate inputs and “feed” them into the target binary. This part is the same as closed box fuzzing. What is different is the addition of a feedback mechanism that establishes a fitness criteria for determining which inputs are saved for further mutations.
The inspiration for this comes from the theory of evolution, according to which only the fittest specimens survive and pass on their genes. The feedback mechanism in the real world is the physical environment, and the fitness criteria indicates how well our mutations allow us to adapt to survival in this environment. As organisms with beneficial mutations (high fitness) allow them to thrive in the environment, these organisms also have the most opportunity to pass on their genes to subsequent generations.
In open box fuzzers, the main feedback mechanism is code coverage: how much of the total program states have been explored by the current input (the distinction between program states and lines of code is intentional, as we elaborate further in the section on feedback mechanisms).
The fitness criteria is discovering new, unexplored program states. If the current input fulfills this criteria, then it is saved to a queue for further mutations. Performing these steps iteratively, the mutated inputs will progressively reach deeper and deeper into the program, since they are based on the “genes” of the previous inputs that gained additional code coverage. Running at hundreds of executions per second, open box fuzzing becomes an extremely efficient way of automatically exploring code coverage.
American Fuzzy Lop Implementation
So far, we have been talking about fuzzers generically, but I would like to take the time now to discuss features and implementations specific for American Fuzzy Lop (AFL), the granddaddy of all fuzzers written by Michael Zalewski. The way AFL is implemented should be fairly instructive in understanding how fuzzers operate in general.
1. Feedback Mechanism
As discussed previously, the main feedback mechanism is (runtime) code coverage. To accomplish this, AFL needs to compile the original source code with its own customized version that adds additional instrumentation code at each decision branch (if statements, switch statements and function calls/returns).
The inserted instrumentation will set the (curr_loc, prev_loc) value in a 2D array, where prev_loc and curr_loc are respectively the indices that mark the previous branch location and the current branch location. Setting (curr_loc, prev_loc) then means that at some point during the current run, execution proceeded from branch prev_loc to branch curr_loc. To illustrate this point further, compare branch coverage to block coverage, which only tracks whether or not a new branch was discovered (a code block is a group of instructions that gets executed in sequence each time before reaching a branch instruction). Suppose we have two execution traces:
A → B → C → D
A → C → B → D
Under block coverage, both traces would be the same since the same blocks were hit in the same run. Using branch coverage, the C→B and B→C transition would be identified as two separate traces. Branch coverage was an innovation pioneered by Zalewski in the original AFL.
In order to continuously run the target programs with different inputs, AFL has to restart and execute the same program multiple times. On UNIX systems, this is achieved through the fork() system call, which copies a current program and executes another copy of it. Fork() is a time consuming operation since it requires an in-memory duplication of the application binary and current execution state, and linking of libraries.
An optimization performed by AFL to speed up this step is to skip the library initialization/linking routine used by the child process. Other alternative execution modes include manually modifying the source code to include a AFL_INIT() macro in the code, which defers forking the binary until the macro line has been executed. Yet another method (and the preferred one for the best performance) is AFL persistent mode, which requires the user to manually wrap the target code block(s) within a special AFL for loop macro. This then allows AFL to automatically patch the input into wrapped source code without incurring the cost of forking a child process for each execution.
Several mutation operations can be chosen from and performed at random or deterministically by AFL, including but not limited to, increments and decrements random bytes/bits, overwriting random random offsets in the input with magic values (ie. null characters, zero to 9 in uint_8, uint_32, uint_64), and splicing, which takes two inputs at offsets and concatenates them together. Despite the relative simplicity of the mutations, when combined with fast execution speeds (varies widely with the target software but anywhere in the neighborhood of 200 execs/sec would be considered fast), AFL is able to synthesize valid JPEGs from completely random input data.
The mutations applied on each of the fuzzed inputs are relatively simple since they are designed for generic data. For a more targeted approach, a dictionary can be provided that contains keywords that will be clobbered together by AFL. For a even more targeted approach, custom mutators could be written specifically for the task (ie. delineating specific offsets in the file header to be off-limits to mutations).
Given its added complexity, is instrumented fuzzing always better than closed box fuzzing in finding bugs and vulnerabilities? The answer is probably yes, in almost all cases, if both options are available to the tester, feedback driven fuzzing will always win out over blind closed box fuzzing.
The effectiveness of fuzzing has led to hundreds of thousands of disclosed CVEs, and has wide-scale adoption in big software companies such as Google and Microsoft. However, in practical cases, closed box fuzzing can still be the correct approach, especially in cases like web application pentesting where it is infeasible to instrument the target application. As in all things, context should determine the tools for the job.
And despite the powerful instrumented approach offered by AFL out of the box, a lot of non-trivial setup is still required on a case by case basis depending on the target software, such as writing a fuzzing harness that minimizes the initialization time and provides a custom bridge for AFL to feed inputs into the target binary. Also, performance tuning AFL can also be quite non-intuitive, as the number of discovered edges hits a saturation point, requiring a rethink of the fuzzing entry point in order to continue exploring deeper program states. Lastly, vanilla AFL has long been outgrown by new additions made with the community sourced AFLplusplus project, which have added tons of new features (RedQueen and Cmplog), macros and compilation options that makes configuring AFL a bewildering task for users new to the world of fuzzing.
Be on the lookout for more fuzzing-related content from us in the future!
John is a self taught cybersecurity professional who the challenges and puzzles posed by our matters related to cyber. Specializing in source code audits and cloud security audits, John strives to make the digital age as safe and secure for everyone as much as possible. Embodying the software developer’s mindset of working hard at being lazy, he enjoys using code to automate much of the tedious tasks in order to focus on the truly fascinating ones. He tries hard to cultivate the virtues of a cyber warrior. It is a calling that requires as much mental fortitude as well as depth of technical knowledge, all the while nursing a insatiable curiosity.