awesome-directed-fuzzing

Directed Fuzzing seems to be a current hot research topic. This repository aims to provide a curated list of research papers focusing on directed greybox fuzzing (see more directed whitebox fuzzing and miscellaneous).

[CCS'17] Directed Greybox Fuzzing

[paper] [project] [slides] [talk]

Click to see the abstract!

Existing Greybox Fuzzers (GF) cannot be effectively directed, for instance, towards problematic changes or patches, towards critical system calls or dangerous locations, or towards functions in the stacktrace of a reported vulnerability that we wish to reproduce. In this paper, we introduce Directed Greybox Fuzzing (DGF) which generates inputs with the objective of reaching a given set of target program locations efficiently. We develop and evaluate a simulated annealing-based power schedule that gradually assigns more energy to seeds that are closer to the target locations while reducing energy for seeds that are further away. Experiments with our implementation AFLGo demonstrate that DGF outperforms both directed symbolic-execution-based whitebox fuzzing and undirected greybox fuzzing. We show applications of DGF to patch testing and crash reproduction, and discuss the integration of AFLGo into Google’s continuous fuzzing platform OSS-Fuzz. Due to its directedness, AFLGo could find 39 bugs in several well-fuzzed, security-critical projects like LibXML2. 17 CVEs were assigned.

[paper]

Click to see the abstract!

Existing directed grey-box fuzzers are effective compared with coverage-based fuzzers. However, they fail to achieve a balance between effectiveness and efficiency, and it is difficult to cover complex paths due to random mutation. To mitigate the issue, we propose a novel approach, sequence directed hybrid fuzzing (SDHF), which leverages a sequence-directed strategy and concolic execution technique to enhance the effectiveness of fuzzing. Given a set of target statement sequences of a program, SDHF aims to generate inputs that can reach the statements in each sequence in order and trigger potential bugs in the program. We implement the proposed approach in a tool called Berry and evaluate its capability on crash reproduction, true positive verification, and vulnerability detection. Experimental results demonstrate that Berry outperforms four state-of-the-art fuzzers, including directed fuzzers BugRedux, AFLGo and Lolly, and undirected hybrid fuzzer QSYM. Moreover, Berry found 7 new vulnerabilities in real-world programs such as UPX and GNU Libextractor, and 3 new CVEs were assigned.

[paper] [project]

Click to see the abstract!

Directed fuzzing focuses on automatically testing specific parts of the code by taking advantage of additional information such as (partial) bug stack trace, patches or risky operations. Key applications include bug reproduction, patch testing and static analysis report verification. Although directed fuzzing has received a lot of attention recently, hard-to-detect vulnerabilities such as Use-After-Free (UAF) are still not well addressed, especially at the binary level. We propose UAFuzz, the first (binary-level) directed greybox fuzzer dedicated to UAF bugs. The technique features a fuzzing engine tailored to UAF specifics, a lightweight code instrumentation and an efficient bug triage step. Experimental evaluation for bug reproduction on real cases demonstrates that UAFuzz significantly outperforms state-of-the-art directed fuzzers in terms of fault detection rate, time to exposure and bug triaging. UAFUZZ has also been proven effective in patch testing, leading to the discovery of 30 new bugs (7 CVEs) in programs such as Perl, GPAC and GNU Patch. Finally, we provide to the community a large fuzzing benchmark dedicated to UAF, built on both real codes and real bugs.

[arxiv'20] TOFU: Target-Oriented FUzzer

[paper]

Click to see the abstract!

Program fuzzing—providing randomly constructed inputs to a computer program—has proved to be a powerful way to uncover bugs, find security vulnerabilities, and generate test inputs that increase code coverage. In many applications, however, one is interested in a target-oriented approach—one wants to find an input that causes the program to reach a specific target point in the program. We have created TOFU (for Target-Oriented FUzzer) to address the directed fuzzing problem. TOFU’s search is biased according to a distance metric that scores each input according to how close the input’s execution trace gets to the target locations. TOFU is also input-structure aware (i.e., the search makes use of a specification of a superset of the program’s allowed inputs). Our experiments on xmllint show that TOFU is 28% faster than AFLGo, while reaching 45% more targets. Moreover, both distanceguided search and exploitation of knowledge of the input structure contribute significantly to TOFU’s performance.

[arxiv'20] SoK: The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing

[paper]

[ASIACCS'22] TargetFuzz: Using DARTs to Guide Directed Greybox Fuzzers

[paper]

Click to see the abstract!

Software development is a continuous and incremental process. Developers continuously improve their software in small batches rather than in one large batch. The high frequency of small batches makes it essential to use effective testing methods that detect bugs under limited testing time. To this end, researchers propose directed greybox fuzzing (DGF) which aims to generate test cases towards stressing certain target sites. Different from the coverage-based greybox fuzzing (CGF) which aims to maximize code coverage in the whole program, the goal of DGF is to cover potentially buggy code regions (e.g., a recently modified program region). While prior works improve several aspects of DGF (such as power scheduling, input prioritization, and target selection), little attention has been given to improving the seed selection process. Existing DGF tools use seed corpora mainly tailored for CGF (i.e., a set of seeds that cover different regions of the program). We observe that using CGFbased corpora limits the bug-finding capability of a directed greybox fuzzer. To mitigate this shortcoming, we propose TargetFuzz, a mechanism that provides a DGF tool with a target-oriented seed corpus. We refer to this corpus as DART corpus, which contains only 'close' seeds to the targets. This way, DART corpus guides DGF to the targets, thereby exposing bugs even under limited fuzzing time. Evaluations on 34 real bugs show that AFLGo (a state-of-theart directed greybox fuzzer), when equipped with DART corpus, finds 10 additional bugs and achieves 4.03× speedup, on average, in the time-to-exposure compared to a generic CGF-based corpus.

[S&P'22] Exploit the Last Straw That Breaks Android Systems

[paper] [project]

Click to see the abstract!

The Android system services usually play a critical role in running multiple important tasks, and delivering seamless user experiences, e.g., conveniently storing user data. In this paper, we conduct the first systematic security study on the data storing process in Android system services, and consequently discover a novel class of design flaws (named Straw), which can lead to serious DoS (Denial-of-Service) attacks, e.g., permanently crashing the whole victim Android device. Then we propose a novel directed fuzzing based approach, called StrawFuzzer, to automatically vet all system services against the straw vulnerabilities. StrawFuzzer balances the tradeoff between path exploration and vulnerability exploitation. By applying StrawFuzzer on three Android systems with the latest security updates, we identified 35 unique straw vulnerabilities affecting 474 interfaces across 77 system services and successfully generated corresponding exploits, which can be used to conduct various permanent/temporary DoS attacks. We have reported our findings with suggestions for repairing the vulnerabilities to corresponding vendors. Up to now, Google has rated our vulnerability as high severity.

[ICSE'22] Linear-time Temporal Logic guided Greybox Fuzzing

[paper] [project] [talk]

Click to see the abstract!

Software model checking as well as runtime verification are verification techniques which are widely used for checking temporal properties of software systems. Even though they are property verification techniques, their common usage in practice is in "bug finding", that is, finding violations of temporal properties. Motivated by this observation and leveraging the recent progress in fuzzing, we build a greybox fuzzing framework to find violations of Linear-time Temporal Logic (LTL) properties.

Our framework takes as input a sequential program written in C/C++, and an LTL property. It finds violations, or counterexample traces, of the LTL property in stateful software systems; however, it does not achieve verification. Our work substantially extends directed greybox fuzzing to witness arbitrarily complex event orderings. We note that existing directed greybox fuzzing approaches are limited to witnessing reaching a location or witnessing simple event orderings like use-after-free. At the same time, compared to model checkers, our approach finds the counterexamples faster, thereby finding more counterexamples within a given time budget.

Our LTL-Fuzzer tool, built on top of the AFL fuzzer, is shown to be effective in detecting bugs in well-known protocol implementations, such as OpenSSL and Telnet. We use LTL-Fuzzer to reproduce known vulnerabilities (CVEs), to find 15 zero-day bugs by checking properties extracted from RFCs (for which 12 CVEs have been assigned), and to find violations of both safety as well as liveness properties in real-world protocol implementations. Our work represents a practical advance over software model checkers — while simultaneously representing a conceptual advance over existing greybox fuzzers. Our work thus provides a starting point for understanding the unexplored synergies among software model checking, runtime verification and greybox fuzzing.

[thesis] Directing greybox fuzzing to discover bugs in hardware and software - Sadullah Canakci

[paper]

Click to see the abstract!

Computer systems are deeply integrated into our daily routines such as online shopping, checking emails, and posting photos on social media platforms. Unfortunately, with the wide range of functionalities and sensitive information stored in computer systems, they have become fruitful targets for attackers. Cybersecurity ventures estimate that the cost of cyber attacks will reach $10.5 trillion USD annually by 2025. Moreover, data breaches have resulted in the leakage of millions of people’s social security numbers, social media account passwords, and healthcare information. With the increasing complexity and connectivity of computer systems, the intensity and volume of cyber attacks will continue to increase. Attackers will continuously look for bugs in the systems and ways to exploit them for gaining unauthorized access or leaking sensitive information. Minimizing bugs in systems is essential to remediate security weaknesses. To this end, researchers proposed a myriad of methods to discover bugs. In the software domain, one prominent method is fuzzing, the process of repeatedly running a program under test with “random” inputs to trigger bugs. Among different variants of fuzzing, greybox fuzzing (GF) has especially seen widespread adoption thanks to its practicality and bug-finding capability. In GF, the fuzzer collects feedback from the program (e.g., code coverage) during its execution and guides the input generation based on the feedback. Due to its success in finding bugs in the software domain, GF has gained traction in the hardware domain as well. Several works adapted GF to the hardware domain by addressing the differences between hardware and software. These works demonstrated that GF can be leveraged to discover bugs in hardware designs such as processors. In this thesis, we propose three different fuzzing mechanisms, one for software and two for hardware, to expose bugs in the multiple layers of systems. Each mechanism focuses on different aspects of GF to assist the fuzzing procedure for triggering bugs in hardware and software. The first mechanism, TargetFuzz, focuses on producing an effective seed corpus when fuzzing software. The seed corpus consists of a set of inputs serving as starting points to the fuzzer. We demonstrate that carefully selecting seeds to steer GF towards potentially buggy code regions increases the bug-finding capability of GF. Compared to prior works, TargetFuzz discovered 10 additional bugs and achieved 4.03× speedup, on average, in the total elapsed time for finding bugs. The second mechanism, DirectFuzz, adapts a specific variant of GF for software fuzzing, namely directed greybox fuzzing (DGF), to the hardware domain. The main use case of DGF in software is patch testing where the goal is to steer fuzzing towards recently modified code region. Similar to software, hardware design is an incremental and continuous process. Therefore, it is important to prioritize testing of a new component in a hardware design rather than previously well-tested components. DirectFuzz takes several differences between hardware and software (such as clock sensitivity, concurrent execution of multiple code fragments, hardware-specific coverage) into account to successfully adapt DGF to the hardware domain. DirectFuzz relies on coverage feedback applicable to a wide range of hardware designs and requires limited design knowledge. While this increases its ease of adoption to many different hardware designs, its effectiveness (i.e., bug-finding success) becomes limited in certain hardware designs such as processors. Overall, compared to a state-of-the-work hardware fuzzer, DirectFuzz covers specified targets sites (e.g., modified hardware regions) 2.23× faster. Our third mechanism named ProcessorFuzz relies on novel coverage feedback tailored for processors to increase the effectiveness of fuzzing in processors. Specifically, ProcessorFuzz monitors value changes in control and status registers which form the backbone of a processor. ProcessorFuzz addresses several drawbacks of existing works in processor fuzzing. Specifically, existing works can introduce significant instrumentation overhead, result in misleading guidance, and have lack of support for widely-used hardware languages. ProcessorFuzz revealed 8 new bugs in widely-used open source processors and identified bugs 1.23× faster than a prior work.

[paper]

Click to see the abstract!

As agile software development and extreme programing have become increasingly popular, continuous integration (CI) has become a widely used collaborative work method. However, it is common to make changes frequently to a project during CI. If existing testing methods are applied to CI directly, it will be difficult to make testing resources focus on changes generated by CI, which results in insufficient testing for changes. To solve this problem, we propose a fuzz testing method for CI. First, differential analysis is performed to determine the change points generated during CI, change points are added to the taint source set, and static analysis is conducted to calculate the distances between each basic block and the taint sources. Then, the project under test is instrumented according to the distances. During fuzz testing, testing resources are allocated based on seed coverage to test the change points effectively. Using the proposed methods, we implement CIDFuzz as a prototype tool, and experiments are conducted on four open‐source projects that use CI. Experimental results show that, compared with AFL and AFLGo, CIDFuzz can reduce the time costs of covering change points up to 39.59% and 41.64%, respectively. Also, CIDFuzz can reduce the time costs of reproducing vulnerabilities up to 34.78% and 25.55%.

[EuroS&P'23] Hunting for Truth: Analyzing Explanation Methods in Learning-based Vulnerability Discovery

[paper]

Click to see the abstract!

Recent research has developed a series of methods for finding vulnerabilities in software using machine learning. While the proposed methods provide a remarkable performance in controlled experiments, their practical application is hampered by their black-box nature: A security practitioner cannot tell how these methods arrive at a decision and what code structures contribute to a reported security flaw. Explanation methods for machine learning may overcome this problem and guide the practitioner to relevant code. However, there exist a variety of competing explanation methods, each highlighting different code regions when given the same finding. So far, this inconsistency has made it impossible to select a suitable explanation method for practical use.

In this paper, we address this problem and develop a method for analyzing and comparing explanations for learning-based vulnerability discovery. Given a predicted vulnerability, our approach uses directed fuzzing to create local ground-truth around code regions marked as relevant by an explanation method. This local ground-truth enables us to assess the veracity of the explanation. As a result, we can qualitatively compare different explanation methods and determine the most accurate one for a particular learning setup. In an empirical evaluation with different discovery and explanation methods, we demonstrate the utility of this approach and its capabilities in making learning-based vulnerability discovery more transparent.

[ISSTA'23] 1dFuzz: Reproduce 1-day Vulnerabilities with Directed Differential Fuzzing

[paper]

Click to see the abstract!

1-day vulnerabilities are common in practice and have posed severe threats to end users, as adversaries could learn from released patches to find them and exploit them. Reproducing 1-day vulnerabilities is also crucial for defenders, e.g., to block attack traffic against 1-day vulnerabilities. A core question that affects the effectiveness of recognizing and triggering 1-day vulnerabilities is what is the unique feature of a security patch. After conducting a large-scale empirical study, we point out that a common and unique feature of patches is the trailing call sequence (TCS) and present a novel directed differential fuzzing solution 1dFuzz to efficiently reproduce 1-day vulnerabilities in this paper. Based on the TCS feature, we present a locator 1dLoc able to find candidate patch locations via static analysis, a novel TCS-based distance metric for directed fuzzing, and a novel sanitizer 1dSan able to catch PoCs for 1-day vulnerabilities during fuzzing. We have systematically evaluated 1dFuzz on a set of real-world software vulnerabilities in 11 different settings. Results show that 1dFuzz significantly outperforms state-of-the-art (SOTA) baselines and could find up to 2.26x more 1-day vulnerabilities with a 43% shorter time.

[Usenix'23] FishFuzz: Catch Deeper Bugs by Throwing Larger Nets

[paper] [project] [artifact]

Click to see the abstract!

Fuzzers effectively explore programs to discover bugs. Greybox fuzzers mutate seed inputs and observe their execution. Whenever a seed reaches new behavior (e.g., new code or higher execution frequency), it is stored for further mutation. Greybox fuzzers directly measure exploration and, by repeating execution of the same targets with large amounts of mutated seeds, passively exploit any lingering bugs. Directed greybox fuzzers (DGFs) narrow the search to a few code locations but so far generalize distance to all targets into a single score and do not prioritize targets dynamically.

FISHFUZZ introduces an input prioritization strategy that builds on three concepts: (i) a novel multi-distance metric whose precision is independent of the number of targets, (ii) a dynamic target ranking to automatically discard exhausted targets, and (iii) a smart queue culling algorithm, based on hyperparameters, that alternates between exploration and exploitation. FISHFUZZ enables fuzzers to seamlessly scale among thousands of targets and prioritize seeds toward interesting locations, thus achieving more comprehensive program testing. To demonstrate generality, we implement FISHFUZZ over two well-established greybox fuzzers (AFL and AFL++). We evaluate FISHFUZZ by leveraging all sanitizer labels as targets. In comparison to modern DGFs and state-of-the-art coverage guided fuzzers, FISHFUZZ reaches higher coverage compared to the direct competitors, finds up to 2.8x more bugs compared with the baseline and reproduces 68.3% existing bugs faster. FISHFUZZ also discovers 56 new bugs (38 CVEs) in 47 programs.

[arxiv'23] FGo: A Directed Grey-box Fuzzer with Probabilistic Exponential cut-the-loss Strategies

[paper] [project]

Click to see the abstract!

[paper]

Click to see the abstract!

Fuzzing is a widely adopted technique in the software industry to enhance security and software quality. However, most existing fuzzers are specifically designed for monolithic software architectures and face significant limitations when it comes to serving distributed Microservices applications (Apps). These limitations primarily revolve around issues of inconsistency, communication, and applicability which arise due to the differences in monolithic and distributed software architecture. This paper presents a novel fuzzing framework, called MicroFuzz, specifically designed for Microservices. Mocking-Assisted Seed Execution, Distributed Tracing, Seed Refresh and Pipeline Parallelism approaches are adopted to address the environmental complexities and dynamics of Microservices and improve the efficiency of fuzzing. MicroFuzz has been successfully implemented and deployed in AntGroup, a prominent FinTech company. Its performance has been evaluated in three distinct industrial scenarios: normalized fuzzing, iteration testing, and taint verification. Throughout five months of operation, MicroFuzz has diligently analyzed a substantial codebase, consisting of 261 Apps with over 74.6 million lines of code (LOC). The framework’s effectiveness is evident in its detection of 5,718 potential quality or security risks, with 1,764 of them confirmed and fixed as actual security threats by software specialists. Moreover, MicroFuzz significantly increased line coverage by 12.24% and detected new paths by 38.42% in the iteration testing.

[S&P'24] Everything is Good for Something: Counterexample-Guided Directed Fuzzing via Likely Invariant Inference

[paper]

Click to see the abstract!

Directed fuzzing demonstrates the potential to reproduce bug reports, verify patches, and debug vulnerabilities. State-of-the-art directed fuzzers prioritize inputs that are more likely to trigger the target vulnerability or filter irrelevant inputs unrelated to the targets. Despite these efforts, existing approaches struggle to reproduce specific vulnerabilities as most generated inputs are irrelevant. For instance, in the Magma benchmark, more than 94% of generated inputs miss the target vulnerability. We call this challenge the indirect input generation problem. We propose to increase the yield of inputs that reach the target location by restraining input generation. Our key insight is to infer likely invariants from both reachable and unreachable executed inputs to constrain the search space of the subsequent input generation and produce more reachable inputs. Moreover, we propose two selection strategies to minimize the fraction of unnecessary inputs for efficient invariant inference and deprioritize imprecise invariants for effective input generation. Halo, our prototype implementation, outperforms state-of-the-art directed fuzzers with a 15.3x speedup in reproducing target vulnerabilities by generating 6.2x more reachable inputs. During our evaluation, we also detected ten previously unknown bugs involving seven incomplete fixes in the latest versions of well-fuzzed targets.

[S&P'24] LABRADOR: Response Guided Directed Fuzzing for Black-box IoT Devices

Click to see the abstract!

Fuzzing is a popular solution to finding vulnerabilities in software including IoT firmware. However, due to the challenges of emulating or rehosting firmware, some IoT devices (e.g., enterprise-level devices) can only be fuzzed in a black-box manner, which makes fuzzers blind and inefficient due to missing feedbacks (e.g., code coverage or distance). In this paper, we present a novel response guided directed fuzzing solution LABRADOR, able to test black-box IoT devices efficiently. Specifically, we leverage the network response to infer the execution trace of firmware and deduce the code coverage of testing. Second, we leverage the test case (i.e., request) and its response to estimate the distance to the target sensitive code (i.e., sink). Lastly, we further leverage the distance to guide test case mutation, which efficiently drives directed fuzzing toward candidate vulnerable code. We have implemented a prototype of LABRADOR and evaluated it on 14 different enterprise-level IoT devices. Results showed that LABRADOR significantly outperforms state-of-the-art (SOTA) solutions. It finds 44X more vulnerabilities than SNIPUZZ, BOOFUZZ and FIRM-AFL and 8.57X more vulnerabilities than SaTC. In total, it discovered 79 unknown vulnerabilities, of which 61 were assigned with CVEs.

[Usenix'24] SDFUZZ: Target States Driven Directed Fuzzing

[paper]

Click to see the abstract!

Directed fuzzers often unnecessarily explore program code and paths that cannot trigger the target vulnerabilities. We observe that the major application scenarios of directed fuzzing provide detailed vulnerability descriptions, from which highly-valuable program states (i.e., target states) can be derived, e.g., call traces when a vulnerability gets triggered. By driving to expose such target states, directed fuzzers can exclude massive unnecessary exploration. Inspired by the observation, we present SDFUZZ, an efficient directed fuzzing tool driven by target states. SDFUZZ first automatically extracts target states in vulnerability reports and static analysis results. SDFUZZ employs a selective instrumentation technique to reduce the fuzzing scope to the required code for reaching target states. SDFUZZ then early terminates the execution of a test case once SDFUZZ probes that the remaining execution cannot reach the target states. It further uses a new target state feedback and refines prior imprecise distance metric into a two-dimensional feedback mechanism to proactively drive the exploration towards the target states. We thoroughly evaluated SDFUZZ on known vulnerabilities and compared it to related works. The results show that SDFUZZ could improve vulnerability exposure capability with more vulnerability triggered and less time used, outperforming the state-of-the-art solutions. SDFUZZ could significantly improve the fuzzing throughput. Our application of SDFUZZ to automatically validate the static analysis results successfully discovered four new vulnerabilities in well-tested applications. Three of them have been acknowledged by developers.

[NDSS'24] DeepGo: Predictive Directed Greybox Fuzzing

[paper] [project]

Click to see the abstract!

Directed Greybox Fuzzing (DGF) is an effective approach designed to strengthen testing vulnerable code areas via predefined target sites. The state-of-the-art DGF techniques redefine and optimize the fitness metric to reach the target sites precisely and quickly. However, optimizations for fitness metrics are mainly based on heuristic algorithms, which usually rely on historical execution information and lack foresight on paths that have not been exercised yet. Thus, those hard-to-execute paths with complex constraints would hinder DGF from reaching the targets, making DGF less efficient.

In this paper, we propose DeepGo, a predictive directed greybox fuzzer that can combine historical and predicted information to steer DGF to reach the target site via an optimal path. We first propose the path transition model, which models DGF as a process of reaching the target site through specific path transition sequences. The new seed generated by mutation would cause the path transition, and the path corresponding to the high-reward path transition sequence indicates a high likelihood of reaching the target site through it. Then, to predict the path transitions and the corresponding rewards, we use deep neural networks to construct a Virtual Ensemble Environment (VEE), which gradually imitates the path transition model and predicts the rewards of path transitions that have not been taken yet. To determine the optimal path, we develop a Reinforcement Learning for Fuzzing (RLF) model to generate the transition sequences with the highest sequence rewards. The RLF model can combine historical and predicted path transitions to generate the optimal path transition sequences, along with the policy to guide the mutation strategy of fuzzing. Finally, to exercise the high-reward path transition sequence, we propose the concept of an action group, which comprehensively optimizes the critical steps of fuzzing to realize the optimal path to reach the target efficiently. We evaluated DeepGo on 2 benchmarks consisting of 25 programs with a total of 100 target sites. The experimental results show that DeepGo achieves 3.23×, 1.72×, 1.81×, and 4.83× speedup compared to AFLGo, BEACON, WindRanger, and ParmeSan, respectively in reaching target sites, and 2.61×, 3.32×, 2.43× and 2.53× speedup in exposing known vulnerabilities.

[FSE'24] Evaluating Directed Fuzzers: Are We Heading in the Right Direction?

[paper] [project] [artifact] [slides]

Click to see the abstract!

Directed fuzzing recently has gained significant attention due to its ability to reconstruct proof-of-concept (PoC) test cases for target code such as buggy lines or functions. Surprisingly, however, there has been no in-depth study on the way to properly evaluate directed fuzzers despite much progress in the field. In this paper, we present the first systematic study on the evaluation of directed fuzzers. In particular, we analyze common pitfalls in evaluating directed fuzzers with extensive experiments on five state-of-the-art tools, which amount to 30 CPU-years of computational effort, in order to confirm that different choices made at each step of the evaluation process can significantly impact the results. For example, we find that a small change in the crash triage logic can substantially affect the measured performance of a directed fuzzer, while the majority of the papers we studied do not fully disclose their crash triage scripts. We argue that disclosing the whole evaluation process is essential for reproducing research and facilitating future work in the field of directed fuzzing. In addition, our study reveals that several common evaluation practices in the current directed fuzzing literature can mislead the overall assessments. Thus, we identify such mistakes in previous papers and propose guidelines for evaluating directed fuzzers.

[paper]

Click to see the abstract!

Directed Greybox Fuzzing has proven effective in vulnerability detection areas such as bug reproduction and patch testing. However, existing directed fuzzers are often difficult to customize, lack modularity and have limited binary support. This constrains their usability on complex software or when the source code is unavailable; a challenge encountered when fuzzing embedded systems. This article addresses these limitations by introducing the Directed Fuzzing Toolkit (DRIFT) as a platform for directed fuzzing within the modular framework LibAFL. DRIFT modularizes techniques from the state-of-the-art directed fuzzer AFLGo and adapts them for binary applications thereby augmenting LibAFL’s highly customizable fuzzers with directed fuzzing capabilities. Additionally, by leveraging Ghidra’s analysis, DRIFT achieves architecture agnostic static analysis, opening doors for DGF to tackle previously challenging scenarios. Our evaluation of DRIFT shows a 90% correlation in static analysis metrics over binary compared to its source-code counterpart. Fuzzing performance was also notable despite operating over emulation. In benchmarks, DRIFT’s performance exceeds the original fuzzer with up to doubled bug discovery rates and 9–40x faster exploitation times of target bugs. These results are attributed to the toolkit’s modular design and its integration with LibAFL. Additionally, DRIFT includes a profiling platform for DGF metrics and is incorporated with the Magma benchmark. Together, these features position DRIFT as a practical advancement in directed fuzzing within LibAFL.

[Fuzzing'24] Effective Fuzzing within CI/CD Pipelines (Registered Report)

[paper] [artifact]

Click to see the abstract!

Deploying fuzzing within CI/CD pipelines can help ensure safe and secure code evolution. Directed greybox fuzzing techniques such as AFLGo are a good match for the CI/CD context. These techniques prioritise inputs based on estimated distances to the changed code. Unfortunately, computing these distances is often expensive, making the techniques impractical for short CI/CD runs. In this paper, we propose an AFLGo-based technique called PaZZER, which optimises the distance calculation by dropping the expensive control-flow graph component and computing the callgraph component in an incremental fashion. Preliminary results are promising, showing that PaZZER can make CI/CD testing feasible for large applications: e.g., for Objdump the distance computation time is decreased from 34 min to just 2.5 min, with a further 2.3 min saved when an incremental algorithm is used. The significant time reduction in distance computation allows PaZZER to use most of the time on actual fuzzing, making it practical for short CI/CD runs of around 10 minutes. Our planned full evaluation will involve real-world commits from a diverse set of nine applications of different sizes. This will include coverage experiments and an ablation study to investigate the impact of PaZZER’s design decisions, and a bug-finding case study comparing it against AFLGo and Google’s CIFuzz. We will assess the benefits and effectiveness of our approach in terms of patch coverage, patch proximity, distance computation time, and time-to-exposure for bugs.

[Fuzzing'24] Directed or Undirected: Investigating Fuzzing Strategies in a CI/CD Setup (Registered Report)

[paper]

Click to see the abstract!

Fuzzing best practices suggest that fuzzing should be run for at least 24 hours, if not longer. This recommendation makes it hard to integrate fuzzing into CI/CD contexts, to rapidly check a commit for bugs. Existing studies on CI/CD fuzzing simulated a CI/CD environment by running undirected fuzzers on Magma benchmark programs, which have multiple bugs injected into a single version of the program. Directed fuzzers, such as AFLGo, aim to generate inputs that reach specific target locations in the program being fuzzed. Thus, they should be more effective at fuzzing in a CI/CD environment. In this study, we propose to evaluate both directed and undirected fuzzers in a simulated CI/CD environment. Like prior work, we will use Magma as a source of benchmarks, and run fuzzers for 10 minutes. Unlike prior work, we will start the fuzzing process from a saturated corpus, rather than Magma’s default corpus. Also unlike prior work, we will run the fuzzers on versions of Magma programs with a single bug injected. To deal with the threat that Magma patches give directed fuzzers access to too precise information as to the bug location, we will also conduct experiments where we add additional lines of target code, to evaluate the sensitivity of directed fuzzers. Our registered report gives preliminary results on a small subset of benchmarks.

[arxiv'24] An Empirical Study on the Distance Metric in Guiding Directed Grey-box Fuzzing

[paper]

Click to see the abstract!

Directed grey-box fuzzing (DGF) aims to discover vulnerabilities in specific code areas efficiently. Distance metric, which is used to measure the quality of seed in DGF, is a crucial factor in affecting the fuzzing performance. Despite distance metrics being widely applied in existing DGF frameworks, it remains opaque about how different distance metrics guide the fuzzing process and affect the fuzzing result in practice. In this paper, we conduct the first empirical study to explore how different distance metrics perform in guiding DGFs. Specifically, we systematically discuss different distance metrics in the aspect of calculation method and granularity. Then, we implement different distance metrics based on AFLGo. On this basis, we conduct comprehensive experiments to evaluate the performance of these distance metrics on the benchmarks widely used in existing DGF-related work. The experimental results demonstrate the following insights. First, the difference among different distance metrics with varying methods of calculation and granularities is not significant. Second, the distance metrics may not be effective in describing the difficulty of triggering the target vulnerability. In addition, by scrutinizing the quality of testcases, our research highlights the inherent limitation of existing mutation strategies in generating high-quality testcases, calling for designing effective mutation strategies for directed fuzzing. We open-source the implementation code and experiment dataset to facilitate future research in DGF.

[arxiv'24] TransferFuzz: Fuzzing with Historical Trace for Verifying Propagated Vulnerability Code

[paper] [project]

Click to see the abstract!

Code reuse in software development frequently facilitates the spread of vulnerabilities, making the scope of affected software in CVE reports imprecise. Traditional methods primarily focus on identifying reused vulnerability code within target software, yet they cannot verify if these vulnerabilities can be triggered in new software contexts. This limitation often results in false positives. In this paper, we introduce TransferFuzz, a novel vulnerability verification framework, to verify whether vulnerabilities propagated through code reuse can be triggered in new software. Innovatively, we collected runtime information during the execution or fuzzing of the basic binary (the vulnerable binary detailed in CVE reports). This process allowed us to extract historical traces, which proved instrumental in guiding the fuzzing process for the target binary (the new binary that reused the vulnerable function). TransferFuzz introduces a unique Key Bytes Guided Mutation strategy and a Nested Simulated Annealing algorithm, which transfers these historical traces to implement trace-guided fuzzing on the target binary, facilitating the accurate and efficient verification of the propagated vulnerability. Our evaluation, conducted on widely recognized datasets, shows that TransferFuzz can quickly validate vulnerabilities previously unverifiable with existing techniques. Its verification speed is 2.5 to 26.2 times faster than existing methods. Moreover, TransferFuzz has proven its effectiveness by expanding the impacted software scope for 15 vulnerabilities listed in CVE reports, increasing the number of affected binaries from 15 to 53. The datasets and source code used in this article are available at https://github.com/Siyuan-Li201/TransferFuzz.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
ParmeSan.pdf		ParmeSan.pdf
README.md		README.md
misc.md		misc.md
sequence_hybrid.pdf		sequence_hybrid.pdf
whitebox.md		whitebox.md

strongcourage/awesome-directed-fuzzing

Folders and files

Latest commit

History

Repository files navigation

awesome-directed-fuzzing

[CCS'17] Directed Greybox Fuzzing

[CCS'18] Hawkeye: Towards a Desired Directed Grey-box Fuzzer

[DSN'19] 1dVul: Discovering 1-day Vulnerabilities through Binary Patches

[ICPC'19] Sequence coverage directed greybox fuzzing

[CCS'19] Poster: Directed Hybrid Fuzzing on Binary Code

[ICSE'19] LEOPARD: Identifying Vulnerable Code for Vulnerability Assessment through Program Metrics

[arxiv'19] V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing

[SANER'20] Sequence directed hybrid fuzzing

[ICSE'20] Targeted Greybox Fuzzing with Static Lookahead Analysis

[SEC'20] FuzzGuard: Filtering out Unreachable Inputs in Directed Grey-box Fuzzing through Deep Learning

[SEC'20] ParmeSan: Sanitizer-guided Greybox Fuzzing

[RAID'20] Binary-level Directed Fuzzing for Use-After-Free Vulnerabilities

[arxiv'20] TOFU: Target-Oriented FUzzer

[arxiv'20] SoK: The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing

[PRDC'20] GTFuzz: Guard Token Directed Grey-Box Fuzzing

[arxiv'20] DeFuzz: Deep Learning Guided Directed Fuzzing

[Appl.Sci.'21] Constructing More Complete Control Flow Graphs Utilizing Directed Gray-Box Fuzzing

[DAC'21] DirectFuzz: Automated Test Generation for RTL Designs using Directed Graybox Fuzzing

[CCS'21] Regression Greybox Fuzzing

[ICAIS'21] KCFuzz: Directed Fuzzing Based on Keypoint Coverage

[Usenix'21] Constraint-guided Directed Greybox Fuzzing

[ASE'21 NIER] Towards Systematic and Dynamic Task Allocation for Collaborative Parallel Fuzzing

[arxiv'21] Finding Counterexamples of Temporal Logic properties in Software Implementations via Greybox Fuzzing

[S&P'22] BEACON : Directed Grey-Box Fuzzing with Provable Path Pruning

[ICSE'22] WindRanger: A Directed Greybox Fuzzer driven by Deviation Basic Block

[ASIACCS'22] TargetFuzz: Using DARTs to Guide Directed Greybox Fuzzers

[S&P'22] Exploit the Last Straw That Breaks Android Systems

[ICSE'22] Linear-time Temporal Logic guided Greybox Fuzzing

[thesis] Directing greybox fuzzing to discover bugs in hardware and software - Sadullah Canakci

[Usenix'22] BRAKTOOTH: Causing Havoc on Bluetooth Link Manager via Directed Fuzzing

[arxiv'22] Multiple Targets Directed Greybox Fuzzing

[arxiv'22] FishFuzz: Throwing Larger Nets to Catch Deeper Bugs

[CCS'22] MC2: Rigorous and Efficient Directed Greybox Fuzzing

[ACSAC'22] One Fuzz Doesn’t Fit All: Optimizing Directed Fuzzing via Target-tailored Program State Restriction

[S&P'23] SELECTFUZZ: Efficient Directed Fuzzing with Selective Path Exploration

[TDSC'23] G-Fuzz: A Directed Fuzzing Framework for gVisor

[arxiv'23] Directed Greybox Fuzzing with Stepwise Constraint Focusing

[S&P'23] ODDFUZZ: Discovering Java Deserialization Vulnerabilities via Structure-Aware Directed Greybox Fuzzing

[IET Software'23] CIDFuzz: Fuzz testing for continuous integration

[EuroS&P'23] Hunting for Truth: Analyzing Explanation Methods in Learning-based Vulnerability Discovery

[ISSTA'23] 1dFuzz: Reproduce 1-day Vulnerabilities with Directed Differential Fuzzing

[Usenix'23] FishFuzz: Catch Deeper Bugs by Throwing Larger Nets

[arxiv'23] FGo: A Directed Grey-box Fuzzer with Probabilistic Exponential cut-the-loss Strategies

[Usenix'23] DAFL: Directed Grey-box Fuzzing Guided by Data Dependency

[EuroS&PW 2023'23] Guiding Directed Fuzzing with Feasibility

[CCS'23] HyperGo: Probability-based Directed Hybrid Fuzzing

[CCS'23] SyzDirect: Directed Greybox Fuzzing for Linux Kernel

[MS Thesis'23] Hybrid Testing: Combining Static Analysis and Directed Fuzzing

[Usenix'23] DDRace: Finding Concurrency UAF Vulnerabilities in Linux Drivers with Directed Fuzzing

[arxiv'23] TOPr: Enhanced Static Code Pruning for Fast and Precise Directed Fuzzing

[S&P'24] Titan : Efficient Multi-target Directed Greybox Fuzzing

[arxiv'23] Toward Unbiased Multiple-Target Fuzzing with Path Diversity

[S&P'24] Predecessor-aware Directed Greybox Fuzzing

[OOPSLA'23] A Cocktail Approach to Practical Call Graph Construction

[ICCAD'23] SurgeFuzz: Surge-Aware Directed Fuzzing for CPU Designs

[APSEC'23] On the Effectiveness of Synthetic Benchmarks for Evaluating Directed Grey-box Fuzzers

[ICSE'24-SEIP] MicroFuzz: An Efficient Fuzzing Framework for Microservices

[S&P'24] Everything is Good for Something: Counterexample-Guided Directed Fuzzing via Likely Invariant Inference

[S&P'24] LABRADOR: Response Guided Directed Fuzzing for Black-box IoT Devices

[Usenix'24] SDFUZZ: Target States Driven Directed Fuzzing

[NDSS'24] DeepGo: Predictive Directed Greybox Fuzzing

[FSE'24] Evaluating Directed Fuzzers: Are We Heading in the Right Direction?

[SBFT'24] TuneFuzz: adaptively exploring target programs

[ASIACCS'24] SoK: Where to Fuzz? Assessing Target Selection Methods in Directed Fuzzing

[ISSTA'24] DDGF: Dynamic Directed Greybox Fuzzing with Path Profiling

[ISSTA'24] Prospector: Boosting Directed Greybox Fuzzing for Large-scale Target Sets with Iterative Prioritization

[DIMVA'24] Modularizing Directed Greybox Fuzzing for Binaries over Multiple CPU Architectures

[Fuzzing'24] Effective Fuzzing within CI/CD Pipelines (Registered Report)

[Fuzzing'24] Directed or Undirected: Investigating Fuzzing Strategies in a CI/CD Setup (Registered Report)

[arxiv'24] An Empirical Study on the Distance Metric in Guiding Directed Grey-box Fuzzing

[arxiv'24] TransferFuzz: Fuzzing with Historical Trace for Verifying Propagated Vulnerability Code

About

Topics

Resources

Packages