layout | title | has_children | nav_order | permalink |
---|---|---|---|---|
default |
FAQ |
false |
7 |
/faq/ |
- TOC {:toc}
Evaluating fuzz testing tools properly and rigorously is difficult, and typically needs time and resources that most researchers do not have access to. A study on Evaluating Fuzz Testing analyzed 32 fuzzing research papers and has found that "no paper adheres to a sufficiently high standard of evidence to justify general claims of effectiveness". This is a problem because it can lead to unreproducible results.
We created FuzzBench, so that all researchers and developers can evaluate their tools according to the best practices and guidelines, with minimal effort and for free.
We are planning to extend the system to measure bugs as well.
The most challenging part of fuzzing is generating inputs that exercise different parts of the program. The effectiveness of a fuzzer doing this program state discovery is best measured using a coverage metric -- this is why we started with that. Measuring this with bugs would be difficult, because bugs are typically sparse in a program. Coverage, on the other hand, is a great proxy metric for bugs as well, as a fuzzer cannot find a bug in code that it's not covering.
You don't have to. We made FuzzBench fully open source so that anybody can reproduce the experiments. Also, we'd like FuzzBench to be a community driven platform. Contributions and suggestions to make the platform better are welcome.
We are running the free FuzzBench service on Google Cloud, and the current implementation has some Google Cloud specific bits in it. You can use the code to run FuzzBench yourself on Google Cloud. Our docs explain how to do this [here]({{ site.baseurl }}/running-your-own-experiment/running-an-experiment/).
We are also working on making it easier to run in other environments (local cluster, other cloud providers, kubernetes, etc.). Community contributions for making it easier to run on different platforms are more than welcome.
Yes! In our initial launch, we have picked only a few fuzzers (e.g. AFL, libFuzzer) to get things started. We welcome all researchers to add their tools to the FuzzBench platform for automated, continuous, and free evaluation. Please use the instructions provided [here]({{ site.baseurl}}/getting-started/adding-a-new-fuzzer/).
Sure. However, please make sure you have configured it properly. It's too easy to misunderstand configuration details that can have an impact on the results. If you can, please reach out to the authors to confirm your configuration looks good to them.
I'd like to get my fuzzer evaluated, but I don't want the results and/or code to be public yet. Can I use the FuzzBench service?
Probably yes. We run private experiments for this purpose. Please reach out to us at [email protected]. If we agree to benchmark your fuzzer, please follow the guide on [adding a new fuzzer]({{ site.baseurl }}/getting-started/adding-a-new-fuzzer/) on how to integrate your fuzzer with FuzzBench.
You can ignore the sections on [Requesting an experiment]({{ site.baseurl }}/getting-started/adding-a-new-fuzzer/#requesting-an-experiment) and
[Submitting your integration]({{ site.baseurl }}/getting-started/adding-a-new-fuzzer/#submitting-your-integration).
Please test your fuzzer works with our benchmarks, we don't have CI to verify
this for private experiments.
Ideally, you should test all benchmarks using make -j test-run-$FUZZER-all
.
This takes too long on most machines so you should at least test a few of them:
make test-run-$FUZZER-zlib_zlib_uncompress_fuzzer test-run-$FUZZER-libpng-1.2.56
You should also run make presubmit
to validate the fuzzer's name and
integration code.
When your fuzzer is ready, send us a patch file that applies cleanly to
FuzzBench with git apply <patch_file>
.
We have chosen a large and diverse set of real-world benchmarks precisely to avoid technique over-fitting. These are some of the most widely used open source projects that process a wide variety of input formats. We believe that the evaluation results will generalize due to the size and diversity of our benchmarks. However, we are always open to suggestions from the community to improve it.
We have picked a large and diverse set of real-world benchmarks. This includes projects that are widely used and hence have a critical impact on infrastructure and user security.
Many of these benchmarks come from the OSS projects that are integrated in our community fuzzing service OSS-Fuzz.
We welcome recommendations on adding a new benchmark on the FuzzBench platform. It should satisfy these criteria:
- Should be a commonly used OSS project.
- Should have a non-trivial codebase (e.g. not a CRC32 implementation).
Please follow the instructions [here]({{ site.baseurl }}/developing-fuzzbench/adding-a-new-benchmark/) to add a new benchmark.
Please file an issue on GitHub or send a pull request fixing the problem.
You can use the following BibTeX entry: {% raw %}
@inproceedings{FuzzBench,
author = {Metzman, Jonathan and Szekeres, L\'{a}szl\'{o} and Maurice Romain Simon, Laurent and Trevelin Sprabery, Read and Arya, Abhishek},
title = {{FuzzBench: An Open Fuzzer Benchmarking Platform and Service}},
year = {2021},
isbn = {9781450385626},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3468264.3473932},
doi = {10.1145/3468264.3473932},
booktitle = {Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
pages = {1393–1403},
numpages = {11},
series = {ESEC/FSE 2021}
}
{% endraw %}