layout | title | parent | nav_order | permalink |
---|---|---|---|---|
default |
Corpora |
Advanced topics |
3 |
/advanced-topics/corpora/ |
{: .no_toc}
If you want to access the corpora that we are using for your fuzz targets (synthesized by the fuzzing engines), follow these steps.
- TOC {:toc}
To get access to a project's corpora, you must be listed as the
primary contact or as an auto cc in the project's project.yaml
file, as described
in the [New Project Guide]({{ site.baseurl }}/getting-started/new-project-guide/#projectyaml).
If you don't do this, most of the links below won't work.
The corpora for fuzz targets are stored on Google Cloud
Storage. To access them, you need to
install the gsutil
tool, which is part of
the Google Cloud SDK. Follow the instructions on the installation page to
login with the Google account listed in your project's project.yaml
file.
The fuzzer statistics page for your project on [ClusterFuzz]({{ site.baseurl }}/further-reading/clusterfuzz) contains a link to the Google Cloud console for your corpus under the corpus_size column. Click the link to browse and download individual test inputs in the corpus.
If you want to download the entire corpus, click the link in the corpus_size column, then copy the Buckets path at the top of the page:
Copy the corpus to a directory on your machine by running the following command:
$ gsutil -m cp -r gs://<bucket_path> <local_directory>
Using the expat example above, this would be:
$ gsutil -m cp -r \
gs://expat-corpus.clusterfuzz-external.appspot.com/libFuzzer/expat_parse_fuzzer \
<local_directory>
We keep daily zipped backups of your corpora. These can be accessed from the
corpus_backup column of the fuzzer statistics page. Downloading these can
be significantly faster than running gsutil -m cp -r
on the corpus bucket.