ce_cdm/oemcrypto/test/fuzz_tests/build_clusterfuzz.md

# OEMCRYPTO Fuzzing - Build clustefuzz and run fuzzing

## Objective

*   Run fuzzing on OEMCrypto public APIs on linux by building open sourced
    clusterfuzz source code in order to find security vulnerabilities.

    [Clusterfuzz][1]

*   Partners who implement OEMCrypto can follow these instructions to build
    clusterfuzz, the fuzzing framework and run fuzzing using fuzzer scripts
    provided by the Widevine team at Google.

## Glossary

*   Fuzzing - Fuzzing is a methodology where random, interesting, unexpected
    inputs are fed to APIs in order to crash those, thereby catching any
    security vulnerabilities with the code.

*   Fuzzing engines - [libfuzzer][4], afl, honggfuzz are the actual fuzzing
    engines that get the coverage information from API, use that to generate
    more interesting inputs which can be passed to fuzzer.

*   Seed corpus - Fuzzing engine trying to generate interesting inputs from an
    empty file is not efficient. Seed corpus is the initial input that a fuzzer
    can accept and call the API with that. Fuzzing engine can then mutate this
    seed corpus to generate more inputs to fuzzer.

*   Clusterfuzz - ClusterFuzz is a scalable fuzzing infrastructure that finds
    security and stability issues in software. Google uses ClusterFuzz to fuzz
    all Google products. Clusterfuzz provides us with the capability, tools to
    upload fuzz binaries and make use of the fuzzing engines to run fuzzing,
    find crashes and organizes the information. Clusterfuzz framework is open
    sourced, the source code can be downloaded and framework can be built
    locally or by using google cloud.

*   Fuzzing output - Fuzzing is used to pass random inputs to API in order to
    ensure that API is crash resistant. We are not testing functionality via
    fuzzing. Fuzz scripts run continuously until they find a crash with the API
    under test.

## Building fuzz scripts

This section outlines the steps to build fuzz binaries that can be run
continuously using clusterfuzz.

> **Note:** All the directories mentioned below are relative to cdm repository
> root directory.

1.  Fuzz scripts for OEMCrypto APIs are provided by the Widevine team at Google
    located under `oemcrypto/test/fuzz_tests` directory.

> **Note:** Prerequisites to run the following step are [here][10]. We also need
> to install ninja.

2.  Build a static library of your OEMCrypto implementation.
    *   Compile and link your OEMCrypto implementation source with
        `-fsanitize=address,fuzzer` flag as per these [instructions][9] when
        building a static library.

    *   Run `./oemcrypto/test/fuzz_tests/build_partner_oemcrypto_fuzztests
        <oemcrypto_static_library_path>` script from cdm repository root
        directory.

    *   This will generate fuzz binaries under the `out/Default` directory.


> **Note:** Alternatively, you can use your own build systems, for which you
> will need to define your own build files with the OEMCrypto fuzz source files
> included. You can find the the fuzz source files in
> `oemcrypto/test/fuzz_tests/partner_oemcrypto_fuzztests.gyp` and
> `oemcrypto/test/fuzz_tests/partner_oemcrypto_fuzztests.gypi`.

3.  Seed corpus for each fuzz script can be found under
    `oemcrypto/test/fuzz_tests/corpus` directory. Some fuzzers are simple and do
    not have seed corpus associated with them.

4.  Create a zip file `oemcrypto_fuzzers_yyyymmddhhmmss.zip` with fuzz binaries
    and respective seed corpus zip files. Structure of a sample zip file with
    fuzzer binaries and seed corpus would look like following:

    ```
    *   fuzzerA
    *   fuzzerA_seed_corpus.zip
    *   fuzzerB
    *   fuzzerB_seed_corpus.zip
    *   fuzzerC (fuzzerC doesn't have seed corpus associated with it)
    ```

## Building clusterfuzz

*   OEMCrypto implementation can be fuzzed by building clusterfuzz code which is
    open sourced and using it to run fuzzing. Use a Linux VM to build
    clusterfuzz.

> **Note:** You may see some issues with python modules missing, please install
> those modules if you see errors. If you have multiple versions of python on
> the VM, then use `python<version> -m pipenv shell` when you are at [this][3]
> step.

*   Follow these [instructions][2] in order to download clusterfuzz repository,
    build it locally or create a continuous fuzz infrastructure setup using
    google cloud.

## Running fuzzers on local clusterfuzz instance

*   If you prefer to run fuzzing on a local machine instead of having a
    production setup using google cloud, then follow these [instructions][6] to
    add a job to the local clusterfuzz instance.

> **Note:** Job name should have a fuzzing engine and sanitizer as part of it. A
> libfuzzer and asan jobs should have libfuzzer_asan in the job name.

*   Create a job e:g:`libfuzzer_asan_oemcrypto` and upload previously created
    `oemcrypto_fuzzers_yyyymmddhhmmss.zip` as a custom build. Future uploads of
    zip file should have a name greater than current name. Following the above
    naming standard will ensure zip file names are always in ascending order.

*   Once the job is added and clusterfuzz bot is running, fuzzing should be up
    and running. Results can be monitored as mentioned [here][6].

*   On a local clusterfuzz instance, only one fuzzer is being fuzzed at a time.

> **Note:** Fuzzing is time consuming. Finding issues as well as clusterfuzz
> regressing and fixing the issues can take time. We need fuzzing to run at
> least for a couple of weeks to have good coverage.

## Finding fuzz crashes

Once the clusterfuzz finds an issue, it logs crash information such as the
build, test case and stack trace for the crash.

*   Test cases tab should show the fuzz crash and test case that caused the
    crash. Run `./fuzz_binary <test_case>` in order to debug the crash locally.

More information about different types of logs is as below:

*   [Bot logs][7] will show information related to fuzzing, number of crashes
    that a particular fuzzer finds, number of new crashes, number of known
    crashes etc.

*   [Local GCS][8] in your clusterfuzz checkout folder will store the fuzz
    binaries that are being fuzzed, seed corpus etc.

*   `local_gcs/test-fuzz-logs-bucket` will store information related to fuzz
    crashes if any were found by the fuzzing engine. It will store crash
    information categorized by fuzzer and by each day. It will also store test
    case that caused the crash.

*   `/path/to/my-bot/clusterfuzz/log.txt` will have any log information from
    fuzzer script and OEMCrypto implementation.

## Fixing issues

*   Once you are able to debug using the crash test case, apply fix to the
    implementation, create `oemcrypto_fuzzers_yyyymmddhhmmss.zip` with latest
    fuzz binaries.

*   Upload the latest fuzz binary to the fuzz job that was created earlier.
    Fuzzer will recognize the fix and mark the crash as fixed in test cases tab
    once the regression finishes. You do not need to update crashes as fixed,
    clusterfuzz will do that.

[1]: https://google.github.io/clusterfuzz/
[2]: https://google.github.io/clusterfuzz/getting-started/
[3]: https://google.github.io/clusterfuzz/getting-started/prerequisites/#loading-pipenv
[4]: https://llvm.org/docs/LibFuzzer.html
[5]: https://google.github.io/clusterfuzz/setting-up-fuzzing/libfuzzer-and-afl/
[6]: https://google.github.io/clusterfuzz/setting-up-fuzzing/libfuzzer-and-afl/#checking-results
[7]: https://google.github.io/clusterfuzz/getting-started/local-instance/#viewing-logs
[8]: https://google.github.io/clusterfuzz/getting-started/local-instance/#local-google-cloud-storage
[9]: https://google.github.io/clusterfuzz/setting-up-fuzzing/libfuzzer-and-afl/#libfuzzer
[10]: https://google.github.io/clusterfuzz/setting-up-fuzzing/libfuzzer-and-afl/#prerequisites