threatcode · pull · May 12, 2026 · May 12, 2026 · May 12, 2026 · May 12, 2026
diff --git a/infra/experimental/agent-skills/README.md b/infra/experimental/agent-skills/README.md
@@ -1,9 +1,145 @@
 # OSS-Fuzz agent skills
 
-Skills that can be easily used with agents e.g. gemini CLI.
+Skills and tooling that let an agent CLI (Gemini CLI or Claude Code) write,
+build, and extend OSS-Fuzz fuzzing integrations. The folder ships:
 
+- Six skills that give an agent OSS-Fuzz-specific knowledge.
+- `infra/experimental/agent-skills/helper.py`, a wrapper that launches agent
+  sessions over one or more OSS-Fuzz projects in parallel.
+- `copy_to_global.sh`, an installer that places the skills where your
+  agent CLI can find them.
 
-# Threat model for running
-This is experimental code and has an open threat model. By design, the agents execute untrusted code and are running in "dangerous"/"yolo" modes. As such, when running this tool you should assume you will be running untrusted code on your machine. You should only run this in a trusted environment and on a trusted network. In practice, this means you must run this in a heavily sandboxed environment, and from a security perspective if you run this tool you will run untrusted code in your environment.
+Before running anything in this folder, review the [threat model](#threat-model).
+This tooling is experimental and runs agents in unrestricted modes.
 
-This code does not run in OSS-Fuzz production services and is not part of the tooling that runs our continuous fuzzing of open source projects.
+## Contents
+
+| Item | Purpose |
+|---|---|
+| `fuzzing-memory-unsafe-expert/` | Skill: fuzz C/C++ projects |
+| `fuzzing-go-expert/` | Skill: fuzz Go projects |
+| `fuzzing-rust-expert/` | Skill: fuzz Rust projects (cargo-fuzz) |
+| `fuzzing-jvm-expert/` | Skill: fuzz JVM projects (Java/Kotlin/Scala) with Jazzer |
+| `fuzzing-python-expert/` | Skill: fuzz Python projects with Atheris |
+| `oss-fuzz-engineer/` | Skill: OSS-Fuzz infra workflows (integrate, fix, extend) |
+| `infra/experimental/agent-skills/helper.py` | Driver that launches agent sessions per OSS-Fuzz project |
+| `copy_to_global.sh` | Installs the skills into `~/.gemini/skills` or `~/.claude/skills` |
+
+See each skill's `SKILL.md` for the detailed guidance the agent receives.
+
+## Prerequisites
+
+- A supported agent CLI installed and on `PATH`:
+  [Gemini CLI](https://github.com/google-gemini/gemini-cli) or
+  [Claude Code](https://claude.com/claude-code).
+- Docker (required by OSS-Fuzz's own `infra/helper.py`, which the agent calls).
+- Python 3.
+- A local checkout of OSS-Fuzz — `infra/experimental/agent-skills/helper.py`
+  resolves the repo root relative to its own location, so run it from this
+  checkout.
+
+## Quick start
+
+```bash
+# 1. Install the skills into your agent CLI.
+./copy_to_global.sh gemini        # or: ./copy_to_global.sh claude
+
+# 2. Confirm the agent CLI is reachable.
+gemini --version                  # or: claude --version
+
+# 3. Run a task across one or more OSS-Fuzz projects.
+python infra/experimental/agent-skills/helper.py fix-builds open62541 json-c htslib
+```
+
+`copy_to_global.sh` **overwrites** any existing skill of the same name in the
+target directory.
+
+## How the skills are used
+
+There are two ways the skills get invoked:
+
+1. **Interactively, in your agent CLI.** After `copy_to_global.sh`, the
+   skills appear to your agent and are auto-selected when a task matches the
+   skill's description. For example, asking the agent to write a Python
+   harness will surface `fuzzing-python-expert`. You can use the skills this
+   way without ever touching `infra/experimental/agent-skills/helper.py`.
+
+2. **Driven by `infra/experimental/agent-skills/helper.py`.** The helper
+   builds task-specific prompts that reference these skills and launches
+   non-interactive agent sessions, one per OSS-Fuzz project, in parallel.
+   Use this when you want to run the same task across many projects.
+
+The agent makes local changes and writes a per-project report. It does
+**not** commit or push — review the diff and reports before you do anything
+with the output.
+
+## `infra/experimental/agent-skills/helper.py` commands
+
+Run `python infra/experimental/agent-skills/helper.py <command> --help` for
+full flag listings. A summary of the available subcommands:
+
+| Command | Purpose | Example |
+|---|---|---|
+| `expand-oss-fuzz-projects` | Add new harnesses / improve coverage on existing projects | `python infra/experimental/agent-skills/helper.py expand-oss-fuzz-projects open62541 json-c` |
+| `fix-builds` | Diagnose and fix broken project builds | `python infra/experimental/agent-skills/helper.py fix-builds htslib` |
+| `run-task` | Run an arbitrary `--task` string per project | `python infra/experimental/agent-skills/helper.py run-task --task "Add a harness for the XML attribute parser" open62541` |
+| `add-chronos-support` | Add Chronos support to a project | `python infra/experimental/agent-skills/helper.py add-chronos-support json-c` |
+| `integrate-project` | Onboard a new project from a Git URL | `python infra/experimental/agent-skills/helper.py integrate-project https://github.com/org/repo` |
+| `clean` | Remove local artifacts from previous agent runs | `python infra/experimental/agent-skills/helper.py clean open62541` |
+| `show-prompt` | Print the prompt that would be sent, without launching the agent | `python infra/experimental/agent-skills/helper.py show-prompt fix-builds htslib` |
+
+### Useful behaviors and flags
+
+- **Parallelism.** Sessions run in parallel with `DEFAULT_MAX_PARALLEL = 2`.
+  Override with the helper's parallelism flag if your machine can handle
+  more concurrent Docker builds.
+- **Agent auto-detection.** `infra/experimental/agent-skills/helper.py`
+  locates the agent CLI on `PATH` automatically — you do not need to tell
+  it whether you are using Gemini CLI or Claude Code.
+- **Dry runs.** `show-prompt` prints the exact prompt that would be sent.
+  Use it first when trying a new command or task description.
+- **Reports and logs.** Each session writes a per-project report locally;
+  review these before acting on the agent's changes.
+
+## Typical workflows
+
+**Triage a batch of broken projects**
+
+```bash
+python infra/experimental/agent-skills/helper.py show-prompt fix-builds proj1 proj2
+python infra/experimental/agent-skills/helper.py fix-builds proj1 proj2
+# Review the diff and per-project reports, then commit manually.
+```
+
+**Onboard a new project end-to-end**
+
+```bash
+python infra/experimental/agent-skills/helper.py integrate-project https://github.com/org/repo
+# The agent uses oss-fuzz-engineer plus the appropriate fuzzing-*-expert
+# skill for the project's language.
+```
+
+**Expand coverage on a project you already maintain**
+
+```bash
+python infra/experimental/agent-skills/helper.py expand-oss-fuzz-projects myproj
+```
+
+**Run a custom investigation across several projects**
+
+```bash
+python infra/experimental/agent-skills/helper.py run-task \
+    --task "Investigate why the XML parser harness has low branch coverage \
+            and add targeted harnesses for the attribute-parsing paths." \
+    open62541 json-c
+```
+
+## Threat model
+
+This is experimental code with a deliberately permissive threat model:
+
+- Agents run in "dangerous"/"yolo" modes and will execute untrusted code.
+- Running this tooling means running untrusted code in your environment.
+- Only run it in a heavily sandboxed environment and on a trusted network.
+- This code does **not** run in OSS-Fuzz production services and is not part
+  of the tooling that runs our continuous fuzzing of open source projects.
diff --git a/projects/openjph/Dockerfile b/projects/openjph/Dockerfile
@@ -22,7 +22,7 @@ RUN apt-get update && apt-get install -y cmake libtiff-dev zip
 # clone the library
 RUN git clone https://github.com/aous72/OpenJPH.git
 
-# clone the seed corpus
+# import the ojph_expand_fuzz_target seed corpus
 RUN git clone --depth 1 https://github.com/aous72/jp2k_test_codestreams.git
 
 # import the build script

diff --git a/projects/openjph/build.sh b/projects/openjph/build.sh
@@ -24,7 +24,9 @@ make -j$(nproc)
 cp fuzzing/ojph_expand_fuzz_target $OUT
 cp fuzzing/ojph_compress_fuzz_target $OUT
 
-# Build the seed corpus
+# Build the seed corpora
 cd $SRC
 rm -f $OUT/ojph_expand_fuzz_target_seed_corpus.zip
 zip -j $OUT/ojph_expand_fuzz_target_seed_corpus.zip jp2k_test_codestreams/openjph/*.j2c
+rm -f $OUT/ojph_compress_fuzz_target_seed_corpus.zip
+zip -j $OUT/ojph_compress_fuzz_target_seed_corpus.zip $SRC/OpenJPH/fuzzing/seed_corpus/ojph_compress_fuzz_target/w128_h128_b2_79_b3_09.bin
diff --git a/projects/vlc/Dockerfile b/projects/vlc/Dockerfile
@@ -22,4 +22,4 @@ RUN pip3 install meson
 RUN git clone --depth 1 https://code.videolan.org/videolan/vlc.git vlc
 RUN git clone --depth 1 https://code.videolan.org/VideoLAN.org/vlc-fuzz-corpus.git vlc/fuzz-corpus
 WORKDIR vlc
-COPY build.sh fuzzing-modules.patch generate_ts_seeds.py generate_ps_seeds.py $SRC/
+COPY build.sh fuzzing-modules.patch generate_seeds.py $SRC/
diff --git a/projects/vlc/build.sh b/projects/vlc/build.sh
@@ -136,74 +136,13 @@ make V=1 -j$(nproc)
 
 cp ./test/vlc-demux-dec-libfuzzer $OUT/
 
-# Add MPEG-I/II video ES fuzzer target (mpgv.c) which lacks a dedicated corpus.
-# The mpgv module is linked via fuzzing-modules.patch and registered in the PLUGINS
-# list, but without a seed corpus directory no vlc-demux-dec-libfuzzer-mpgv binary
-# is produced. This directly exercises modules/demux/mpeg/mpgv.c and the MPEG video
-# packetizer (modules/packetizer/mpegvideo.c).
-mkdir -p fuzz-corpus/seeds/mpgv
-python3 -c "
-# Minimal MPEG-1 video elementary stream seed.
-# The sequence_header_code (0x000001B3) passes CheckMPEGStartCode in mpgv.c:
-#   0xB3 is not in {0xB0, 0xB1, 0xB6} and 0xB3 <= 0xB9, so VLC_SUCCESS is returned.
-# The demuxer opens without force and the Demux loop feeds data to the mpegvideo
-# packetizer, exercising parsing logic for MPEG-I/II video bitstreams.
-#
-# Sequence header structure (ISO/IEC 11172-2 / ISO/IEC 13818-2):
-#   start code (4B) | width(12b)/height(12b) | aspect(4b)/framerate(4b) |
-#   bitrate(18b)/marker(1b)/vbv_size(10b)/constrained(1b)/load_flags(2b)
-seed = bytes([
-    # Sequence header: 352x240, 1:1 aspect, 29.97fps, VBR, vbv=0
-    0x00, 0x00, 0x01, 0xB3,  # sequence_header_code
-    0x16, 0x00, 0xF0,        # width=352(12b)|height=240(12b): 0001 0110 0000 | 0000 1111 0000
-    0x15,                    # aspect=1(4b)|framerate=5(4b) = 0001 0101
-    0xFF, 0xFF, 0xE0, 0x00,  # bitrate(18b)=0x3FFFF(VBR) marker=1 vbv(10b)=0 flags=0
-    # Group of Pictures header: closed GOP, 00:00:00:00
-    0x00, 0x00, 0x01, 0xB8,  # group_start_code
-    0x00, 0x00, 0x01,        # time_code(25b)=0 closed_gop=0 broken_link=0
-    # Picture header: temporal_ref=0, I-frame, no extra vbv_delay
-    0x00, 0x00, 0x01, 0x00,  # picture_start_code
-    0x00, 0x10, 0xFF, 0xFF,  # temporal_ref(10b)=0 picture_type(3b)=0x1(I) vbv_delay(16b)=0xFFFF
-    # Slice: slice_vertical_position=1, quantiser_scale=1
-    0x00, 0x00, 0x01, 0x01,  # slice_start_code (row 1)
-    0x22, 0x00, 0x00,        # quantiser_scale=1, intra_slice=0, slice_data
-])
-open('fuzz-corpus/seeds/mpgv/minimal.mpgv', 'wb').write(seed)
-print('Created mpgv seed: {} bytes'.format(len(seed)))
-"
-
-# MPEG video start-code dictionary for the mpgv fuzzer.
-# These tokens help libFuzzer reach specific parsing branches in mpgv.c,
-# mpegvideo packetizer, and the MPEG-4 IOD parser (mpeg4_iod.c via TS).
-cat > fuzz-corpus/dictionaries/mpgv.dict << 'DICT_EOF'
-# MPEG-1/2 video start codes (ISO/IEC 11172-2 / ISO/IEC 13818-2)
-# libFuzzer dictionary format: one token per line, inline comments not allowed.
-"\x00\x00\x01\xB3"
-"\x00\x00\x01\xB7"
-"\x00\x00\x01\xB8"
-"\x00\x00\x01\x00"
-"\x00\x00\x01\xB5"
-"\x00\x00\x01\xB2"
-"\x00\x00\x01\x01"
-"\x00\x00\x01\xAF"
-"\x00\x00\x01"
-DICT_EOF
-
-# Replace the existing TS seeds (which are all null-packets only and do not
-# exercise any PAT/PMT/PES parsing) with proper structured TS streams.
-# generate_ts_seeds.py builds 12 minimal TS files that each contain a valid
-# PAT + PMT + at least one PES packet, directly exercising ts_psi.c, ts_pes.c,
-# ts_pid.c, ts_streams.c, ts_decoders.c, ts_si.c, ts_scte.c in
-# modules/demux/mpeg/.
-python3 $SRC/generate_ts_seeds.py fuzz-corpus/seeds/ts
-
-# Replace the upstream dvd_subtitle.vob in seeds/ps. The shipped seed has a
-# malformed SPU header (i_spu_size=8192 vs ~40 bytes of payload), so the
-# spudec packetizer holds the block waiting for more data forever and
-# modules/codec/spudec/parse.c (the actual control-sequence + RLE parser)
-# never runs. generate_ps_seeds.py emits a complete DVD subtitle SPU that
-# flows through ParsePacket -> ParseControlSeq -> ParseRLE.
-python3 $SRC/generate_ps_seeds.py fuzz-corpus/seeds/ps
+# Generate structured seeds + libFuzzer dictionaries for the demux/codec
+# fuzz targets that either had no dedicated corpus or whose upstream seeds
+# fail to exercise the target code. See generate_seeds.py for per-target
+# rationale; the script writes:
+#   seeds/{ts,ps,heif,rawdv,vc1,cdg,mus,mpgv}/* and a CEA-708 SEI seed
+#   appended to the upstream seeds/h264/ corpus, plus matching dictionaries.
+python3 $SRC/generate_seeds.py fuzz-corpus
 
 # Prepare for removing sdp.dict without breaking the build
 rm fuzz-corpus/dictionaries/sdp.dict || true