EROSOLAR
In-scope only: U.S.-government tasking, U.S. defense-prime engagements under contract, and authorized bug-bounty programs. Scope and EAR/ITAR posture: /about. Procurement: /defense.

How it works

From scope intake to a delivered write-up, the pipeline an Erosolar agent runs end-to-end. Tooling matches what credible binary-VR teams use today: Tavily for sourcing, Ghidra with the GhidraMCP server for variant analysis, AFL++ for fuzzing, GDB + pwndbg for triage, pwntools + ROPgadget for PoC.

Pipeline summary

StepToolInputOutputHuman checkpoint
0Target identifier, authorization basis, out-of-scope listEngagement-scoped task✓ operator confirms
1TavilyCVE / target keywordsPatch diff URL, advisories
2git, wget, APK pullVulnerable + patched versionsTwo binary trees
3Ghidra + GhidraMCPTwo binariesChanged-function list, variant hypotheses
4AFL++Harness + seed corpusCrash inputs
5GDB + pwndbgCrash inputBug class, root cause✓ operator triages
6pwntools, ROPgadgetBug class, primitiveReproducible PoC
7PoC + write-upEncrypted artefact set✓ operator releases

Step 0Scope intake & authorization phase.intake

The operator provides a target identifier (CVE, package, firmware image, or bug-bounty program), an authorization basis (private research, USG-contract task order, defense-prime engagement under contract, or written bug-bounty scope), and an out-of-scope list. The agent does not fire against any target without these inputs. Legal regime and EAR/ITAR posture are at /about.

Two activation paths, both opt-in and explicit at process launch:

# Coordinated-disclosure / public-research (HackerOne, PSIRT, advisory):
EROSOLAR_PROFILE=variant-research erosolar

# Procurement-delivery (USG contract / defense prime / bug-bounty program):
EROSOLAR_PROFILE=engagement-delivery erosolar

Default erosolar launches the coding profile with the offsec tool surface excluded, per the capability separation rule on /about. Profile rulebooks: agents/variant-research.rules.json, agents/engagement-delivery.rules.json.

Step 1Target discovery with Tavily phase.recon

tavily search "CVE-2026-XXXX Android kernel patch diff"

The agent picks a recent critical CVE affecting an in-scope, high-value target. Tavily returns blog posts, NVD entries, vendor advisories, and ideally a link to the upstream Git commit or patched build.

Step 2Binary acquisition phase.acquire

The CLI uses terminal commands to download the vulnerable and patched versions of the software. For open-source components, it clones the repo at two different commits and compiles them. For closed-source, it pulls firmware from a device or APKs from a vendor channel.

Step 3Variant analysis with Ghidra MCP phase.bindiff + phase.variant

The agent calls Ghidra's Version Tracking to automatically diff the two binaries. It receives a list of changed functions. Using Ghidra's decompiler, it examines each changed function to understand the fix. It then uses Ghidra's search tools to find the same code pattern in related software or older versions on your system — all done via MCP calls. This is how a public patch becomes a hypothesis generator for fresh, unpatched variants.

This is the same starting shape Project Zero's Big Sleep / Naptime uses: a known fix becomes a hypothesis generator for unfixed siblings.

Step 4Fuzzing campaign phase.fuzz

If no exact variant is found, the agent identifies the vulnerable input type (a specific file format, network packet, or syscall). It writes a small fuzzing harness in C/Python, seeds it with a valid input mutated to stress the fixed code path, and launches AFL++ via terminal:

afl-fuzz -i seeds/ -o findings/ -- ./harness @@

The CLI monitors the fuzzer output. When a crash appears, the crash file flows to Step 5.

Step 5Crash triage & root cause analysis phase.triage

The agent runs the crashing input under GDB with pwndbg, captures the register state and backtrace, then cross-references with the Ghidra decompilation of the crashing function (via MCP). It diagnoses the bug class: heap overflow, use-after-free, integer overflow, type confusion, etc.

gdb -batch \
  -ex "run < crash" \
  -ex "checksec" \
  -ex "bt full" \
  -ex "info registers" \
  ./harness

Step 6PoC & exploit building phase.poc

Using pwntools templates and the bug details, the agent scripts an initial proof of concept. For use-after-free it crafts a heap spray; for stack overflows it uses ROPgadget to find gadgets in the binary. The LLM guides the process via the CLI.

Step 7Documentation for delivery phase.disclose / phase.deliver

Once a reliable exploit is built, the agent compiles a detailed write-up — technical description, affected versions, exploitation technique, PoC code — and stores it in a local, encrypted archive. The artefact set is delivered to the engagement's authorized recipient: a USG sponsor, a U.S. defense prime under contract, or a published bug-bounty program. See /defense for procurement and /about for scope.

Validation: what makes a finding ship-ready

A finding leaves the pipeline only after three reproducibility checks: (1) the crash reproduces from the saved input under a fresh process, (2) the same input runs cleanly on the patched build (negative test), and (3) the operator reviews and signs off on the package before any external delivery. Crashes that don't reproduce, OOMs without a controlled-corruption primitive, and PoCs that won't run twice in a row do not graduate to Step 7. This is the false-positive bar credible peers ship against — Big Sleep's "build the actual exploit, FP rate goes to zero" rule, applied at every promotion boundary.

Where humans stay in the loop

Three named checkpoints. Step 0 — the operator confirms the target identifier, the authorization basis, and the out-of-scope list before the agent fires. Step 5 — the operator reviews the bug-class diagnosis before exploit work begins; non-reproducing crashes and uncontrolled corruption are dropped here. Step 7 — the operator reviews the artefact set before any external delivery; nothing leaves the pipeline autonomously. End-to-end autonomy without human curation is not a claim this page makes; comparable systems (Big Sleep, XBOW) operate the same way.

What this pipeline does not do

Comparable systems

Three reference points the procurement audience already knows: