Autonomous CTI investigation.
Bounded. Auditable. Source‑available.

Bounce‑CTI is a research prototype. Given a single observable — a domain, an IP, a hash, a JARM fingerprint, even the bare filename of a malicious binary — an LLM agent pivots across roughly fifty public threat‑intelligence sources, filters infrastructural noise before it propagates, and streams the resulting graph live to the analyst's browser.

View on GitHub Get in touch

Status: active · research
Sources: ~50 public APIs
Pivot depth: 3 hops · ~60 calls
Benchmark: 12 real cases
Licence: source‑available

what it is

An investigation agent, not a SIEM, not a feed.

An analyst working from an alert opens five to ten browser tabs — VirusTotal, URLScan, crt.sh, RDAP, passive DNS — copies fragments between them, and arrives at a provisional picture an hour later. Bounce‑CTI moves the first pass of that work into a bounded autonomous agent: it queries the same public sources in a deliberate order, builds a typed infrastructure graph, and writes a short analyst report. The output is a starting point for human judgement, not a replacement for it.

how it works

The data flow.

The analyst submits a seed. A FastAPI backend spawns a headless Claude Code agent whose only tools are two MCP servers — one that writes to the graph, one that reads from public CTI sources. Every infrastructure node passes through a defusing layer before becoming a pivot point. Writes stream to the browser over WebSocket.

Component Defusing layer Live event stream

engineering

Six choices that define the project.

The defensible angle of an autonomous CTI tool in 2026 is not "an LLM orchestrates investigation" — that's commoditised. It's everything around the agent: what it can touch, what it must skip, and how its work is made reproducible.

Bounded, auditable pivoting.

The agent stops at three hops from the seed observable and caps total API calls around sixty. Every pivot decision is logged. The trace is reproducible from the SQLite database after the fact.
Pre‑pivot infrastructural defusing.

Before pivoting on an IP or nameserver, the agent checks it against hardcoded lists of CDN ranges, parking nameservers, sinkholes, and DynDNS providers. This is what stops the graph from exploding into noise — a problem common to broad OSINT aggregators.
Procedural transparency.

The graph streams to the analyst's browser over WebSocket as the investigation unfolds. Each pivot is visible the moment it happens. The intent is auditability and, for less experienced analysts, learning by example.
A reproducible benchmark.

Bounce‑CTI ships with EVAL_PROTOCOL v3, a public protocol that scores twelve real‑world CTI investigations on two separate tracks — capability (methodology) and recall (coverage) — with any hallucinated node or edge a hard‑fail. Every run is committed to the repo. The point is to make claims falsifiable, not just demonstrable.
Sandboxed agent.

The agent communicates exclusively through MCP tools. It has no shell access, no filesystem access, no network access beyond the whitelisted sources. The blast radius of an agent error is bounded by construction.
Deterministic graph identity.

Node IDs are SHA‑1 hashes of (investigation_id, type, value). Re‑running the same investigation produces the same graph topology — useful for incident‑response artefacts that need to be referenced across teams and over time.

case study

A pan‑European fake‑voting PhaaS cluster, from one WhatsApp URL.

A single "School votingFR" fake‑voting URL — received over WhatsApp — was submitted as a seed. The agent recognised Cloudflare fronting, recovered the bulletproof origin, and pulled the origin TLS certificate that bound the sibling domains together. On request it then expanded the cluster to two dozen domains across four TLDs and three languages — reporting its confidence as a gradient, and stopping where the free, passive evidence ran out.

Seed type · domain · Hops · 3 · Status · published

What the agent found

One Cloudflare‑fronted cluster over a bulletproof origin (217.145.226[.]186, AS205775). The decisive link was the origin certificate's SAN binding three sibling domains — not the shared JARM, which the agent down‑weighted to corroboration only. Two confirmed members sat at VirusTotal 0/94, and the cluster was absent from every community feed queried.

Cluster domains: 24
Languages: 3 · FR·DE·IT
Shared origin cert: 1
Silent (VT 0/94) in‑cluster: 2
Community feeds hit: 0
Bulletproof origin: AS205775

Read the case study Anonymised delivery details · all IOCs defanged

benchmark

EVAL_PROTOCOL v3 — capability first, recall second.

The benchmark is published alongside the code so the tool's claims are falsifiable. v3 scores two independent tracks. CAP — the headline capability score — measures methodology: did the agent pick the right pivots, stay within budget, defuse benign infrastructure, and commit to a hypothesis? It is decay‑proof, so a dead indicator can't sink it. REC — recall — is freshness‑gated: it only scores when a liveness probe is present, and is skipped as DATA_DECAYED otherwise. A separate set of benign seeds tests restraint, and any hallucinated node or edge is a hard‑fail gate. Cases are drawn from public write‑ups by Silent Push, Sekoia, Trend Micro, DFIR Report, DomainTools, Intrinsec, DNSFilter and Trellix.

EVAL_PROTOCOL · v3

CAP · REC · restraint — hallucination is a hard gate

Latest run

2026‑06‑01 · de5a31b · Opus 4.8

CAP mean 92.9 +6.9 vs 2026‑05‑28

Hallucination 0/5 hard gate clear

Recall (REC) 61.1 LIVE subset · n=3

Restraint · benign 67 3 benign seeds

CAP= 0.40·PSpivot + 0.25·EFFbudget + 0.20·RSTdefuse + 0.15·HYPhypothesis

Nightly fresh subset · 5 positive cases

#	Threat	Seed	PS·EFF·RST·HYP	CAP	Δ prior	Liveness
c02	MuddyWater	hash	100·100·100·100	100.0	+0.0	LIVE
c03	Bumblebee → Akira	hash	100·100·100·100	100.0	+9.8	LIVE
c08	Amadey + StealC	hash	100·38·100·100	84.5	+19.5	LIVE
c09	Tycoon 2FA	domain	75·100·100·100	90.0	+11.4	DECAYED
c12	ClearFake	domain	75·100·100·100	90.0	+10.0	DECAYED

Negative cases · restraint‑only (benign seeds)

Benign seed	Type	Nodes	False tags	RST
Cloudflare anycast	ip	6	0	100
jsDelivr CDN	domain	15	2	50
Wikipedia	domain	22	2	50

Full case catalog · 12 cases ● = ran 2026‑06‑01

#	Threat	Primary marker	Diff.
01	Salt Typhoon	registrant‑email reverse‑WHOIS	medium
02	MuddyWater	JARM / TLS banner	easy‑med
03	Bumblebee → Akira	file → contacted infra	medium
04	Interlock	Cloudflare tunnel defuse	hard
05	Eye Pyramid	bulletproof ASN + default page	hard
06	LummaC2	SSL SHA‑1 cluster + content	hard
07	SocGholish	shared‑IP co‑residency	medium
08	Amadey + StealC	apex / subdomain disambig.	med‑hard
09	Tycoon 2FA	CT issuance‑date burst	medium
10	Contagious Interview	DNS TXT/MX cross‑ref	med‑hard
11	Smishing Triad	Cloudflare origin unmask	hard
12	ClearFake	cert CN → Shodan origin	medium

The fresh subset is decay‑resistant by design — hash seeds never expire, the two domain seeds were freshly registered. The remaining seven cases were DATA_DECAYED or out of subset this run; their CAP scores still stand from prior runs. Two known pivot gaps remain open (ct_burst_window on c09, shodan_cert_cn_search on c12). Full protocol, scoring scripts and per‑run artefacts live at EVAL_PROTOCOL.md and under runs/.

open source & licensing

Source‑available, with terms.

Bounce‑CTI is published under a custom licence. Individual, non‑commercial use is permitted free of charge. Any organisational, institutional, or commercial use — including by companies, governments, NGOs, universities, hospitals, and CERTs — requires a written licence. The exact terms are in LICENSE and COMMERCIAL.md on GitHub.

To request a licence or a demo, write to alexandre.pinoteau@protonmail.com. There is no signup form on purpose — a short email is more selective.

contact

Get in touch.

alexandre.pinoteau@protonmail.com → alexandre‑pinoteau.fr → linkedin.com/in/p1n0t34u → github.com/Iskandeur →

Autonomous CTI investigation.Bounded. Auditable. Source‑available.

An investigation agent, not a SIEM, not a feed.

The data flow.

Six choices that define the project.

Bounded, auditable pivoting.

Pre‑pivot infrastructural defusing.

Procedural transparency.

A reproducible benchmark.

Sandboxed agent.