SIGINT

All-source analysis of agentic development research. What's signal. What's noise.

An LLM-assisted scan of 1,524 papers on agentic AI, scored on four dimensions: Rigor, Transparency, Claims, and Integrity. Each score is a pass rate of binary fact-checks extracted from the paper. No calibration weights, no reviewer judgment in the math.

4.1% meet full reproducibility standards (code, data, seeds, environment).
66% show overclaiming: conclusions stronger than the evidence supports.
33.1 / 100 median composite score across the corpus. That's the baseline the field is working from.

> Read the methodology →

10,663

Papers tracked

1,524

Scored

33.1

Composite median

4.1%

Full reproducibility

53.3

Rigor median

41.7

Transparency median

63.3

Claims median

25.0

Integrity median

> Survey progress of 10663 scanned (NaN%)

V5 Haiku () Legacy () Not scanned (8899)

Of NaN scanned: 1,524 empirical papers feed the aggregates above; NaN non-empirical (surveys, position papers, frameworks) are scored on a reduced rubric and shown individually but not mixed into the aggregates.

What is this?

Every major claim in agentic AI, checked against the primary source. Stop betting your architecture on a blog post someone wrote about a preprint they skimmed.

> Assess claims

Each paper rated for methodology, sample size, and evidence quality.

> Track trends

What the field is converging on, where papers disagree, and what holds up under scrutiny.

> Source details

Drill into any paper's methodology, limitations, and how it connects to the broader body of work.

> Papers

SIGINT

What is this?

> Assess claims

> Track trends

> Source details

Search and browse 10,663+ papers

Score distributions, year trends, and field-level patterns

Cross-paper analysis and correlations

Papers that contradict each other