The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

Engineering at MetaFebruary 11, 2026

Why it matters

Traditional testing is a bottleneck for AI-accelerated development. JiTTesting automates the test lifecycle—from generation to validation—eliminating maintenance toil and ensuring high-signal bug detection in high-velocity environments.

Key takeaways

Agentic software development is accelerating code changes beyond the capacity of traditional, manually maintained test suites.
Just-in-Time Tests (JiTTests) are LLM-generated on the fly for specific pull requests to catch regressions before they reach production.
The system uses mutation testing to deliberately insert faults, simulating potential failures to verify that generated tests are effective.
JiTTests are ephemeral and do not reside in the codebase, eliminating the long-term burden of test maintenance and code review.
Ensembles of rule-based and LLM-based assessors are used to filter results, significantly reducing false positives and engineer toil.
The approach shifts testing focus from generic code quality to high-signal fault detection tailored to the specific intent of a code change.

Keywords

JiTTestingLLM-generated tests

WHAT IT IS

The rise of agentic software development means code is being written, reviewed, and shipped faster than ever before across the entire industry. It also means that testing frameworks need to evolve for this rapidly changing landscape. Faster development demands faster testing that can catch bugs as they land in a codebase, without requiring regular updates and maintenance.

Just-in‑Time Tests (JiTTests) are a fundamentally novel approach to testing where tests are automatically generated by large language models (LLMs) on the fly to catch bugs – even ones that traditional testing might not catch – just-in-time before the code lands into production.

A Catching JiTTest focuses specifically on finding regressions introduced by a code change. This type of testing reimagines decades of software testing theory and practice. While traditional testing relies on static test suites, manual authoring, and ongoing maintenance, Catching JiTTests require no test maintenance and no test code review, meaning engineers can focus their expertise on real bugs, not false positives. Catching JiTTests use sophisticated techniques to maximize test signal value and minimize false positive drag, targeting test signals where they matter most: on serious failures.

HOW TESTING TRADITIONALLY WORKS

Under the traditional paradigm, tests are manually built as new code lands in a codebase and continually executed, requiring regular updates and maintenance. The engineers building these tests face the challenge of needing to check the behavior, not only of the current code, but all possible future changes. Inherent uncertainty about future changes results in tests that don’t catch anything, or when they do, it’s a false positive. Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

HOW CATCHING JITTESTS WORK

Broadly, JiTTests are bespoke tests, tailored to a specific code change, that give engineers simple, actionable feedback about unexpected behavior changes without the need to read or write test code. LLMs can generate JiTTests automatically the moment a pull request is submitted. And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With an understanding of intent, Catching JiTTests can significantly drive down instances of false positives.

Here are the key steps of the Catching JiTTest process:

New code lands in the codebase.
The system infers the intention of the code change.
It creates mutants (code versions with faults deliberately inserted) to simulate what could go wrong.
It generates and runs tests to catch those faults.
Ensembles of rule-based and LLM-based assessors focus the signal on true positive failures.
Engineers receive clear, relevant reports about unexpected changes right when it matters most.

WHY IT MATTERS

Catching JiTTests are designed for the world of AI-powered agentic software development and accelerate testing by focusing on serious unexpected bugs. With them engineers no longer have to spend time writing, reviewing, and testing complex test code. Catching JiTTests, by design, kill many of the issues with traditional testing in one stroke:

They are generated on-the-fly for each code change and do not reside in the codebase, eliminating ongoing maintenance costs and shifting effort from humans to machines.
They are tailored to each change, making them more robust and less prone to breaking due to intended updates.
They automatically adapt as the code changes.
They only require human review when a bug is actually caught.

This all amounts to an important shift in testing infrastructure where the focus moves from generic code quality to whether a test actually finds faults in a specific change without raising a false positive. It helps improve testing overall while also allowing it to keep up with the pace of agentic coding.

READ THE PAPER

Just-in-Time Catching Test Generation at Meta

The post The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It appeared first on Engineering at Meta.

The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

Why it matters

Key takeaways

Keywords

Content preview

WHAT IT IS

HOW TESTING TRADITIONALLY WORKS

HOW CATCHING JITTESTS WORK

WHY IT MATTERS

READ THE PAPER

Related posts

Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale

Trust But Canary: Configuration Safety at Scale

Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

Efficient Optimization With Ax, an Open Platform for Adaptive Experimentation