OpenAI and Anthropic Move to Secure the AI Agent Pipeline

OpenAI announced on March 9 that it would acquire Promptfoo, the open-source AI red-teaming platform used by more than 125,000 developers and over 30 Fortune 500 companies. On the same day, Anthropic shipped Code Review, a multi-agent system that audits AI-generated pull requests for logical errors.

The two announcements arrived independently and within hours of each other.

Promptfoo was founded in 2024 by Ian Webster and Michael D'Angelo. The company raised $22.68 million in total funding, including an $18.4 million Series A led by Insight Partners in July 2025, at a post-money valuation of $85.5 million. Terms of the OpenAI acquisition were not disclosed.

The platform works as an automated adversary. Rather than relying on manual penetration testing, Promptfoo deploys specialized models and agents that interact with a customer's AI application through its chat interface or APIs, behaving as attackers would. When an attack succeeds, the system records the result, analyzes why it worked, and iterates through a reasoning loop to expose deeper vulnerabilities.

OpenAI will integrate Promptfoo into Frontier, its enterprise AI agent platform launched February 5. Frontier allows companies to build, deploy, and manage AI agents with shared context, permissions, and governance. Early customers include Intuit, State Farm, Thermo Fisher, and Uber. OpenAI said Promptfoo's open-source tools would continue to be developed and supported.

Anthropic's Code Review, available in research preview for Teams and Enterprise customers, dispatches multiple AI agents to examine a pull request from different angles. The agents aggregate findings, remove duplicates, and assign severity ratings: red for critical issues, yellow for items worth reviewing, purple for historical problems. Reviews average 20 minutes and cost between $15 and $25 on a token basis. Anthropic said human developers reject fewer than one percent of the issues the system identifies.

Both launches follow disclosures that underscored how AI development tools can become attack surfaces. In late February, Check Point Research detailed two vulnerabilities in Anthropic's Claude Code. CVE-2025-59536, scored 8.7 on the CVSS scale, allowed arbitrary shell command execution when a developer opened an untrusted project directory. A second flaw, CVE-2026-21852, allowed attackers to exfiltrate a developer's Anthropic API key by overriding a project configuration variable. Anthropic patched both in Claude Code version 2.0.65.

These vulnerabilities exploited the same structural condition: a general-purpose language model operating inside a privileged execution environment, where prompt injection becomes a command execution problem rather than a text output problem.

CrowdStrike's 2026 Global Threat Report, published February 24, found that AI-enabled adversaries increased activity by 89 percent year over year. Forty-two percent of vulnerabilities were exploited before public disclosure. The average breakout time for eCrime actors fell to 29 minutes, a 65 percent increase in speed from 2024. The report documented adversaries exploiting legitimate generative AI tools at more than 90 organizations through malicious prompt injection.

OpenAI, through Promptfoo, will offer enterprises automated testing for the agents they deploy. Anthropic, through Code Review, audits the code those agents and developers produce. The two tools address different segments of the same expanding risk surface.

OpenAI and Anthropic Move to Secure the AI Agent Pipeline

Continue reading

The Sam Altman Home Attack Case Has Become an Attempted-Murder Test

Three Rivals, One Threat: Inside the Frontier Model Forum's Activation Against Chinese Distillation

Stay Informed