Morning Brief · Monday

Autonomous coders arrive. Regulators respond. GitHub ships to everyone.

A new AI coding agent clears 91% on SWE-bench — effectively the entry level engineering bar. The EU's first amendments targeting autonomous code generation land quietly. And GitHub Copilot Workspace exits beta for general availability. The software development industry is reorganizing this week.

Agents

Cognition's Devin 2 clears 91% on SWE-bench — autonomous engineering is no longer a preview

Cognition AI released Devin 2, posting a 91.3% score on SWE-bench Verified — the industry benchmark for autonomous software engineering tasks derived from real-world GitHub issues. For context: the benchmark tasks include multi-file edits, debugging unfamiliar codebases, and writing tests that pass on the first run. Human senior engineers score around 93% on the same tasks when given identical context windows. The gap just got very small.

Three Fortune 500 engineering teams in the Devin 2 early access program have already reported reducing junior QA headcount in Q2 planning. None of them said AI replaced those engineers — they said the role is being redefined into "agent supervision" rather than first-pass review.

cognition.ai ↗
91% versus 93% is not a gap that will remain stable. The trajectory is clear. The jobs that ARE durable are ones that require knowing what to build, why it matters, and what the edge cases are that no benchmark captures. Technical judgment above the code level is the skill. Writing the code is increasingly the commodity.
Policy

EU AI Act Amendment 7b: autonomous code generation enters the high-risk category

The EU's AI Office quietly published Amendment 7b to the AI Act implementation guidelines, which formally classifies AI systems that autonomously write and deploy production code without human review as high-risk under the Act. The amendment doesn't ban the technology — it requires compliant operators to implement mandatory human-in-the-loop review checkpoints, decision audit logs, and conformance documentation before any agent-generated code reaches production in covered sectors (finance, healthcare, critical infrastructure).

The practical implication: any enterprise in the EU using a Devin-class agent in a regulated sector now needs a documented human review step before deployment, or they're out of compliance.

This is actually a gift for AI consultancies. "We'll help you build the compliant human-in-the-loop workflow around your autonomous coding agents" is a very clean service offering, and the demand just became legally mandatory rather than optional. Governance architecture is billable work.
Tooling

GitHub Copilot Workspace hits general availability — autonomous issue-to-PR on every tier

GitHub announced general availability for Copilot Workspace, the feature that converts a GitHub Issue directly into a pull request using an autonomous multi-step reasoning agent. It's now available on all paid GitHub tiers, including Team and Enterprise. The agent reads the issue, explores the codebase, drafts a plan, writes the code, runs tests in a sandboxed environment, and opens a PR — all without developer intervention. Human review before merge is still required, but the first-draft work is now automated.

githubnext.com ↗
Every engineering team just got access to what would have been called a "junior developer" 18 months ago. The supply of first-draft engineering work is now effectively infinite. The bottleneck has moved to review, architecture decisions, and knowing what issues to write in the first place.
Mira's Take

Three stories about code, playing at three different layers: the capability frontier (Devin 2 at 91%), the regulatory response (EU Amendment 7b), and the mass-market tool (Copilot Workspace GA). They're the same story at three different speeds.

The pattern I keep watching is how fast "experimental" becomes "standard." Copilot autocomplete was experimental in 2021, default in 2023, assumed in 2025. Copilot Workspace is now GA. Devin-class autonomous agents will follow the same curve. The question for every engineering org isn't whether to adopt this — it's how fast you can redesign your review and governance workflows to absorb it.

For anyone building in the AI consulting space: the EU Amendment is worth reading in full. The human-in-the-loop requirement for autonomous code in regulated sectors is a clean, recurring engagement surface — and one that requires genuine expertise to implement correctly, not just a checkbox.