The landscape of software creation is undergoing a quiet revolution as AI‑driven coding assistants move from experimental novelties to everyday workhorses. At the recent Code with Claude gathering in London, engineers showcased how tools like Anthropic’s Claude Code are now capable of authoring entire pull requests without human intervention, signaling a shift that many observers once thought would take a decade to materialize. The event, held concurrently with Google’s I/O, attracted a crowd of developers eager to see how far large language models have progressed in understanding intent, translating specifications into syntactically correct code, and even handling the nuances of version control. Attendees reported that nearly half of the room had already merged a change generated solely by the AI, and a surprising number admitted they had not reviewed the generated diff before pushing it upstream. This level of trust underscores a broader cultural change: the perception of AI as a junior partner that can shoulder routine coding chores, freeing humans to focus on architecture, product strategy, and complex problem‑solving. While the technology is still maturing, the momentum suggests that organizations that delay adoption may find themselves at a competitive disadvantage, especially as the cost of manual coding continues to rise and talent shortages persist in many markets.

The atmosphere at the two‑day conference was electric, with developers hunched over laptops, alternating between typing prompts and watching live demos on the main stage. When Jeremy Hadfield, an Anthropic engineer, asked the audience who had shipped a pull request wholly written by Claude in the past week, roughly fifty percent of the hands shot up, illustrating how quickly the tool has moved from novelty to routine use. He followed up with a more provocative question: who had merged such a change without ever looking at the generated code? Nervous laughter rippled through the crowd, yet most of those hands stayed raised, revealing a growing comfort with delegating even the review step to the model. This candid exchange highlighted a pivotal moment in developer culture—where the line between human oversight and machine autonomy is blurring, and where traditional gatekeeping practices like manual code inspection are being reconsidered in favor of speed and agility.

To understand why this shift matters, it helps to revisit the role of a pull request in modern software engineering. A pull request represents a proposed modification to a codebase, packaged for peer review, automated testing, and eventual integration into the main branch. Historically, crafting these requests consumed a significant portion of a developer’s day, involving careful thought, manual typing, and iterative feedback loops. The pull request is not merely a technical artifact; it is a social contract that ensures quality, shares knowledge, and maintains accountability across teams. When an AI can generate a complete, coherent pull request that passes automated checks, it threatens to compress the feedback cycle dramatically, potentially reducing the time from idea to production from days to hours. However, this acceleration also raises questions about the depth of understanding embedded in the generated code and whether the social safeguards of peer review are being adequately preserved.

Industry leaders are already broadcasting impressive adoption figures that underscore the rapid normalization of AI‑assisted coding. At Anthropic, Hadfield claimed that the majority of internal software is now authored by Claude, with the model even contributing substantially to its own codebase—a recursive improvement loop that few would have imagined possible just a year ago. Parallel statements have emerged from OpenAI, Google, and Microsoft, each citing internal metrics that show a steep decline in lines of code written manually by engineers. These claims, while sometimes met with skepticism, reflect a broader trend where organizations are measuring productivity gains not just in feature velocity but also in reduced cognitive load on developers. The cumulative effect is a shifting baseline: what once required a team of senior engineers might now be achievable with a smaller core group augmented by ever‑more capable AI agents.

The evolution of Claude Code itself illustrates the swift pace of innovation in this space. Early iterations, such as the original Claude 4 release, could produce rudimentary snippets but struggled with complex logic and contextual consistency. Subsequent updates—particularly Claude 4.6 and Claude 4.7 launched in February and April of this year—introduced refined reasoning abilities, better handling of edge cases, and improved integration with popular development environments. These enhancements enabled the model to navigate multi‑file refactors, understand intricate API contracts, and generate unit tests that align with existing test suites. As each version closed the gap between prototype reliability and production readiness, developer confidence grew, leading to the widespread hands‑off behavior observed at the London event. The trajectory suggests that future releases will continue to push the boundaries of what AI can autonomously manage, from architectural decisions to performance optimizations.

A central tenet of Anthropic’s strategy is to move beyond simple code generation toward a model that can self‑critique and iterate without constant human prompting. Boris Cherny, who leads the Claude Code initiative, articulated this vision in the opening keynote: the default interaction is no longer “I will prompt Claude” but “I will have Claude prompt itself.” In practice, this means the AI spawns subsidiary agents that propose changes, run internal tests, analyze failures, and propose refinements—all within a closed loop that aims to eliminate visible error messages for the human operator. Ravi Trivedi reinforced this idea by advising attendees to “let it cook,” emphasizing that the best outcomes emerge when developers step back and allow the model to explore solution spaces autonomously. This approach promises to reduce context‑switching fatigue and to surface innovative solutions that might not arise from a strictly human‑centric workflow.

One of the most intriguing innovations unveiled at the conference is the “dreaming” mechanism, a reflective subsystem that enables Claude Code agents to learn from their own experiences. After completing a coding task, each agent writes a private note capturing what worked, what stumbled, and any interesting patterns observed in the codebase. These notes are stored in a shared repository that subsequent agents can query when they encounter similar challenges. The dreaming process then aggregates these notes, surfacing recurrent themes, common bug signatures, and successful idioms that can be encoded into the model’s future behavior. In essence, dreaming transforms isolated task‑level experiences into a cumulative knowledge base, allowing the system to improve its proficiency on a specific project over time—much like a senior developer gaining intuition through repeated exposure to a codebase.

The practical impact of dreaming becomes evident when multiple agents collaborate on a large, evolving codebase. Imagine a scenario where one agent refactors a legacy module, leaves a note about a subtle threading issue it discovered, and later another agent tasked with adding a feature pulls that note, avoiding the same pitfall and perhaps even suggesting a more robust concurrency pattern. Over successive iterations, the aggregated insights reduce redundant debugging efforts and accelerate the onboarding of new AI contributors to the project. For organizations, this translates into shorter ramp‑up times for AI‑assisted development, higher consistency across contributions, and a measurable uplift in code quality that compounds with each sprint. The feature also hints at a future where AI models can develop domain‑specific expertise without explicit retraining, simply by accumulating experiential data from day‑to‑day operations.

Beyond the technical showcases, the conference highlighted real‑world adopters who have restructured their development pipelines around Claude Code. Companies such as Spotify and Delivery Hero described how they have integrated AI‑generated pull requests into their continuous delivery pipelines, reporting measurable reductions in cycle time and incident rates. Early‑stage startups like Lovable, Base44, and Monday.com shared stories of building entire minimum viable products with minimal human coding, relying on the AI to handle boilerplate, authentication flows, and even UI component generation. These testimonies underscore that the benefits of AI‑assisted coding are not confined to tech giants; they are accessible to any organization willing to invest in the necessary tooling, training, and cultural shift toward trust in machine‑generated artifacts.

Nevertheless, the enthusiasm at the event was met with a undercurrent of skepticism circulating in developer forums such as Reddit and Hacker News. Critics argue that the push for AI‑driven coding often originates from management seeking short‑term productivity gains, while the reality on the ground can be more cumbersome. Some complain that the volume of auto‑generated code creates an overwhelming review burden, forcing engineers to spend more time deciphering opaque AI output than they would writing the code themselves. Others warn of skill atrophy, noting that reliance on AI for routine tasks may erode fundamental programming abilities over the long term. Security researchers have also raised flags, pointing out that language models can inadvertently introduce vulnerabilities—such as injection flaws or insecure defaults—when they lack a deep understanding of threat models, potentially expanding the attack surface of deployed applications.

Anthropic’s engineering lead Katelyn Lesse and product lead Angela Jiang addressed these concerns head‑on, emphasizing that foundational software engineering best practices remain indispensable. Lesse asserted that practices like thorough code reviews, comprehensive testing, and security scanning have not been obsoleted; rather, teams may have momentarily drifted away from them in the excitement of newfound speed. She noted that technical managers within Anthropic themselves feel the strain of overseeing a surge in AI‑produced artifacts, highlighting that rapid output still demands careful prioritization and time management. While Lesse characterized Claude’s current capability as comparable to that of a mid‑level engineer for routine coding tasks, she stressed that expert human judgment is still essential for system architecture, complex debugging, and strategic decision‑making. Jiang echoed this sentiment, describing the ultimate aspiration as a model that could eventually “build itself,” but acknowledging that reaching that endpoint will require a balanced partnership between human oversight and machine autonomy.

For engineering leaders contemplating deeper integration of AI coding assistants, the path forward lies in establishing disciplined workflows that harness speed without sacrificing rigor. Begin by defining clear boundaries for what the AI may autonomously produce—such as boilerplate, data‑access layers, or routine bug fixes—and reserve human review for security‑critical components, architectural changes, and any code that touches sensitive data. Implement automated gatekeeping: unit tests, static analysis, and dependency scanners must run on every AI‑generated pull request before it reaches a human reviewer. Leverage features like dreaming to create project‑specific knowledge bases that reduce repetitive mistakes and accelerate onboarding of new AI agents. Invest in upskilling programs that teach developers how to prompt effectively, how to interpret AI‑generated diffs, and how to intervene when the model strays from intended semantics. Finally, monitor key metrics—cycle time, defect escape rate, and developer satisfaction—to iteratively tune the balance between automation and oversight, ensuring that the organization reaps the productivity benefits of AI while maintaining the quality and security standards that users expect.