📝Software engineering shift: intent to working software
| Stage | Description |
|---|---|
| ~2021 Autocomplete | Simple token predictions; editor guesses next few characters. |
| ~2022 Inline Code Suggestions | Complete entire functions from signature; model understands patterns, not just tokens. |
| ~2023 Chat-Based Generation | Describe features in natural language, receive working implementation; conversation becomes the interface. |
| ~2024-25 Coding Agents | Multi-file edits, tool calling, test execution, iterative self-correction; agent runs its own loop. |
| ~2025-26 Autonomous Agents | Clone repositories, plan architecture, execute in sandboxes, run full test suites, submit pull requests; no human keystrokes required. |
| Dimension | Vibe Coding | Structured AI-Assisted Coding | Agentic Engineering |
|---|---|---|---|
| Intent specification | Casual natural language prompts | Detailed prompts with examples and constraints | Formal specs, architecture docs, memory files |
| Verification | "Does it seem to work?" | Manual testing, spot-checking | Automated test suites, CI/CD gates, LM judges |
| Codebase understanding | Minimal; developer may not read the generated code | Selective review of critical paths | Comprehensive review of architecture; AI handles implementation details |
| Error handling | Copy-paste error messages back to the AI | Developer diagnoses root cause, AI implements fix | Agents self-diagnose within defined bounds; humans handle architectural issues |
| Appropriate scope | Prototypes, scripts, personal projects, hackathons | Features within established codebases | Production systems, team-scale development |
| Risk profile | High; acceptable for disposable code | Moderate; human judgment at key checkpoints | Low; systematic verification at every stage |
| Context Type | Loading Mechanism | Token Cost | Characteristics | Examples |
|---|---|---|---|---|
| Static Context | Always loaded, every interaction | High | Expensive but reliable; agent never forgets | System instructions, rule files (AGENTS.md), global memory, core guardrails |
| Dynamic Context | Loaded on demand, per task | Low per turn | Efficient and scalable; pay only for what you use | Agent Skills (triggered by task match), tool results, retrieved documents (RAG) |
| Phase | Traditional Iterative SDLC | AI-Driven SDLC |
|---|---|---|
| Requirements | 2-3 days | Specs become eval criteria |
| Design | 1-2 days | Architecture decisions amplified at scale |
| Implementation | 1-3 weeks | Minutes to hours (Agent self-corrects) |
| Testing | 3-5 days | Output Eval (Verify what it built AND how it got there), Trajectory Eval |
| Review & Deploy | 2-3 days | Review & Deploy |
| Maintenance | Ongoing | Continuous automation |
| Sprint Cycle | Weeks | Minutes to hours |
| Dimension | Conductor | Orchestrator |
|---|---|---|
| Interaction | Real-time, Synchronous, In-IDE | Asynchronous, High-level, Multi-agent |
| Developer's Role | Prompt, reviews inline, refines | Defines specific task, reviews PR/output, approves or corrects |
| Control Level | Keystroke-level control, immediate feedback, single-file scope, developer always in loop | Goal-level control, delayed feedback, multi-file scope, reviews outcomes not keystrokes |
| Best For | Exploratory coding, prototyping, learning new API | Feature implementation, migrations, test generation |
| Leverage | Fine-grained control | High-leverage delegation |
| Metric | Vibe Coding | Agentic Engineering |
|---|---|---|
| CapEx | Minimal Investment (12%) | Upfront Platform Design (12%) |
| OpEx | High Running Costs | Low Marginal Running Costs |
| Characteristics | Rapid prototyping, slow scaling, high friction for long-term maintenance, economic dead-end for complex systems | Controlled iteration, fast scaling, low friction for automatic updates, economically sustainable for mature codebases |
| Stage | Description |
|---|---|
| ~2021 Autocomplete | Simple token predictions; editor guesses next few characters. |
| ~2022 Inline Code Suggestions | Complete entire functions from signature; model understands patterns, not just tokens. |
| ~2023 Chat-Based Generation | Describe features in natural language, receive working implementation; conversation becomes the interface. |
| ~2024-25 Coding Agents | Multi-file edits, tool calling, test execution, iterative self-correction; agent runs its own loop. |
| ~2025-26 Autonomous Agents | Clone repositories, plan architecture, execute in sandboxes, run full test suites, submit pull requests; no human keystrokes required. |
| Dimension | Vibe Coding | Structured AI-Assisted Coding | Agentic Engineering |
|---|---|---|---|
| Intent specification | Casual natural language prompts | Detailed prompts with examples and constraints | Formal specs, architecture docs, memory files |
| Verification | "Does it seem to work?" | Manual testing, spot-checking | Automated test suites, CI/CD gates, LM judges |
| Codebase understanding | Minimal; developer may not read the generated code | Selective review of critical paths | Comprehensive review of architecture; AI handles implementation details |
| Error handling | Copy-paste error messages back to the AI | Developer diagnoses root cause, AI implements fix | Agents self-diagnose within defined bounds; humans handle architectural issues |
| Appropriate scope | Prototypes, scripts, personal projects, hackathons | Features within established codebases | Production systems, team-scale development |
| Risk profile | High; acceptable for disposable code | Moderate; human judgment at key checkpoints | Low; systematic verification at every stage |
| Context Type | Loading Mechanism | Token Cost | Characteristics | Examples |
|---|---|---|---|---|
| Static Context | Always loaded, every interaction | High | Expensive but reliable; agent never forgets | System instructions, rule files (AGENTS.md), global memory, core guardrails |
| Dynamic Context | Loaded on demand, per task | Low per turn | Efficient and scalable; pay only for what you use | Agent Skills (triggered by task match), tool results, retrieved documents (RAG) |
| Phase | Traditional Iterative SDLC | AI-Driven SDLC |
|---|---|---|
| Requirements | 2-3 days | Specs become eval criteria |
| Design | 1-2 days | Architecture decisions amplified at scale |
| Implementation | 1-3 weeks | Minutes to hours (Agent self-corrects) |
| Testing | 3-5 days | Output Eval (Verify what it built AND how it got there), Trajectory Eval |
| Review & Deploy | 2-3 days | Review & Deploy |
| Maintenance | Ongoing | Continuous automation |
| Sprint Cycle | Weeks | Minutes to hours |
| Dimension | Conductor | Orchestrator |
|---|---|---|
| Interaction | Real-time, Synchronous, In-IDE | Asynchronous, High-level, Multi-agent |
| Developer's Role | Prompt, reviews inline, refines | Defines specific task, reviews PR/output, approves or corrects |
| Control Level | Keystroke-level control, immediate feedback, single-file scope, developer always in loop | Goal-level control, delayed feedback, multi-file scope, reviews outcomes not keystrokes |
| Best For | Exploratory coding, prototyping, learning new API | Feature implementation, migrations, test generation |
| Leverage | Fine-grained control | High-leverage delegation |
| Metric | Vibe Coding | Agentic Engineering |
|---|---|---|
| CapEx | Minimal Investment (12%) | Upfront Platform Design (12%) |
| OpEx | High Running Costs | Low Marginal Running Costs |
| Characteristics | Rapid prototyping, slow scaling, high friction for long-term maintenance, economic dead-end for complex systems | Controlled iteration, fast scaling, low friction for automatic updates, economically sustainable for mature codebases |