{"sourceUrl":null,"sourceType":"file","contentType":"Explainer","apex":{"id":"n1","type":"APEX","label":"From Predictive AI to Autonomous Agents","text":"Artificial intelligence is undergoing a paradigm shift from passive, discrete tasks to autonomous problem-solving and task execution by AI agents.","children":[{"id":"n2","type":"CONC","label":"AI Agents as LM Evolution","text":"Agents represent the natural evolution of Language Models, made useful in software by combining an LM's reasoning with practical action capabilities.","parentId":"n1","children":[{"id":"n3","type":"DETL","label":"Traditional AI Focus","text":"For years, AI focused on passive, discrete tasks like answering questions, translating text, or generating images from prompts, requiring constant human direction.","parentId":"n2","children":[]},{"id":"n4","type":"INSG","label":"Paradigm Shift to Autonomous Agents","text":"There is a paradigm shift from AI that predicts or creates content to new software capable of autonomous problem-solving and task execution.","parentId":"n2","children":[]},{"id":"n5","type":"DETL","label":"Agent Definition","text":"An AI agent is a complete application that makes plans and takes actions to achieve goals, combining an LM's ability to reason with the ability to act.","parentId":"n2","children":[]},{"id":"n6","type":"INSG","label":"Agents Handle Complex Tasks","text":"Agents can handle complex, multi-step tasks independently, figuring out necessary steps to reach a goal without constant human guidance.","parentId":"n2","children":[]}]},{"id":"n7","type":"CONC","label":"Document Purpose","text":"This document is the first in a five-part series, guiding developers, architects, and product leaders in transitioning to robust, production-grade agentic systems.","parentId":"n1","children":[{"id":"n8","type":"DETL","label":"Building Agent Prototypes","text":"Building a simple prototype is straightforward, but ensuring security, quality, and reliability for production is a significant challenge.","parentId":"n7","children":[]},{"id":"n9","type":"CONC","label":"Comprehensive Foundation","text":"This paper provides a comprehensive foundation for building, deploying, and managing intelligent applications that reason, act, and observe to accomplish goals.","parentId":"n7","children":[{"id":"n10","type":"SUBC","label":"Core Anatomy","text":"Deconstructs an agent into its three essential components: the reasoning Model, actionable Tools, and the governing Orchestration Layer.","parentId":"n9","children":[]},{"id":"n11","type":"SUBC","label":"Taxonomy of Capabilities","text":"Classifies agents from simple, connected problem-solvers to complex, collaborative multi-agent systems.","parentId":"n9","children":[]},{"id":"n12","type":"SUBC","label":"Architectural Design","text":"Dives into practical design considerations for each component, from model selection to tool implementation.","parentId":"n9","children":[]},{"id":"n13","type":"SUBC","label":"Building for Production","text":"Establishes the Agent Ops discipline needed to evaluate, debug, secure, and scale agentic systems from single instance to enterprise fleet.","parentId":"n9","children":[]}]}]},{"id":"n14","type":"CONC","label":"Introduction to AI Agents","text":"An AI Agent combines models, tools, an orchestration layer, and runtime services, using a Language Model in a loop to accomplish a goal.","parentId":"n1","children":[{"id":"n15","type":"INSG","label":"Anthropomorphizing AI","text":"Words are insufficient to describe human-AI interaction, leading to anthropomorphizing AI with human terms like 'think,' 'reason,' and 'know'.","parentId":"n14","children":[]},{"id":"n16","type":"SUBC","label":"Model (The Brain)","text":"The core language or foundation model serves as the agent's central reasoning engine to process information, evaluate options, and make decisions.","parentId":"n14","children":[{"id":"n17","type":"DETL","label":"Model Type Capabilities","text":"The type of model, whether general-purpose, fine-tuned, or multimodal, dictates the agent's cognitive capabilities.","parentId":"n16","children":[]},{"id":"n18","type":"DETL","label":"Context Window Curator","text":"An agentic system ultimately curates the input context window for the Language Model.","parentId":"n16","children":[]}]},{"id":"n19","type":"SUBC","label":"Tools (The Hands)","text":"Tools connect the agent's reasoning to the outside world, enabling actions beyond text generation, including API extensions, code functions, and data stores.","parentId":"n14","children":[{"id":"n20","type":"DETL","label":"Tool Usage Process","text":"An agentic system allows an LM to plan tool usage, execute the tool, and incorporate results into the input context window of the next LM call.","parentId":"n19","children":[]}]},{"id":"n21","type":"SUBC","label":"Orchestration Layer (The Nervous System)","text":"This layer governs the agent's operational loop, managing planning, memory (state), and reasoning strategy execution using prompting frameworks.","parentId":"n14","children":[{"id":"n22","type":"DETL","label":"Reasoning Techniques","text":"The orchestration layer uses reasoning techniques like Chain-of-Thought or ReAct to break down complex goals and decide when to think versus use a tool.","parentId":"n21","children":[]},{"id":"n23","type":"DETL","label":"Memory Management","text":"This layer is responsible for providing agents with the memory to 'remember' information.","parentId":"n21","children":[]}]},{"id":"n24","type":"SUBC","label":"Deployment (The Body and Legs)","text":"Production deployment ensures the agent is a reliable and accessible service, involving hosting on a secure, scalable server with monitoring and management.","parentId":"n14","children":[{"id":"n25","type":"DETL","label":"Accessing Deployed Agents","text":"Once deployed, agents can be accessed by users through a graphical interface or programmatically via an Agent-to-Agent (A2A) API.","parentId":"n24","children":[]}]},{"id":"n26","type":"INSG","label":"Developer Role Shift","text":"Building a generative AI agent shifts the developer's role from a 'bricklayer' defining explicit logic to a 'director' setting the scene and guiding autonomous actors.","parentId":"n14","children":[]},{"id":"n27","type":"INSG","label":"LM Flexibility Challenge","text":"A Large Language Model's incredible flexibility, its greatest strength, also makes it difficult to reliably compel it to do one specific thing perfectly.","parentId":"n14","children":[]},{"id":"n28","type":"CONC","label":"Agent Ops for Debugging","text":"'Agent Ops' redefines the debugging cycle of measurement, analysis, and system optimization by monitoring the agent's 'thought process' through traces and logs.","parentId":"n14","children":[]}]},{"id":"n29","type":"CONC","label":"Agentic Problem-Solving Process","text":"An AI agent operates on a continuous, cyclical 5-step process to achieve objectives, integrating a reasoning model, actionable tools, and a governing orchestration layer.","parentId":"n1","children":[{"id":"n30","type":"DETL","label":"5 Fundamental Steps","text":"The agentic problem-solving loop can be broken down into five fundamental steps, detailed in the book Agentic System Design.","parentId":"n29","children":[]},{"id":"n31","type":"SUBC","label":"Step 1: Get the Mission","text":"The process begins with a specific, high-level goal provided by a user or an automated trigger.","parentId":"n29","children":[{"id":"n32","type":"EXMP","label":"Mission Example","text":"An example mission is 'Organize my team's travel for the upcoming conference' or 'A new high-priority customer ticket has arrived'.","parentId":"n31","children":[]}]},{"id":"n33","type":"SUBC","label":"Step 2: Scan the Scene","text":"The agent gathers context by perceiving its environment, accessing available resources like memory, user guidance, tools, calendars, databases, or APIs.","parentId":"n29","children":[]},{"id":"n34","type":"SUBC","label":"Step 3: Think It Through","text":"The agent's core 'think' loop, driven by the reasoning model, analyzes the Mission against the Scene and devises a plan.","parentId":"n29","children":[{"id":"n35","type":"DETL","label":"Chain of Reasoning","text":"This involves a chain of reasoning, like planning to use 'get_team_roster' and then 'calendar_api' to book travel.","parentId":"n34","children":[]}]},{"id":"n36","type":"SUBC","label":"Step 4: Take Action","text":"The orchestration layer executes the first concrete step of the plan by selecting and invoking an appropriate tool, such as calling an API or querying a database.","parentId":"n29","children":[]},{"id":"n37","type":"SUBC","label":"Step 5: Observe and Iterate","text":"The agent observes the outcome of its action, adds new information to its context or memory, and repeats the loop by returning to Step 3.","parentId":"n29","children":[]},{"id":"n38","type":"DETL","label":"Continuous Cycle Management","text":"The 'Think, Act, Observe' cycle continues, managed by the Orchestration Layer, reasoned by the Model, and executed by the Tools until the Mission is achieved.","parentId":"n29","children":[]},{"id":"n39","type":"EXMP","label":"Customer Support Agent Example","text":"A customer support agent responding to 'Where is my order #12345?' demonstrates the 5-step problem-solving cycle.","parentId":"n29","children":[{"id":"n40","type":"DETL","label":"Example: Agent Strategy","text":"Instead of acting immediately, the agent enters its 'Think It Through' phase to devise a multi-step plan for providing a delivery status.","parentId":"n39","children":[]},{"id":"n41","type":"DETL","label":"Example: Plan - Identify","text":"The agent identifies the need to find the order in the internal database to confirm existence and retrieve details.","parentId":"n39","children":[]},{"id":"n42","type":"DETL","label":"Example: Plan - Track","text":"From order details, the agent extracts the shipping carrier's tracking number and queries the external carrier's API for live status.","parentId":"n39","children":[]},{"id":"n43","type":"DETL","label":"Example: Plan - Report","text":"Finally, the agent synthesizes the gathered information into a clear, helpful response for the user.","parentId":"n39","children":[]},{"id":"n44","type":"DETL","label":"Example: Act - Step 1","text":"In its first 'Act' phase, the agent calls the find_order(\"12345\") tool, observing a full order record with tracking number 'ZYX987'.","parentId":"n39","children":[]},{"id":"n45","type":"DETL","label":"Example: Act - Step 2","text":"The orchestration layer then calls the get_shipping_status(\"ZYX987\") tool, observing the result 'Out for Delivery'.","parentId":"n39","children":[]},{"id":"n46","type":"DETL","label":"Example: Act - Report","text":"With all data gathered, the agent plans the final message and generates the response: 'Your order #12345 is 'Out for Delivery'!","parentId":"n39","children":[]}]}]},{"id":"n47","type":"CONC","label":"Taxonomy of Agentic Systems","text":"Agentic systems can be classified into broad levels, each building on the capabilities of the last, scaling in complexity.","parentId":"n1","children":[{"id":"n48","type":"INSG","label":"Scoping Agent Type","text":"For architects or product leaders, a key initial decision is scoping what kind of agent to build based on complexity.","parentId":"n47","children":[]},{"id":"n49","type":"SUBC","label":"Level 0: Core Reasoning System","text":"This level starts with the Language Model as the reasoning engine, operating in isolation based on pre-trained knowledge without external tools or memory.","parentId":"n47","children":[{"id":"n50","type":"DETL","label":"Strength of Level 0","text":"Its strength lies in extensive training, allowing it to explain concepts and plan problem-solving deeply.","parentId":"n49","children":[]},{"id":"n51","type":"DETL","label":"Trade-off of Level 0","text":"The trade-off is a complete lack of real-time awareness, being 'blind' to facts outside its training data.","parentId":"n49","children":[]},{"id":"n52","type":"EXMP","label":"Level 0 Example","text":"A Level 0 agent can explain baseball rules or Yankees history but not the score of last night's game, as it's outside its training data.","parentId":"n49","children":[]}]},{"id":"n53","type":"SUBC","label":"Level 1: Connected Problem-Solver","text":"At this level, the reasoning engine connects to and utilizes external tools, allowing problem-solving beyond static, pre-trained knowledge.","parentId":"n47","children":[{"id":"n54","type":"DETL","label":"Level 1 Capability","text":"This fundamental ability to interact with the world, using tools like search or APIs, is the core capability of a Level 1 agent.","parentId":"n53","children":[]},{"id":"n55","type":"EXMP","label":"Level 1 Example","text":"Given the mission 'What was the final score of the Yankees game last night?', a Level 1 agent uses a Google Search API to find and synthesize the answer 'Yankees won 5-3'.","parentId":"n53","children":[]}]},{"id":"n56","type":"SUBC","label":"Level 2: Strategic Problem-Solver","text":"Level 2 expands capabilities from simple tasks to strategically planning complex, multi-part goals, with context engineering as a key skill.","parentId":"n47","children":[{"id":"n57","type":"CONC","label":"Context Engineering","text":"Context engineering is the agent's ability to actively select, package, and manage the most relevant information for each step of its plan.","parentId":"n56","children":[{"id":"n58","type":"JUST","label":"Importance of Context Engineering","text":"Context engineering curates the model's limited attention to prevent overload and ensure efficient performance, thus impacting agent accuracy.","parentId":"n57","children":[]}]},{"id":"n59","type":"EXMP","label":"Level 2 Example Mission","text":"Find a good coffee shop halfway between two addresses, demonstrating multi-step strategic planning and tool use.","parentId":"n56","children":[{"id":"n60","type":"DETL","label":"Example: Step 1 Think","text":"The agent first thinks 'I must first find the halfway point'.","parentId":"n59","children":[]},{"id":"n61","type":"DETL","label":"Example: Step 1 Act","text":"The agent calls the Maps tool with both addresses.","parentId":"n59","children":[]},{"id":"n62","type":"DETL","label":"Example: Step 1 Observe","text":"The agent observes 'The halfway point is Millbrae, CA'.","parentId":"n59","children":[]},{"id":"n63","type":"DETL","label":"Example: Step 2 Think","text":"The agent then thinks 'Now I must find coffee shops in Millbrae' with a 4-star rating or higher.","parentId":"n59","children":[]},{"id":"n64","type":"DETL","label":"Example: Step 2 Act","text":"The agent calls the google_places tool with query='coffee shop in Millbrae, CA', min_rating=4.0, demonstrating context engineering.","parentId":"n59","children":[]},{"id":"n65","type":"DETL","label":"Example: Step 2 Observe","text":"The agent observes the search returns 'Millbrae Coffee' and 'The Daily Grind'.","parentId":"n59","children":[]},{"id":"n66","type":"DETL","label":"Example: Step 3 Think","text":"The agent thinks 'I will synthesize these results and present them to the user'.","parentId":"n59","children":[]}]},{"id":"n67","type":"DETL","label":"Proactive Assistance","text":"Strategic planning enables proactive assistance, such as an agent reading a flight confirmation email and adding key context to a calendar.","parentId":"n56","children":[]}]},{"id":"n68","type":"SUBC","label":"Level 3: Collaborative Multi-Agent System","text":"This level shifts the paradigm from a single 'super-agent' to a 'team of specialists' working in concert, mirroring human organizations with division of labor.","parentId":"n47","children":[{"id":"n69","type":"DETL","label":"Agents as Tools","text":"Here, agents treat other agents as tools, exemplified by a 'Project Manager' agent delegating tasks to specialized team members.","parentId":"n68","children":[]},{"id":"n70","type":"EXMP","label":"Level 3 Example Mission","text":"A 'Project Manager' agent receives the mission 'Launch our new 'Solaris' headphones' and delegates sub-tasks.","parentId":"n68","children":[{"id":"n71","type":"DETL","label":"Example: Market Research","text":"The Project Manager delegates to a MarketResearchAgent to 'Analyze competitor pricing for noise-canceling headphones' and return a summary by tomorrow.","parentId":"n70","children":[]},{"id":"n72","type":"DETL","label":"Example: Marketing Task","text":"The Project Manager delegates to a MarketingAgent to 'Draft three versions of a press release' using the 'Solaris' product spec sheet.","parentId":"n70","children":[]},{"id":"n73","type":"DETL","label":"Example: Web Development","text":"The Project Manager delegates to a WebDevAgent to 'Generate the new product page HTML' based on design mockups.","parentId":"n70","children":[]}]},{"id":"n74","type":"INSG","label":"Automating Complex Workflows","text":"This collaborative model represents the frontier of automating entire, complex business workflows from start to finish, despite current LM reasoning limitations.","parentId":"n68","children":[]}]},{"id":"n75","type":"SUBC","label":"Level 4: Self-Evolving System","text":"Level 4 is a profound leap from delegation to autonomous creation and adaptation, where an agentic system dynamically expands its capabilities.","parentId":"n47","children":[{"id":"n76","type":"DETL","label":"Dynamic Capability Expansion","text":"At this level, an agent can identify gaps in its own capabilities and dynamically create new tools or even new agents to fill them.","parentId":"n75","children":[]},{"id":"n77","type":"EXMP","label":"Level 4 Example Mission","text":"A 'Project Manager' agent, tasked with 'Solaris' launch, realizes it needs social media sentiment monitoring but lacks a tool.","parentId":"n75","children":[{"id":"n78","type":"DETL","label":"Example: Think (Meta-Reasoning)","text":"The agent thinks 'I must track social media buzz for 'Solaris,' but I lack the capability'.","parentId":"n77","children":[]},{"id":"n79","type":"DETL","label":"Example: Act (Autonomous Creation)","text":"Instead of failing, it invokes an AgentCreator tool to build a new agent that monitors social media for keywords 'Solaris headphones', performs sentiment analysis, and reports daily summaries.","parentId":"n77","children":[]},{"id":"n80","type":"DETL","label":"Example: Observe","text":"A new, specialized SentimentAnalysisAgent is created, tested, and added to the team on the fly, contributing to the original mission.","parentId":"n77","children":[]}]},{"id":"n81","type":"INSG","label":"Learning and Evolving Organization","text":"This level of autonomy, where a system dynamically expands its own capabilities, transforms a team of agents into a truly learning and evolving organization.","parentId":"n75","children":[]}]}]},{"id":"n82","type":"CONC","label":"Core Agent Architecture","text":"Building agents involves the specific architectural design of its three core components: Model, Tools, and Orchestration, transitioning from concept to code.","parentId":"n1","children":[{"id":"n83","type":"SUBC","label":"Model: The Brain","text":"The Language Model is the reasoning core, and its selection is a critical architectural decision dictating cognitive capabilities, operational cost, and speed.","parentId":"n82","children":[{"id":"n84","type":"DCSN","label":"Model Selection Approach","text":"Treating model choice as simply picking the highest benchmark score is a common path to failure; success in production is rarely determined by generic academic benchmarks.","parentId":"n83","children":[]},{"id":"n85","type":"DETL","label":"Agentic Fundamentals","text":"Real-world success requires a model excelling at agentic fundamentals: superior reasoning for complex problems and reliable tool use.","parentId":"n83","children":[]},{"id":"n86","type":"JUST","label":"Optimal Model Selection","text":"The 'best' model is at the optimal intersection of quality, speed, and price for specific tasks, determined by defining the business problem and testing against direct metrics.","parentId":"n83","children":[]},{"id":"n87","type":"DETL","label":"Multiple Models","text":"You may choose more than one model, a 'team of specialists,' using models like Gemini 2.5 Pro for planning and Gemini 2.5 Flash for simpler tasks.","parentId":"n83","children":[]},{"id":"n88","type":"INSG","label":"Model Routing Strategy","text":"Model routing, either automatic or hard-coded, is a key strategy for optimizing both performance and cost.","parentId":"n83","children":[]},{"id":"n89","type":"DETL","label":"Multimodal Data Handling","text":"Natively multimodal models like Gemini live mode streamline image/audio processing, while specialized tools like Cloud Vision API or Speech-to-Text API convert data to text for language-only models.","parentId":"n83","children":[]},{"id":"n90","type":"INSG","label":"AI Landscape Evolution","text":"The AI landscape evolves rapidly, rendering a 'set it and forget it' mindset unsustainable; models chosen today will be superseded in six months.","parentId":"n83","children":[]},{"id":"n91","type":"DCSN","label":"Agent Ops Practice","text":"Building for reality means investing in a nimble operational framework, an 'Agent Ops' practice, with a robust CI/CD pipeline that continuously evaluates new models.","parentId":"n83","children":[]},{"id":"n92","type":"JUST","label":"Agent Ops Benefits","text":"Agent Ops de-risks and accelerates upgrades, ensuring agents are powered by the best models without complete architectural overhaul.","parentId":"n83","children":[]}]},{"id":"n93","type":"SUBC","label":"Tools: The Hands","text":"Tools connect the agent's reasoning (brain) to reality, enabling it to retrieve real-time information and take action beyond static training data.","parentId":"n82","children":[{"id":"n94","type":"DETL","label":"Three-Part Tool Loop","text":"A robust tool interface involves defining what a tool can do, invoking it, and observing the result.","parentId":"n93","children":[]},{"id":"n95","type":"CONC","label":"Retrieving Information","text":"Accessing up-to-date information is the most foundational tool, grounding the agent in reality and dramatically reducing hallucinations.","parentId":"n93","children":[{"id":"n96","type":"DETL","label":"RAG for External Knowledge","text":"Retrieval-Augmented Generation (RAG) enables agents to query external knowledge stored in Vector Databases or Knowledge Graphs.","parentId":"n95","children":[]},{"id":"n97","type":"DETL","label":"NL2SQL for Structured Data","text":"Natural Language to SQL (NL2SQL) tools allow agents to query databases to answer analytic questions, like 'What were our top-selling products last quarter?'.","parentId":"n95","children":[]}]},{"id":"n98","type":"CONC","label":"Executing Actions","text":"Agents unleash true power when they move from reading information to actively performing actions, transforming into autonomous actors.","parentId":"n93","children":[{"id":"n99","type":"DETL","label":"Wrapping APIs/Code","text":"Existing APIs and code functions can be wrapped as tools, allowing agents to send emails, schedule meetings, or update customer records.","parentId":"n98","children":[]},{"id":"n100","type":"DETL","label":"Writing/Executing Code","text":"For dynamic tasks, an agent can write and execute code on the fly in a secure sandbox, generating SQL queries or Python scripts.","parentId":"n98","children":[]}]},{"id":"n101","type":"CONC","label":"Human in the Loop (HITL)","text":"HITL tools allow agents to pause workflows and ask for human confirmation or specific information, ensuring human involvement in critical decisions.","parentId":"n93","children":[{"id":"n102","type":"EXMP","label":"HITL Implementation","text":"HITL could be implemented via SMS text messaging or a task in a database.","parentId":"n101","children":[]}]}]},{"id":"n103","type":"SUBC","label":"Function Calling","text":"For agents to reliably do 'function calling' and use tools, clear instructions, secure connections, and orchestration are required.","parentId":"n82","children":[{"id":"n104","type":"DETL","label":"OpenAPI Specification","text":"Longstanding standards like the OpenAPI specification provide a structured contract describing a tool's purpose, parameters, and expected response.","parentId":"n103","children":[]},{"id":"n105","type":"DETL","label":"Model Context Protocol (MCP)","text":"Open standards like the Model Context Protocol (MCP) are popular for simpler discovery and connection to tools due to convenience.","parentId":"n103","children":[]},{"id":"n106","type":"DETL","label":"Native Tools","text":"A few models, like Gemini with native Google Search, invoke functions as part of the LM call itself.","parentId":"n103","children":[]}]},{"id":"n107","type":"SUBC","label":"Orchestration Layer","text":"The orchestration layer, acting as the central nervous system, is the engine that runs the 'Think, Act, Observe' loop and governs agent behavior.","parentId":"n82","children":[{"id":"n108","type":"INSG","label":"Orchestration Layer Role","text":"This layer is not just plumbing but the conductor of the agentic symphony, deciding when the model reasons, which tool acts, and how results inform the next movement.","parentId":"n107","children":[]}]}]},{"id":"n109","type":"CONC","label":"Core Design Choices","text":"Architectural decisions for agents involve determining autonomy, implementation methods, and ensuring a production-grade framework.","parentId":"n1","children":[{"id":"n110","type":"DCSN","label":"Agent Autonomy Spectrum","text":"The first architectural decision is determining the agent's degree of autonomy, which exists on a spectrum from deterministic workflows to dynamically adaptive LMs.","parentId":"n109","children":[]},{"id":"n111","type":"DCSN","label":"Implementation Method","text":"No-code builders offer speed for structured tasks and simple agents, while code-first frameworks like Google's Agent Development Kit (ADK) provide deep control for complex systems.","parentId":"n109","children":[]},{"id":"n112","type":"DCSN","label":"Production-Grade Framework","text":"A production-grade framework must be open, allowing plug-in of any model or tool to prevent vendor lock-in.","parentId":"n109","children":[{"id":"n113","type":"DETL","label":"Precise Control","text":"The framework must provide precise control, enabling a hybrid approach where LM reasoning is governed by hard-coded business rules.","parentId":"n112","children":[]},{"id":"n114","type":"DETL","label":"Observability","text":"The framework must be built for observability, generating detailed traces and logs that expose the entire reasoning trajectory for unexpected agent behavior.","parentId":"n112","children":[]}]},{"id":"n115","type":"CONC","label":"Instruct with Domain Knowledge and Persona","text":"Developers' most powerful lever is instructing the agent with domain knowledge and a distinct persona, using a system prompt or core instructions.","parentId":"n109","children":[{"id":"n116","type":"INSG","label":"Agent Constitution","text":"This instruction isn't just a simple command; it serves as the agent's constitution, defining constraints, desired output, rules of engagement, and tone.","parentId":"n115","children":[]}]},{"id":"n117","type":"CONC","label":"Augment with Context","text":"The agent's 'memory' is orchestrated into the LM context window at runtime.","parentId":"n109","children":[{"id":"n118","type":"SUBC","label":"Short-Term Memory","text":"This is the agent's active 'scratchpad,' maintaining the running history of the current conversation and action-observation pairs for immediate context.","parentId":"n117","children":[]},{"id":"n119","type":"SUBC","label":"Long-Term Memory","text":"Long-term memory provides persistence across sessions, implemented as a RAG system connected to a vector database or search engine.","parentId":"n117","children":[{"id":"n120","type":"INSG","label":"Personalized Continuous Experience","text":"The orchestrator enables the agent to pre-fetch and query its history, allowing it to 'remember' user preferences or past task outcomes for personalization.","parentId":"n119","children":[]}]}]},{"id":"n121","type":"CONC","label":"Multi-Agent Systems and Design Patterns","text":"As tasks grow in complexity, a 'team of specialists' approach, mirroring human organizations, is more effective than a single 'super-agent'.","parentId":"n109","children":[{"id":"n122","type":"JUST","label":"Benefits of Division of Labor","text":"This division of labor makes each specialized AI agent simpler, more focused, and easier to build, test, and maintain for dynamic or long-running processes.","parentId":"n121","children":[]},{"id":"n123","type":"DETL","label":"Coordinator Pattern","text":"For dynamic or non-linear tasks, a 'manager' agent segments complex requests and intelligently routes sub-tasks to specialist agents, then aggregates responses.","parentId":"n121","children":[]},{"id":"n124","type":"DETL","label":"Sequential Pattern","text":"For linear workflows, the Sequential pattern acts like a digital assembly line where one agent's output becomes the next agent's input.","parentId":"n121","children":[]},{"id":"n125","type":"DETL","label":"Iterative Refinement Pattern","text":"This pattern creates a feedback loop, using a 'generator' agent to create content and a 'critic' agent to evaluate it against quality standards.","parentId":"n121","children":[]},{"id":"n126","type":"DETL","label":"Human-in-the-Loop (HITL) Pattern","text":"For high-stakes tasks, the Human-in-the-Loop (HITL) pattern is critical, creating a deliberate pause for human approval before significant agent action.","parentId":"n121","children":[]}]}]},{"id":"n127","type":"CONC","label":"Agent Deployment and Services","text":"Deploying a local agent to a server makes it a reliable, accessible service, requiring several supporting services for effectiveness.","parentId":"n1","children":[{"id":"n128","type":"DETL","label":"Essential Services","text":"An agent requires session history, memory persistence, and other services for effective operation in production.","parentId":"n127","children":[]},{"id":"n129","type":"DETL","label":"Agent Builder Responsibilities","text":"Agent builders are responsible for logging, security measures, data privacy, data residency, and regulation compliance when deploying to production.","parentId":"n127","children":[]},{"id":"n130","type":"DETL","label":"Deployment Options","text":"Agent builders can rely on application hosting infrastructure, including purpose-built platforms like Vertex AI Agent Engine or industry standard runtimes like Cloud Run or GKE.","parentId":"n127","children":[]}]},{"id":"n131","type":"CONC","label":"Agent Ops: Structured Approach to Unpredictable","text":"Building agents requires a new operational philosophy called 'Agent Ops' due to the stochastic nature of agentic systems and probabilistic responses.","parentId":"n1","children":[{"id":"n132","type":"DETL","label":"Testing Generative AI","text":"Traditional deterministic software unit tests cannot simply assert output == expected for generative AI, as agent responses are probabilistic by design.","parentId":"n131","children":[]},{"id":"n133","type":"INSG","label":"LM for Quality Evaluation","text":"Evaluating agent 'quality' usually requires an LM to assess if the response fulfills requirements, avoids extra content, and maintains proper tone.","parentId":"n131","children":[]},{"id":"n134","type":"DETL","label":"Agent Ops Definition","text":"Agent Ops is a disciplined, structured approach managing the unique challenges of building, deploying, and governing AI agents, evolving from DevOps and MLOps.","parentId":"n131","children":[]},{"id":"n135","type":"CONC","label":"Measure What Matters: Instrumenting Success","text":"Define 'better' in business context by framing observability like an A/B test and identifying Key Performance Indicators (KPIs) for real-world impact.","parentId":"n131","children":[{"id":"n136","type":"DETL","label":"KPI Examples","text":"KPIs should go beyond technical correctness to include goal completion rates, user satisfaction scores, task latency, operational cost per interaction, and business goals like revenue or retention.","parentId":"n135","children":[]},{"id":"n137","type":"JUST","label":"Metrics-Driven Development","text":"A top-down view of KPIs guides testing, enables metrics-driven development, and allows calculation of return on investment.","parentId":"n135","children":[]}]},{"id":"n138","type":"CONC","label":"Quality Instead of Pass/Fail: LM Judge","text":"Since simple pass/fail evaluation is impossible for agents, quality is assessed using an 'LM as Judge' against a predefined rubric.","parentId":"n131","children":[{"id":"n139","type":"DETL","label":"LM Judge Rubric","text":"The LM Judge evaluates if the agent's output is correct, factually grounded, and follows instructions.","parentId":"n138","children":[]},{"id":"n140","type":"DETL","label":"Automated Evaluation","text":"This automated evaluation, run against a golden dataset of prompts, provides a consistent measure of quality.","parentId":"n138","children":[]},{"id":"n141","type":"DETL","label":"Evaluation Dataset Creation","text":"Creating evaluation datasets involves sampling scenarios from existing production or development interactions, covering the full breadth of use cases and unexpected ones.","parentId":"n138","children":[]},{"id":"n142","type":"DCSN","label":"Evaluation Review","text":"Evaluation results should always be reviewed by a domain expert before acceptance as valid.","parentId":"n138","children":[]},{"id":"n143","type":"INSG","label":"Product Manager Responsibility","text":"Curation and maintenance of these evaluations are increasingly a key responsibility for Product Managers with support from domain experts.","parentId":"n138","children":[]}]},{"id":"n144","type":"CONC","label":"Metrics-Driven Development: Go/No-Go","text":"Automating dozens of evaluation scenarios and establishing trusted quality scores allow confident testing of development agent changes.","parentId":"n131","children":[{"id":"n145","type":"DETL","label":"Deployment Process","text":"Run new versions against the evaluation dataset, compare scores to existing production versions, and use A/B deployments for maximum safety.","parentId":"n144","children":[]},{"id":"n146","type":"DETL","label":"Important Factors","text":"Beyond automated evaluations, important factors include latency, cost, and task success rates, which should be compared in A/B deployments.","parentId":"n144","children":[]}]},{"id":"n147","type":"CONC","label":"Debug with OpenTelemetry Traces","text":"OpenTelemetry traces provide a high-fidelity, step-by-step recording of the agent's entire execution path, essential for understanding 'why' metrics dip or bugs occur.","parentId":"n131","children":[{"id":"n148","type":"DETL","label":"Trace Details","text":"Traces expose the exact prompt, model's internal reasoning, chosen tool, parameters, and raw observation data.","parentId":"n147","children":[]},{"id":"n149","type":"JUST","label":"Debugging Utility","text":"Trace details diagnose and fix root causes of issues, providing deep insights primarily for debugging, not performance overviews.","parentId":"n147","children":[]},{"id":"n150","type":"DETL","label":"Trace Data Collection","text":"Trace data can be collected seamlessly in platforms like Google Cloud Trace, streamlining root cause analysis by visualizing and searching traces.","parentId":"n147","children":[]}]},{"id":"n151","type":"CONC","label":"Cherish Human Feedback","text":"Human feedback is the most valuable and data-rich resource for improving agents, serving as 'gifts' for new real-world edge cases.","parentId":"n131","children":[{"id":"n152","type":"DETL","label":"Feedback Collection","text":"Collecting and aggregating bug reports or 'thumbs down' feedback is critical to generate insights and trigger alerts for operational issues.","parentId":"n151","children":[]},{"id":"n153","type":"JUST","label":"Closing the Loop","text":"An effective Agent Ops process 'closes the loop' by capturing feedback, replicating the issue, and converting it into a new, permanent test case.","parentId":"n151","children":[]},{"id":"n154","type":"INSG","label":"System Vaccination","text":"Closing the loop ensures the bug is fixed and the system is 'vaccinated' against that entire class of error recurring.","parentId":"n151","children":[]}]}]},{"id":"n155","type":"CONC","label":"Agent Interoperability","text":"Interconnecting high-quality agents with users and other agents is crucial for bringing agents into a wider ecosystem, akin to the 'face of the Agent'.","parentId":"n1","children":[{"id":"n156","type":"DETL","label":"Agents Are Not Tools","text":"There is a distinction between connecting to agents and connecting agents with data and APIs; agents are not tools themselves.","parentId":"n155","children":[]}]},{"id":"n157","type":"CONC","label":"Agents and Humans","text":"The most common form of agent-human interaction is through a user interface, ranging from chatbots to rich, dynamic front-end experiences.","parentId":"n1","children":[{"id":"n158","type":"DETL","label":"HITL Interaction Patterns","text":"Human-in-the-Loop (HITL) patterns include intent refinement, goal expansion, confirmation, and clarification requests.","parentId":"n157","children":[]},{"id":"n159","type":"DETL","label":"LM Control of UI","text":"Computer use is a tool category where the LM controls a user interface, with human interaction and oversight, navigating pages or pre-filling forms.","parentId":"n157","children":[]},{"id":"n160","type":"DETL","label":"Dynamic UI Adaptation","text":"The LM can change the UI to meet needs via Tools controlling UI (MCP UI), specialized messaging (AG UI), or generating bespoke interfaces (A2UI).","parentId":"n157","children":[]},{"id":"n161","type":"DETL","label":"Multimodal Communication","text":"Advanced agents are breaking the text barrier with real-time, multimodal communication in 'live mode' for a natural connection.","parentId":"n157","children":[]},{"id":"n162","type":"DETL","label":"Gemini Live API","text":"Technologies like the Gemini Live API enable bidirectional streaming, allowing users to speak to and interrupt agents as in natural conversation.","parentId":"n157","children":[]},{"id":"n163","type":"INSG","label":"Enhanced Agent-Human Collaboration","text":"With camera and microphone access, agents can see and hear users, responding with generated speech at human-like latency, fundamentally changing collaboration.","parentId":"n157","children":[]}]},{"id":"n164","type":"CONC","label":"Agents and Agents","text":"As enterprises scale AI, agents must connect with each other, requiring a common standard for discovery and communication.","parentId":"n1","children":[{"id":"n165","type":"DETL","label":"Challenges of Agent Interconnection","text":"Without a common standard, connecting different specialized agents from various teams would create a tangled web of brittle, custom API integrations.","parentId":"n164","children":[]},{"id":"n166","type":"SUBC","label":"Agent2Agent (A2A) Protocol","text":"The Agent2Agent (A2A) protocol is an open standard designed to solve agent communication, acting as a universal handshake for the agentic economy.","parentId":"n164","children":[{"id":"n167","type":"DETL","label":"Agent Card","text":"A2A allows any agent to publish a digital 'business card' (Agent Card), a simple JSON file advertising capabilities, network endpoint, and security credentials.","parentId":"n166","children":[]},{"id":"n168","type":"JUST","label":"Standardized Discovery","text":"Agent Cards make discovery simple and standardized for agent communication.","parentId":"n166","children":[]},{"id":"n169","type":"DETL","label":"Agent Communication Distinction","text":"Unlike MCP for transactional requests, Agent 2 Agent communication is typically for additional problem solving.","parentId":"n166","children":[]}]},{"id":"n170","type":"SUBC","label":"Task-Oriented Architecture","text":"Once discovered, agents communicate using a task-oriented architecture, framing interactions as asynchronous 'tasks' instead of simple request-response.","parentId":"n164","children":[{"id":"n171","type":"DETL","label":"Streaming Updates","text":"A client agent sends a task request to a server agent, which can provide streaming updates over a long-running connection.","parentId":"n170","children":[]}]},{"id":"n172","type":"INSG","label":"Interoperable Ecosystem","text":"This robust, standardized communication protocol enables collaborative, Level 3 multi-agent systems and transforms isolated agents into an interoperable ecosystem.","parentId":"n164","children":[]}]},{"id":"n173","type":"CONC","label":"Agents and Money","text":"As AI agents perform more tasks, some involve buying, selling, or facilitating transactions, creating a trust crisis if something goes wrong.","parentId":"n1","children":[{"id":"n174","type":"INSG","label":"Trust Crisis in Agentic Economy","text":"If an autonomous agent clicks 'buy,' it creates a crisis of trust regarding fault, authorization, authenticity, and accountability.","parentId":"n173","children":[]},{"id":"n175","type":"JUST","label":"Unlocking Agentic Economy","text":"To unlock a true agentic economy, new standards are needed that allow agents to transact securely and reliably on behalf of their users.","parentId":"n173","children":[]},{"id":"n176","type":"SUBC","label":"Agent Payments Protocol (AP2)","text":"AP2 is an open protocol designed as the definitive language for agentic commerce, extending A2A by introducing cryptographically-signed digital 'mandates'.","parentId":"n173","children":[{"id":"n177","type":"DETL","label":"Verifiable User Intent","text":"Digital mandates act as verifiable proof of user intent, creating a non-repudiable audit trail for every transaction.","parentId":"n176","children":[]},{"id":"n178","type":"JUST","label":"Global Transaction Capability","text":"This allows agents to securely browse, negotiate, and transact on a global scale based on delegated authority from the user.","parentId":"n176","children":[]}]},{"id":"n179","type":"SUBC","label":"x402 Protocol","text":"x402 is an open internet payment protocol using the standard HTTP 402 'Payment Required' status code.","parentId":"n173","children":[{"id":"n180","type":"JUST","label":"Frictionless Micropayments","text":"It enables frictionless machine-to-machine micropayments, allowing agents to pay for API access or digital content on a pay-per-use basis without complex accounts.","parentId":"n179","children":[]}]},{"id":"n181","type":"INSG","label":"Foundational Trust Layer","text":"Together, AP2 and x402 protocols are building the foundational trust layer for the agentic web.","parentId":"n173","children":[]}]},{"id":"n182","type":"CONC","label":"Securing a Single Agent: Trust Trade-Off","text":"When creating an AI agent, there's a fundamental tension between utility and security, as granting power introduces risk.","parentId":"n1","children":[{"id":"n183","type":"DETL","label":"Agent Power and Risk","text":"To be useful, agents need autonomy to make decisions and tools to perform actions like sending emails or querying databases.","parentId":"n182","children":[]},{"id":"n184","type":"DETL","label":"Primary Security Concerns","text":"Primary security concerns are rogue actions—unintended or harmful behaviors—and sensitive data disclosure.","parentId":"n182","children":[]},{"id":"n185","type":"DCSN","label":"Defense-in-Depth Approach","text":"Managing agent security requires a hybrid, defense-in-depth approach, rather than relying solely on the AI model's judgment due to manipulation risks.","parentId":"n182","children":[]},{"id":"n186","type":"SUBC","label":"Deterministic Guardrails","text":"The first security layer consists of traditional, deterministic guardrails—hardcoded rules acting as a security chokepoint outside the model's reasoning.","parentId":"n182","children":[{"id":"n187","type":"EXMP","label":"Guardrail Example","text":"A policy engine blocking purchases over $100 or requiring explicit user confirmation before external API interaction is an example.","parentId":"n186","children":[]},{"id":"n188","type":"JUST","label":"Guardrail Benefit","text":"This layer provides predictable, auditable hard limits on the agent's power.","parentId":"n186","children":[]}]},{"id":"n189","type":"SUBC","label":"Reasoning-Based Defenses","text":"The second layer uses AI to secure AI, training models to be resilient to attacks and employing specialized 'guard models' as security analysts.","parentId":"n182","children":[{"id":"n190","type":"DETL","label":"Guard Model Function","text":"Guard models examine the agent's proposed plan before execution, flagging potentially risky or policy-violating steps for review.","parentId":"n189","children":[]},{"id":"n191","type":"JUST","label":"Robust Security Posture","text":"This hybrid model, combining rigid code certainty with contextual AI awareness, creates a robust security posture ensuring agent power aligns with its purpose.","parentId":"n189","children":[]}]},{"id":"n192","type":"SUBC","label":"Agent Identity: New Principal Class","text":"Agent identity represents a new class of principal beyond human users and services, requiring its own verifiable identity.","parentId":"n1","children":[{"id":"n193","type":"INSG","label":"IAM Paradigm Shift","text":"This is a fundamental shift in how Identity and Access Management (IAM) must be approached in the enterprise.","parentId":"n192","children":[]},{"id":"n194","type":"JUST","label":"Bedrock of Agent Security","text":"Verifying each identity and having access controls is the bedrock of agent security.","parentId":"n192","children":[]},{"id":"n195","type":"DETL","label":"Verifiable Digital Passport","text":"An agent needs a cryptographically verifiable identity, often using standards like SPIFFE, analogous to an employee ID badge.","parentId":"n192","children":[]},{"id":"n196","type":"DETL","label":"Least-Privilege Permissions","text":"Once identified, an agent can be granted specific, least-privilege permissions, like a SalesAgent having CRM access but a HRonboardingAgent being denied.","parentId":"n192","children":[]},{"id":"n197","type":"JUST","label":"Containment of Compromise","text":"Granular control ensures that even if a single agent is compromised, the potential blast radius is contained.","parentId":"n192","children":[]},{"id":"n198","type":"DETL","label":"Delegated Authority","text":"Without an agent identity construct, agents cannot work on behalf of humans with limited delegated authority.","parentId":"n192","children":[]},{"id":"n199","type":"CMPR","label":"Authentication Categories","text":"Different principal entities—users, agents, and service accounts—have distinct authentication and verification methods.","table":{"cols":["Principal entity","Authentication / Verification","Notes"],"rows":[{"label":"Users","cells":["Authenticated with OAuth or SSO","Human actors with full autonomy and responsibility for their actions"]},{"label":"Agents (new category of principles)","cells":["Verified with SPIFFE","Agents have delegated authority, taking actions on behalf of users"]},{"label":"Service accounts","cells":["Integrated into IAM","Applications and containers, fully deterministic, no responsible for actions"]}]},"parentId":"n192","children":[]}]},{"id":"n200","type":"SUBC","label":"Policies to Constrain Access","text":"Policies are a form of authorization (AuthZ), distinct from authentication (AuthN), used to limit a principal's capabilities.","parentId":"n1","children":[{"id":"n201","type":"EXMP","label":"Policy Example","text":"An example policy is 'Users in Marketing can only access these 27 API endpoints and cannot execute DELETE commands'.","parentId":"n200","children":[]},{"id":"n202","type":"JUST","label":"Principle of Least Privilege","text":"The recommended approach for agents is to constrain access to only the capabilities required for their jobs, applying the principle of least privilege.","parentId":"n200","children":[]}]},{"id":"n203","type":"SUBC","label":"Securing an ADK Agent","text":"Securing an agent built with the Agent Development Kit (ADK) involves practical application of identity and policy concepts through code and configuration.","parentId":"n1","children":[{"id":"n204","type":"DETL","label":"Identity Definition","text":"The process requires clear definition of user accounts (OAuth), service accounts (to run code), and agent identities (to use delegated authority).","parentId":"n203","children":[]},{"id":"n205","type":"DETL","label":"Policy Enforcement","text":"After authentication, policies constrain access to services, often done at the API governance layer with MCP and A2A services.","parentId":"n203","children":[]},{"id":"n206","type":"DETL","label":"Guardrails in Tools/Models","text":"The next layer involves building guardrails into tools, models, and sub-agents to enforce policies.","parentId":"n203","children":[]},{"id":"n207","type":"JUST","label":"Predictable Security Baseline","text":"This ensures that tool logic refuses unsafe or out-of-policy actions, providing a predictable and auditable security baseline through concrete, reliable code.","parentId":"n203","children":[]},{"id":"n208","type":"DETL","label":"Callbacks and Plugins","text":"For dynamic security, ADK provides Callbacks and Plugins; a before_tool_callback inspects parameters of a tool call before it runs.","parentId":"n203","children":[]},{"id":"n209","type":"DETL","label":"Gemini as a Judge","text":"A common plugin pattern is 'Gemini as a Judge', using a fast, inexpensive model like Gemini Flash-Lite to screen inputs and outputs for prompt injections or harmful content.","parentId":"n203","children":[]},{"id":"n210","type":"SUBC","label":"Model Armor Service","text":"Model Armor is an optional managed service for dynamic checks, screening prompts and responses for prompt injection, jailbreak attempts, PII leakage, and malicious URLs.","parentId":"n203","children":[{"id":"n211","type":"JUST","label":"Model Armor Benefits","text":"Offloading complex security tasks to Model Armor ensures consistent, robust protection without developers needing to build and maintain guardrails.","parentId":"n210","children":[]}]},{"id":"n212","type":"INSG","label":"Hybrid Security Approach","text":"Combining strong identity, deterministic in-tool logic, dynamic AI-powered guardrails, and managed services like Model Armor builds a powerful and trustworthy single agent.","parentId":"n203","children":[]}]},{"id":"n213","type":"CONC","label":"Scaling Up to Enterprise Fleet","text":"Scaling AI agents from a single triumph to a fleet of hundreds across an enterprise presents architectural challenges beyond primary security concerns.","parentId":"n1","children":[{"id":"n214","type":"INSG","label":"Agent Sprawl","text":"When agents and tools proliferate across an organization, it can lead to 'API sprawl' like complexity, requiring systems to handle much more than just individual agent security.","parentId":"n213","children":[]}]},{"id":"n215","type":"CONC","label":"Security and Privacy: Agentic Frontier","text":"An enterprise-grade platform must address unique security and privacy challenges inherent to generative AI, even with a single agent.","parentId":"n1","children":[{"id":"n216","type":"DETL","label":"New Attack Vectors","text":"The agent itself becomes a new attack vector vulnerable to prompt injection, data poisoning, and inadvertent leakage of sensitive data.","parentId":"n215","children":[]},{"id":"n217","type":"DETL","label":"Defense-in-Depth Strategy","text":"A robust platform provides a defense-in-depth strategy, starting with protecting data from training base models and using controls like VPC Service Controls.","parentId":"n215","children":[]},{"id":"n218","type":"DETL","label":"Input and Output Filtering","text":"The strategy requires input and output filtering, acting like a firewall for prompts and responses.","parentId":"n215","children":[]},{"id":"n219","type":"DETL","label":"Contractual Protections","text":"The platform must offer contractual protections, like intellectual property indemnity for training data and generated output, giving enterprises legal and technical confidence.","parentId":"n215","children":[]}]},{"id":"n220","type":"CONC","label":"Agent Governance: Control Plane","text":"Managing agent sprawl requires a higher-order architectural approach: a central gateway serving as a control plane for all agentic activity.","parentId":"n1","children":[{"id":"n221","type":"INSG","label":"Gateway as Control System","text":"The gateway approach creates a control system, establishing a mandatory entry point for all agentic traffic, including user-to-agent prompts, agent-to-tool calls, and direct LM inference requests.","parentId":"n220","children":[]},{"id":"n222","type":"DETL","label":"Two Interconnected Functions","text":"This control plane serves two primary, interconnected functions: Runtime Policy Enforcement and Centralized Governance.","parentId":"n220","children":[]},{"id":"n223","type":"SUBC","label":"Runtime Policy Enforcement","text":"The gateway acts as the architectural chokepoint for security, handling authentication ('Do I know who this actor is?') and authorization ('Do they have permission to do this?').","parentId":"n220","children":[{"id":"n224","type":"JUST","label":"Observability Benefit","text":"Centralizing enforcement provides a 'single pane of glass' for observability, creating common logs, metrics, and traces for every transaction.","parentId":"n223","children":[]},{"id":"n225","type":"INSG","label":"Transparent and Auditable System","text":"This transforms disparate agents and workflows into a transparent and auditable system.","parentId":"n223","children":[]}]},{"id":"n226","type":"SUBC","label":"Centralized Governance","text":"A central registry, an enterprise app store for agents and tools, provides a source of truth to enforce policies effectively.","parentId":"n220","children":[{"id":"n227","type":"DETL","label":"Registry Benefits","text":"The registry allows developers to discover and reuse assets, preventing redundant work, and gives administrators a complete inventory.","parentId":"n226","children":[]},{"id":"n228","type":"DETL","label":"Formal Lifecycle","text":"It enables a formal lifecycle for agents and tools, allowing security reviews before publication, versioning, and creation of fine-grained policies.","parentId":"n226","children":[]}]},{"id":"n229","type":"INSG","label":"Managed, Secure, Efficient Ecosystem","text":"Combining a runtime gateway with a central governance registry transforms chaotic sprawl into a managed, secure, and efficient ecosystem.","parentId":"n220","children":[]}]},{"id":"n230","type":"CONC","label":"Cost and Reliability: Infrastructure Foundation","text":"Enterprise-grade agents must be both reliable and cost-effective, requiring underlying infrastructure to manage these trade-offs securely and compliantly.","parentId":"n1","children":[{"id":"n231","type":"INSG","label":"Negative ROI Factors","text":"An agent that frequently fails or provides slow results has a negative Return on Investment.","parentId":"n230","children":[]},{"id":"n232","type":"INSG","label":"Scaling Challenges","text":"A prohibitively expensive agent cannot scale to meet business demands, hindering its utility.","parentId":"n230","children":[]},{"id":"n233","type":"DETL","label":"Infrastructure Options","text":"Infrastructure needs range from scale-to-zero for irregular traffic to dedicated capacity like Provisioned Throughput for LM services or 99.9% SLAs for runtimes like Cloud Run.","parentId":"n230","children":[]},{"id":"n234","type":"JUST","label":"Predictable Performance","text":"These infrastructure options, coupled with comprehensive monitoring, ensure predictable performance, making important agents responsive even under heavy load.","parentId":"n230","children":[]},{"id":"n235","type":"INSG","label":"Foundation for AI Scaling","text":"This establishes the final, essential foundation for scaling AI agents from innovation into a core, reliable enterprise component.","parentId":"n230","children":[]}]},{"id":"n236","type":"CONC","label":"How Agents Evolve and Learn","text":"Agents deployed in dynamic environments must adapt to changing policies, technologies, and data formats to avoid performance degradation.","parentId":"n1","children":[{"id":"n237","type":"DETL","label":"Agent Aging","text":"Without adaptability, an agent's performance degrades over time, a process called 'aging', leading to loss of utility and trust.","parentId":"n236","children":[]},{"id":"n238","type":"JUST","label":"Scalable Learning Solution","text":"Manually updating a large fleet of agents is uneconomical; a more scalable solution is designing agents that learn and evolve autonomously.","parentId":"n236","children":[]}]},{"id":"n239","type":"CONC","label":"How Agents Learn and Self Evolve","text":"Agents learn from experience and external signals, much like humans, using this information to optimize future behavior.","parentId":"n1","children":[{"id":"n240","type":"SUBC","label":"Runtime Experience","text":"Agents learn from session logs, traces, memory, tool interactions, and decision trajectories, including Human-in-the-Loop (HITL) feedback for guidance.","parentId":"n239","children":[]},{"id":"n241","type":"SUBC","label":"External Signals","text":"Learning is driven by new external documents, such as updated enterprise policies, public regulatory guidelines, or critiques from other agents.","parentId":"n239","children":[]},{"id":"n242","type":"SUBC","label":"Enhanced Context Engineering","text":"Advanced systems continuously refine prompts, few-shot examples, and retrieved memory information to optimize context for each task.","parentId":"n239","children":[]},{"id":"n243","type":"SUBC","label":"Tool Optimization and Creation","text":"Agent reasoning identifies capability gaps, leading to gaining access to new tools, creating tools on the fly (e.g., Python scripts), or modifying existing ones.","parentId":"n239","children":[]},{"id":"n244","type":"DETL","label":"Additional Optimization Techniques","text":"Dynamically reconfiguring multi-agent design patterns or using Reinforcement Learning from Human Feedback (RLHF) are active research areas.","parentId":"n239","children":[]},{"id":"n245","type":"EXMP","label":"Learning New Compliance Guidelines","text":"An enterprise agent operating in a heavily regulated industry can learn new compliance guidelines using a multi-agent workflow.","parentId":"n239","children":[{"id":"n246","type":"DETL","label":"Querying Agent Role","text":"A Querying Agent retrieves raw data in response to a user request.","parentId":"n245","children":[]},{"id":"n247","type":"DETL","label":"Reporting Agent Role","text":"A Reporting Agent synthesizes retrieved data into a draft report.","parentId":"n245","children":[]},{"id":"n248","type":"DETL","label":"Critiquing Agent Role","text":"A Critiquing Agent, with compliance guidelines, reviews the report, escalating to a human expert if ambiguity or final sign-off is needed.","parentId":"n245","children":[]},{"id":"n249","type":"DETL","label":"Learning Agent Role","text":"A Learning Agent observes the interaction, pays attention to human expert feedback, and generalizes it into new, reusable guidelines.","parentId":"n245","children":[]},{"id":"n250","type":"INSG","label":"Autonomous Adaptation Loop","text":"If a human flags data requiring anonymization, the Learning Agent records it, and the Critiquing Agent applies this new rule, reducing human intervention.","parentId":"n245","children":[]}]}]},{"id":"n251","type":"CONC","label":"Simulation and Agent Gym","text":"More advanced approaches involve a dedicated platform, an Agent Gym, engineered to optimize multi-agent systems in offline processes with advanced tooling.","parentId":"n1","children":[{"id":"n252","type":"DETL","label":"Agent Gym Attribute: Standalone","text":"It is not in the execution path, functioning as a standalone off-production platform with assistance from any LM model and offline tools.","parentId":"n251","children":[]},{"id":"n253","type":"DETL","label":"Agent Gym Attribute: Simulation","text":"It offers a simulation environment for agents to 'exercise' on new data and learn, excellent for 'trial-and-error' with many optimization pathways.","parentId":"n251","children":[]},{"id":"n254","type":"DETL","label":"Agent Gym Attribute: Synthetic Data","text":"It can call advanced synthetic data generators to guide simulation to be realistic and pressure-test agents, including red-teaming and critiquing agents.","parentId":"n251","children":[]},{"id":"n255","type":"DETL","label":"Agent Gym Attribute: Adaptable Tools","text":"The arsenal of optimization tools is not fixed; it can adopt new tools via open protocols or learn new concepts and craft tools.","parentId":"n251","children":[]},{"id":"n256","type":"DETL","label":"Agent Gym Attribute: Human Connection","text":"Agent Gym can connect to human domain experts for consulting on outcomes, guiding optimizations for edge-cases of 'tribal knowledge'.","parentId":"n251","children":[]}]},{"id":"n257","type":"CONC","label":"Examples of Advanced Agents","text":"Examples demonstrate advanced agent capabilities in scientific research and algorithm optimization.","parentId":"n1","children":[{"id":"n258","type":"SUBC","label":"Google Co-Scientist","text":"Co-Scientist is an advanced AI agent designed as a virtual research collaborator to accelerate scientific discovery by systematically exploring complex problem spaces.","parentId":"n257","children":[{"id":"n259","type":"DETL","label":"Co-Scientist Goal","text":"It enables researchers to define a goal, ground the agent in knowledge sources, and generate/evaluate novel hypotheses.","parentId":"n258","children":[]},{"id":"n260","type":"DETL","label":"Ecosystem of Agents","text":"To achieve its goals, Co-Scientist spawns an ecosystem of collaborating agents.","parentId":"n258","children":[]},{"id":"n261","type":"DETL","label":"Supervisor Agent Role","text":"The AI acts as a research project manager, creating a detailed plan and delegating tasks to specialized agents, distributing resources.","parentId":"n258","children":[]},{"id":"n262","type":"JUST","label":"Scalability and Improvement","text":"This structure ensures the project can scale easily and that the team's methods improve as they work toward the final goal.","parentId":"n258","children":[]},{"id":"n263","type":"INSG","label":"Continuous Improvement","text":"Various agents work for hours or days, continuously improving generated hypotheses, running loops that refine ideas and judgment methods.","parentId":"n258","children":[]}]},{"id":"n264","type":"SUBC","label":"AlphaEvolve Agent","text":"AlphaEvolve is an advanced agentic system that discovers and optimizes algorithms for complex problems in mathematics and computer science.","parentId":"n257","children":[{"id":"n265","type":"DETL","label":"AlphaEvolve Mechanism","text":"It combines Gemini language models' creative code generation with an automated evaluation system, using an evolutionary process.","parentId":"n264","children":[]},{"id":"n266","type":"DETL","label":"Evolutionary Process","text":"The AI generates potential solutions, an evaluator scores them, and promising ideas inspire the next generation of code.","parentId":"n264","children":[]},{"id":"n267","type":"DETL","label":"Breakthroughs","text":"This approach has led to significant breakthroughs, including improving Google's data centers, chip design, and AI training.","parentId":"n264","children":[]},{"id":"n268","type":"DETL","label":"Matrix Multiplication","text":"AlphaEvolve has discovered faster matrix multiplication algorithms.","parentId":"n264","children":[]},{"id":"n269","type":"DETL","label":"Mathematical Problems","text":"AlphaEvolve has found new solutions to open mathematical problems.","parentId":"n264","children":[]},{"id":"n270","type":"JUST","label":"AlphaEvolve Strength","text":"AlphaEvolve excels at problems where verifying solution quality is easier than finding the solution itself.","parentId":"n264","children":[]},{"id":"n271","type":"DETL","label":"Iterative Human-AI Partnership","text":"AlphaEvolve is designed for a deep, iterative partnership between humans and AI, working in two main ways.","parentId":"n264","children":[]},{"id":"n272","type":"SUBC","label":"Transparent Solutions","text":"The AI generates solutions as human-readable code, allowing users to understand logic, gain insights, trust results, and directly modify code.","parentId":"n264","children":[]},{"id":"n273","type":"SUBC","label":"Expert Guidance","text":"Human expertise is essential for defining the problem, refining evaluation metrics, and steering exploration, preventing unintended loopholes.","parentId":"n264","children":[]},{"id":"n274","type":"INSG","label":"Continuous Code Improvement","text":"The agent continuously improves the code, enhancing metrics specified by humans.","parentId":"n264","children":[]}]}]}]},{"id":"n275","type":"CONC","label":"Conclusion","text":"Generative AI agents represent a pivotal evolution, shifting artificial intelligence from a passive tool to an active, autonomous partner in problem-solving.","parentId":"n1","children":[{"id":"n276","type":"INSG","label":"Formal Framework Provided","text":"This document provided a formal framework for understanding and building these systems, moving beyond prototypes to establish reliable, production-grade architecture.","parentId":"n275","children":[]},{"id":"n277","type":"DETL","label":"Three Essential Components","text":"An agent is deconstructed into its three essential components: the reasoning Model ('Brain'), actionable Tools ('Hands'), and governing Orchestration Layer ('Nervous System').","parentId":"n275","children":[]},{"id":"n278","type":"JUST","label":"Agent's True Potential","text":"Seamless integration of these parts, operating in a continuous 'Think, Act, Observe' loop, unlocks an agent's true potential.","parentId":"n275","children":[]},{"id":"n279","type":"DETL","label":"Classifying Agentic Systems","text":"Classifying agentic systems from Level 1 Problem-Solver to Level 3 Multi-Agent System helps architects scope ambitions to task complexity.","parentId":"n275","children":[]},{"id":"n280","type":"INSG","label":"New Developer Paradigm","text":"The central challenge lies in a new developer paradigm where developers become 'architects' and 'directors' rather than 'bricklayers' defining explicit logic.","parentId":"n275","children":[]},{"id":"n281","type":"JUST","label":"Source of Unreliability","text":"The flexibility that makes LMs powerful is also the source of their unreliability.","parentId":"n275","children":[]},{"id":"n282","type":"JUST","label":"Success in Engineering Rigor","text":"Success is found in engineering rigor applied to the entire system, including robust tool contracts, resilient error handling, sophisticated context management, and comprehensive evaluation.","parentId":"n275","children":[]},{"id":"n283","type":"INSG","label":"Foundational Blueprint","text":"The outlined principles and architectural patterns serve as a foundational blueprint for navigating this new software frontier.","parentId":"n275","children":[]},{"id":"n284","type":"INSG","label":"Harnessing Agentic AI Power","text":"This disciplined architectural approach will be the deciding factor in harnessing the full power of agentic AI, building collaborative, capable, and adaptable team members.","parentId":"n275","children":[]}]}]},"slug":"sintroduction-to-agentspdf-7de15f","sharedAt":{"_seconds":1780566147,"_nanoseconds":950000000},"title":"From Predictive AI to Autonomous Agents"}