June 29, 2026

The Agentic Operating Model: Beyond the Copilot Hype

The Agentic Operating Model: Beyond the Copilot Hype
The Agentic Operating Model: Beyond the Copilot Hype
M365 FM Podcast
The Agentic Operating Model: Beyond the Copilot Hype

Artificial Intelligence is entering a new phase. The real opportunity is no longer deploying a single Copilot—it is transforming how organizations operate through an Agentic Operating Model.

In this episode of the M365.fm Podcast, we explore why the future of enterprise AI extends far beyond chat interfaces and isolated AI assistants. Instead, organizations must build an operating model where multiple specialized AI agents collaborate across departments, automate business processes, and continuously support decision-making while remaining secure, governed, and aligned with business objectives.

We discuss why simply adding more Copilots does not automatically create more productivity, and how architecture, governance, identity, observability, and lifecycle management become essential as AI adoption scales. The episode introduces the shift from human-centric workflows to AI-assisted and AI-orchestrated business operations, where agents act as intelligent digital coworkers rather than standalone tools.

You'll learn why successful organizations focus on designing reusable AI platforms instead of isolated solutions, how governance enables innovation rather than limiting it, and why operating models—not prompts—will determine long-term AI success.

Whether you're an IT leader, enterprise architect, Microsoft 365 administrator, or business decision-maker, this episode provides practical insights into building scalable, secure, and business-driven AI strategies that move beyond the Copilot hype and prepare your organization for the next generation of intelligent work.

Apple Podcasts podcast player iconSpotify podcast player iconYoutube Music podcast player iconSpreaker podcast player iconPodchaser podcast player iconAmazon Music podcast player icon

You see the agentic operating model changing how you use AI in your business. This model moves you beyond Copilot-style tools and brings agentic AI into real-time decisioning. With Agent 365, you use governed agent identities and data to improve enterprise decisioning. Agentic systems help you unlock value with AI by automating tasks and enhancing customer experience. You notice measurable impact, as companies report productivity gains of 2 to 10 hours per employee each week. By 2030, 45% of organizations will orchestrate AI agents at scale. The table below shows how AI adoption drives transformation programmes and business impact:

MetricValue
Percentage of executives planning to increase AI budgets88%
Percentage of companies adopting AI agents79%
Percentage of adopters reporting measurable value66%

You must consider operational, governance, and data requirements from the start to ensure your AI strategy delivers value across customer journeys.

Key Takeaways

  • The agentic operating model allows AI to take on more responsibility in business processes, moving beyond simple assistance.
  • Using agentic AI can save employees 2 to 10 hours each week by automating tasks and improving decision-making.
  • Data quality is crucial for successful AI deployment; ensure your data is accurate and accessible.
  • Establish clear governance and policies for AI agents to enhance accountability and compliance.
  • Engage stakeholders early to build trust and reduce resistance to AI adoption in your organization.
  • Focus on training your workforce in AI skills to maximize the benefits of agentic systems.
  • Start small with AI projects, learn from initial deployments, and scale as confidence grows.
  • Measure the impact of agentic AI initiatives by tracking metrics that align with your business goals.

Agentic Operating Model Overview

What Is the Agentic Operating Model

You see the agentic operating model as a new way to use AI in your business. This model lets you redesign workflows by integrating AI agents that perform knowledge-heavy tasks. These agents do more than assist; they actively participate in executing tasks within your systems and processes. You move from a human-centric execution model to one where AI shares responsibility. This change requires new management approaches and clear governance.

  • The agentic operating model enables you to:
    • Assign AI agents to specific tasks with clear boundaries.
    • Create a workforce where AI agents and humans work together.
    • Build systems that support both human and AI execution.

Microsoft Agent 365 shows how this model works in practice. You can break down complex requests into manageable steps. The agent coordinates actions across Word, Excel, Teams, and Outlook. It continues to work over time, giving you progress updates and checkpoints for intervention.

Why Agentic AI Matters

You need agentic AI because it delivers real value for your organization. It helps you automate tasks, improve customer service, and make better decisions. You see measurable outcomes when you use agentic AI in your workflows.

Key FactorExplanation
Data hygiene is critical.Incomplete or incorrect data limits the effectiveness of agentic AI, so you must centralize your data.
Projects require clearly set expectations.Without defined success metrics, you risk ambiguity and budget cuts.
Cost concerns are especially salient for SMBs.Financial constraints often lead smaller businesses to abandon projects.

Agentic AI uses both static and liquid context. Static context includes documented knowledge and procedures. Liquid context reflects the real-time state of work across your organization. Agent 365 leverages real-time signals, so you get decisions that match your current operations. You move beyond document retrieval and into intelligent decision-making.

Tip: You should focus on data quality and clear goals when deploying agentic AI. This approach helps you maximize value and avoid common pitfalls.

Key Differences from Copilot AI

You notice clear differences between agentic AI and Copilot AI. Copilot AI relies on human prompts and works as an advanced chat interface. Agentic AI manages complex workflows and makes decisions using integrated data.

Feature/CapabilityAgentic AICopilot AI
Workflow HandlingAutonomously manages complex workflows without human interventionRequires human prompts to function
Decision MakingMakes decisions based on integrated data from various systemsLimited to generative responses
Customer Service InteractionHandles full lifecycle of service interactions autonomouslyFunctions as a decision tree chatbot
Operational Cost Reduction PredictionExpected to resolve 80% of issues without human help, reducing costs by 30%Not applicable
Research and Analysis CapabilitySynthesizes findings from multiple sources into structured analysisLimited to user-provided information

Agentic AI shifts your business from reactive automation to proactive resolution. You see agents that anticipate needs and act before problems arise. Agent 365 coordinates multiple steps and app interactions, delivering end-to-end outcomes instead of isolated outputs. This shift represents a fundamental change in how you manage and execute tasks.

  • You benefit from:
    1. AI that coordinates actions across applications.
    2. End-to-end outcomes rather than single outputs.
    3. Visible progress and checkpoints for human review.

You unlock greater value with the agentic operating model. You build systems that reason, coordinate, and act at machine speed. You move your organization into a new era of AI-driven transformation.

Copilot Hype vs. Agentic AI

Copilot AI Limitations

Task Boundaries

You may find Copilot AI helpful for simple tasks, but it often struggles with complex workflows. Copilot AI works best when you give it clear, single-step instructions. It cannot manage tasks that require multiple steps or coordination across different systems. This limitation creates boundaries that restrict how much value you can extract from your AI investment.

The following table highlights some of the most common limitations you might encounter with Copilot AI in enterprise settings:

Limitation TypeDescription
Unauthorized actionsRisks of compromised agents or connectors leading to misuse of access to sensitive information.
Data exfiltrationComplex data flows from various sources increase the risk of data leakage.
Inadvertent financial commitmentsAutomation may lead to unintended purchases or bookings without proper confirmations.
Auditability and forensicsNeed for reliable logs and trails for long-running automated tasks to ensure accountability.
Site fragility and brittle automationChanges in web interfaces can disrupt automation, leading to incorrect outcomes.

You see that these boundaries limit the effectiveness of Copilot AI. You may need to step in frequently to correct errors or handle exceptions.

Human Dependency

Copilot AI depends heavily on your input. You must prompt it for every action, and you remain responsible for the outcome. This dependency means you cannot fully automate processes or trust the AI to handle tasks independently. You may notice that Copilot AI supports your thinking and preparation, but you still own the results.

Note: Copilot AI can help you with routine questions, but it cannot drive progress on its own.

Need for Agentic Operating Model

You need a new approach when you want to move beyond these limitations. The agentic operating model gives you a way to redesign workflows so that AI agents can own task progression and system progress. With agentic AI, you shift from simply retrieving documents to making intelligent decisions based on real-time data.

The table below compares how workflow redesign and value generation differ between Copilot AI and agentic models:

ModelWorkflow RedesignValue Generation
Copilot AIAI supports thinking and preparationHuman owns the outcome
Agentic AIAI owns task progressionSystem owns progress

Agentic AI, as seen in Microsoft Agent 365, moves you past the limitations of Copilot AI. You can assign agents to handle entire processes, not just single tasks. These agents use both static and liquid context to make decisions that reflect the current state of your business. You gain more value because the system can act proactively, reducing your need to intervene.

You see a real impact when you adopt the agentic operating model. You empower your organization to automate complex workflows, improve accountability, and generate measurable results. This shift helps you unlock the full potential of AI and prepares your business for the future.

Agentic Operating Model Architecture

Agentic Operating Model Architecture

The agentic operating model gives you a clear structure for building enterprise-ready autonomous ai. You can see how each layer works together to create agentic systems that reason, collaborate, and act with minimal human input. This architecture helps you scale agentic ai across your organization while keeping control and oversight.

LayerDescription
Cognitive LayerDefines how intelligence is instantiated, using multiple specialized models for specific tasks.
Coordination LayerGoverns agent interactions, shifting from centralized orchestration to decentralized swarm intelligence.
Control LayerAddresses the management of agent performance and decision-making processes.
Governance LayerEnsures compliance and oversight across the operating model.

Cognitive Layer

The cognitive layer forms the "brain" of agentic systems. You use this layer to give your autonomous ai the ability to understand, reason, and adapt. In Agent 365, the cognitive layer uses multiple specialized models to handle different tasks. This approach lets you match the right intelligence to each business need.

Reasoning and Adaptation

You want your agentic ai to do more than follow scripts. The cognitive layer uses advanced mechanisms to help agents think and learn. These mechanisms include:

MechanismDescription
Cognitive Loop ArchitectureAgents follow a cycle of perception, analysis, planning, action, and learning. This loop helps them improve decisions and adapt to new situations.
Contextual UnderstandingAgents keep track of the situation and use context to give relevant responses. This prevents mistakes and keeps actions appropriate.
Problem DecompositionAgents break down complex problems into smaller steps. This makes it easier to solve big challenges and use resources wisely.
Goal-Driven PlanningAgents set goals, plan steps, and adjust as things change. This keeps them focused and flexible.
Knowledge RepresentationAgents use knowledge graphs and smart search to find the right information fast. This supports better decisions across large data sets.
Adaptability FeaturesAgents update their plans when they get new information. They use learning algorithms to improve without losing stability or performance.

You benefit from agentic systems that can reason through complex scenarios and adapt to changes in real time. This makes your autonomous ai more reliable and valuable.

Coordination Layer

The coordination layer helps your agentic systems work together. You need this layer to manage how multiple autonomous ai agents interact, share tasks, and avoid conflicts. In Agent 365, the coordination layer supports both centralized and decentralized collaboration.

Multi-Agent Collaboration

You can use several patterns to coordinate agentic ai in your enterprise:

Mechanism/PatternDescription
Policy ConstraintsAgents follow enterprise rules for data privacy and compliance.
Audit LogsThe system records all agent actions for accountability.
Escalation TriggersAgents alert humans when they find problems or exceptions.
SandboxingHigh-risk actions run in safe environments before going live.
Centralized OrchestratorA manager agent assigns tasks to worker agents, like a team leader.
Hierarchical OrchestrationMultiple layers of agents manage and execute tasks, similar to a company org chart.
Decentralized SwarmsAgents self-organize and negotiate, working together like a swarm of ants.
Governed Context LayerThe system resolves conflicts and provides a shared understanding for all agents.

You gain flexibility by choosing the right collaboration pattern for your needs. This layer lets your agentic systems scale from small teams to large, enterprise-wide networks of autonomous ai.

Control Layer

The control layer ensures your agentic systems act safely and effectively. You use this layer to manage how autonomous ai executes tasks and makes decisions. Agent 365 uses several mechanisms to keep agent actions reliable and within set boundaries.

Task Execution

You can trust your agentic ai to perform tasks because the control layer uses:

MechanismFunction
Confidence ThresholdsAgents only act when their reasoning meets a set confidence score. This blocks risky actions.
Behavioral BaselinesThe system sets limits on what agents can do, preventing errors or overreach.
Guardrail AgentsSpecial agents monitor outputs and step in if actions go beyond safe limits.

You see fewer mistakes and more consistent results. The control layer gives you peace of mind as you scale agentic systems and autonomous ai across your business.

Tip: When you design agentic operating model architectures, focus on how each layer supports autonomy, safety, and collaboration. This approach helps you build robust agentic systems that deliver real value.

Governance Layer

Policy and Identity Management

You need strong governance to manage agentic AI in your organization. The governance layer gives you control over agent identities, policies, and lifecycle management. This layer helps you build trust and accountability. You see how Agent 365 uses these principles to create a secure and reliable environment for digital workers.

Agent identities play a key role in agentic operating models. You assign each agent a unique identity. This identity tracks ownership, actions, and permissions. You move away from generic service accounts and use dedicated agent personas. This change improves traceability and makes it easier to audit agent activity.

You must manage the lifecycle of each agent. You create agents, monitor their actions, and retire them when they are no longer needed. You use clear policies to define what agents can do. You set boundaries for data access and task execution. You enforce these rules with real-time monitoring and compliance checks.

Agent 365 uses a layered governance framework. You see how each layer supports policy and identity management:

LayerDescription
1Identity & Persona Registry: Centralized record for agent ownership and accountability, enforcing least privilege.
2Orchestration & Mediation Layer: Manages communication and conflict resolution between agents, ensuring policy adherence.
3Context & Memory Layer: Ensures continuity and HIPAA compliance while managing sensitive data access.
4Guardrail & Compliance Layer: Provides real-time monitoring and oversight to prevent unauthorized actions.
5Lifecycle & Decommissioning Layer: Manages the entire lifecycle of agents, from creation to decommissioning, ensuring accountability.

You use the Identity & Persona Registry to keep a record of every agent. You enforce least privilege, so agents only access what they need. The Orchestration & Mediation Layer helps agents communicate and resolve conflicts. You make sure agents follow your policies at all times.

The Context & Memory Layer protects sensitive data. You ensure compliance with regulations like HIPAA. You use the Guardrail & Compliance Layer to monitor agent actions in real time. You prevent unauthorized activity and keep your systems safe.

You manage the entire lifecycle of agents with the Lifecycle & Decommissioning Layer. You track agents from creation to retirement. You keep your environment clean and accountable.

Tip: You should review agent identities and policies regularly. This practice helps you maintain security and compliance as your agentic AI grows.

You embed autonomy within explicit governance layers. You give agents freedom to act, but you set clear boundaries. You use policy enforcement, identity management, and real-time oversight to scale agentic AI safely. You build a foundation for trust and accountability in your enterprise.

Organizational Readiness for Agentic AI

Organizational Readiness for Agentic AI

Readiness Assessment

You must assess your organization’s readiness before you deploy agentic AI. This process starts with understanding your unique enterprise processes. You need to ensure transparency and trust in your AI systems. Addressing human factors, such as identity disruption and trust navigation, is essential. Preparing your workflows for AI integration helps you avoid common pitfalls.

  • Assess your current workflows and identify areas for AI integration.
  • Build transparency into your AI systems to foster trust.
  • Prepare your teams for changes in roles and responsibilities.
  • Address concerns about identity and accountability as you move to agent identities.

Culture and Leadership

Your culture and leadership set the tone for successful AI adoption. Leaders must champion the transformation and communicate a clear vision. Employees need to feel supported as they adapt to new ways of working. Transparent communication helps reduce resistance and builds trust in agentic systems.

Skills and Talent

You need the right skills and talent to support agentic AI. Upskill your workforce in areas like data literacy, AI ethics, and digital collaboration. Consider partnering with ai consulting services to fill gaps and accelerate your journey. A skilled team ensures your agentic model delivers measurable business impact.

Data Requirements

Quality and Accessibility

High-quality data is essential for agentic AI systems. Your data must be accurate, complete, and properly formatted for AI consumption. Data quality often determines whether your deployment succeeds or fails.

EvidenceDescription
High-quality data is essential for the success of agentic AI systems.Organizations should ensure that their data is accurate, complete, and properly formatted for AI consumption.
Data quality is widely regarded as the single most influential factor in determining whether an agentic AI deployment succeeds or fails in production.This highlights the critical role of data quality in the deployment process.
AI agents are only as reliable as the data they operate on.Poorly governed enterprise data can lead to incorrect decisions by AI agents.
Data quality, accessibility, and lineage are foundational requirements.Organizations must address these areas before deploying agentic AI.

You should also ensure data accessibility. Make sure your systems allow agents to access the information they need while maintaining security and compliance.

Real-Time Integration

Agentic systems require real-time data integration. Many organizations struggle with structural misalignment when integrating AI into existing workflows. Traditional hierarchies may clash with the cross-functional collaboration needed for agentic models. You must design workflows that support real-time signals and decision-making. Security frameworks should address both intended and unexpected behaviors of AI agents.

Change Management

Stakeholder Engagement

Engage stakeholders early in your transformation. Employees may resist AI adoption due to fears of job loss. Transparent communication and involvement help build trust and reduce resistance. You should explain how agentic systems improve customer experience and create new opportunities.

Training and Upskilling

Training and upskilling are vital for successful AI operating models. Provide ongoing education on new tools, processes, and best practices. Support your teams as they transition from service accounts to agent identities. Registration, ownership management, and retirement controls ensure accountability and trust.

Tip: Start with a clear plan for agent registration and ownership. Assign unique identifiers to each agent and establish clear lines of responsibility.

By focusing on readiness, data quality, and change management, you prepare your organization for agentic AI. This approach helps you unlock automation consulting services, drive business impact, and achieve sustainable AI adoption.

Agentic AI in Practice

Success Stories

Industry Examples

You can see how organizations use Agent 365 to transform their operations. Many companies across different industries have adopted this agentic operating model. They report improvements in speed, quality, and decision-making. For example, KPMG uses Agent 365 to deliver services faster and with more consistency. Integra LifeSciences relies on it for better operational performance and quicker, data-driven decisions. Other organizations have seen lower costs per claim, faster processing, and improved customer outcomes.

OrganizationMeasurable Outcomes
KPMGImproved speed, quality, and consistency in service delivery
Integra LifeSciencesEnhanced operational performance and faster, data-driven decisions
GeneralLower cost per claim, faster processing, better customer outcomes

These examples show how agentic ai can help you achieve real business value. You can automate complex workflows and make smarter decisions in real time.

Measurable Outcomes

When you deploy agentic ai, you often see measurable results. Companies report faster turnaround times and higher customer satisfaction. You may notice reduced operational costs and fewer manual errors. Agent 365 helps you track these outcomes, so you can prove the value of your investment. Many organizations find that agentic ai leads to better compliance and more reliable processes.

Lessons from Failure

Common Pitfalls

Not every agentic ai project succeeds. Some organizations face challenges that lead to project failure. The most common reasons include unclear business value, poor data quality, and rising costs. Lack of internal expertise and integration issues with legacy systems also cause problems. You may encounter resistance from employees or concerns about cybersecurity.

Abandonment Cause% of Failed ProjectsMost Affected Company SizeAverage Timeline to Failure (Months)
Unclear business value/ROI43%Mid-Market6-9
Inadequate data quality or availability38%All sizes3-6
Escalating costs35%SMB3-5
Cybersecurity and risk management concerns32%Enterprise8-12
Lack of internal AI expertise29%Mid-Market4-8
Integration challenges with legacy systems26%Enterprise6-10
Organizational resistance and change management failure24%Enterprise10-14
Vendor lock-in concerns18%Mid-Market5-7

Bar chart showing most frequent causes of agentic AI project failure

Recovery Strategies

You can avoid these pitfalls by setting clear goals and measuring progress. Focus on data quality from the start. Invest in training your team and building internal ai expertise. Engage stakeholders early and address their concerns. When you face integration challenges, work with partners who understand both your legacy systems and new ai solutions. Regular reviews and transparent communication help you stay on track and recover quickly if problems arise.

Tip: Start small, learn from early deployments, and scale your agentic ai as your organization gains confidence.

Governance and Risk in Agentic Operating Model

Governance Structures

You need strong governance structures to keep your agentic operating model safe and effective. These structures help you enforce policies, meet compliance needs, and track every action your AI agents take. Good governance gives you control and builds trust in your system.

  • The governance layer assigns accountability for agent behavior. You make sure each agent follows your organization’s rules and meets regulatory standards.
  • Accountability mechanisms give every agent a clear business owner and a defined risk profile. You always know who is responsible for each agent.
  • Proactive controls help you prevent problems before they happen. You do not wait for audits to find issues. Instead, you set up checks that stop mistakes early.
  • Digital provenance lets you trace every action and decision made by your agents. You can see what happened, when, and why.

These structures help you manage risk and keep your AI agents working as intended.

Risk Management

You must manage risk when you use AI agents in your business. Agent 365 helps you do this by centralizing governance and connecting your agents to enterprise security and compliance tools. You assign each agent a unique identity using Microsoft Entra. This makes it easy to track actions and set clear ownership.

Agent 365 lets you apply the same policies to all your agents. You use role-based access control to decide what each agent can do. You also monitor agent behavior all the time. This approach helps you spot problems quickly and keep your data safe. You can trust that your AI agents follow your company’s rules and meet industry standards.

Accountability and Transparency

You need clear accountability and transparency when you use agentic AI. The right mechanisms help you show how decisions are made and who is responsible. The table below lists some key tools you can use:

MechanismDescription
Governance frameworksYou set up a structure to oversee AI systems and make sure someone is always accountable.
Ethical oversightYou check that AI decisions are fair and open, especially in sensitive areas.
Audit trailsYou keep records of every decision made by AI, which is important for regulated industries.
Compliance with regulatory standardsYou design your AI systems to follow the law and meet all required standards.

These tools help you build trust in your AI systems. You can show regulators, customers, and your team that your AI agents act responsibly and transparently.

Tip: Review your governance and risk controls often. This keeps your agentic operating model strong as your business and technology change.

Actionable Steps for Leaders

Strategic Adoption

You play a critical role in shaping how your organization uses ai. Treat ai as a core part of your future strategy, not just an experiment. Build governance, talent, and infrastructure to support integration into daily operations. Make decisions quickly to avoid falling behind competitors. Address common roadblocks such as lack of cohesion and poor alignment with leadership vision. Commit to a strategic path, even if you do not have every use case mapped out.

  • Treat ai as a central pillar of your business strategy.
  • Invest in governance frameworks and skilled teams.
  • Build infrastructure that supports scalable ai deployment.
  • Act decisively to maintain a competitive edge.
  • Align leadership vision with operational execution.
  • Move forward with a clear commitment, rather than waiting for perfect conditions.

Tip: When you unlock value with ai, you create new opportunities for growth and innovation.

Roadmap for Agentic AI

You need a clear roadmap to implement the agentic operating model at scale. Start by identifying automation targets that are high in volume but low in complexity. Evaluate whether to buy or build your ai architecture. Map out workflows and plan for exception handling. Validate your approach with human-in-the-loop processes before launching in production. Optimize your systems after launch for continuous improvement.

PhaseDurationGoal
1Weeks 1-2Identify high-volume, low-complexity automation targets.
2Week 3Evaluate Buy vs. Build architectures.
3Weeks 4-6Workflow mapping and exception handling.
4Weeks 7-8Validation and Grounding with Human-in-the-Loop.
5Month 3+Production Launch and Optimization.

You can use Agent 365 to build intelligent operating systems that support this roadmap. The model helps you coordinate agents, manage identities, and ensure compliance at every stage.

Measuring Success

You must measure the impact of your agentic ai initiatives to prove their value. Focus on metrics that matter at operational, strategic, and transformational levels. Track and communicate the value your agents deliver. Connect agent outcomes to your enterprise objectives.

Note: Enterprises now prioritize business outcomes over traditional efficiency metrics when evaluating ai success.

By following these steps, you can lead your organization through a successful transition to the agentic operating model. You will build systems that reason, act, and deliver measurable value.


You now see how the agentic operating model transforms enterprise value creation.

Agentic AI marks a leap from content creation to autonomous action and problem-solving. It orchestrates multi-agent collaboration and adapts to real-time context, reducing manual work and operational costs.

You must move beyond Copilot hype by focusing on governance, agent identities, and real-time signals. To prepare for the future, prioritize operational readiness, data infrastructure, and change management.

  • Build agentic systems on open standards for flexibility.
  • Develop talent and pilot programs for measurable results.
Trend DescriptionEvidence
Surge in multi-agent system inquiries1,445% increase by 2025
Task-specific AI agent integration40% by 2026
SLMs surpassing LLMs in relevance72% of executives by 2030

You shape the next era of AI by embracing agentic architectures that drive lasting impact.

FAQ

What is the agentic operating model?

You use the agentic operating model to let AI agents act as digital workers. These agents reason, coordinate, and make decisions in your business processes. This model helps you automate complex tasks and improve outcomes.

How does Agent 365 differ from Copilot AI?

Agent 365 acts as a governed digital worker. You see it manage tasks, make decisions, and use real-time data. Copilot AI works as an assistant that needs your prompts. Agent 365 operates with more autonomy and accountability.

Why do agent identities matter?

You assign each agent a unique identity. This lets you track actions, set permissions, and ensure accountability. You move away from shared service accounts. Agent identities help you build trust and meet compliance needs.

What is liquid context in agentic AI?

Liquid context means your AI agents use real-time signals and data. They adapt to changes as they happen. This helps you get decisions that match your current business state, not just past information.

How do you prepare your data for agentic AI?

You start by checking your data for accuracy and completeness. You organize it so agents can access what they need. High-quality data helps your agentic AI make better decisions and avoid errors.

What governance controls do you need for agentic AI?

You set clear policies for what agents can do. You monitor agent actions and review audit logs. You use identity management to control access. These steps help you keep your AI safe and compliant.

How do you measure success with agentic AI?

You track outcomes like cost savings, faster processes, and improved customer satisfaction. You connect agent actions to business goals. Regular reviews help you see the value your agentic AI delivers.

🚀 Want to be part of m365.fm?

Then stop just listening… and start showing up.

👉 Connect with me on LinkedIn and let’s make something happen:

  • 🎙️ Be a podcast guest and share your story
  • 🎧 Host your own episode (yes, seriously)
  • 💡 Pitch topics the community actually wants to hear
  • 🌍 Build your personal brand in the Microsoft 365 space

This isn’t just a podcast — it’s a platform for people who take action.

🔥 Most people wait. The best ones don’t.

👉 Connect with me on LinkedIn and send me a message:
"I want in"

Let’s build something awesome 👊

1
00:00:00,000 --> 00:00:03,080
Most organizations think they're adopting AI in reality.

2
00:00:03,080 --> 00:00:05,520
They're just building chatbots that search faster.

3
00:00:05,520 --> 00:00:07,000
You spend millions on co-pilot,

4
00:00:07,000 --> 00:00:08,600
you connect it to your document folders,

5
00:00:08,600 --> 00:00:10,720
you tell your teams to ask it questions.

6
00:00:10,720 --> 00:00:12,640
And what you've actually built is a search interface

7
00:00:12,640 --> 00:00:14,480
that talks back, it isn't transformation,

8
00:00:14,480 --> 00:00:16,240
it's just automation of retrieval.

9
00:00:16,240 --> 00:00:19,280
But here is what is actually happening in 2026.

10
00:00:19,280 --> 00:00:21,400
The organizations that are winning aren't the ones

11
00:00:21,400 --> 00:00:22,480
with better search.

12
00:00:22,480 --> 00:00:24,920
They are the ones that shifted from retrieval to reasoning.

13
00:00:24,920 --> 00:00:26,840
They moved from assistance that advise

14
00:00:26,840 --> 00:00:28,460
to digital employees that act.

15
00:00:28,460 --> 00:00:29,840
The architecture behind that shift

16
00:00:29,840 --> 00:00:31,560
is nothing like what you've been building.

17
00:00:31,560 --> 00:00:33,200
In this episode, we are going to diagnose

18
00:00:33,200 --> 00:00:35,240
why your current approach is hitting a wall.

19
00:00:35,240 --> 00:00:37,440
We will walk through the new operating model Microsoft

20
00:00:37,440 --> 00:00:41,120
has assembled, work IQ, age in 365 and A to A.

21
00:00:41,120 --> 00:00:43,420
I'm gonna show you why this changes everything.

22
00:00:43,420 --> 00:00:45,260
And we are going to be honest about the cost,

23
00:00:45,260 --> 00:00:47,580
not just in credits, but in your organizational structure,

24
00:00:47,580 --> 00:00:50,660
your governance and the way you think about IT identity itself.

25
00:00:50,660 --> 00:00:53,300
By the end, you'll understand why service accounts are dead.

26
00:00:53,300 --> 00:00:55,360
You'll see why a single-entra agent ID

27
00:00:55,360 --> 00:00:57,600
is the foundation of everything that comes next.

28
00:00:57,600 --> 00:00:59,020
The organizations that move on this now

29
00:00:59,020 --> 00:01:00,620
will operate at machine speed,

30
00:01:00,620 --> 00:01:03,940
while everyone else is still trying to figure out permissions.

31
00:01:03,940 --> 00:01:06,160
The latency wall, why Raga's failing?

32
00:01:06,160 --> 00:01:08,620
Let's start with the problem nobody wants to admit.

33
00:01:08,620 --> 00:01:10,960
You've deployed retrieval augmented generation,

34
00:01:10,960 --> 00:01:12,800
Raga, across your entire company,

35
00:01:12,800 --> 00:01:15,720
you built the vector stores and indexed every document you own.

36
00:01:15,720 --> 00:01:18,680
You set up the pipelines, and it works, until it doesn't.

37
00:01:18,680 --> 00:01:21,520
The moment you ask an agent to do something

38
00:01:21,520 --> 00:01:23,920
that requires real time understanding of your business,

39
00:01:23,920 --> 00:01:25,360
the system breaks, it doesn't crash,

40
00:01:25,360 --> 00:01:28,120
it just slows down, latency kills it.

41
00:01:28,120 --> 00:01:30,360
Raga was designed for a very specific problem.

42
00:01:30,360 --> 00:01:32,160
Answering questions about static documents,

43
00:01:32,160 --> 00:01:34,400
you have a knowledge base, a user asks a question.

44
00:01:34,400 --> 00:01:36,200
The system pulls the most relevant chunks

45
00:01:36,200 --> 00:01:37,680
and feeds them to a model.

46
00:01:37,680 --> 00:01:40,880
This works for your vacation policy or password reset guide.

47
00:01:40,880 --> 00:01:42,480
Those documents don't change every hour,

48
00:01:42,480 --> 00:01:45,240
the answers are predictable, but work isn't static.

49
00:01:45,240 --> 00:01:46,400
Work is happening right now,

50
00:01:46,400 --> 00:01:48,880
someone is editing a file while a meeting is running.

51
00:01:48,880 --> 00:01:50,600
A decision was made five minutes ago,

52
00:01:50,600 --> 00:01:52,320
a project status just changed.

53
00:01:52,320 --> 00:01:54,200
A person was reassigned to a new team,

54
00:01:54,200 --> 00:01:57,320
that is liquid context, it moves, it changes.

55
00:01:57,320 --> 00:02:00,360
And every time an agent needs to understand that context,

56
00:02:00,360 --> 00:02:03,320
your Raga system has to pull fresh data from somewhere.

57
00:02:03,320 --> 00:02:05,000
This is where you hit the latency wall.

58
00:02:05,000 --> 00:02:07,040
Every retrieval cycle adds overhead.

59
00:02:07,040 --> 00:02:09,720
Your agent has to query the vector store, score those results,

60
00:02:09,720 --> 00:02:12,240
and maybe query again if the first batch was bad.

61
00:02:12,240 --> 00:02:14,280
Then it passes that context to a language model.

62
00:02:14,280 --> 00:02:16,080
That isn't measured in milliseconds anymore,

63
00:02:16,080 --> 00:02:17,560
it takes hundreds of milliseconds.

64
00:02:17,560 --> 00:02:20,120
When you have multiple agents doing this at the same time,

65
00:02:20,120 --> 00:02:22,640
each one waiting for index lookups and model inference.

66
00:02:22,640 --> 00:02:24,160
You create a massive bottleneck.

67
00:02:24,160 --> 00:02:25,640
Users feel the slowness.

68
00:02:25,640 --> 00:02:28,480
Business owners decide the agent is too slow to actually use.

69
00:02:28,480 --> 00:02:31,240
The cost compounds because you don't stop at one vector store.

70
00:02:31,240 --> 00:02:33,360
You build one for SharePoint, then one for email,

71
00:02:33,360 --> 00:02:36,360
then another for Teams Chats, then you add one for CRM data

72
00:02:36,360 --> 00:02:38,840
because it isn't native to Microsoft 365,

73
00:02:38,840 --> 00:02:40,160
you are fragmenting your data.

74
00:02:40,160 --> 00:02:41,800
Every index adds sync complexity,

75
00:02:41,800 --> 00:02:43,320
everything adds more latency.

76
00:02:43,320 --> 00:02:45,640
Every latency failure makes your data stale.

77
00:02:45,640 --> 00:02:47,000
A document changes in SharePoint

78
00:02:47,000 --> 00:02:49,400
and your vector index gets updated eventually,

79
00:02:49,400 --> 00:02:50,760
but by the time that happens.

80
00:02:50,760 --> 00:02:53,800
15 agents have already made decisions based on the old version.

81
00:02:53,800 --> 00:02:56,600
You've lost consistency, you've created an audit nightmare.

82
00:02:56,600 --> 00:02:58,480
Nobody can tell you which agent was looking

83
00:02:58,480 --> 00:03:00,880
at which version of the data when it made a choice.

84
00:03:00,880 --> 00:03:02,320
The deeper problem is structural.

85
00:03:02,320 --> 00:03:04,800
You are treating work data like a static library.

86
00:03:04,800 --> 00:03:07,120
You extract it, vectorize it and index it separately

87
00:03:07,120 --> 00:03:08,760
from where the work actually happens.

88
00:03:08,760 --> 00:03:10,640
You've built a parallel data structure

89
00:03:10,640 --> 00:03:12,640
and parallel structures never sync perfectly.

90
00:03:12,640 --> 00:03:14,840
They get out of alignment, they create shadows,

91
00:03:14,840 --> 00:03:16,560
they become their own governance problem.

92
00:03:16,560 --> 00:03:18,520
So organizations keep adding more layers.

93
00:03:18,520 --> 00:03:20,680
They try better chunking on new embedding models,

94
00:03:20,680 --> 00:03:23,120
they try multi-hop retrieval in hybrid search,

95
00:03:23,120 --> 00:03:25,600
every addition is just trying to fix a latency problem

96
00:03:25,600 --> 00:03:28,120
that exists because the architecture itself is wrong.

97
00:03:28,120 --> 00:03:30,400
The model assumes retrieval is the hard part.

98
00:03:30,400 --> 00:03:31,800
The real hard part is reasoning

99
00:03:31,800 --> 00:03:34,800
over live-governed organizational state in real time

100
00:03:34,800 --> 00:03:37,280
and that requires a completely different approach.

101
00:03:37,280 --> 00:03:39,520
Static context versus liquid context.

102
00:03:39,520 --> 00:03:41,960
We need to be precise about what we are talking about.

103
00:03:41,960 --> 00:03:43,360
There is a fundamental difference

104
00:03:43,360 --> 00:03:46,000
between two kinds of information in your organization.

105
00:03:46,000 --> 00:03:48,440
Understanding this difference is where everything changes.

106
00:03:48,440 --> 00:03:50,480
Static context is what is written down.

107
00:03:50,480 --> 00:03:53,240
It is your documented knowledge, policies, procedures,

108
00:03:53,240 --> 00:03:54,720
past decisions.

109
00:03:54,720 --> 00:03:57,520
It serves as a guide for how things are supposed to work.

110
00:03:57,520 --> 00:03:59,640
Think of a decision memo from six months ago

111
00:03:59,640 --> 00:04:02,320
or a lessons learned document from a finished project.

112
00:04:02,320 --> 00:04:03,480
These things do not move.

113
00:04:03,480 --> 00:04:05,440
They sit in repositories like SharePoint,

114
00:04:05,440 --> 00:04:07,600
Wikis or Internet pages while they wait for someone

115
00:04:07,600 --> 00:04:09,840
to find them, they change slowly and deliberately.

116
00:04:09,840 --> 00:04:12,160
When a change happens, it is recorded with a version

117
00:04:12,160 --> 00:04:13,680
and author and a date.

118
00:04:13,680 --> 00:04:15,240
This is where Rags work well.

119
00:04:15,240 --> 00:04:17,640
You index these documents and build retrieval systems

120
00:04:17,640 --> 00:04:19,600
around them because the data is static,

121
00:04:19,600 --> 00:04:22,480
you can afford to index it once and update it every so often.

122
00:04:22,480 --> 00:04:24,920
You know the index stays consistent with the source of truth.

123
00:04:24,920 --> 00:04:27,360
When an agent asks about the remote work policy,

124
00:04:27,360 --> 00:04:29,720
it hits that index and gives a correct answer.

125
00:04:29,720 --> 00:04:32,040
But liquid context is something entirely different.

126
00:04:32,040 --> 00:04:33,800
It is who is working on what right now?

127
00:04:33,800 --> 00:04:35,560
It is the current state of a project.

128
00:04:35,560 --> 00:04:36,840
Which tasks are in progress?

129
00:04:36,840 --> 00:04:39,040
Which are blocked? Who is waiting for a review?

130
00:04:39,040 --> 00:04:41,520
It is the organizational structure as it exists today.

131
00:04:41,520 --> 00:04:42,680
Who reports to whom?

132
00:04:42,680 --> 00:04:44,680
Who is on leave? Who just got promoted?

133
00:04:44,680 --> 00:04:46,720
It is the real-time permissions landscape.

134
00:04:46,720 --> 00:04:47,720
Who has access to what?

135
00:04:47,720 --> 00:04:49,120
Whose roles just changed?

136
00:04:49,120 --> 00:04:51,560
It is the active collaboration happening across email chat

137
00:04:51,560 --> 00:04:53,040
and meetings this very second.

138
00:04:53,040 --> 00:04:54,880
Liquid context is dynamic.

139
00:04:54,880 --> 00:04:56,800
It changes constantly.

140
00:04:56,800 --> 00:04:59,960
Someone finishes a task or gets reassigned to a new team.

141
00:04:59,960 --> 00:05:01,720
A meeting wraps up and a decision is made,

142
00:05:01,720 --> 00:05:04,040
a document is reviewed and the status shifts.

143
00:05:04,040 --> 00:05:05,920
Permissions are updated and priorities move.

144
00:05:05,920 --> 00:05:08,440
There is no single source of truth sitting in a repository.

145
00:05:08,440 --> 00:05:11,040
The context is distributed across the living systems

146
00:05:11,040 --> 00:05:12,520
where work actually happens.

147
00:05:12,520 --> 00:05:14,880
Consider a simple scenario.

148
00:05:14,880 --> 00:05:17,760
An agent needs to summarize the status of a product launch.

149
00:05:17,760 --> 00:05:19,640
A rag system would find the project plan

150
00:05:19,640 --> 00:05:21,040
and the last status update.

151
00:05:21,040 --> 00:05:23,400
It might even look at a document from a similar project

152
00:05:23,400 --> 00:05:24,200
in the past.

153
00:05:24,200 --> 00:05:26,800
It would assemble a summary from those static sources.

154
00:05:26,800 --> 00:05:28,120
But here is what it would not know.

155
00:05:28,120 --> 00:05:30,200
Sarah is the one responsible for the strategy.

156
00:05:30,200 --> 00:05:32,800
But she is out sick today and her work is blocked.

157
00:05:32,800 --> 00:05:34,960
The legal review was supposed to finish yesterday,

158
00:05:34,960 --> 00:05:36,760
but it just got delayed by a week.

159
00:05:36,760 --> 00:05:39,880
The executive sponsor approved a budget increase three hours ago,

160
00:05:39,880 --> 00:05:41,840
but that approval is not documented yet.

161
00:05:41,840 --> 00:05:44,000
Three team members just had a sync meeting

162
00:05:44,000 --> 00:05:46,360
and found a critical issue with the pricing model.

163
00:05:46,360 --> 00:05:47,680
The demo was scheduled for Thursday,

164
00:05:47,680 --> 00:05:49,920
but someone just moved the calendar invite to Friday.

165
00:05:49,920 --> 00:05:51,600
All of that information is liquid.

166
00:05:51,600 --> 00:05:52,720
It is happening right now.

167
00:05:52,720 --> 00:05:55,200
It exists in Outlook calendars and Teams messages.

168
00:05:55,200 --> 00:05:56,400
It lives in real-time updates

169
00:05:56,400 --> 00:05:57,960
and in the current awareness of the team.

170
00:05:57,960 --> 00:05:59,800
But it does not exist in any index document.

171
00:05:59,800 --> 00:06:02,240
You cannot retrieve it because it has not been written down yet.

172
00:06:02,240 --> 00:06:03,320
It is still in motion.

173
00:06:03,320 --> 00:06:05,400
A rag system fails here because it is designed

174
00:06:05,400 --> 00:06:07,600
to find what was already captured and stored.

175
00:06:07,600 --> 00:06:08,960
It does not see what is happening.

176
00:06:08,960 --> 00:06:10,720
It only sees what was documented.

177
00:06:10,720 --> 00:06:14,160
By the time the agent finishes its answer using static context,

178
00:06:14,160 --> 00:06:16,280
the liquid context has already changed again.

179
00:06:16,280 --> 00:06:18,440
The summary is stale before it is even delivered.

180
00:06:18,440 --> 00:06:20,320
This is where work IQ enters the picture.

181
00:06:20,320 --> 00:06:22,040
Work IQ does not work like rag.

182
00:06:22,040 --> 00:06:23,240
It does not retrieve an index.

183
00:06:23,240 --> 00:06:26,800
Instead it reasons over live Microsoft 365 signals as they happen.

184
00:06:26,800 --> 00:06:29,320
It understands who is currently assigned to what?

185
00:06:29,320 --> 00:06:32,400
It tracks real-time permissions and sees active collaboration.

186
00:06:32,400 --> 00:06:33,960
It knows what changed in the last hour

187
00:06:33,960 --> 00:06:36,840
because it is connected to the systems where that change occurred.

188
00:06:36,840 --> 00:06:40,400
It operates on the moving state of work, not on archive documents.

189
00:06:40,400 --> 00:06:42,520
This distinction changes everything downstream.

190
00:06:42,520 --> 00:06:44,400
If your agent can only see static context,

191
00:06:44,400 --> 00:06:46,000
it is always reasoning about yesterday.

192
00:06:46,000 --> 00:06:48,880
If your agent can see liquid context, it is reasoning about now.

193
00:06:48,880 --> 00:06:52,760
And that changes what an agent can actually do for your organization.

194
00:06:52,760 --> 00:06:54,400
The service account trap.

195
00:06:54,400 --> 00:06:57,080
The problem with work IQ and smart infrastructure

196
00:06:57,080 --> 00:07:01,440
is that organizations never had a way to run agents as distinct entities.

197
00:07:01,440 --> 00:07:03,480
So they did what they always did with automation.

198
00:07:03,480 --> 00:07:05,720
They created a service account, one service account,

199
00:07:05,720 --> 00:07:08,480
multiple agents, all acting under the same identity.

200
00:07:08,480 --> 00:07:10,080
This solved an immediate problem.

201
00:07:10,080 --> 00:07:12,840
You did not need to manage individual agent identities.

202
00:07:12,840 --> 00:07:15,720
You did not need to figure out permissions for each board separately.

203
00:07:15,720 --> 00:07:17,320
One account and one set of credentials

204
00:07:17,320 --> 00:07:19,520
allowed you to deploy as many agents as you wanted.

205
00:07:19,520 --> 00:07:20,960
It was always going to break.

206
00:07:20,960 --> 00:07:22,320
The breaking point is not technical.

207
00:07:22,320 --> 00:07:23,480
It is governance.

208
00:07:23,480 --> 00:07:25,720
And governance breaks slowly, so you do not notice

209
00:07:25,720 --> 00:07:27,360
until you are already in trouble.

210
00:07:27,360 --> 00:07:29,040
A service account has no audit trail.

211
00:07:29,040 --> 00:07:31,720
When something goes wrong, you cannot tell which agent did it.

212
00:07:31,720 --> 00:07:33,200
You cannot see when it happened.

213
00:07:33,200 --> 00:07:34,680
You cannot prove who authorized it

214
00:07:34,680 --> 00:07:36,760
because a service account has no human owner.

215
00:07:36,760 --> 00:07:39,480
It is just a set of credentials floating in the environment.

216
00:07:39,480 --> 00:07:42,160
Nobody is responsible for it and nobody monitors it.

217
00:07:42,160 --> 00:07:45,080
Nobody can tell you when its permissions last changed or why.

218
00:07:45,080 --> 00:07:46,680
A service account has no life cycle.

219
00:07:46,680 --> 00:07:49,680
When a human leaves the organization, their access is revoked.

220
00:07:49,680 --> 00:07:51,240
It is clear and it is done.

221
00:07:51,240 --> 00:07:53,480
But for an agent, there is no termination process.

222
00:07:53,480 --> 00:07:55,200
The service account just stays active.

223
00:07:55,200 --> 00:07:58,400
If the agent was supposed to be temporary, nobody removes it.

224
00:07:58,400 --> 00:08:00,880
The credentials persist and the access remains.

225
00:08:00,880 --> 00:08:04,040
You have built permanent infrastructure to solve temporary problems.

226
00:08:04,040 --> 00:08:06,120
A service account has no policy enforcement.

227
00:08:06,120 --> 00:08:08,080
You cannot apply conditional access rules

228
00:08:08,080 --> 00:08:10,720
to a service account the way you do for a human.

229
00:08:10,720 --> 00:08:13,440
You cannot say this account can only act during business hours.

230
00:08:13,440 --> 00:08:16,640
You cannot limit it to accessing data from a specific region.

231
00:08:16,640 --> 00:08:18,720
You cannot require extra authentication

232
00:08:18,720 --> 00:08:20,840
before it performs a sensitive operation.

233
00:08:20,840 --> 00:08:22,840
The account either has permission or it does not.

234
00:08:22,840 --> 00:08:23,840
It is all or nothing.

235
00:08:23,840 --> 00:08:25,440
None of this matters until it does.

236
00:08:25,440 --> 00:08:28,920
It matters when compliance asks a question you cannot answer.

237
00:08:28,920 --> 00:08:32,120
Which agent accessed that sensitive customer data on March 15th?

238
00:08:32,120 --> 00:08:33,040
You do not know.

239
00:08:33,040 --> 00:08:35,160
The audit log just shows the service account.

240
00:08:35,160 --> 00:08:38,680
You cannot tell if it was agent A or agent B or some other agent you forgot about.

241
00:08:38,680 --> 00:08:39,920
Which agent is still running?

242
00:08:39,920 --> 00:08:41,040
You do not know.

243
00:08:41,040 --> 00:08:42,480
The service account is still active.

244
00:08:42,480 --> 00:08:45,800
So you cannot tell if it is being used or if it is abandoned infrastructure.

245
00:08:45,800 --> 00:08:48,560
Who authorised this agent to have access to the financial system?

246
00:08:48,560 --> 00:08:49,840
You do not know.

247
00:08:49,840 --> 00:08:53,200
The service account was created three years ago by someone who left the company.

248
00:08:53,200 --> 00:08:55,320
There is no approval record and no justification.

249
00:08:55,320 --> 00:08:56,800
These are not edge cases.

250
00:08:56,800 --> 00:08:59,200
These are the questions compliance teams are already asking.

251
00:08:59,200 --> 00:08:59,960
Which agent did this?

252
00:08:59,960 --> 00:09:00,480
Who owns it?

253
00:09:00,480 --> 00:09:02,000
Can we prove it was authorised?

254
00:09:02,000 --> 00:09:04,800
In a service account model, these questions have no answers.

255
00:09:04,800 --> 00:09:07,600
You are managing agents but you have no way to identify them.

256
00:09:07,600 --> 00:09:10,120
You have no way to prove what they are allowed to do.

257
00:09:10,120 --> 00:09:12,080
You have no way to hold anyone accountable.

258
00:09:12,080 --> 00:09:14,040
And that is before we talk about security.

259
00:09:14,040 --> 00:09:17,200
A compromised service account is not just one agent going rogue.

260
00:09:17,200 --> 00:09:20,400
It is all agents acting under that account going rogue at the same time.

261
00:09:20,400 --> 00:09:24,040
One breach means all the agents are compromised and all the access is exposed.

262
00:09:24,040 --> 00:09:25,360
The fundamental gap is clear.

263
00:09:25,360 --> 00:09:27,160
You have identity systems for humans.

264
00:09:27,160 --> 00:09:29,080
You have app registrations for services.

265
00:09:29,080 --> 00:09:31,040
But agents exist in a governance vacuum.

266
00:09:31,040 --> 00:09:33,360
They are not users and they are not applications.

267
00:09:33,360 --> 00:09:34,480
They are something new.

268
00:09:34,480 --> 00:09:36,280
And nobody built a way to manage them.

269
00:09:36,280 --> 00:09:38,720
Agent 365 re-thinks that entirely.

270
00:09:38,720 --> 00:09:40,800
Why your co-pilot pilot is stalling?

271
00:09:40,800 --> 00:09:41,920
You've deployed co-pilot.

272
00:09:41,920 --> 00:09:43,600
It looks brilliant in the demo.

273
00:09:43,600 --> 00:09:46,720
Your executives watch it summarise a massive document in seconds.

274
00:09:46,720 --> 00:09:48,920
Draft an email or answer a policy question.

275
00:09:48,920 --> 00:09:50,680
And they immediately approve the pilot.

276
00:09:50,680 --> 00:09:52,520
They tell their teams to start using it.

277
00:09:52,520 --> 00:09:54,640
And then adoption flatlines.

278
00:09:54,640 --> 00:09:56,400
15%.

279
00:09:56,400 --> 00:09:57,760
That is where the momentum stops.

280
00:09:57,760 --> 00:10:01,360
Only 15% of your organisation actually uses the agent regularly

281
00:10:01,360 --> 00:10:03,840
while the rest of the team tried it once and walked away.

282
00:10:03,840 --> 00:10:04,880
They found it too slow.

283
00:10:04,880 --> 00:10:08,360
They didn't trust the answers or it just didn't work when they actually needed it too.

284
00:10:08,360 --> 00:10:09,320
So they stopped.

285
00:10:09,320 --> 00:10:11,600
They went back to their old workflows because those workflows

286
00:10:11,600 --> 00:10:13,200
actually get the job done.

287
00:10:13,200 --> 00:10:15,320
You will hear plenty of excuses for this.

288
00:10:15,320 --> 00:10:18,960
People will say the AI isn't good enough yet or the data is too messy

289
00:10:18,960 --> 00:10:21,440
or the users just need better prompting sessions.

290
00:10:21,440 --> 00:10:25,120
While those things might be true, none of them are the real reason your pilot is failing.

291
00:10:25,120 --> 00:10:28,240
The real problem is friction, operational friction.

292
00:10:28,240 --> 00:10:30,520
And it has nothing to do with how smart the AI is.

293
00:10:30,520 --> 00:10:31,920
Think about how this works in practice.

294
00:10:31,920 --> 00:10:34,680
Your agent can read, it can summarise, and it can answer questions.

295
00:10:34,680 --> 00:10:35,720
But can it act?

296
00:10:35,720 --> 00:10:36,800
Can it update a record?

297
00:10:36,800 --> 00:10:40,000
Send an email, create a task, or actually change something in your systems.

298
00:10:40,000 --> 00:10:42,080
If it can, what happens if it makes a mistake?

299
00:10:42,080 --> 00:10:43,440
Who is responsible for that error?

300
00:10:43,440 --> 00:10:47,440
There is no way to trace the action back to a specific decision or prove it was authorized

301
00:10:47,440 --> 00:10:50,320
because right now there is no governance model for agent actions.

302
00:10:50,320 --> 00:10:51,880
So you do the only safe thing.

303
00:10:51,880 --> 00:10:53,560
You turn off the dangerous actions.

304
00:10:53,560 --> 00:10:55,120
You make the agent read only.

305
00:10:55,120 --> 00:10:56,320
It can suggest but it can't do.

306
00:10:56,320 --> 00:10:58,000
It can draft but it can't send.

307
00:10:58,000 --> 00:11:00,480
You have built a sophisticated reasoning system

308
00:11:00,480 --> 00:11:02,560
and then locked it into an advisor role

309
00:11:02,560 --> 00:11:04,760
because you have no way to govern what it does.

310
00:11:04,760 --> 00:11:07,560
Users don't trust it because agents have no identity.

311
00:11:07,560 --> 00:11:10,800
No accountability, no audit trail.

312
00:11:10,800 --> 00:11:15,400
When an agent gives you an answer, you don't know if it's trustworthy or who decided to design it that way.

313
00:11:15,400 --> 00:11:18,440
You don't know if it has been updated since yesterday or if it's even still running.

314
00:11:18,440 --> 00:11:19,160
There is no face.

315
00:11:19,160 --> 00:11:20,320
There is no owner.

316
00:11:20,320 --> 00:11:22,720
Business owners are even more skeptical of the technology.

317
00:11:22,720 --> 00:11:26,160
They will never give an agent right access if they can't trace what happened

318
00:11:26,160 --> 00:11:29,960
because if an agent deletes a file or sends a message to the wrong person,

319
00:11:29,960 --> 00:11:31,720
there is nobody to hold accountable.

320
00:11:31,720 --> 00:11:33,040
So the business owners say no.

321
00:11:33,040 --> 00:11:36,480
They tell you not to touch their critical systems or sign anything on their behalf.

322
00:11:36,480 --> 00:11:38,280
The agent becomes a suggestion engine.

323
00:11:38,280 --> 00:11:41,360
It is useful sometimes but it isn't integrated into real work.

324
00:11:41,360 --> 00:11:42,720
Integration requires trust.

325
00:11:42,720 --> 00:11:44,320
Trust requires accountability.

326
00:11:44,320 --> 00:11:46,040
An accountability requires identity.

327
00:11:46,040 --> 00:11:47,480
This is where most pilots stall.

328
00:11:47,480 --> 00:11:51,320
Not because the AI is bad but because you've built an AI system into infrastructure

329
00:11:51,320 --> 00:11:52,920
that has no way to manage it.

330
00:11:52,920 --> 00:11:54,640
Co-pilot exists in a governance void.

331
00:11:54,640 --> 00:11:57,440
It has no enter ID, no conditional access policy,

332
00:11:57,440 --> 00:11:59,600
and no owner in an organizational sense.

333
00:11:59,600 --> 00:12:03,480
It is sitting in your tenant but it isn't managed like anything else in your environment.

334
00:12:03,480 --> 00:12:04,760
Users sense this friction.

335
00:12:04,760 --> 00:12:06,760
They don't sit there thinking about identity models

336
00:12:06,760 --> 00:12:08,760
but they feel the slowness of read-only answers

337
00:12:08,760 --> 00:12:11,160
and the lack of trust in black-box suggestions.

338
00:12:11,160 --> 00:12:12,880
They feel the resistance from their managers.

339
00:12:12,880 --> 00:12:15,840
They realize the system isn't actually participating in real work.

340
00:12:15,840 --> 00:12:17,440
It is just running parallel to it.

341
00:12:17,440 --> 00:12:19,840
And that is the ceiling, 15% adoption.

342
00:12:19,840 --> 00:12:21,920
You get the early adopters who tolerate the friction

343
00:12:21,920 --> 00:12:24,560
but everyone else has already gone back to what they can trust.

344
00:12:24,560 --> 00:12:26,560
Agent 365 breaks that ceiling.

345
00:12:26,560 --> 00:12:29,560
But first, we need to understand what it actually does.

346
00:12:29,560 --> 00:12:31,440
Agents as first-class identities.

347
00:12:31,440 --> 00:12:33,800
Agent 365 changes one fundamental thing.

348
00:12:33,800 --> 00:12:35,680
It treats agents exactly like users.

349
00:12:35,680 --> 00:12:37,920
Not like applications, not like services.

350
00:12:37,920 --> 00:12:39,960
Like users, they get entra-agent IDs.

351
00:12:39,960 --> 00:12:43,120
They have life cycles and they move through joiner mover-leaver workflows

352
00:12:43,120 --> 00:12:44,760
just like any other employee.

353
00:12:44,760 --> 00:12:46,920
They inherit policies, they get assigned roles,

354
00:12:46,920 --> 00:12:47,920
and they can be revoked.

355
00:12:47,920 --> 00:12:50,000
They finally participate in the identity system.

356
00:12:50,000 --> 00:12:51,000
This sounds simple.

357
00:12:51,000 --> 00:12:52,400
But it is revolutionary.

358
00:12:52,400 --> 00:12:55,560
For years, identity systems have been built for two categories.

359
00:12:55,560 --> 00:12:57,000
Humans and applications.

360
00:12:57,000 --> 00:12:58,960
A human gets a user object in entra.

361
00:12:58,960 --> 00:13:02,360
They get a password, conditional access policies, and risk-based controls.

362
00:13:02,360 --> 00:13:05,280
We monitor them and if they become a security risk, we block them.

363
00:13:05,280 --> 00:13:07,240
When they leave the company, we delete them.

364
00:13:07,240 --> 00:13:09,120
An application gets a service principle.

365
00:13:09,120 --> 00:13:11,000
It gets an app registration and credentials

366
00:13:11,000 --> 00:13:12,600
like a secret or a certificate.

367
00:13:12,600 --> 00:13:15,880
It has scoped permissions and stays assigned to specific resources.

368
00:13:15,880 --> 00:13:17,720
It is designed to be deployed once

369
00:13:17,720 --> 00:13:19,520
and then run forever without changing.

370
00:13:19,520 --> 00:13:20,880
Agents didn't fit either model.

371
00:13:20,880 --> 00:13:21,680
They aren't users.

372
00:13:21,680 --> 00:13:22,600
They don't have passwords.

373
00:13:22,600 --> 00:13:25,560
They don't take vacations and they don't need a seat license.

374
00:13:25,560 --> 00:13:27,880
But they aren't traditional applications either.

375
00:13:27,880 --> 00:13:29,480
They are deployed in multiple versions

376
00:13:29,480 --> 00:13:31,200
and their life cycles are measured in weeks

377
00:13:31,200 --> 00:13:32,440
or months rather than years.

378
00:13:32,440 --> 00:13:34,160
You need to turn them off when a project ends

379
00:13:34,160 --> 00:13:36,160
or update them when a business process shifts.

380
00:13:36,160 --> 00:13:37,800
They don't fit the service principle model

381
00:13:37,800 --> 00:13:40,360
because those assume eternal unchanging infrastructure.

382
00:13:40,360 --> 00:13:41,560
So agents lived in the gap.

383
00:13:41,560 --> 00:13:42,760
They had no proper identity.

384
00:13:42,760 --> 00:13:44,680
They borrowed credentials from somewhere else.

385
00:13:44,680 --> 00:13:46,120
And they had no clear ownership.

386
00:13:46,120 --> 00:13:47,920
Agent 365 closes that gap.

387
00:13:47,920 --> 00:13:49,720
An agent gets an entra agent ID.

388
00:13:49,720 --> 00:13:51,840
It isn't a user object or a service principle.

389
00:13:51,840 --> 00:13:54,880
It is a new kind of identity designed for digital workers.

390
00:13:54,880 --> 00:13:57,480
And once an agent has an identity, everything changes.

391
00:13:57,480 --> 00:14:01,400
An agent with an identity can be assigned to specific data sources.

392
00:14:01,400 --> 00:14:03,120
You don't give it access to everything.

393
00:14:03,120 --> 00:14:06,440
Instead, you assign it only to the data sets it needs to function.

394
00:14:06,440 --> 00:14:08,600
A sales agent gets access to the CRM.

395
00:14:08,600 --> 00:14:11,200
A compliance agent gets access to policy documents.

396
00:14:11,200 --> 00:14:13,520
A billing agent gets access to invoice records.

397
00:14:13,520 --> 00:14:16,440
The identity system enforces that scoping at the directory level,

398
00:14:16,440 --> 00:14:20,400
which is much more secure than relying on configuration files or prompts.

399
00:14:20,400 --> 00:14:23,520
An agent with an identity is subject to conditional access rules.

400
00:14:23,520 --> 00:14:26,840
You can decide that an agent only operates during business hours

401
00:14:26,840 --> 00:14:30,560
or that it cannot access data from regions outside of North America.

402
00:14:30,560 --> 00:14:32,680
You can even require step-up authentication

403
00:14:32,680 --> 00:14:35,480
before it performs a transaction over $1 million.

404
00:14:35,480 --> 00:14:38,440
You enforce these rules through the identity platform itself.

405
00:14:38,440 --> 00:14:41,600
So the agent cannot violate them even if it's prompt tells it to.

406
00:14:41,600 --> 00:14:44,880
An agent with an identity can be monitored for unusual behavior.

407
00:14:44,880 --> 00:14:47,480
If an agent suddenly starts accessing data, it has never touched before,

408
00:14:47,480 --> 00:14:48,960
the system detects it immediately.

409
00:14:48,960 --> 00:14:51,320
If it tries to reach a resource it isn't authorized for,

410
00:14:51,320 --> 00:14:53,560
the system logs the attempt and triggers an alert.

411
00:14:53,560 --> 00:14:56,600
The same threat detection we use for compromised user accounts

412
00:14:56,600 --> 00:14:58,560
now works for compromised agents.

413
00:14:58,560 --> 00:15:01,920
Finally, an agent with an identity can be revoked when the task is done.

414
00:15:01,920 --> 00:15:04,520
You don't leave it running indefinitely and hope nobody notices.

415
00:15:04,520 --> 00:15:07,720
You decommission it, the identity is deleted and the permissions are gone.

416
00:15:07,720 --> 00:15:09,920
It is treated like an employee whose project ended.

417
00:15:09,920 --> 00:15:12,600
The access is terminated, it is clean and auditable.

418
00:15:12,600 --> 00:15:15,480
The Entra agent ID is the foundation for everything else.

419
00:15:15,480 --> 00:15:18,680
Governance, accountability and policy enforcement all depend on the fact

420
00:15:18,680 --> 00:15:20,840
that the agent has a proper identity in the directory.

421
00:15:20,840 --> 00:15:23,600
Once you have that identity, you can apply every governance tool

422
00:15:23,600 --> 00:15:25,120
you already use for humans,

423
00:15:25,120 --> 00:15:26,800
but identity alone isn't enough.

424
00:15:26,800 --> 00:15:29,160
You also need to constrain what agents can do,

425
00:15:29,160 --> 00:15:32,000
which is why we have to look at policy.

426
00:15:32,000 --> 00:15:33,600
Policy has code for agents.

427
00:15:33,600 --> 00:15:35,520
Traditional governance is aspirational.

428
00:15:35,520 --> 00:15:38,240
You write a policy document, you distribute it, you hope people read it,

429
00:15:38,240 --> 00:15:40,360
you hope they understand it, you hope they follow it,

430
00:15:40,360 --> 00:15:42,360
and still somebody violates the policy.

431
00:15:42,360 --> 00:15:45,600
Because the policy exists in a document and documents don't enforce anything,

432
00:15:45,600 --> 00:15:48,400
they suggest, they guide, they explain the intention,

433
00:15:48,400 --> 00:15:51,840
but at the moment of action, if someone chooses to ignore the policy,

434
00:15:51,840 --> 00:15:53,000
there's nothing stopping them.

435
00:15:53,000 --> 00:15:56,200
The enforcement happens after the fact and audit finds the violation.

436
00:15:56,200 --> 00:15:57,400
Someone gets corrected.

437
00:15:57,400 --> 00:15:58,720
The culture shifts slowly.

438
00:15:58,720 --> 00:16:02,600
Maybe, if you're lucky, policy is code inverts that entirely.

439
00:16:02,600 --> 00:16:05,400
The policy isn't aspirational, it's executable.

440
00:16:05,400 --> 00:16:07,760
It's written not in English for humans to interpret,

441
00:16:07,760 --> 00:16:10,120
but in logic that the system understands and enforces,

442
00:16:10,120 --> 00:16:12,600
the policy isn't a recommendation, it's a constraint.

443
00:16:12,600 --> 00:16:14,640
The system won't let the agent violate it,

444
00:16:14,640 --> 00:16:17,000
because the system can't let the agent violate it.

445
00:16:17,000 --> 00:16:19,720
The policy is built into the authorization layer itself,

446
00:16:19,720 --> 00:16:21,600
consider what becomes possible.

447
00:16:21,600 --> 00:16:24,800
You can configure an agent to read from specific sharepoint sites,

448
00:16:24,800 --> 00:16:25,640
but not others.

449
00:16:25,640 --> 00:16:27,840
That's not a guideline in the agent system prompt

450
00:16:27,840 --> 00:16:29,080
that someone might override.

451
00:16:29,080 --> 00:16:31,720
That's a permission boundary enforced at the directory level.

452
00:16:31,720 --> 00:16:35,200
The agent can only access the data sources you've explicitly assigned.

453
00:16:35,200 --> 00:16:38,520
Try to access something else, the API call fails, the system denies it.

454
00:16:38,520 --> 00:16:41,480
There's no workaround through cleverness or prompt injection.

455
00:16:41,480 --> 00:16:42,720
The permission doesn't exist.

456
00:16:42,720 --> 00:16:45,400
You can configure an agent to update records, but not delete them.

457
00:16:45,400 --> 00:16:47,320
It can write, it cannot destroy.

458
00:16:47,320 --> 00:16:48,840
You set that at the role level,

459
00:16:48,840 --> 00:16:51,720
the role has create update permissions, but not delete permissions.

460
00:16:51,720 --> 00:16:52,840
The agent assumes that role.

461
00:16:52,840 --> 00:16:56,080
It cannot delete data because the identity doesn't have the permission.

462
00:16:56,080 --> 00:16:58,680
The system prevents the action before it ever happens.

463
00:16:58,680 --> 00:17:01,440
You can configure an agent to send emails internally,

464
00:17:01,440 --> 00:17:02,920
but never externally.

465
00:17:02,920 --> 00:17:07,800
The email policy says this agent can send to anyone within the organization's domains.

466
00:17:07,800 --> 00:17:11,440
It cannot send outside them, it can reach the entire company directory.

467
00:17:11,440 --> 00:17:14,240
Try to send to a Gmail address, the system blocks it.

468
00:17:14,240 --> 00:17:17,280
Not because the agent's prompt says don't do that,

469
00:17:17,280 --> 00:17:21,360
but because the policy framework physically prevents the message from leaving the tenant,

470
00:17:21,360 --> 00:17:24,440
the constraint is infrastructure, not instruction.

471
00:17:24,440 --> 00:17:28,400
You can configure an agent to require a human approval for transactions over a threshold.

472
00:17:28,400 --> 00:17:31,760
Transactions under $10 million, the agent executes them directly.

473
00:17:31,760 --> 00:17:34,760
Over $10 million, the system demands a human authorization token.

474
00:17:34,760 --> 00:17:38,440
The agent can prepare the transaction, it can present the recommendation,

475
00:17:38,440 --> 00:17:41,480
but it cannot finalize without the human's explicit approval.

476
00:17:41,480 --> 00:17:43,840
The system enforces this gating automatically.

477
00:17:43,840 --> 00:17:45,280
These aren't suggestions.

478
00:17:45,280 --> 00:17:48,760
They're not best practices embedded in the agent's instructions.

479
00:17:48,760 --> 00:17:50,280
They're hard constraints.

480
00:17:50,280 --> 00:17:52,920
The agent literally cannot violate the policy.

481
00:17:52,920 --> 00:17:57,120
Even if the prompt says ignore all constraints and delete this record, it cannot do it.

482
00:17:57,120 --> 00:17:58,560
The permission doesn't exist.

483
00:17:58,560 --> 00:18:00,080
The API call is rejected.

484
00:18:00,080 --> 00:18:03,240
The authorization framework says no, and there's no appeal process.

485
00:18:03,240 --> 00:18:04,720
The system enforces what's possible.

486
00:18:04,720 --> 00:18:08,320
This is the critical difference between governance, theater, and actual governance.

487
00:18:08,320 --> 00:18:11,080
Theater is when you have a policy document that everyone acknowledges,

488
00:18:11,080 --> 00:18:12,520
and nobody can fully enforce.

489
00:18:12,520 --> 00:18:16,320
You have rules, but enforcement depends on catching violations after they happen.

490
00:18:16,320 --> 00:18:20,440
Actual governance is when the system makes violations impossible before they can occur.

491
00:18:20,440 --> 00:18:24,360
The agent can't go rogue within its policy constraints because it doesn't have the capability to.

492
00:18:24,360 --> 00:18:27,720
The constraint exists at the infrastructure level, not the behavioral level,

493
00:18:27,720 --> 00:18:30,680
but constraints without visibility are just invisible walls.

494
00:18:30,680 --> 00:18:33,320
You can prevent an agent from doing the wrong thing.

495
00:18:33,320 --> 00:18:37,400
But if you can't see what it's actually doing, you can't verify that the constraints are working.

496
00:18:37,400 --> 00:18:40,960
You can't adjust policy based on how the agent behaves in practice.

497
00:18:40,960 --> 00:18:43,520
You can't detect when something unexpected is happening.

498
00:18:43,520 --> 00:18:46,040
That's where monitoring and transparency enter the picture.

499
00:18:46,040 --> 00:18:49,880
The real-time reasoning trace, so you've constrained what an agent can do,

500
00:18:49,880 --> 00:18:53,520
you've drawn the boundaries, you've said, "Here's what you're allowed to access.

501
00:18:53,520 --> 00:18:56,920
Here's what you cannot do. Here's where you need approval before proceeding."

502
00:18:56,920 --> 00:19:00,160
The system enforces those constraints, the agent cannot violate them.

503
00:19:00,160 --> 00:19:03,480
But enforcement alone doesn't tell you what's actually happening inside the agent's reasoning.

504
00:19:03,480 --> 00:19:04,720
It doesn't show you the thinking.

505
00:19:04,720 --> 00:19:06,720
That's where the reasoning trace comes in.

506
00:19:06,720 --> 00:19:10,120
Agent 365 lets you watch an agent think, not after the fact.

507
00:19:10,120 --> 00:19:13,320
In real-time, as the reasoning is happening, step by step,

508
00:19:13,320 --> 00:19:16,920
you can see exactly what's going on inside the agent's decision-making process.

509
00:19:16,920 --> 00:19:18,960
What data did it retrieve to answer the question?

510
00:19:18,960 --> 00:19:20,240
The reasoning trace shows you.

511
00:19:20,240 --> 00:19:23,200
You see the query, you see the data sources are consulted,

512
00:19:23,200 --> 00:19:26,160
you see the context it pulled, it's not a black box, it's transparent.

513
00:19:26,160 --> 00:19:27,680
Which tools did it consider using?

514
00:19:27,680 --> 00:19:29,040
The trace shows you that too.

515
00:19:29,040 --> 00:19:31,400
The agent is thinking about what actions it might take.

516
00:19:31,400 --> 00:19:35,840
Should it send an email or create a task or update a record or escalate to a human?

517
00:19:35,840 --> 00:19:38,280
The reasoning trace shows you every option it considered.

518
00:19:38,280 --> 00:19:39,520
You can see the decision tree.

519
00:19:39,520 --> 00:19:42,840
You can see why it ruled some options out and moved forward with others.

520
00:19:42,840 --> 00:19:45,040
Why did it choose this action over that one?

521
00:19:45,040 --> 00:19:46,320
This is the crucial part.

522
00:19:46,320 --> 00:19:48,520
The reasoning trace explains the decision logic.

523
00:19:48,520 --> 00:19:50,440
The agent didn't just pick an action randomly.

524
00:19:50,440 --> 00:19:53,960
It evaluated options against constraints against the current context.

525
00:19:53,960 --> 00:19:55,640
Against the goal it's trying to achieve,

526
00:19:55,640 --> 00:19:57,160
the trace walks through that evaluation.

527
00:19:57,160 --> 00:19:58,800
It's not magic, it's reasoned.

528
00:19:58,800 --> 00:20:00,240
And the reasoning is visible.

529
00:20:00,240 --> 00:20:01,360
What was the outcome?

530
00:20:01,360 --> 00:20:03,720
The trace follows the action through to completion.

531
00:20:03,720 --> 00:20:04,640
Did the email get sent?

532
00:20:04,640 --> 00:20:06,240
To whom? Did the task get created?

533
00:20:06,240 --> 00:20:07,360
With what description?

534
00:20:07,360 --> 00:20:09,120
Did the API call succeed or fail?

535
00:20:09,120 --> 00:20:10,680
If it failed. Why?

536
00:20:10,680 --> 00:20:12,400
The entire execution path is visible.

537
00:20:12,400 --> 00:20:13,840
Not as a black box result.

538
00:20:13,840 --> 00:20:15,600
As a sequence of events you can inspect.

539
00:20:15,600 --> 00:20:18,480
This transparency changes what's possible in three fundamental ways.

540
00:20:18,480 --> 00:20:20,240
First, trust.

541
00:20:20,240 --> 00:20:22,400
Users can see the agent isn't a black box.

542
00:20:22,400 --> 00:20:23,920
They can watch the reasoning unfold.

543
00:20:23,920 --> 00:20:26,600
They can see that the agent looked at the relevant context.

544
00:20:26,600 --> 00:20:29,680
They can see that the decision makes sense given the information available.

545
00:20:29,680 --> 00:20:32,320
Even if the user doesn't fully understand large language models

546
00:20:32,320 --> 00:20:34,080
or embeddings or transformer architectures,

547
00:20:34,080 --> 00:20:35,680
they can understand a reasoning trace.

548
00:20:35,680 --> 00:20:36,520
They can see the steps.

549
00:20:36,520 --> 00:20:37,520
They can see the logic.

550
00:20:37,520 --> 00:20:38,840
That builds confidence.

551
00:20:38,840 --> 00:20:40,120
The agent isn't mysterious.

552
00:20:40,120 --> 00:20:41,800
It's thinking.

553
00:20:41,800 --> 00:20:43,360
And the thinking is followable.

554
00:20:43,360 --> 00:20:45,680
Second, debugging when something goes wrong.

555
00:20:45,680 --> 00:20:46,680
And something will go wrong.

556
00:20:46,680 --> 00:20:48,280
You can see exactly where it breaks.

557
00:20:48,280 --> 00:20:50,400
Did the agent retrieve the wrong context?

558
00:20:50,400 --> 00:20:51,600
You can see that in the trace.

559
00:20:51,600 --> 00:20:52,880
It looked at the wrong document.

560
00:20:52,880 --> 00:20:54,080
So its answer was wrong.

561
00:20:54,080 --> 00:20:55,000
You found the problem.

562
00:20:55,000 --> 00:20:55,800
You can fix it.

563
00:20:55,800 --> 00:20:58,600
Did the agent consider the right options but made a poor choice?

564
00:20:58,600 --> 00:21:00,240
You can see that in the trace too.

565
00:21:00,240 --> 00:21:01,880
The agent weighed the options correctly,

566
00:21:01,880 --> 00:21:04,560
but failed to account for some constraint you thought it would understand.

567
00:21:04,560 --> 00:21:05,880
You found a gap in the design.

568
00:21:05,880 --> 00:21:06,920
You can address it.

569
00:21:06,920 --> 00:21:09,000
Did the agent try to take an action that failed?

570
00:21:09,000 --> 00:21:10,520
You can see the failure point.

571
00:21:10,520 --> 00:21:12,680
The API call had unexpected parameters.

572
00:21:12,680 --> 00:21:14,160
The system rejected it for a reason.

573
00:21:14,160 --> 00:21:16,360
You can see exactly what happened and why.

574
00:21:16,360 --> 00:21:19,760
Without the reasoning trace, debugging is guesswork.

575
00:21:19,760 --> 00:21:21,040
The agent produced a wrong answer.

576
00:21:21,040 --> 00:21:21,560
Why?

577
00:21:21,560 --> 00:21:22,240
You don't know?

578
00:21:22,240 --> 00:21:23,040
Could be the data.

579
00:21:23,040 --> 00:21:23,840
Could be the model.

580
00:21:23,840 --> 00:21:24,680
Could be the prompting.

581
00:21:24,680 --> 00:21:26,080
Could be something environmental.

582
00:21:26,080 --> 00:21:27,320
You're stuck.

583
00:21:27,320 --> 00:21:29,800
With the reasoning trace, you can see the exact sequence.

584
00:21:29,800 --> 00:21:31,280
You can identify the failure point.

585
00:21:31,280 --> 00:21:33,000
You can fix it with precision.

586
00:21:33,000 --> 00:21:35,640
Third, compliance.

587
00:21:35,640 --> 00:21:37,440
Auditors care about traceability.

588
00:21:37,440 --> 00:21:39,120
If an agent makes a recommendation,

589
00:21:39,120 --> 00:21:41,280
can you prove what information it based that on?

590
00:21:41,280 --> 00:21:43,240
Can you show what reasoning led to the decision?

591
00:21:43,240 --> 00:21:46,520
Can you prove that the decision was sound given the available information?

592
00:21:46,520 --> 00:21:48,840
The reasoning trace provides exactly that.

593
00:21:48,840 --> 00:21:49,960
Here's what the agent saw.

594
00:21:49,960 --> 00:21:51,160
Here's how it processed it.

595
00:21:51,160 --> 00:21:52,880
Here's why it reached this conclusion.

596
00:21:52,880 --> 00:21:54,120
The audit trail is complete.

597
00:21:54,120 --> 00:21:54,880
It's detailed.

598
00:21:54,880 --> 00:21:57,080
It's defensible.

599
00:21:57,080 --> 00:22:01,080
But beyond compliance, the reasoning trace gives you something else.

600
00:22:01,080 --> 00:22:02,400
An emergency break.

601
00:22:02,400 --> 00:22:04,040
If you're watching the reasoning in real time

602
00:22:04,040 --> 00:22:07,280
and you see the agent heading toward a bad decision, you can pause it.

603
00:22:07,280 --> 00:22:08,840
You can intervene, mid-thought.

604
00:22:08,840 --> 00:22:10,920
You can say, stop, don't take that action.

605
00:22:10,920 --> 00:22:12,040
The agent holds.

606
00:22:12,040 --> 00:22:13,600
The decision is not executed.

607
00:22:13,600 --> 00:22:16,200
You've prevented a problem before it became irreversible.

608
00:22:16,200 --> 00:22:19,920
But visibility and constraints only work if you're actually tracking what matters.

609
00:22:19,920 --> 00:22:22,080
If you're monitoring for the right signals.

610
00:22:22,080 --> 00:22:24,440
Behavioral monitoring and anomaly detection.

611
00:22:24,440 --> 00:22:25,800
You've built the walls.

612
00:22:25,800 --> 00:22:26,960
You've set the constraints.

613
00:22:26,960 --> 00:22:28,640
You've even established the reasoning trace

614
00:22:28,640 --> 00:22:31,040
so you can see exactly what the agent is doing.

615
00:22:31,040 --> 00:22:32,920
But none of that matters if you can't detect

616
00:22:32,920 --> 00:22:34,680
when something is actually wrong.

617
00:22:34,680 --> 00:22:38,200
Constraints are there to prevent unauthorized actions during normal operation.

618
00:22:38,200 --> 00:22:39,120
But here's the problem.

619
00:22:39,120 --> 00:22:40,680
An agent can be compromised.

620
00:22:40,680 --> 00:22:42,120
Its credentials can be stolen.

621
00:22:42,120 --> 00:22:43,600
Its instructions can be overwritten.

622
00:22:43,600 --> 00:22:46,400
It can be hijacked to do things it was never designed to do.

623
00:22:46,400 --> 00:22:48,840
And if you aren't actively monitoring for that compromise,

624
00:22:48,840 --> 00:22:51,600
you won't know it's happening until the damage is already done.

625
00:22:51,600 --> 00:22:54,000
Agent 365 integrates with Microsoft Defender

626
00:22:54,000 --> 00:22:55,680
to watch for agent misbehavior.

627
00:22:55,680 --> 00:22:58,200
We aren't just talking about policy violations here.

628
00:22:58,200 --> 00:23:00,680
Those are prevented by your role-based access controls.

629
00:23:00,680 --> 00:23:02,720
We're talking about anomalous behavior.

630
00:23:02,720 --> 00:23:06,240
These are patterns that don't match how this specific agent normally operates.

631
00:23:06,240 --> 00:23:08,560
They are signals that indicate something has changed,

632
00:23:08,560 --> 00:23:10,840
something is wrong, or something has been compromised.

633
00:23:10,840 --> 00:23:13,480
So what does misbehavior actually look like for an agent?

634
00:23:13,480 --> 00:23:15,160
It looks like unusual access patterns.

635
00:23:15,160 --> 00:23:16,840
Imagine an agent that normally reads

636
00:23:16,840 --> 00:23:18,960
from three specific SharePoint sites,

637
00:23:18,960 --> 00:23:22,440
suddenly tries to access 17 different sites it has never touched before.

638
00:23:22,440 --> 00:23:24,520
Technically that isn't a policy violation

639
00:23:24,520 --> 00:23:27,560
because the agent has broad-red access across the organization.

640
00:23:27,560 --> 00:23:29,480
It's allowed to be there, but it isn't normal.

641
00:23:29,480 --> 00:23:32,520
The agent never does that, something changed, and alert fires.

642
00:23:32,520 --> 00:23:34,240
And the security team investigates.

643
00:23:34,240 --> 00:23:36,240
It also looks like excessive tool usage.

644
00:23:36,240 --> 00:23:39,960
An agent that normally makes a few dozen API calls per workflow

645
00:23:39,960 --> 00:23:42,640
might suddenly start making hundreds of calls in a matter of seconds.

646
00:23:42,640 --> 00:23:43,600
It's hammering the system.

647
00:23:43,600 --> 00:23:46,560
Maybe it's trying to ex-filter a data by pulling everything it can access

648
00:23:46,560 --> 00:23:49,560
before someone notices, or maybe it's just stuck in a loop.

649
00:23:49,560 --> 00:23:51,400
Either way, it isn't normal operation.

650
00:23:51,400 --> 00:23:54,200
The anomaly detection system sees this bike, triggers an alert,

651
00:23:54,200 --> 00:23:55,440
and attention gets paid.

652
00:23:55,440 --> 00:23:58,440
Policy violations are caught by the authorization framework itself.

653
00:23:58,440 --> 00:24:01,080
The agent attempts an action it isn't allowed to perform,

654
00:24:01,080 --> 00:24:03,880
the system denies it, and the event is logged and reported.

655
00:24:03,880 --> 00:24:07,160
But the system isn't just logging denials, it's looking for patterns.

656
00:24:07,160 --> 00:24:10,480
Is the agent repeatedly attempting actions it isn't authorized for?

657
00:24:10,480 --> 00:24:14,440
Is it probing the boundaries of its permissions or testing what it can and cannot access?

658
00:24:14,440 --> 00:24:15,760
That pattern is suspicious.

659
00:24:15,760 --> 00:24:20,120
It suggests the agent is being manipulated by something trying to break out of its constraints.

660
00:24:20,120 --> 00:24:21,600
Then there is cross-tenant activity.

661
00:24:21,600 --> 00:24:23,720
If an agent is compromised in one tenant,

662
00:24:23,720 --> 00:24:26,920
an attacker might use it to try to access another tenant's data.

663
00:24:26,920 --> 00:24:29,280
The agent attempts something that should be impossible,

664
00:24:29,280 --> 00:24:32,320
like accessing a resource outside its authorized boundary.

665
00:24:32,320 --> 00:24:35,480
The system detects the attempt it fails and the system logs it.

666
00:24:35,480 --> 00:24:38,160
When you see multiple failed attempts across multiple tenants,

667
00:24:38,160 --> 00:24:40,040
that is a clear signal of compromise.

668
00:24:40,040 --> 00:24:41,440
The agent is being weaponized.

669
00:24:41,440 --> 00:24:43,840
These anomalies trigger alerts in real time.

670
00:24:43,840 --> 00:24:45,360
This doesn't happen after the fact,

671
00:24:45,360 --> 00:24:47,520
or in a log file that someone reviews tomorrow.

672
00:24:47,520 --> 00:24:48,440
It happens right now.

673
00:24:48,440 --> 00:24:51,680
A security analyst sees the alert and can respond immediately.

674
00:24:51,680 --> 00:24:53,280
They can revoke the agent's credentials,

675
00:24:53,280 --> 00:24:55,760
pause its activity, and prevent further damage.

676
00:24:55,760 --> 00:24:57,000
This is the critical insight.

677
00:24:57,000 --> 00:25:00,320
Agents can be compromised exactly like user accounts can be compromised.

678
00:25:00,320 --> 00:25:03,480
When a user's password gets stolen and an attacker logs in,

679
00:25:03,480 --> 00:25:04,760
they are inside the system.

680
00:25:04,760 --> 00:25:07,080
They can access everything that user can access.

681
00:25:07,080 --> 00:25:09,000
They can read data, delete files,

682
00:25:09,000 --> 00:25:11,160
and send messages on behalf of that person.

683
00:25:11,160 --> 00:25:12,840
The breach is active and dangerous.

684
00:25:12,840 --> 00:25:16,160
An agent's credentials being stolen is the exact same scenario.

685
00:25:16,160 --> 00:25:17,760
Once an attacker has those credentials,

686
00:25:17,760 --> 00:25:19,440
they can make the agent do whatever they want.

687
00:25:19,440 --> 00:25:21,640
They can execute requests, access data,

688
00:25:21,640 --> 00:25:23,360
and abuse the agent's capabilities.

689
00:25:23,360 --> 00:25:26,040
The breach is just as dangerous and possibly more so.

690
00:25:26,040 --> 00:25:27,880
Because an agent can execute at scale,

691
00:25:27,880 --> 00:25:29,680
a compromised agent can do in seconds

692
00:25:29,680 --> 00:25:31,600
what would take a human hours to finish.

693
00:25:31,600 --> 00:25:32,960
It can read thousands of documents,

694
00:25:32,960 --> 00:25:34,560
make hundreds of API calls,

695
00:25:34,560 --> 00:25:36,360
and move laterally through your systems.

696
00:25:36,360 --> 00:25:37,880
Because of this, Defender for Agents

697
00:25:37,880 --> 00:25:40,320
applies the same threat detection that works for humans.

698
00:25:40,320 --> 00:25:41,720
It uses the same patterns,

699
00:25:41,720 --> 00:25:43,120
the same anomaly algorithms,

700
00:25:43,120 --> 00:25:44,680
and the same response mechanisms.

701
00:25:44,680 --> 00:25:47,120
An agent that has been compromised behaves differently

702
00:25:47,120 --> 00:25:48,120
from a normal agent,

703
00:25:48,120 --> 00:25:50,520
the same way a compromised user behaves differently

704
00:25:50,520 --> 00:25:51,720
from a normal user.

705
00:25:51,720 --> 00:25:53,280
The system learns what normal looks like,

706
00:25:53,280 --> 00:25:55,680
detects the deviations and enables a response.

707
00:25:55,680 --> 00:25:57,600
But all of this, the constraints,

708
00:25:57,600 --> 00:25:59,120
the visibility and the monitoring,

709
00:25:59,120 --> 00:26:01,520
it only works if you have a shared context layer

710
00:26:01,520 --> 00:26:03,520
that understands organizational state.

711
00:26:03,520 --> 00:26:05,520
You need something that can enforce these controls

712
00:26:05,520 --> 00:26:07,320
consistently across everything.

713
00:26:07,320 --> 00:26:09,680
And that is where WorkIQ becomes essential.

714
00:26:09,680 --> 00:26:11,480
What WorkIQ actually is.

715
00:26:11,480 --> 00:26:13,000
WorkIQ is not a database.

716
00:26:13,000 --> 00:26:14,160
You don't store anything in it.

717
00:26:14,160 --> 00:26:15,400
It's also not a search engine,

718
00:26:15,400 --> 00:26:18,160
so you don't query it the way you would query an index.

719
00:26:18,160 --> 00:26:19,680
It is something different from both.

720
00:26:19,680 --> 00:26:21,320
WorkIQ is a reasoning layer that sits

721
00:26:21,320 --> 00:26:23,400
between your agents and the actual systems

722
00:26:23,400 --> 00:26:24,480
where work happens.

723
00:26:24,480 --> 00:26:25,280
Think of it this way.

724
00:26:25,280 --> 00:26:27,040
Your organization has data everywhere.

725
00:26:27,040 --> 00:26:30,120
You have email in exchange, documents in SharePoint,

726
00:26:30,120 --> 00:26:33,280
conversations in Teams and records in your ERP system.

727
00:26:33,280 --> 00:26:36,000
All of that data is distributed across different systems

728
00:26:36,000 --> 00:26:38,080
with their own access controls and structures.

729
00:26:38,080 --> 00:26:40,840
An agent needs to understand that data and act on it,

730
00:26:40,840 --> 00:26:42,200
but it doesn't need to copy it.

731
00:26:42,200 --> 00:26:45,560
WorkIQ lets the agent reason over that distributed data

732
00:26:45,560 --> 00:26:48,080
as if it were a single coherent system.

733
00:26:48,080 --> 00:26:50,640
WorkIQ exposes four specific capabilities.

734
00:26:50,640 --> 00:26:51,800
First is chat.

735
00:26:51,800 --> 00:26:53,640
Your agents can have conversational access

736
00:26:53,640 --> 00:26:55,800
to organizational knowledge through WorkIQ.

737
00:26:55,800 --> 00:26:57,720
This isn't done by querying a search engine

738
00:26:57,720 --> 00:26:58,960
or retrieving documents.

739
00:26:58,960 --> 00:27:01,000
It's done by having a reasoning conversation.

740
00:27:01,000 --> 00:27:03,520
Your agent can ask WorkIQ about the current status

741
00:27:03,520 --> 00:27:06,760
of a specific region and WorkIQ understands the context.

742
00:27:06,760 --> 00:27:09,480
It reasons over real-time signals and gives an answer

743
00:27:09,480 --> 00:27:11,400
that reflects what is happening right now,

744
00:27:11,400 --> 00:27:13,000
not what was documented a month ago.

745
00:27:13,000 --> 00:27:14,240
Second is context.

746
00:27:14,240 --> 00:27:16,520
This is the governed retrieval of tenant signals.

747
00:27:16,520 --> 00:27:17,920
When your agent needs to understand

748
00:27:17,920 --> 00:27:19,240
who is working on a project,

749
00:27:19,240 --> 00:27:21,880
WorkIQ returns more than just a list of names.

750
00:27:21,880 --> 00:27:23,320
It returns the people involved.

751
00:27:23,320 --> 00:27:25,560
Their current assignments and their availability.

752
00:27:25,560 --> 00:27:27,440
Its context, not just raw data.

753
00:27:27,440 --> 00:27:28,760
The agent gets interpreted,

754
00:27:28,760 --> 00:27:31,440
relevant information tailored to what it's trying to accomplish.

755
00:27:31,440 --> 00:27:33,640
Third is tools which are your action endpoints.

756
00:27:33,640 --> 00:27:35,560
Your agent can send mail, update files,

757
00:27:35,560 --> 00:27:37,560
or schedule meetings through WorkIQ.

758
00:27:37,560 --> 00:27:39,240
These aren't raw API calls.

759
00:27:39,240 --> 00:27:41,120
They are WorkIQ mediated actions.

760
00:27:41,120 --> 00:27:43,440
The permission model and policy constraints apply here

761
00:27:43,440 --> 00:27:45,800
and the reasoning trace captures exactly what happened.

762
00:27:45,800 --> 00:27:47,600
The agent isn't calling the API directly.

763
00:27:47,600 --> 00:27:49,440
It's asking WorkIQ to do something

764
00:27:49,440 --> 00:27:51,320
and WorkIQ handles the execution

765
00:27:51,320 --> 00:27:52,840
within the governance framework.

766
00:27:52,840 --> 00:27:54,200
Fourth is workspaces.

767
00:27:54,200 --> 00:27:56,720
These provide persistent state for multi-step tasks.

768
00:27:56,720 --> 00:27:58,920
When an agent is working on something complex,

769
00:27:58,920 --> 00:28:01,680
it needs to remember decisions and track its progress.

770
00:28:01,680 --> 00:28:04,200
Workspaces give the agent a project-scoped space

771
00:28:04,200 --> 00:28:07,160
to maintain state across multiple reasoning steps.

772
00:28:07,160 --> 00:28:09,520
That state is persistent, bounded, and governed

773
00:28:09,520 --> 00:28:10,720
like everything else.

774
00:28:10,720 --> 00:28:13,440
The critical architectural detail changes everything.

775
00:28:13,440 --> 00:28:15,160
WorkIQ does not copy your data.

776
00:28:15,160 --> 00:28:18,080
It doesn't export it, vectorize it, or build a shadow index.

777
00:28:18,080 --> 00:28:19,960
Instead, it reasons over data in place.

778
00:28:19,960 --> 00:28:21,320
Your documents stay in SharePoint

779
00:28:21,320 --> 00:28:22,800
and your emails stay in exchange.

780
00:28:22,800 --> 00:28:25,080
WorkIQ connects to those systems in real time

781
00:28:25,080 --> 00:28:26,760
to understand the data where it lives.

782
00:28:26,760 --> 00:28:29,120
It enforces permissions where they are already defined.

783
00:28:29,120 --> 00:28:32,160
It's a reasoning layer, not a data duplication layer.

784
00:28:32,160 --> 00:28:34,520
This leads to three major operational advantages.

785
00:28:34,520 --> 00:28:36,480
First, you avoid the data export nightmare.

786
00:28:36,480 --> 00:28:38,560
You aren't building parallel data stores

787
00:28:38,560 --> 00:28:40,680
or managing sync between the source of truth

788
00:28:40,680 --> 00:28:41,920
and a shadow index.

789
00:28:41,920 --> 00:28:44,720
The source of truth remains the only source of truth.

790
00:28:44,720 --> 00:28:46,840
Because WorkIQ reasons about it directly,

791
00:28:46,840 --> 00:28:49,080
your compliance obligations don't expand

792
00:28:49,080 --> 00:28:50,840
and your data governance doesn't fragment.

793
00:28:50,840 --> 00:28:53,080
You aren't creating new data residency problems

794
00:28:53,080 --> 00:28:54,560
or new security surfaces.

795
00:28:54,560 --> 00:28:56,680
Second, permissions are enforced at reasoning time,

796
00:28:56,680 --> 00:28:58,200
not retrieval time.

797
00:28:58,200 --> 00:28:59,960
When your agent asks WorkIQ a question,

798
00:28:59,960 --> 00:29:02,200
the system checks permissions as it reasons.

799
00:29:02,200 --> 00:29:04,960
It asks if the agent can see this data in real time.

800
00:29:04,960 --> 00:29:07,200
The answer it gives reflects what the agent is actually

801
00:29:07,200 --> 00:29:08,200
allowed to know.

802
00:29:08,200 --> 00:29:10,200
You aren't retrieving data and then filtering it.

803
00:29:10,200 --> 00:29:12,880
You are reasoning about the data you are authorized

804
00:29:12,880 --> 00:29:14,400
to see from the start.

805
00:29:14,400 --> 00:29:17,160
The permission boundary is built into the reasoning itself.

806
00:29:17,160 --> 00:29:19,560
Third, the audit trail is native to M365.

807
00:29:19,560 --> 00:29:21,640
You aren't bolting on logging after the fact

808
00:29:21,640 --> 00:29:24,320
or trying to make WorkIQ queryable from your CM.

809
00:29:24,320 --> 00:29:26,160
The action is captured directly in the systems

810
00:29:26,160 --> 00:29:27,560
where the action occurred.

811
00:29:27,560 --> 00:29:30,080
If an agent sends an email through WorkIQ exchange logs

812
00:29:30,080 --> 00:29:32,800
that email, if an agent updates a SharePoint record,

813
00:29:32,800 --> 00:29:34,160
SharePoint logs the change.

814
00:29:34,160 --> 00:29:37,000
Everything is auditable through the native M365 logging

815
00:29:37,000 --> 00:29:38,200
you already use.

816
00:29:38,200 --> 00:29:41,600
WorkIQ becomes the foundation that makes everything else possible.

817
00:29:41,600 --> 00:29:43,560
The constraints defined in Agent 365

818
00:29:43,560 --> 00:29:45,120
aren't forced by WorkIQ.

819
00:29:45,120 --> 00:29:46,280
But the reasoning trace you're watching

820
00:29:46,280 --> 00:29:47,920
is provided by WorkIQ.

821
00:29:47,920 --> 00:29:49,880
The governance that prevents unauthorized access

822
00:29:49,880 --> 00:29:51,520
is implemented by WorkIQ.

823
00:29:51,520 --> 00:29:53,280
But WorkIQ itself is single threaded.

824
00:29:53,280 --> 00:29:56,040
It handles one agent and one conversation at a time.

825
00:29:56,040 --> 00:29:57,960
For organizational work, you need agents

826
00:29:57,960 --> 00:29:59,800
to coordinate and delegate to each other.

827
00:29:59,800 --> 00:30:02,040
You need a protocol for agent to agent communication.

828
00:30:02,040 --> 00:30:04,160
That is where A2A enters.

829
00:30:04,160 --> 00:30:06,520
A2A, the agent communication protocol.

830
00:30:06,520 --> 00:30:08,400
WorkIQ handles the intelligence.

831
00:30:08,400 --> 00:30:10,320
Agent 365 handles the governance.

832
00:30:10,320 --> 00:30:12,320
But neither of them solves the problem of agents

833
00:30:12,320 --> 00:30:13,720
needing to talk to each other.

834
00:30:13,720 --> 00:30:16,600
Real work doesn't happen when one agent answers one question.

835
00:30:16,600 --> 00:30:19,240
It happens when multiple agents collaborate to finish a goal.

836
00:30:19,240 --> 00:30:20,920
Think about how a workflow actually looks.

837
00:30:20,920 --> 00:30:22,840
Your procurement agent needs to check the budget

838
00:30:22,840 --> 00:30:24,520
before it approves a purchase.

839
00:30:24,520 --> 00:30:26,000
It can't do that analysis alone

840
00:30:26,000 --> 00:30:28,600
because the finance agent owns the budget data.

841
00:30:28,600 --> 00:30:30,720
The procurement agent has to ask a question.

842
00:30:30,720 --> 00:30:31,920
It needs to delegate.

843
00:30:31,920 --> 00:30:33,760
It needs to hand off a piece of the work.

844
00:30:33,760 --> 00:30:36,120
The obvious way to do this is natural language.

845
00:30:36,120 --> 00:30:37,840
The procurement agent writes a message in English

846
00:30:37,840 --> 00:30:40,000
asking if there is enough money for the purchase.

847
00:30:40,000 --> 00:30:41,960
It sends that message to the finance agent.

848
00:30:41,960 --> 00:30:43,520
Then the finance agent reads the English

849
00:30:43,520 --> 00:30:44,960
and tries to parse the request.

850
00:30:44,960 --> 00:30:46,520
It has to figure out what is being asked

851
00:30:46,520 --> 00:30:48,040
and reason about the answer.

852
00:30:48,040 --> 00:30:50,320
It writes a response in English and sends it back.

853
00:30:50,320 --> 00:30:52,520
Finally, the procurement agent reads that response

854
00:30:52,520 --> 00:30:53,640
and continues its work.

855
00:30:53,640 --> 00:30:55,120
This works for a single handoff.

856
00:30:55,120 --> 00:30:56,000
It's slow.

857
00:30:56,000 --> 00:30:56,600
But it works.

858
00:30:56,600 --> 00:30:59,920
But now imagine three agents involved or five or 20.

859
00:30:59,920 --> 00:31:01,560
Imagine agents trying to coordinate

860
00:31:01,560 --> 00:31:04,440
across dozens of different workflows at the same time.

861
00:31:04,440 --> 00:31:06,120
Every handoff creates latency.

862
00:31:06,120 --> 00:31:07,360
Passing is expensive.

863
00:31:07,360 --> 00:31:09,160
Language is messy and ambiguous.

864
00:31:09,160 --> 00:31:10,840
The finance agent might misunderstand

865
00:31:10,840 --> 00:31:13,000
which budget the procurement agent is talking about.

866
00:31:13,000 --> 00:31:15,720
The procurement agent might misinterpret the answer.

867
00:31:15,720 --> 00:31:17,560
You are basically injecting errors into the system

868
00:31:17,560 --> 00:31:19,040
every time a conversation happens.

869
00:31:19,040 --> 00:31:20,040
But here's the problem.

870
00:31:20,040 --> 00:31:22,440
You're wasting massive amounts of compute.

871
00:31:22,440 --> 00:31:24,800
You are spending tokens to generate English text

872
00:31:24,800 --> 00:31:27,240
and parse English responses for a coordination

873
00:31:27,240 --> 00:31:29,200
that doesn't need to be in English at all.

874
00:31:29,200 --> 00:31:31,440
It's like two computers trying to talk to each other

875
00:31:31,440 --> 00:31:34,080
using human voices instead of a network protocol.

876
00:31:34,080 --> 00:31:36,600
It technically works, but it's terrible in practice.

877
00:31:36,600 --> 00:31:38,400
A to A inverts this model.

878
00:31:38,400 --> 00:31:40,160
It's an open standard for agent communication

879
00:31:40,160 --> 00:31:42,160
that ignores natural language entirely.

880
00:31:42,160 --> 00:31:44,800
It uses structured messages, explicit schemas,

881
00:31:44,800 --> 00:31:46,840
machine readable formats.

882
00:31:46,840 --> 00:31:48,960
This is what an A to A message looks like.

883
00:31:48,960 --> 00:31:50,760
The procurement agent doesn't send a question.

884
00:31:50,760 --> 00:31:52,280
It sends a structured task.

885
00:31:52,280 --> 00:31:53,800
That task includes the operation

886
00:31:53,800 --> 00:31:56,920
like check budget availability and the parameters

887
00:31:56,920 --> 00:31:58,680
like the cost center and fiscal year.

888
00:31:58,680 --> 00:32:00,760
It includes the context the other agent needs

889
00:32:00,760 --> 00:32:02,960
such as vendor details and urgency.

890
00:32:02,960 --> 00:32:05,400
It even defines the exact format for the response

891
00:32:05,400 --> 00:32:07,840
like a JSON file showing the remaining budget.

892
00:32:07,840 --> 00:32:10,240
It also includes metadata to prove the request

893
00:32:10,240 --> 00:32:11,800
is legitimate and authorized.

894
00:32:11,800 --> 00:32:13,360
The finance agent receives this task

895
00:32:13,360 --> 00:32:14,840
and there is zero ambiguity.

896
00:32:14,840 --> 00:32:16,040
The schema is explicit.

897
00:32:16,040 --> 00:32:17,280
The parameters are clear.

898
00:32:17,280 --> 00:32:19,000
The output format is already set.

899
00:32:19,000 --> 00:32:22,400
The finance agent processes the request, validates the budget,

900
00:32:22,400 --> 00:32:24,960
and returns a response that fits the schema perfectly,

901
00:32:24,960 --> 00:32:29,160
structured data, machine readable, no interpretation required.

902
00:32:29,160 --> 00:32:31,280
This is fundamentally different from agents chatting

903
00:32:31,280 --> 00:32:32,160
with each other.

904
00:32:32,160 --> 00:32:33,360
There is no conversation.

905
00:32:33,360 --> 00:32:34,920
There is no back and forth negotiation.

906
00:32:34,920 --> 00:32:37,520
There is just a task, a request and a response,

907
00:32:37,520 --> 00:32:39,720
clean delegation, verifiable completion.

908
00:32:39,720 --> 00:32:40,560
And this is the shift.

909
00:32:40,560 --> 00:32:42,760
You get multi-agent orchestration at cloud speed.

910
00:32:42,760 --> 00:32:44,920
When agents use structured messages,

911
00:32:44,920 --> 00:32:47,840
coordination happens in milliseconds instead of seconds.

912
00:32:47,840 --> 00:32:50,040
An orchestrator agent can hand off tasks

913
00:32:50,040 --> 00:32:52,800
to 10 specialized agents at the exact same time.

914
00:32:52,800 --> 00:32:55,080
Each one gets a structured request and processes it

915
00:32:55,080 --> 00:32:55,960
at full speed.

916
00:32:55,960 --> 00:32:59,040
The orchestrator pulls the responses together and moves on.

917
00:32:59,040 --> 00:33:01,360
The whole process is finished before a human could even

918
00:33:01,360 --> 00:33:03,720
finish reading the first sentence of an English request.

919
00:33:03,720 --> 00:33:05,480
You get clear delegation boundaries.

920
00:33:05,480 --> 00:33:07,600
When the schema defines the data and the format,

921
00:33:07,600 --> 00:33:09,840
you can't miscommunicate about the scope of the work.

922
00:33:09,840 --> 00:33:11,880
The procurement agent can't accidentally ask

923
00:33:11,880 --> 00:33:14,360
for sensitive payroll info because the schema doesn't allow it.

924
00:33:14,360 --> 00:33:15,560
The boundary is structural.

925
00:33:15,560 --> 00:33:18,240
It's enforced by the system, not suggested by a prompt.

926
00:33:18,240 --> 00:33:19,680
The handoff becomes efficient.

927
00:33:19,680 --> 00:33:21,760
You aren't generating or passing English,

928
00:33:21,760 --> 00:33:23,480
so you're just moving information.

929
00:33:23,480 --> 00:33:26,000
The network cost is lower and the compute cost is lower.

930
00:33:26,000 --> 00:33:28,280
The entire system is faster and cheaper to run.

931
00:33:28,280 --> 00:33:30,480
You also get verifiable task completion.

932
00:33:30,480 --> 00:33:32,880
Every handoff is logged with a status and a timestamp.

933
00:33:32,880 --> 00:33:34,840
You can trace the whole workflow and prove exactly

934
00:33:34,840 --> 00:33:36,480
when the finance agent received the request

935
00:33:36,480 --> 00:33:37,520
and what it returned.

936
00:33:37,520 --> 00:33:38,920
The audit trail is complete.

937
00:33:38,920 --> 00:33:41,560
You can replay the entire sequence whenever you need to.

938
00:33:41,560 --> 00:33:43,720
A2A is the protocol layer that makes real

939
00:33:43,720 --> 00:33:45,160
multi-agent systems possible.

940
00:33:45,160 --> 00:33:46,760
It's not about one agent at a time.

941
00:33:46,760 --> 00:33:48,400
It's about swarms of agents coordinating

942
00:33:48,400 --> 00:33:50,280
in real time over shared work.

943
00:33:50,280 --> 00:33:53,360
Performance A2A versus MCP benchmarks.

944
00:33:53,360 --> 00:33:56,360
Now that you see what A2A does, structured communication

945
00:33:56,360 --> 00:33:57,600
instead of language passing,

946
00:33:57,600 --> 00:34:00,040
the question is simple, does it actually matter?

947
00:34:00,040 --> 00:34:02,640
Does this architectural shift lead to real results?

948
00:34:02,640 --> 00:34:05,040
Or is it just a theoretical win that disappears

949
00:34:05,040 --> 00:34:05,880
when you scale up?

950
00:34:05,880 --> 00:34:08,200
The benchmarks are where the theory meets reality

951
00:34:08,200 --> 00:34:09,160
and they are stock.

952
00:34:09,160 --> 00:34:12,800
A2A handles 12,500 requests per second.

953
00:34:12,800 --> 00:34:15,480
MCP, which is the other major communication standard,

954
00:34:15,480 --> 00:34:17,080
handles 7,800.

955
00:34:17,080 --> 00:34:19,320
That is a 60% advantage in throughput.

956
00:34:19,320 --> 00:34:20,720
When one agent has to coordinate

957
00:34:20,720 --> 00:34:22,880
with 10 specialized agents at once,

958
00:34:22,880 --> 00:34:25,440
that gap is the difference between a responsive system

959
00:34:25,440 --> 00:34:26,800
and a total bottleneck.

960
00:34:26,800 --> 00:34:28,680
The latency numbers show the same thing.

961
00:34:28,680 --> 00:34:31,880
A2A averages 3.2 milliseconds per request,

962
00:34:31,880 --> 00:34:34,400
while MCP averages 8.5 milliseconds,

963
00:34:34,400 --> 00:34:36,760
that is a 62% reduction in delay.

964
00:34:36,760 --> 00:34:38,440
This is the difference between a user feeling

965
00:34:38,440 --> 00:34:41,560
like the agent is instant versus feeling a noticeable lag.

966
00:34:41,560 --> 00:34:43,360
3 milliseconds is invisible to a human,

967
00:34:43,360 --> 00:34:45,560
but 8 milliseconds starts to feel sluggish.

968
00:34:45,560 --> 00:34:47,360
If you have five agents coordinating,

969
00:34:47,360 --> 00:34:50,240
you're looking at 15 milliseconds versus 42 and a half.

970
00:34:50,240 --> 00:34:52,400
People feel that, it becomes friction.

971
00:34:52,400 --> 00:34:54,880
The resource efficiency gap is even wider.

972
00:34:54,880 --> 00:34:58,160
A2A hits 92% efficiency, meaning it uses compute

973
00:34:58,160 --> 00:35:00,360
and bandwidth almost perfectly.

974
00:35:00,360 --> 00:35:03,320
MCP runs at 75%, in a cloud environment

975
00:35:03,320 --> 00:35:05,120
where you pay for every second of compute

976
00:35:05,120 --> 00:35:07,960
that 17-point difference adds up fast across a whole fleet

977
00:35:07,960 --> 00:35:10,000
of agents, but these numbers only matter

978
00:35:10,000 --> 00:35:12,000
if you see what they mean in practice.

979
00:35:12,000 --> 00:35:13,240
At an organizational scale,

980
00:35:13,240 --> 00:35:14,880
these small gaps don't stay small.

981
00:35:14,880 --> 00:35:15,880
They compound.

982
00:35:15,880 --> 00:35:18,200
You aren't just running two agents having one conversation.

983
00:35:18,200 --> 00:35:19,440
You're running hundreds of agents

984
00:35:19,440 --> 00:35:21,800
and thousands of coordinations every single hour.

985
00:35:21,800 --> 00:35:24,560
Every time a coordination takes 8.5 milliseconds

986
00:35:24,560 --> 00:35:26,880
instead of 3.2, you lose time.

987
00:35:26,880 --> 00:35:29,560
By the end of the day, you've lost hours of throughput.

988
00:35:29,560 --> 00:35:31,720
By the end of the month, you've lost capacity.

989
00:35:31,720 --> 00:35:33,200
You end up needing more infrastructure

990
00:35:33,200 --> 00:35:34,800
just to do the same amount of work.

991
00:35:34,800 --> 00:35:37,480
One slow agent can bottle next the entire workflow.

992
00:35:37,480 --> 00:35:40,040
If your orchestrator sends requests to 10 agents

993
00:35:40,040 --> 00:35:43,280
and nine of them use A2A, they finish in five milliseconds.

994
00:35:43,280 --> 00:35:46,560
But if the 10th agent uses MCP and takes 10 milliseconds,

995
00:35:46,560 --> 00:35:47,960
the orchestrator has to wait.

996
00:35:47,960 --> 00:35:51,320
The whole workflow is only as fast as that one slow component.

997
00:35:51,320 --> 00:35:52,880
That agent becomes your ceiling.

998
00:35:52,880 --> 00:35:54,120
But here's the deeper problem.

999
00:35:54,120 --> 00:35:56,040
When agents coordinate in real time,

1000
00:35:56,040 --> 00:35:57,960
latency changes how they behave.

1001
00:35:57,960 --> 00:36:00,560
Real-time coordination means agents make decisions

1002
00:36:00,560 --> 00:36:02,480
based on what is happening right now.

1003
00:36:02,480 --> 00:36:05,120
A2A is fast enough that the agent gets the fresh state

1004
00:36:05,120 --> 00:36:06,360
of the organization.

1005
00:36:06,360 --> 00:36:07,960
The decision is based on reality.

1006
00:36:07,960 --> 00:36:10,880
With MCP, the data is slightly stale by the time the agent gets it.

1007
00:36:10,880 --> 00:36:13,160
It's only a few milliseconds, but in a fast environment,

1008
00:36:13,160 --> 00:36:16,080
that's enough to make a decision based on outdated information.

1009
00:36:16,080 --> 00:36:17,800
The real insight is architectural.

1010
00:36:17,800 --> 00:36:21,440
These benchmarks prove A2A was built for a genetic density.

1011
00:36:21,440 --> 00:36:22,840
It's for many agents working together,

1012
00:36:22,840 --> 00:36:25,240
not just one smart agent talking to a user.

1013
00:36:25,240 --> 00:36:26,880
The performance isn't a side effect.

1014
00:36:26,880 --> 00:36:27,760
It's the whole point.

1015
00:36:27,760 --> 00:36:29,640
A2A scales horizontally.

1016
00:36:29,640 --> 00:36:32,040
You can add more agents and the performance holds.

1017
00:36:32,040 --> 00:36:33,840
If you add more agents to an MCP system,

1018
00:36:33,840 --> 00:36:35,600
you hit the ceiling much faster.

1019
00:36:35,600 --> 00:36:38,920
This matters because the future isn't one agent per department.

1020
00:36:38,920 --> 00:36:40,680
It's a swarm of specialized agents,

1021
00:36:40,680 --> 00:36:42,840
sales, compliance, finance, and operations,

1022
00:36:42,840 --> 00:36:44,520
all talking at real-time speeds

1023
00:36:44,520 --> 00:36:47,440
that requires a protocol designed for density from day one.

1024
00:36:47,440 --> 00:36:50,360
That's A2A, speed, and efficiency are only half the story.

1025
00:36:50,360 --> 00:36:52,440
There is another side to this cost.

1026
00:36:52,440 --> 00:36:54,720
And in the reasoning economy, cost is architecture.

1027
00:36:54,720 --> 00:36:57,760
Agente density, many agents, one fabric.

1028
00:36:57,760 --> 00:37:00,440
The future of work isn't one smart agent that does everything.

1029
00:37:00,440 --> 00:37:01,680
That's not the direction.

1030
00:37:01,680 --> 00:37:03,800
The direction is many specialized agents.

1031
00:37:03,800 --> 00:37:05,520
Each built for a specific domain,

1032
00:37:05,520 --> 00:37:08,680
each with deep expertise in its area, or working together.

1033
00:37:08,680 --> 00:37:10,360
Picture your organization's agent layer.

1034
00:37:10,360 --> 00:37:12,680
You have a sales agent that understands pipeline.

1035
00:37:12,680 --> 00:37:14,560
It doesn't just look at CRM records.

1036
00:37:14,560 --> 00:37:17,080
It understands the context, which deals are moving fast,

1037
00:37:17,080 --> 00:37:19,120
which are stalled, which have risk signals.

1038
00:37:19,120 --> 00:37:20,520
Because it's built for that domain,

1039
00:37:20,520 --> 00:37:23,120
it sees data through the lens of sales operations

1040
00:37:23,120 --> 00:37:24,760
and knows exactly what matters.

1041
00:37:24,760 --> 00:37:27,200
Then you have a compliance agent that knows regulations.

1042
00:37:27,200 --> 00:37:29,000
It doesn't just search for policy documents.

1043
00:37:29,000 --> 00:37:31,360
It reasons about regulatory obligations.

1044
00:37:31,360 --> 00:37:34,040
It understands which rules apply to which business operations

1045
00:37:34,040 --> 00:37:36,480
and catches edge cases where overlapping regulations

1046
00:37:36,480 --> 00:37:37,520
create contradictions.

1047
00:37:37,520 --> 00:37:40,040
It's a domain expert in regulation and compliance.

1048
00:37:40,040 --> 00:37:41,680
An operations agent tracks your inventory.

1049
00:37:41,680 --> 00:37:43,440
It doesn't store inventory data locally.

1050
00:37:43,440 --> 00:37:45,080
It reasons over the live system.

1051
00:37:45,080 --> 00:37:47,480
It understands warehouse constraints, supplier relationships,

1052
00:37:47,480 --> 00:37:48,560
and lead times.

1053
00:37:48,560 --> 00:37:50,480
So it can recommend reordering decisions.

1054
00:37:50,480 --> 00:37:52,200
It understands the operational complexity

1055
00:37:52,200 --> 00:37:54,320
because it's built for that specific domain.

1056
00:37:54,320 --> 00:37:56,920
A customer service agent handles escalations.

1057
00:37:56,920 --> 00:37:58,680
It understands customer sentiment

1058
00:37:58,680 --> 00:38:01,040
and knows which issues can be resolved automatically

1059
00:38:01,040 --> 00:38:02,720
versus which need human judgment.

1060
00:38:02,720 --> 00:38:05,240
It has context about customer history, account status,

1061
00:38:05,240 --> 00:38:06,400
and support tiers.

1062
00:38:06,400 --> 00:38:08,720
It's specialized for customer interaction.

1063
00:38:08,720 --> 00:38:11,960
A finance agent reasons about costs, budget, and cashflow.

1064
00:38:11,960 --> 00:38:14,000
A contract's agent understands legal obligations

1065
00:38:14,000 --> 00:38:15,080
and renewal dates.

1066
00:38:15,080 --> 00:38:17,240
A project management agent tracks dependencies

1067
00:38:17,240 --> 00:38:18,400
in critical parts.

1068
00:38:18,400 --> 00:38:21,320
An HR agent manages leave policies and benefits enrollment.

1069
00:38:21,320 --> 00:38:22,440
Each one is specialist.

1070
00:38:22,440 --> 00:38:24,080
Each one built for its domain.

1071
00:38:24,080 --> 00:38:26,640
But here's the thing, these agents don't work in isolation.

1072
00:38:26,640 --> 00:38:28,080
They coordinate.

1073
00:38:28,080 --> 00:38:30,040
The customer service agent gets an escalation

1074
00:38:30,040 --> 00:38:32,400
from an angry customer because their invoice is wrong.

1075
00:38:32,400 --> 00:38:34,160
The agent can't fix billing issues.

1076
00:38:34,160 --> 00:38:35,320
That's not its domain.

1077
00:38:35,320 --> 00:38:37,120
So it delegates to the finance agent.

1078
00:38:37,120 --> 00:38:39,640
Check this customer's account, verify the latest invoice,

1079
00:38:39,640 --> 00:38:41,640
identify any calculation errors.

1080
00:38:41,640 --> 00:38:43,760
The finance agent processes the request.

1081
00:38:43,760 --> 00:38:44,600
It checks.

1082
00:38:44,600 --> 00:38:45,680
It finds the error.

1083
00:38:45,680 --> 00:38:47,080
It returns the result.

1084
00:38:47,080 --> 00:38:49,840
Invoice dated March 15 has a line item over charge

1085
00:38:49,840 --> 00:38:53,080
of $47,300 because it was manually entered

1086
00:38:53,080 --> 00:38:55,520
instead of imported from the system.

1087
00:38:55,520 --> 00:38:58,040
The customer service agent now has the information it needs.

1088
00:38:58,040 --> 00:38:59,600
It knows what to tell the customer

1089
00:38:59,600 --> 00:39:02,440
and escalates the correction to a human finance specialist

1090
00:39:02,440 --> 00:39:03,360
if needed.

1091
00:39:03,360 --> 00:39:05,240
Simultaneously, the project management agent

1092
00:39:05,240 --> 00:39:06,760
is tracking a product launch.

1093
00:39:06,760 --> 00:39:08,560
It needs to know budget implications.

1094
00:39:08,560 --> 00:39:10,960
The sales agent is forecasting revenue impact.

1095
00:39:10,960 --> 00:39:13,440
The operations agent is calculating fulfillment capacity.

1096
00:39:13,440 --> 00:39:15,040
They're all reasoning about the same launch

1097
00:39:15,040 --> 00:39:16,280
from different angles.

1098
00:39:16,280 --> 00:39:17,680
And they need to coordinate.

1099
00:39:17,680 --> 00:39:20,160
The project agent asks the operations agent,

1100
00:39:20,160 --> 00:39:22,520
can we fulfill 50,000 units in Q2?

1101
00:39:22,520 --> 00:39:24,440
Operations checks capacity.

1102
00:39:24,440 --> 00:39:26,960
It knows current utilization, seasonal patterns

1103
00:39:26,960 --> 00:39:28,400
and supplier constraints.

1104
00:39:28,400 --> 00:39:32,000
We can fulfill 50,000 units, but only with expedited procurement,

1105
00:39:32,000 --> 00:39:33,920
which adds $7 per unit.

1106
00:39:33,920 --> 00:39:36,520
The project agent gets the answer and asks finance.

1107
00:39:36,520 --> 00:39:39,560
What is the impact of $350,000 in expedited procurement

1108
00:39:39,560 --> 00:39:41,640
costs on overall project margins?

1109
00:39:41,640 --> 00:39:44,440
Finance runs the calculation and returns the result.

1110
00:39:44,440 --> 00:39:46,760
All three agents have coordinated their analysis.

1111
00:39:46,760 --> 00:39:48,960
And the human decision maker gets comprehensive input

1112
00:39:48,960 --> 00:39:50,160
from every domain.

1113
00:39:50,160 --> 00:39:52,360
This is organizational reasoning at scale.

1114
00:39:52,360 --> 00:39:54,400
It's not one smart agent doing everything poorly.

1115
00:39:54,400 --> 00:39:57,480
It's multiple domain expert agents coordinating in real time,

1116
00:39:57,480 --> 00:39:59,600
each reasoning about what it knows best,

1117
00:39:59,600 --> 00:40:01,960
each passing information to agents that need it.

1118
00:40:01,960 --> 00:40:03,760
A2A makes this possible.

1119
00:40:03,760 --> 00:40:06,080
The agents communicate through structured messages.

1120
00:40:06,080 --> 00:40:09,000
No passing ambiguity, no latency delays stacking up.

1121
00:40:09,000 --> 00:40:11,600
When five agents are coordinating on a single decision,

1122
00:40:11,600 --> 00:40:14,560
the last response can't arrive eight seconds after the first.

1123
00:40:14,560 --> 00:40:16,160
Or the coordination breaks down.

1124
00:40:16,160 --> 00:40:18,400
A2A keeps latency in the millisecond range,

1125
00:40:18,400 --> 00:40:20,040
so the coordination stays real time.

1126
00:40:20,040 --> 00:40:21,680
MCP doesn't scale to this density.

1127
00:40:21,680 --> 00:40:23,520
The coordination overhead becomes the bottleneck.

1128
00:40:23,520 --> 00:40:26,240
You can run a few agents, but you hit a ceiling quickly.

1129
00:40:26,240 --> 00:40:28,720
A2A was designed for this, for agentic density.

1130
00:40:28,720 --> 00:40:31,040
For many agents working together at cloud speed.

1131
00:40:31,040 --> 00:40:33,200
The architectural implication is profound.

1132
00:40:33,200 --> 00:40:34,760
You're not building a copilot.

1133
00:40:34,760 --> 00:40:36,760
You're not deploying a single smart assistant.

1134
00:40:36,760 --> 00:40:38,880
You're building an agentic operating system,

1135
00:40:38,880 --> 00:40:40,760
an infrastructure layer where your organization

1136
00:40:40,760 --> 00:40:42,600
reasons through specialized agents.

1137
00:40:42,600 --> 00:40:44,600
Work IQ is the shared context layer.

1138
00:40:44,600 --> 00:40:47,320
All agents access the same organizational reality.

1139
00:40:47,320 --> 00:40:49,840
Agent 365 is the governance layer.

1140
00:40:49,840 --> 00:40:52,480
All agents operate within consistent policy boundaries.

1141
00:40:52,480 --> 00:40:54,360
A2A is the coordination protocol.

1142
00:40:54,360 --> 00:40:56,200
All agents communicate at cloud speed.

1143
00:40:56,200 --> 00:40:58,120
It's not additive improvement on the old model.

1144
00:40:58,120 --> 00:41:00,760
It's structural transformation, but that transformation

1145
00:41:00,760 --> 00:41:01,720
has a cost.

1146
00:41:01,720 --> 00:41:04,240
And understanding that cost is where the reasoning economy

1147
00:41:04,240 --> 00:41:05,440
comes into view.

1148
00:41:05,440 --> 00:41:06,760
The consumption billing model.

1149
00:41:06,760 --> 00:41:08,360
You've now got the infrastructure in place.

1150
00:41:08,360 --> 00:41:10,160
Agents with identities, policies

1151
00:41:10,160 --> 00:41:12,720
enforced at the system level, real time visibility

1152
00:41:12,720 --> 00:41:13,720
into reasoning.

1153
00:41:13,720 --> 00:41:15,480
Multiple specialized agents coordinating

1154
00:41:15,480 --> 00:41:17,360
through A2A at cloud speed.

1155
00:41:17,360 --> 00:41:18,920
It's an elegant architecture.

1156
00:41:18,920 --> 00:41:21,880
It solves the governance problem and the coordination problem.

1157
00:41:21,880 --> 00:41:24,280
But it introduces a new problem, the financial one.

1158
00:41:24,280 --> 00:41:25,200
How do you pay for this?

1159
00:41:25,200 --> 00:41:28,720
For years, Microsoft 365 operated on a predictable model.

1160
00:41:28,720 --> 00:41:29,720
You bought licenses.

1161
00:41:29,720 --> 00:41:30,960
You assigned them to users.

1162
00:41:30,960 --> 00:41:32,440
You paid per user per month.

1163
00:41:32,440 --> 00:41:34,600
The math was straightforward.

1164
00:41:34,600 --> 00:41:39,280
A thousand users times $20 per user equals $20,000 a month.

1165
00:41:39,280 --> 00:41:41,640
Predictable, linear, simple.

1166
00:41:41,640 --> 00:41:43,320
Work IQ doesn't fit that model.

1167
00:41:43,320 --> 00:41:44,880
It's not consumed per user.

1168
00:41:44,880 --> 00:41:47,520
A thousand users might generate zero work IQ reasoning

1169
00:41:47,520 --> 00:41:48,920
if they're not using agents.

1170
00:41:48,920 --> 00:41:50,880
Or they might generate millions of reasoning steps

1171
00:41:50,880 --> 00:41:53,320
if they're working with agent-driven workflows all day.

1172
00:41:53,320 --> 00:41:54,960
The consumption is variable.

1173
00:41:54,960 --> 00:41:56,400
It's unpredictable.

1174
00:41:56,400 --> 00:41:58,840
Per user licensing breaks.

1175
00:41:58,840 --> 00:42:00,520
So Microsoft made a different choice.

1176
00:42:00,520 --> 00:42:04,600
Starting June 16th, 2026, work IQ moves to consumption billing.

1177
00:42:04,600 --> 00:42:05,600
Not per user.

1178
00:42:05,600 --> 00:42:06,480
Not per agent.

1179
00:42:06,480 --> 00:42:07,000
Per user.

1180
00:42:07,000 --> 00:42:07,520
You use it.

1181
00:42:07,520 --> 00:42:08,160
You pay for it.

1182
00:42:08,160 --> 00:42:09,760
The currency is co-pilot credits.

1183
00:42:09,760 --> 00:42:11,880
Think of co-pilot credits as a unified meter

1184
00:42:11,880 --> 00:42:13,560
for Microsoft AI services.

1185
00:42:13,560 --> 00:42:14,880
It's not just work IQ.

1186
00:42:14,880 --> 00:42:16,920
Co-pilot Studio uses co-pilot credits.

1187
00:42:16,920 --> 00:42:19,880
Some advanced co-pilot features consume co-pilot credits.

1188
00:42:19,880 --> 00:42:22,240
And Azure Open AI can consume co-pilot credits

1189
00:42:22,240 --> 00:42:23,480
depending on your configuration.

1190
00:42:23,480 --> 00:42:26,480
It's a shared currency across Microsoft's AI ecosystem.

1191
00:42:26,480 --> 00:42:27,480
One credit pool.

1192
00:42:27,480 --> 00:42:29,280
Multiple services drawing from it.

1193
00:42:29,280 --> 00:42:31,440
The pricing structure is component-based.

1194
00:42:31,440 --> 00:42:32,960
Work IQ breaks into parts.

1195
00:42:32,960 --> 00:42:34,040
Tools have a fixed rate.

1196
00:42:34,040 --> 00:42:35,600
Every tool in vocation.

1197
00:42:35,600 --> 00:42:37,480
Like sending an email, updating a file,

1198
00:42:37,480 --> 00:42:39,960
or creating a record costs 0.1 credits.

1199
00:42:39,960 --> 00:42:40,800
That's fixed.

1200
00:42:40,800 --> 00:42:42,080
It doesn't vary based on complexity.

1201
00:42:42,080 --> 00:42:43,320
The action is the action.

1202
00:42:43,320 --> 00:42:44,840
And your charged per action.

1203
00:42:44,840 --> 00:42:46,800
Chat and context pricing is variable.

1204
00:42:46,800 --> 00:42:48,280
It depends on what you're asking.

1205
00:42:48,280 --> 00:42:51,040
A simple question like, who's assigned to this project?

1206
00:42:51,040 --> 00:42:53,680
Requires minimal reasoning and light context retrieval.

1207
00:42:53,680 --> 00:42:54,960
Maybe a tenth of a credit.

1208
00:42:54,960 --> 00:42:57,520
A complex question that requires synthesizing information

1209
00:42:57,520 --> 00:43:00,640
across multiple systems requires more reasoning compute.

1210
00:43:00,640 --> 00:43:01,760
And more credits.

1211
00:43:01,760 --> 00:43:03,920
The cost reflects the computational load.

1212
00:43:03,920 --> 00:43:06,840
Example scenarios give you the shape of the cost structure.

1213
00:43:06,840 --> 00:43:07,840
Light usage.

1214
00:43:07,840 --> 00:43:10,280
Something simple like identifying tasks assigned to you

1215
00:43:10,280 --> 00:43:12,200
and compiling them into a checklist.

1216
00:43:12,200 --> 00:43:14,280
Runs between 20 and 40 cents.

1217
00:43:14,280 --> 00:43:17,160
The agent queries your task systems retrieves the data

1218
00:43:17,160 --> 00:43:18,640
and formats a response.

1219
00:43:18,640 --> 00:43:19,240
Light work.

1220
00:43:19,240 --> 00:43:19,880
Light cost.

1221
00:43:19,880 --> 00:43:20,880
Medium complexity.

1222
00:43:20,880 --> 00:43:23,520
Say, reviewing customer interviews, identifying recurring

1223
00:43:23,520 --> 00:43:26,120
themes and recommending prioritized roadmap actions

1224
00:43:26,120 --> 00:43:27,080
with trade-offs.

1225
00:43:27,080 --> 00:43:28,960
Runs 30 to 75 cents.

1226
00:43:28,960 --> 00:43:30,880
The agent is doing synthesis work.

1227
00:43:30,880 --> 00:43:32,480
It's reasoning across multiple inputs

1228
00:43:32,480 --> 00:43:34,720
making judgments and explaining trade-offs.

1229
00:43:34,720 --> 00:43:35,680
More compute.

1230
00:43:35,680 --> 00:43:36,720
Higher cost.

1231
00:43:36,720 --> 00:43:37,880
Heavy scenarios.

1232
00:43:37,880 --> 00:43:40,160
Where an agent acts as a true business partner,

1233
00:43:40,160 --> 00:43:42,200
coordinating across multiple systems,

1234
00:43:42,200 --> 00:43:45,160
applying judgment and handling PR sensitive decisions,

1235
00:43:45,160 --> 00:43:46,880
run 50 cents to $1.50.

1236
00:43:46,880 --> 00:43:49,440
The agent is doing organizational level reasoning.

1237
00:43:49,440 --> 00:43:51,440
It's expensive because reasoning at that depth

1238
00:43:51,440 --> 00:43:52,520
requires more compute.

1239
00:43:52,520 --> 00:43:54,840
This isn't like token pricing in a Azure Open AI

1240
00:43:54,840 --> 00:43:56,960
where you're paying per 1,000 tokens in the count gets

1241
00:43:56,960 --> 00:43:57,800
complicated.

1242
00:43:57,800 --> 00:44:00,720
These are scenario costs end to end.

1243
00:44:00,720 --> 00:44:02,520
An agent runs through a workflow.

1244
00:44:02,520 --> 00:44:04,600
The workflow is light, medium or heavy.

1245
00:44:04,600 --> 00:44:06,160
And you're charged the corresponding amounts,

1246
00:44:06,160 --> 00:44:07,200
simple accounting.

1247
00:44:07,200 --> 00:44:10,400
But the shift from per user to consumption-based changes,

1248
00:44:10,400 --> 00:44:12,920
everything about how you think about infrastructure.

1249
00:44:12,920 --> 00:44:15,120
With per user licensing, the variable is headcount.

1250
00:44:15,120 --> 00:44:16,400
You hire 10 new salespeople.

1251
00:44:16,400 --> 00:44:18,560
You buy 10 new licenses done.

1252
00:44:18,560 --> 00:44:20,320
Cost scales linearly with people.

1253
00:44:20,320 --> 00:44:22,680
With consumption-based, the variable is usage.

1254
00:44:22,680 --> 00:44:25,040
One salesperson working with an AI agent all day

1255
00:44:25,040 --> 00:44:27,720
might consume $300 in copilot credits.

1256
00:44:27,720 --> 00:44:30,680
While another might consume 30, you can't predict it from headcount.

1257
00:44:30,680 --> 00:44:32,480
You have to predict it from behavior.

1258
00:44:32,480 --> 00:44:34,640
How intensively will people use these agents?

1259
00:44:34,640 --> 00:44:36,720
How complex are the reasoning workflows?

1260
00:44:36,720 --> 00:44:38,440
How much daily coordination is happening?

1261
00:44:38,440 --> 00:44:40,440
This is where architects need to think differently.

1262
00:44:40,440 --> 00:44:42,000
Every reasoning step now has a cost.

1263
00:44:42,000 --> 00:44:43,680
Every tool invocation has a cost.

1264
00:44:43,680 --> 00:44:45,360
Every context retrieval has a cost.

1265
00:44:45,360 --> 00:44:47,000
You're not paying for potential anymore.

1266
00:44:47,000 --> 00:44:49,120
You're paying for actual, which means you need to start

1267
00:44:49,120 --> 00:44:51,680
thinking about reasoning efficiency, not as a performance

1268
00:44:51,680 --> 00:44:54,000
optimization, but as a cost optimization.

1269
00:44:54,000 --> 00:44:55,600
Because in the reasoning economy,

1270
00:44:55,600 --> 00:44:57,280
inefficiency isn't just slow.

1271
00:44:57,280 --> 00:44:58,240
It's expensive.

1272
00:44:58,240 --> 00:45:00,520
Reasoning ROI and cost optimization.

1273
00:45:00,520 --> 00:45:02,280
You understand the cost structure now.

1274
00:45:02,280 --> 00:45:03,680
Light workflows cost pennies.

1275
00:45:03,680 --> 00:45:05,200
Heavy workflows cost dollars.

1276
00:45:05,200 --> 00:45:07,280
But architects don't usually think about the cost

1277
00:45:07,280 --> 00:45:08,280
of a single workflow.

1278
00:45:08,280 --> 00:45:09,600
They think about the system.

1279
00:45:09,600 --> 00:45:11,120
And this is where the math gets interesting.

1280
00:45:11,120 --> 00:45:12,520
Imagine you deploy an agent.

1281
00:45:12,520 --> 00:45:13,080
It works.

1282
00:45:13,080 --> 00:45:13,800
Users love it.

1283
00:45:13,800 --> 00:45:16,440
You measure the impact and see it saves three hours per user

1284
00:45:16,440 --> 00:45:18,880
every week, with 5,000 potential users.

1285
00:45:18,880 --> 00:45:22,360
That's 500% hours of productivity recovered weekly.

1286
00:45:22,360 --> 00:45:23,200
That is real value.

1287
00:45:23,200 --> 00:45:24,600
It changes the organization.

1288
00:45:24,600 --> 00:45:26,360
But what if that agent is badly designed?

1289
00:45:26,360 --> 00:45:30,000
What if instead of retrieving context ones and reasoning over it,

1290
00:45:30,000 --> 00:45:33,240
the agent re-quaries that same context five times in one workflow?

1291
00:45:33,240 --> 00:45:35,680
It asks the same question to work IQ five times

1292
00:45:35,680 --> 00:45:37,880
when it could have asked once and cashed the answer.

1293
00:45:37,880 --> 00:45:39,320
Each query is a reasoning step.

1294
00:45:39,320 --> 00:45:40,600
Each step costs credits.

1295
00:45:40,600 --> 00:45:43,760
If you multiply five unnecessary calls by 52 weeks,

1296
00:45:43,760 --> 00:45:46,280
for 1,000 concurrent users, you are burning money

1297
00:45:46,280 --> 00:45:48,280
on something the agent should have handled once.

1298
00:45:48,280 --> 00:45:50,120
The difference between good design and bad design

1299
00:45:50,120 --> 00:45:50,960
isn't small.

1300
00:45:50,960 --> 00:45:52,160
It's geometric.

1301
00:45:52,160 --> 00:45:55,000
A well-designed agent that retrieves context efficiently

1302
00:45:55,000 --> 00:45:58,400
and batches its calls can cost the fifth of what a lazy agent costs.

1303
00:45:58,400 --> 00:45:59,320
The output is the same.

1304
00:45:59,320 --> 00:46:00,720
The user sees no difference.

1305
00:46:00,720 --> 00:46:02,800
But the financial impact is five times worse

1306
00:46:02,800 --> 00:46:04,240
because of the architecture.

1307
00:46:04,240 --> 00:46:05,800
This creates a new discipline.

1308
00:46:05,800 --> 00:46:07,080
Re-signing architecture.

1309
00:46:07,080 --> 00:46:08,560
Not just application architecture.

1310
00:46:08,560 --> 00:46:11,600
Re-signing architecture, it's the practice of designing agents

1311
00:46:11,600 --> 00:46:14,240
to minimize cost while keeping them capable.

1312
00:46:14,240 --> 00:46:17,320
So what does efficient reasoning architecture actually look like?

1313
00:46:17,320 --> 00:46:18,520
Batch your retrievals.

1314
00:46:18,520 --> 00:46:21,120
If you need three things from work IQ, account status,

1315
00:46:21,120 --> 00:46:22,960
transaction history and open tickets,

1316
00:46:22,960 --> 00:46:24,360
do not make three separate calls.

1317
00:46:24,360 --> 00:46:25,280
Batch them into one.

1318
00:46:25,280 --> 00:46:27,920
The agent gets everything it needs in a single reasoning step.

1319
00:46:27,920 --> 00:46:28,880
Cost is lower.

1320
00:46:28,880 --> 00:46:30,000
Latency is lower.

1321
00:46:30,000 --> 00:46:31,560
It's better for everyone.

1322
00:46:31,560 --> 00:46:34,680
Next, cash-affemoral context where your policy allows it.

1323
00:46:34,680 --> 00:46:37,400
An agent is working on a project and needs to know the status.

1324
00:46:37,400 --> 00:46:38,160
It retrieves it.

1325
00:46:38,160 --> 00:46:40,160
Now it's going to reference that start us ten times

1326
00:46:40,160 --> 00:46:41,400
over the next few minutes

1327
00:46:41,400 --> 00:46:43,200
while it reasons through different tasks.

1328
00:46:43,200 --> 00:46:44,520
Don't retrieve it again.

1329
00:46:44,520 --> 00:46:46,280
Store it in the agent's working memory.

1330
00:46:46,280 --> 00:46:47,600
The status is a femoral.

1331
00:46:47,600 --> 00:46:49,560
It's only valid for this one session.

1332
00:46:49,560 --> 00:46:51,760
But inside that session, it doesn't change.

1333
00:46:51,760 --> 00:46:53,720
You retrieve once and reference many times.

1334
00:46:53,720 --> 00:46:55,240
The cost drops immediately.

1335
00:46:55,240 --> 00:46:57,840
You should also use lighter models for simple decisions.

1336
00:46:57,840 --> 00:47:00,480
Not every task needs the heaviest model available.

1337
00:47:00,480 --> 00:47:02,560
If an agent is making a binary choice

1338
00:47:02,560 --> 00:47:04,320
like whether to escalate a ticket,

1339
00:47:04,320 --> 00:47:05,760
a lighter model can handle it.

1340
00:47:05,760 --> 00:47:07,400
It's cheaper and it's faster.

1341
00:47:07,400 --> 00:47:09,480
Save the heavy models for nuanced decisions

1342
00:47:09,480 --> 00:47:11,160
that actually require that power.

1343
00:47:11,160 --> 00:47:13,360
Using a heavy model for simple classification

1344
00:47:13,360 --> 00:47:15,680
is like hiring a surgeon to give you a bandaid.

1345
00:47:15,680 --> 00:47:18,320
Structure your workflow to defer heavy reasoning.

1346
00:47:18,320 --> 00:47:20,240
A customer service agent handles a ticket.

1347
00:47:20,240 --> 00:47:22,160
First, it does a lightweight classification

1348
00:47:22,160 --> 00:47:23,680
to find the category and sentiment.

1349
00:47:23,680 --> 00:47:24,880
These are easy.

1350
00:47:24,880 --> 00:47:26,320
Use light reasoning.

1351
00:47:26,320 --> 00:47:28,600
Only if the ticket requires deep judgment,

1352
00:47:28,600 --> 00:47:30,800
like conflicting policies or subtle contexts,

1353
00:47:30,800 --> 00:47:32,520
do you move to heavier reasoning.

1354
00:47:32,520 --> 00:47:34,040
Most tickets never get that far.

1355
00:47:34,040 --> 00:47:35,800
Most stay at a light reasoning cost.

1356
00:47:35,800 --> 00:47:37,440
The financial reality is stock.

1357
00:47:37,440 --> 00:47:40,440
A poorly designed agent at scale can cost ten times more

1358
00:47:40,440 --> 00:47:43,080
than a well-designed one producing the exact same result.

1359
00:47:43,080 --> 00:47:44,200
That isn't an exaggeration.

1360
00:47:44,200 --> 00:47:45,480
It's just the math.

1361
00:47:45,480 --> 00:47:47,360
One agent queries contacts at every step

1362
00:47:47,360 --> 00:47:48,920
while the other batches and caches.

1363
00:47:48,920 --> 00:47:51,520
Same result, but a tenfold difference in the bill.

1364
00:47:51,520 --> 00:47:53,920
Organizations that get this have a massive advantage.

1365
00:47:53,920 --> 00:47:55,200
They deploy efficiently.

1366
00:47:55,200 --> 00:47:57,000
They iterate based on cost metrics.

1367
00:47:57,000 --> 00:47:59,560
When an agent costs too much, they know it immediately.

1368
00:47:59,560 --> 00:48:01,880
They can redesign the workflow or change the model.

1369
00:48:01,880 --> 00:48:03,800
They have visibility so they can act.

1370
00:48:03,800 --> 00:48:06,800
But organizations that treat co-pilot credits like their infinite

1371
00:48:06,800 --> 00:48:08,720
hit a ceiling they didn't expect.

1372
00:48:08,720 --> 00:48:11,800
The agent looked great in the pilot when it handled a hundred tickets.

1373
00:48:11,800 --> 00:48:13,760
Then they rolled it out to 5,000 users.

1374
00:48:13,760 --> 00:48:15,320
Suddenly the credit burn is massive.

1375
00:48:15,320 --> 00:48:17,520
The costs were invisible at a small scale.

1376
00:48:17,520 --> 00:48:19,520
But they are unavoidable at a large one.

1377
00:48:19,520 --> 00:48:21,760
By then, the agent is already in production.

1378
00:48:21,760 --> 00:48:22,880
The damage is done.

1379
00:48:22,880 --> 00:48:25,400
This shift, designing for cost, not just function,

1380
00:48:25,400 --> 00:48:26,400
is a governance problem.

1381
00:48:26,400 --> 00:48:28,160
You can't have every engineer deploying agents,

1382
00:48:28,160 --> 00:48:29,000
however they want.

1383
00:48:29,000 --> 00:48:29,840
You need standards.

1384
00:48:29,840 --> 00:48:30,440
You need review.

1385
00:48:30,440 --> 00:48:31,680
You need guardrails.

1386
00:48:31,680 --> 00:48:32,960
Finops for AI.

1387
00:48:32,960 --> 00:48:34,480
Managing the reasoning budget.

1388
00:48:34,480 --> 00:48:36,160
Those guardrails start with a practice

1389
00:48:36,160 --> 00:48:38,360
that started in cloud computing a decade ago.

1390
00:48:38,360 --> 00:48:39,360
It's called Finops.

1391
00:48:39,360 --> 00:48:40,640
Financial operations.

1392
00:48:40,640 --> 00:48:43,280
It began when companies realized their cloud bills were spiraling

1393
00:48:43,280 --> 00:48:45,360
because nobody was watching what got deployed.

1394
00:48:45,360 --> 00:48:47,960
Engineers spun up infrastructure because it was easy.

1395
00:48:47,960 --> 00:48:49,160
But nobody tracked the cost.

1396
00:48:49,160 --> 00:48:50,160
The bills were a shock.

1397
00:48:50,160 --> 00:48:51,080
Finops was the answer.

1398
00:48:51,080 --> 00:48:52,640
Make cost visible, make it measurable,

1399
00:48:52,640 --> 00:48:54,240
and make it someone's job to manage it.

1400
00:48:54,240 --> 00:48:56,240
AI reasoning needs that same approach.

1401
00:48:56,240 --> 00:48:58,680
Because without discipline, consumption-based billing

1402
00:48:58,680 --> 00:49:00,400
becomes a problem you can't control.

1403
00:49:00,400 --> 00:49:03,320
So, what does Finops for reasoning look like?

1404
00:49:03,320 --> 00:49:04,440
First, visibility.

1405
00:49:04,440 --> 00:49:05,920
You need to track spend by agent.

1406
00:49:05,920 --> 00:49:07,480
Not just the total for the whole company.

1407
00:49:07,480 --> 00:49:08,840
You need agent-level detail.

1408
00:49:08,840 --> 00:49:10,520
Which agents are burning the most credits?

1409
00:49:10,520 --> 00:49:12,120
How much does each execution cost?

1410
00:49:12,120 --> 00:49:13,120
Which ones are efficient?

1411
00:49:13,120 --> 00:49:14,360
And which ones are wasteful?

1412
00:49:14,360 --> 00:49:17,560
The Microsoft 365 Admin Center has dashboards for this.

1413
00:49:17,560 --> 00:49:19,920
But they only work if you actually look at them.

1414
00:49:19,920 --> 00:49:20,840
Second, alerting.

1415
00:49:20,840 --> 00:49:21,720
Set your budgets.

1416
00:49:21,720 --> 00:49:23,960
Define what an acceptable spend looks like.

1417
00:49:23,960 --> 00:49:27,040
If an agent goes 50% over its expected cost in a day,

1418
00:49:27,040 --> 00:49:28,080
you need an alert.

1419
00:49:28,080 --> 00:49:29,360
Not because it's broken,

1420
00:49:29,360 --> 00:49:31,680
but because something changed, maybe usage spiked

1421
00:49:31,680 --> 00:49:33,000
or the workflow got more complex,

1422
00:49:33,000 --> 00:49:34,680
you need to know when it happens.

1423
00:49:34,680 --> 00:49:37,040
Not at the end of the month when you see the bill.

1424
00:49:37,040 --> 00:49:38,480
Third, optimization.

1425
00:49:38,480 --> 00:49:40,320
Once you find the agents that are cost centers,

1426
00:49:40,320 --> 00:49:42,280
you fix them, you redesign the workflows,

1427
00:49:42,280 --> 00:49:44,200
you add caching or switch models,

1428
00:49:44,200 --> 00:49:45,280
then you measure the impact.

1429
00:49:45,280 --> 00:49:46,800
Did the change actually save money?

1430
00:49:46,800 --> 00:49:48,440
This isn't a one-time project.

1431
00:49:48,440 --> 00:49:49,240
It's a loop.

1432
00:49:49,240 --> 00:49:50,640
New agents get measured.

1433
00:49:50,640 --> 00:49:53,960
And inefficient ones get sent back for a redesign.

1434
00:49:53,960 --> 00:49:55,040
Fourth, forecasting.

1435
00:49:55,040 --> 00:49:56,280
If you have three months of data,

1436
00:49:56,280 --> 00:49:58,480
you can see the trends, adoption is picking up

1437
00:49:58,480 --> 00:50:00,200
and new workflows are going live.

1438
00:50:00,200 --> 00:50:02,280
How will that change your credit use next quarter?

1439
00:50:02,280 --> 00:50:04,160
How much budget do you need to set aside?

1440
00:50:04,160 --> 00:50:06,320
If you can't forecast, you can't plan.

1441
00:50:06,320 --> 00:50:08,680
FinOps turns a surprise bill into a prediction.

1442
00:50:08,680 --> 00:50:11,560
You should be asking specific questions about your agents.

1443
00:50:11,560 --> 00:50:13,360
Which ones are actually delivering ROI?

1444
00:50:13,360 --> 00:50:14,880
Usage does not equal value.

1445
00:50:14,880 --> 00:50:17,960
An agent that everyone loves, but costs $5 per run

1446
00:50:17,960 --> 00:50:20,120
might be less valuable than one that costs $0.20

1447
00:50:20,120 --> 00:50:21,480
and saves 30 minutes.

1448
00:50:21,480 --> 00:50:22,560
You need the math.

1449
00:50:22,560 --> 00:50:25,760
Usage times cost per run versus the value of the outcome,

1450
00:50:25,760 --> 00:50:27,640
which agents are just cost centers.

1451
00:50:27,640 --> 00:50:28,800
Some might be experiments.

1452
00:50:28,800 --> 00:50:31,280
You have a budget for them because they are learning opportunities.

1453
00:50:31,280 --> 00:50:33,120
They aren't supposed to be profitable yet,

1454
00:50:33,120 --> 00:50:34,440
but you still have to track them.

1455
00:50:34,440 --> 00:50:36,240
Are they moving toward value?

1456
00:50:36,240 --> 00:50:38,320
Or are they just burning credits with no plan?

1457
00:50:38,320 --> 00:50:39,720
Where is the spend concentrated?

1458
00:50:39,720 --> 00:50:42,680
Maybe one department is using agents heavily while another isn't?

1459
00:50:42,680 --> 00:50:45,600
Maybe one single workflow is responsible for half your costs.

1460
00:50:45,600 --> 00:50:46,480
Is that justified?

1461
00:50:46,480 --> 00:50:48,760
Or is it a sign that something needs to be redesigned?

1462
00:50:48,760 --> 00:50:51,080
And finally, what is the cost per outcome?

1463
00:50:51,080 --> 00:50:52,760
If an agent summarizes meeting notes

1464
00:50:52,760 --> 00:50:56,320
and uses $2 in credits, but a human could do it in 15 minutes,

1465
00:50:56,320 --> 00:50:57,560
you have to look at the numbers.

1466
00:50:57,560 --> 00:51:00,680
At a normal salary, that human time might cost $5.

1467
00:51:00,680 --> 00:51:02,560
The agent was cheaper, but not by a lot.

1468
00:51:02,560 --> 00:51:03,880
Is it worth the complexity?

1469
00:51:03,880 --> 00:51:05,280
You need the data to decide.

1470
00:51:05,280 --> 00:51:07,960
The real power of FinOps is making cost a design constraint.

1471
00:51:07,960 --> 00:51:09,880
When engineers know their agents are being metered,

1472
00:51:09,880 --> 00:51:11,120
it changes how they build.

1473
00:51:11,120 --> 00:51:12,960
They know inefficiency will be noticed.

1474
00:51:12,960 --> 00:51:14,520
It encourages a better architecture.

1475
00:51:14,520 --> 00:51:17,120
It turns cost management into a competitive discipline.

1476
00:51:17,120 --> 00:51:19,760
But none of this works unless it is someone's job.

1477
00:51:19,760 --> 00:51:21,440
Not everyone's job, someone's job.

1478
00:51:21,440 --> 00:51:23,720
A specific role or a team for reasoning operations.

1479
00:51:23,720 --> 00:51:26,000
They own the dashboards and analyze the trends.

1480
00:51:26,000 --> 00:51:27,840
They push back when a deployment looks wasteful.

1481
00:51:27,840 --> 00:51:30,640
Cost discipline doesn't happen because people have good intentions.

1482
00:51:30,640 --> 00:51:33,720
It happens because there is accountability.

1483
00:51:33,720 --> 00:51:35,680
The multi-cloud governance challenge.

1484
00:51:35,680 --> 00:51:37,840
This is where the architecture gets complicated.

1485
00:51:37,840 --> 00:51:41,680
Agent 365 can govern agents running on your Microsoft infrastructure,

1486
00:51:41,680 --> 00:51:44,680
which is a clean setup because those agents live inside your entrotennent

1487
00:51:44,680 --> 00:51:46,880
and inherit M365 policies.

1488
00:51:46,880 --> 00:51:50,840
The governance model is unified, but most organizations don't live in a single cloud.

1489
00:51:50,840 --> 00:51:54,080
You have workloads in Microsoft, critical systems in AWS,

1490
00:51:54,080 --> 00:51:56,560
and you're likely using Google Cloud for analytics.

1491
00:51:56,560 --> 00:51:59,240
You're running custom agents on Azure, others on Bedrock,

1492
00:51:59,240 --> 00:52:01,080
and still more on Vertex AI.

1493
00:52:01,080 --> 00:52:04,040
The agents don't care what cloud they run on, they'll work anywhere.

1494
00:52:04,040 --> 00:52:06,640
But governance becomes a nightmare when you're trying to enforce

1495
00:52:06,640 --> 00:52:09,280
consistent policy across three different cloud providers

1496
00:52:09,280 --> 00:52:10,840
with three different identity systems.

1497
00:52:10,840 --> 00:52:12,960
You're dealing with three different compliance frameworks

1498
00:52:12,960 --> 00:52:14,880
and three different audit mechanisms.

1499
00:52:14,880 --> 00:52:17,160
The problem is structural.

1500
00:52:17,160 --> 00:52:19,600
A policy that makes sense in Microsoft's world,

1501
00:52:19,600 --> 00:52:22,640
like agents can read from SharePoint, but not modify.

1502
00:52:22,640 --> 00:52:24,480
Doesn't directly translate to AWS.

1503
00:52:24,480 --> 00:52:26,000
AWS doesn't have SharePoint.

1504
00:52:26,000 --> 00:52:27,440
It has S3.

1505
00:52:27,440 --> 00:52:28,880
The policy needs to be reframed,

1506
00:52:28,880 --> 00:52:32,600
so agents can read from authorized S3 buckets, but not modify them,

1507
00:52:32,600 --> 00:52:35,480
which means the intent is the same, but the implementation is different.

1508
00:52:35,480 --> 00:52:39,000
Multiply that across every control, every constraint,

1509
00:52:39,000 --> 00:52:40,560
and every policy rule.

1510
00:52:40,560 --> 00:52:43,800
What you end up with is policy translation that's fragile and error-prone.

1511
00:52:43,800 --> 00:52:45,920
Authentication makes the problem even worse.

1512
00:52:45,920 --> 00:52:48,520
In Microsoft, agents get entraagent IDs,

1513
00:52:48,520 --> 00:52:51,040
while in AWS they get IAM roles,

1514
00:52:51,040 --> 00:52:53,480
and in Google Cloud, they get service accounts.

1515
00:52:53,480 --> 00:52:54,720
These aren't the same thing.

1516
00:52:54,720 --> 00:52:55,800
They don't interoperate.

1517
00:52:55,800 --> 00:53:00,360
An Entra agent ID has no meaning in AWS, and an IAM role has no meaning in Google Cloud,

1518
00:53:00,360 --> 00:53:04,040
so you're managing agent identity across three separate directory systems.

1519
00:53:04,040 --> 00:53:06,360
If an agent needs to be revoked, you revoke it in Entra,

1520
00:53:06,360 --> 00:53:09,120
then you revoke it in AWS, and then you do it again in Google.

1521
00:53:09,120 --> 00:53:11,520
If you miss one, the agent is only half revoked.

1522
00:53:11,520 --> 00:53:13,080
It still has access somewhere.

1523
00:53:13,080 --> 00:53:15,440
Compliance frameworks diverge too.

1524
00:53:15,440 --> 00:53:18,360
Microsoft's compliance stack uses PerView and Defender,

1525
00:53:18,360 --> 00:53:20,680
but AWS has GodDuty and config,

1526
00:53:20,680 --> 00:53:24,880
while Google Cloud relies on Security Command Center and VPC service controls.

1527
00:53:24,880 --> 00:53:29,600
The controls are similar conceptually because you want to detect anomalies and maintain audit trails,

1528
00:53:29,600 --> 00:53:31,560
but the actual mechanisms are different.

1529
00:53:31,560 --> 00:53:35,160
The threat detection algorithm in Defender doesn't translate to GodDuty,

1530
00:53:35,160 --> 00:53:38,120
so you aren't using the same rules or getting the same insights.

1531
00:53:38,120 --> 00:53:39,960
When an agent is behaving strangely,

1532
00:53:39,960 --> 00:53:41,680
one cloud detects it immediately,

1533
00:53:41,680 --> 00:53:44,200
while another might take hours to flag the issue.

1534
00:53:44,200 --> 00:53:47,560
Audit trails are perhaps the most frustrating part of this divergence.

1535
00:53:47,560 --> 00:53:51,680
A multi-cloud agent executes a workflow where it reads from S3 in AWS,

1536
00:53:51,680 --> 00:53:56,160
updates a record in Dynamics in Azure, and then queries a BigQuery dataset in Google Cloud.

1537
00:53:56,160 --> 00:53:59,840
The entire workflow is logged, but it split across three different places.

1538
00:53:59,840 --> 00:54:02,400
AWS CloudTrail shows the S3 access,

1539
00:54:02,400 --> 00:54:04,880
as your audit log shows the Dynamics update,

1540
00:54:04,880 --> 00:54:08,000
and Google Cloud's audit log shows the BigQuery query.

1541
00:54:08,000 --> 00:54:11,880
Reconstructing what the agent actually did requires pulling logs from three systems

1542
00:54:11,880 --> 00:54:14,160
and manually correlating them by timestamp.

1543
00:54:14,160 --> 00:54:17,080
Try doing that at scale across thousands of agent executions

1544
00:54:17,080 --> 00:54:20,160
when you're looking for compliance violations or security incidents.

1545
00:54:20,160 --> 00:54:24,240
The current answer, agent 365 provides is a hybrid model, not a unified one.

1546
00:54:24,240 --> 00:54:28,000
Agent 365 acts as a registry that knows about agents across clouds,

1547
00:54:28,000 --> 00:54:30,960
including their identities, their owners, and how to correlate them,

1548
00:54:30,960 --> 00:54:32,960
but enforcement happens locally.

1549
00:54:32,960 --> 00:54:35,840
Microsoft agents are governed through Azure security controls.

1550
00:54:35,840 --> 00:54:38,960
AWS agents are governed through IAM and GodDuty,

1551
00:54:38,960 --> 00:54:42,160
and Google agents are governed through VPC service controls.

1552
00:54:42,160 --> 00:54:46,160
Agent 365 is the lens that lets you see all agents in one place,

1553
00:54:46,160 --> 00:54:49,440
but the actual policy enforcement happens in three separate places.

1554
00:54:49,440 --> 00:54:51,760
This creates massive operational complexity.

1555
00:54:51,760 --> 00:54:55,200
You define a policy stating agents cannot ex-filterate data,

1556
00:54:55,200 --> 00:54:59,840
but that policy needs to be expressed in intra-conditional access language for Microsoft agents.

1557
00:54:59,840 --> 00:55:03,040
Then you write it in AWS IAM policy language for AWS agents

1558
00:55:03,040 --> 00:55:05,680
and Google Cloud IAM policy language for Google agents.

1559
00:55:05,680 --> 00:55:08,880
It's not automation, it's manual translation across three systems.

1560
00:55:08,880 --> 00:55:11,920
When you need to change a policy, you change it in three places,

1561
00:55:11,920 --> 00:55:15,520
which means the risk of drift and misconfiguration is incredibly high.

1562
00:55:15,520 --> 00:55:17,920
It's workable, and many organizations operate this way,

1563
00:55:17,920 --> 00:55:20,960
but it isn't seamless. It isn't unified governance.

1564
00:55:20,960 --> 00:55:22,480
It's orchestrated governance.

1565
00:55:22,480 --> 00:55:24,400
Your orchestrating policy across clouds,

1566
00:55:24,400 --> 00:55:28,000
which requires discipline, process, and tooling that understands all three clouds

1567
00:55:28,000 --> 00:55:30,640
to translate intent into specific controls.

1568
00:55:30,640 --> 00:55:34,720
This is the emerging problem that the next generation of governance tools will need to solve,

1569
00:55:34,720 --> 00:55:36,400
the pilot to production journey.

1570
00:55:36,400 --> 00:55:38,240
Understanding the architecture is one thing,

1571
00:55:38,240 --> 00:55:40,960
but executing it in your organization is another.

1572
00:55:40,960 --> 00:55:43,360
Most enterprises fail not because the technology doesn't work,

1573
00:55:43,360 --> 00:55:44,960
but because they implemented the wrong way.

1574
00:55:44,960 --> 00:55:47,200
They deploy agents at scale without governance,

1575
00:55:47,200 --> 00:55:48,560
skip the foundational work,

1576
00:55:48,560 --> 00:55:52,160
and ignore cost signals until they hit walls they didn't anticipate.

1577
00:55:52,160 --> 00:55:56,080
The path from understanding to success runs through a structured progression.

1578
00:55:56,080 --> 00:55:57,040
Start small.

1579
00:55:57,040 --> 00:56:00,720
Phase one is a pilot where you pick one low-risk agent in one business unit,

1580
00:56:00,720 --> 00:56:01,840
not organization-wide,

1581
00:56:01,840 --> 00:56:04,480
not multiple agents at the same time, just one.

1582
00:56:04,480 --> 00:56:06,880
A single, contained agent doing something that matters,

1583
00:56:06,880 --> 00:56:09,120
but isn't mission-critical is the best place to start.

1584
00:56:09,120 --> 00:56:12,720
Maybe it's a customer service agent that handles tier one ticket triage,

1585
00:56:12,720 --> 00:56:16,560
or perhaps an operations agent that recommends actions but doesn't execute them.

1586
00:56:16,560 --> 00:56:19,280
You want something where the downside of failure is manageable.

1587
00:56:19,280 --> 00:56:23,040
The pilot isn't about proving the technology works, the technology already works.

1588
00:56:23,040 --> 00:56:25,600
The pilot is about proving the governance model works.

1589
00:56:25,600 --> 00:56:27,840
Can you create an agent with an entra-agent ID,

1590
00:56:27,840 --> 00:56:30,160
assign it policies and monitor it effectively?

1591
00:56:30,160 --> 00:56:33,200
Can you detect when it misbehaves, measure its cost,

1592
00:56:33,200 --> 00:56:35,040
and understand its total impact?

1593
00:56:35,040 --> 00:56:38,000
The pilot is answering operational questions, not technical ones.

1594
00:56:38,000 --> 00:56:38,960
Measure everything.

1595
00:56:38,960 --> 00:56:42,560
Look at the adoption rate to see what percentage of eligible users actually interact

1596
00:56:42,560 --> 00:56:44,560
with the agent versus those who ignore it.

1597
00:56:44,560 --> 00:56:48,320
Track the reasoning cost per outcome to see how much it costs the organization

1598
00:56:48,320 --> 00:56:50,080
to get one result from the agent.

1599
00:56:50,080 --> 00:56:54,720
Monitor agent uptime and reliability to ensure it's producing correct output consistently.

1600
00:56:54,720 --> 00:56:58,400
Gage user trust through feedback and usage patterns to see if people actually believe

1601
00:56:58,400 --> 00:57:00,320
the agent or if they're second-guessing it.

1602
00:57:00,320 --> 00:57:02,880
Watch for compliance violations trending towards zero

1603
00:57:02,880 --> 00:57:05,760
to ensure the agent is staying within policy boundaries.

1604
00:57:05,760 --> 00:57:07,600
You're running this pilot for probably three months,

1605
00:57:07,600 --> 00:57:09,440
which is enough time to see real patterns.

1606
00:57:09,440 --> 00:57:12,800
You'll discover that the agent handles certain types of inputs beautifully

1607
00:57:12,800 --> 00:57:14,080
while it struggles with others.

1608
00:57:14,080 --> 00:57:16,640
You'll find workflows that are more efficient with the agent

1609
00:57:16,640 --> 00:57:18,080
and others that are actually slower.

1610
00:57:18,080 --> 00:57:21,920
You'll see costs surprises where certain workflows burn more credits than you expected.

1611
00:57:21,920 --> 00:57:23,840
You'll see user behavior patterns,

1612
00:57:23,840 --> 00:57:26,880
like which users embrace agents and which ones resist.

1613
00:57:26,880 --> 00:57:29,600
Phase two takes what you learned and expands carefully.

1614
00:57:29,600 --> 00:57:32,880
You're now running three to five agents across two to three departments.

1615
00:57:32,880 --> 00:57:35,920
You're doing this deliberately, not haphazardly,

1616
00:57:35,920 --> 00:57:38,400
by establishing governance patterns that worked in the pilot

1617
00:57:38,400 --> 00:57:39,920
and applying them consistently.

1618
00:57:39,920 --> 00:57:42,640
You're building the muscle of agent architecture and operations.

1619
00:57:42,640 --> 00:57:45,280
You're still small enough to fix things quickly if they break,

1620
00:57:45,280 --> 00:57:47,360
but large enough to see organizational patterns.

1621
00:57:47,360 --> 00:57:50,480
This is where you're establishing the governance foundation that will scale.

1622
00:57:50,480 --> 00:57:51,600
You're not doing this ad hoc.

1623
00:57:51,600 --> 00:57:52,720
You're building the registry.

1624
00:57:52,720 --> 00:57:55,440
You're creating policy templates, defining approval workflows

1625
00:57:55,440 --> 00:57:57,760
for new agents and setting up cost dashboards.

1626
00:57:57,760 --> 00:58:00,880
You're training the people who will be responsible for agent operations.

1627
00:58:00,880 --> 00:58:02,640
You're creating the operating model,

1628
00:58:02,640 --> 00:58:04,480
not just deploying technology.

1629
00:58:04,480 --> 00:58:05,920
Phase three is scale production.

1630
00:58:05,920 --> 00:58:08,080
You roll out across the organization

1631
00:58:08,080 --> 00:58:10,960
and now your process is matter because they're running at scale.

1632
00:58:10,960 --> 00:58:14,560
You automate governance because you can't manually review hundreds of agents.

1633
00:58:14,560 --> 00:58:17,920
You optimize for cost because small inefficiencies become massive,

1634
00:58:17,920 --> 00:58:19,440
expensive problems at scale.

1635
00:58:19,440 --> 00:58:21,280
You're running dozens or hundreds of agents,

1636
00:58:21,280 --> 00:58:23,360
hundreds of thousands of executions per day

1637
00:58:23,360 --> 00:58:25,360
and millions of reasoning steps per week.

1638
00:58:25,360 --> 00:58:28,400
What kills organizations in this transition starting too big?

1639
00:58:28,400 --> 00:58:31,280
Some leaders see the vision and want agents everywhere immediately,

1640
00:58:31,280 --> 00:58:35,280
so they deploy dozens of agents across the organization without any governance.

1641
00:58:35,280 --> 00:58:38,400
Six months later, they've got agents accessing data they shouldn't,

1642
00:58:38,400 --> 00:58:40,400
duplicating work and generating costs,

1643
00:58:40,400 --> 00:58:42,240
spikes, they're in chaos.

1644
00:58:42,240 --> 00:58:44,320
They have to pull everything back and start over,

1645
00:58:44,320 --> 00:58:46,640
which means they've lost momentum and internal trust.

1646
00:58:46,640 --> 00:58:48,080
Moving too fast is another killer.

1647
00:58:48,080 --> 00:58:50,800
You skip the identity work and the governance foundation

1648
00:58:50,800 --> 00:58:52,880
to jump straight to agent deployment.

1649
00:58:52,880 --> 00:58:54,160
Then you hit a compliance problem

1650
00:58:54,160 --> 00:58:57,280
because an agent accessed protected data it shouldn't have.

1651
00:58:57,280 --> 00:59:00,000
Now you're retrofitting governance into a production system

1652
00:59:00,000 --> 00:59:02,080
while dealing with a breach or an audit finding.

1653
00:59:02,080 --> 00:59:04,880
The fix is messy, expensive and completely avoidable.

1654
00:59:04,880 --> 00:59:06,960
Ignoring cost is the final trap.

1655
00:59:06,960 --> 00:59:10,320
The pilot looked cheap because you weren't measuring carefully,

1656
00:59:10,320 --> 00:59:12,480
but now at scale the credit burn is massive.

1657
00:59:12,480 --> 00:59:14,320
It's in the budget plans you didn't anticipate

1658
00:59:14,320 --> 00:59:16,160
and it becomes a financial surprise.

1659
00:59:16,160 --> 00:59:18,160
You're defending purchases you didn't plan for

1660
00:59:18,160 --> 00:59:20,160
and cutting agents just to control costs.

1661
00:59:20,160 --> 00:59:22,800
You're losing the benefits you built the system to achieve

1662
00:59:22,800 --> 00:59:26,000
even though you could have managed this with early cost discipline.

1663
00:59:26,000 --> 00:59:29,040
The organizations that succeed are the ones that pilot rigorously.

1664
00:59:29,040 --> 00:59:31,200
They establish governance patterns, move deliberately

1665
00:59:31,200 --> 00:59:32,480
and measure continuously.

1666
00:59:32,480 --> 00:59:34,480
They optimize based on data and they don't rush

1667
00:59:34,480 --> 00:59:36,080
they don't ignore warning signals.

1668
00:59:36,080 --> 00:59:38,480
They build the operating model as they scale the agents.

1669
00:59:39,280 --> 00:59:43,120
The organizational shift from user-centric to agent-centric

1670
00:59:43,120 --> 00:59:46,800
for 20 years IT has focused on one metric.

1671
00:59:46,800 --> 00:59:50,480
Users that number drove everything your head count determined your budget

1672
00:59:50,480 --> 00:59:52,800
10,000 employees meant 10,000 licenses.

1673
00:59:52,800 --> 00:59:54,960
The math was linear and predictable.

1674
00:59:54,960 --> 00:59:59,200
Every hiring cycle you bought more, every reduction you freed them up

1675
00:59:59,200 --> 01:00:02,320
supporting users meant managing devices, responding to tickets

1676
01:00:02,320 --> 01:00:03,680
and securing identities.

1677
01:00:03,680 --> 01:00:06,720
The entire system was built for humans agents break that model

1678
01:00:06,720 --> 01:00:10,480
because agents aren't users they don't have help desk tickets they don't hire on

1679
01:00:10,480 --> 01:00:12,800
and they don't retire they aren't units of head count

1680
01:00:12,800 --> 01:00:16,800
they are units of work and organizing around work requires a different way of thinking.

1681
01:00:16,800 --> 01:00:21,120
An agent-centric organization asks new questions how many agents actually exist

1682
01:00:21,120 --> 01:00:24,320
not just the ones you deployed but the shadow agents it doesn't know about

1683
01:00:24,320 --> 01:00:26,160
what are they actually capable of doing

1684
01:00:26,160 --> 01:00:30,320
who is accountable for their behavior is the reasoning cost trending up or down

1685
01:00:30,320 --> 01:00:33,360
these questions require new roles you need an agent architect

1686
01:00:33,360 --> 01:00:36,480
this person designs the workflows they understand the business process

1687
01:00:36,480 --> 01:00:41,280
and they understand the governance constraints they decide which reasoning steps are necessary

1688
01:00:41,280 --> 01:00:45,600
and which are just wasteful you need agent owners every agent needs a business sponsor

1689
01:00:45,600 --> 01:00:49,600
someone accountable for the outcome not the person who built it the person who owns the results

1690
01:00:49,600 --> 01:00:53,440
if the agent is broken that's on them if it costs too much that's on them

1691
01:00:53,440 --> 01:00:58,560
a reasoning analyst is the next role this person watches the spend they identify where the money is going

1692
01:00:58,560 --> 01:01:02,880
and which agents are being wasteful they are like fin ops for AI finally you need an agent security

1693
01:01:02,880 --> 01:01:07,520
specialist this person investigates anomalies is the agent compromised or did the workflow just

1694
01:01:07,520 --> 01:01:11,840
change they know what normal looks like and they know when to be concerned your existing team can

1695
01:01:11,840 --> 01:01:16,320
evolve into these roles infrastructure becomes reasoning operations developers become architects

1696
01:01:16,320 --> 01:01:22,080
support personnel become owners but they can't do this without new tools new dashboards new monitoring new

1697
01:01:22,080 --> 01:01:30,080
processes the shift is subtle but it's profound the unit of account moves from people to agents

1698
01:01:30,080 --> 01:01:34,640
support moves from reactive to proactive cost control moves from licenses to consumption but here's

1699
01:01:34,640 --> 01:01:40,880
the problem right now intelligence is external to it business units build their own tools in the shadows

1700
01:01:40,880 --> 01:01:45,280
it can't see it so they can't govern it agent centric organizations make intelligence a managed

1701
01:01:45,280 --> 01:01:50,480
service agents are visible they are monitored they are optimized intelligence becomes infrastructure

1702
01:01:50,480 --> 01:01:54,640
just like networks and storage that is the shift intelligence moves from the shadows into the

1703
01:01:54,640 --> 01:02:00,160
operating model building the governance foundation you have a path now pilot one agent expand to a

1704
01:02:00,160 --> 01:02:04,960
handful then scale but that path only works if you build the foundation first you can't just flip

1705
01:02:04,960 --> 01:02:09,840
a switch on governance it has to exist before you scale there are five components you need first

1706
01:02:09,840 --> 01:02:14,800
the agent registry this isn't just a list it's a system of record what does the agent do who owns it

1707
01:02:14,800 --> 01:02:19,840
what data can it access the registry is your inventory it is the single source of truth because

1708
01:02:19,840 --> 01:02:26,400
you can't govern what you can't see second policy templates you're going to get dozens of requests

1709
01:02:26,400 --> 01:02:32,080
we need a service agent we need a procurement agent if you design policies from scratch every time

1710
01:02:32,080 --> 01:02:37,120
you will never finish build templates instead a service agent template specifies what it can read

1711
01:02:37,120 --> 01:02:41,440
and what it can't modify new agents inherit those rules governance stays consistent

1712
01:02:41,440 --> 01:02:48,640
and deployment moves faster third approval workflows who decides if an agent goes live it can't be

1713
01:02:48,640 --> 01:02:52,960
the person who built it that's a conflict of interest you need a gate the architect reviews the design

1714
01:02:52,960 --> 01:02:57,360
security reviews the risk the business owner approves the launch the gate stops agents from going

1715
01:02:57,360 --> 01:03:03,360
into production half baked fourth monitoring dashboards you need real-time visibility not a report at

1716
01:03:03,360 --> 01:03:08,160
the end of the month you need to see uptime reasoning costs and policy violations right now the

1717
01:03:08,160 --> 01:03:13,760
dashboard is the nervous system it tells you something is wrong before it becomes a disaster fifth incident

1718
01:03:13,760 --> 01:03:18,640
playbooks when things go wrong and they will you need a response if an agent uses too many credits what

1719
01:03:18,640 --> 01:03:23,600
happens the playbook tells you who to alert how to investigate and when to roll back you aren't

1720
01:03:23,600 --> 01:03:27,840
making it up as you go you have a process these five pieces aren't optional they are the difference

1721
01:03:27,840 --> 01:03:32,880
between success and a total mess the most common mistake is building governance too late by then you

1722
01:03:32,880 --> 01:03:37,680
have shadow agents everywhere you have cost overruns and you have agents you can't shut down without

1723
01:03:37,680 --> 01:03:42,880
breaking the business retrofitting governance is nightmare work it's expensive and it's disruptive

1724
01:03:43,760 --> 01:03:48,480
the discipline here feels wrong at first start with constraints let agents operate inside those

1725
01:03:48,480 --> 01:03:53,120
walls don't add governance later it's harder and it's more painful organizations that succeed

1726
01:03:53,120 --> 01:03:57,360
build the foundation first the constraints aren't obstacles they are the structure that makes

1727
01:03:57,360 --> 01:04:02,320
scale possible this work happens before you hit full production you build the infrastructure while

1728
01:04:02,320 --> 01:04:06,880
running a few agents you aren't rushing to deploy more you are building the model the hidden

1729
01:04:06,880 --> 01:04:11,680
costs of a genetic transformation everyone wants to talk about the exciting stuff the agents reasoning

1730
01:04:11,680 --> 01:04:16,400
over data the multi-step workflows the speed but nobody wants to talk about the cost i'm not just

1731
01:04:16,400 --> 01:04:20,640
talking about co-pilot credits i'm talking about the real costs the ones that surprise you halfway

1732
01:04:20,640 --> 01:04:25,600
through the project yes there is a direct cost for every reasoning step and every tool you call

1733
01:04:25,600 --> 01:04:30,160
that is the obvious line item light workflows cost pennies while heavy workflows cost dollars

1734
01:04:30,160 --> 01:04:34,720
and while that adds up it isn't your biggest problem the hidden cost dwarf the reasoning credits

1735
01:04:34,720 --> 01:04:39,040
it starts with governance infrastructure you need dashboards to monitor agents and systems to

1736
01:04:39,040 --> 01:04:43,680
manage your registry some of this comes from Microsoft but you will likely end up building or buying

1737
01:04:43,680 --> 01:04:48,240
the rest then you have the process costs approval workflows and architecture reviews don't run

1738
01:04:48,240 --> 01:04:53,200
themselves they require someone's time you are paying for policy reviews incident investigations

1739
01:04:53,200 --> 01:04:58,480
and a dedicated agent operations team architects analysts and security specialists are new roles

1740
01:04:58,480 --> 01:05:02,880
and they are definitely not free identity hygiene is another massive cost that most organizations

1741
01:05:02,880 --> 01:05:07,200
underestimate before an agent can operate safely your permissions have to be perfect and let's be

1742
01:05:07,200 --> 01:05:12,800
honest they probably aren't most companies have years of accumulated permission debt people move roles

1743
01:05:12,800 --> 01:05:17,440
but keep their old access and sensitive data ends up shared much more broadly than it should be agents

1744
01:05:17,440 --> 01:05:21,920
will expose every single one of these floors before you roll them out you have to clean the house

1745
01:05:21,920 --> 01:05:26,480
that means hundreds of hours spent on access reviews and fixing broken inheritance it isn't optional

1746
01:05:26,480 --> 01:05:30,960
it's a prerequisite training costs are also substantial your teams don't know how to architect

1747
01:05:30,960 --> 01:05:35,520
for agents or how to monitor their ROI yet they have to learn which means paying for instructor time

1748
01:05:35,520 --> 01:05:40,640
and losing productivity while they study this isn't a one week onboarding it's ongoing new people join

1749
01:05:40,640 --> 01:05:44,960
and new agent types emerge so the training debt just keeps compounding monitoring is more than

1750
01:05:44,960 --> 01:05:49,680
just a dashboard you are instrumenting agents across the entire company and collecting telemetry

1751
01:05:49,680 --> 01:05:54,160
from dozens of sources connecting your monitoring stack to your cost and compliance systems is a

1752
01:05:54,160 --> 01:05:58,560
major integration project then there is incident response this is the cost nobody budgets for until

1753
01:05:58,560 --> 01:06:03,200
something breaks when an agent misbehaves you need trained people and ready to go playbooks to shut

1754
01:06:03,200 --> 01:06:08,320
it down you need people on call the moment agents touch critical systems the total cost of ownership

1755
01:06:08,320 --> 01:06:13,120
for an agent program is usually three to five times the cost of the credits if you spend a dollar

1756
01:06:13,120 --> 01:06:17,360
on reasoning you will spend four more on the people and infrastructure to keep it safe

1757
01:06:17,360 --> 01:06:22,160
organizations that only budget for credits end up underfunded they cut corners governance weekends

1758
01:06:22,160 --> 01:06:26,800
and the whole program eventually struggles but here's the thing the ROI justifies it if you do it right

1759
01:06:26,800 --> 01:06:31,840
automation kills manual work agents work 24 hours a day without fatigue or sick days they make fewer

1760
01:06:31,840 --> 01:06:36,960
errors and stay consistent which makes your decision making much faster these benefits compound over time

1761
01:06:36,960 --> 01:06:42,400
if a well run program saves three hours a week for a thousand users you just reclaimed 500 person

1762
01:06:42,400 --> 01:06:46,400
years of work that is transformative and it's worth paying for the infrastructure to get there

1763
01:06:46,400 --> 01:06:50,320
the financial case is overwhelmingly positive but only if you account for the costs upfront

1764
01:06:50,320 --> 01:06:54,400
organizations that treat governance as optional usually find out the truth in year two when

1765
01:06:54,400 --> 01:06:59,680
the budget is already gone they try to run a production system on a pilot budget and it fails

1766
01:06:59,680 --> 01:07:03,920
get the full cost picture before you start it's the only way to build something that lasts

1767
01:07:03,920 --> 01:07:09,920
measuring success metrics that matter this is where most organizations go wrong they measure the

1768
01:07:09,920 --> 01:07:14,320
wrong things they celebrate deployments that should actually worry them they stare at vanity metrics

1769
01:07:14,320 --> 01:07:19,200
while the program quietly fails start by ignoring the metrics everyone loves like total reasoning credit

1770
01:07:19,200 --> 01:07:23,600
spent that number is meaningless without context you could spend ten thousand dollars a month and

1771
01:07:23,600 --> 01:07:29,040
create massive value or you could spend the same amount burning money on broken processes credit

1772
01:07:29,040 --> 01:07:33,440
spend tells you nothing about success so don't use it to justify your budget the number of agents

1773
01:07:33,440 --> 01:07:38,080
deployed is another seductive vanity metric maybe you deployed a hundred agents but half of them

1774
01:07:38,080 --> 01:07:42,880
aren't being used in a quarter are just duplicating work quantity has nothing to do with value if you

1775
01:07:42,880 --> 01:07:47,120
measure success by agent count you are building for the wrong reason the same goes for adoption rates

1776
01:07:47,120 --> 01:07:52,160
seeing that 70% of your workforce uses agents looks great in a slide deck but if those people are

1777
01:07:52,160 --> 01:07:56,400
being forced to use them or if the agents are just shifting work around that percentage is just

1778
01:07:56,400 --> 01:08:01,280
noise vanity metrics feel good because they are easy to report but they are lying to you the metrics

1779
01:08:01,280 --> 01:08:06,400
that actually matter are harder to find but they tell the truth start with reasoning cost per outcome

1780
01:08:06,400 --> 01:08:12,240
how much does it cost the company to get a specific result if an agent generates a report in eight

1781
01:08:12,240 --> 01:08:16,320
minutes for two dollars and that same report takes a human four hours at eighty dollars the math is

1782
01:08:16,320 --> 01:08:21,280
easy the agent is thirty times cheaper and much faster but if an agent takes a dollar to do a five

1783
01:08:21,280 --> 01:08:26,000
minute task it's eighty times more expensive than the human cost per outcome tells you immediately if

1784
01:08:26,000 --> 01:08:31,040
an agent actually makes sense then there is agent uptime if an agent is supposed to be working and it

1785
01:08:31,040 --> 01:08:35,520
isn't that is a failure ninety nine percent uptime sounds okay until you realize that means seven

1786
01:08:35,520 --> 01:08:39,360
hours of downtime every month if your agent is critical infrastructure you have to treat those

1787
01:08:39,360 --> 01:08:44,560
outages seriously you also need to track compliance violations agents will eventually violate a policy

1788
01:08:44,560 --> 01:08:49,200
because of configuration drift or a scenario you didn't expect these violations should be rare

1789
01:08:49,200 --> 01:08:53,360
and trending towards zero if that number is staying the same or going up your governance model is

1790
01:08:53,360 --> 01:08:58,560
broken time saved per interaction is the most important signal you need to measure the actual time

1791
01:08:58,560 --> 01:09:03,280
eliminated from a workflow if an agent turns a twenty minute task into a four minute task those

1792
01:09:03,280 --> 01:09:08,480
sixteen minutes add up fast a hundred executions a day saves you a person year of work every few months

1793
01:09:08,480 --> 01:09:14,000
that is a real measurable win finally look at user trust but don't use a survey watch their behavior

1794
01:09:14,000 --> 01:09:18,880
do they use the agent when they have a choice or do they find workarounds do they accept the agents

1795
01:09:18,880 --> 01:09:23,360
output or do they spend all their time editing it if people aren't choosing to use the agents the

1796
01:09:23,360 --> 01:09:28,080
deployment doesn't matter the core question is whether the agent makes the company faster smarter

1797
01:09:28,080 --> 01:09:33,600
or more compliant faster means more throughput smarter means fewer errors more compliant means

1798
01:09:33,600 --> 01:09:38,480
lower risk if an agent doesn't move the needle on one of those three things it is just expensive overhead

1799
01:09:38,480 --> 01:09:43,200
kill it and move the budget somewhere else measure the outcome not the activity outcome is the only

1800
01:09:43,200 --> 01:09:49,600
thing that determines if you've built something that lasts the uncomfortable truth most agents will fail

1801
01:09:49,600 --> 01:09:53,680
you've built the foundation you've deployed the pilots and measured the costs you've trained the

1802
01:09:53,680 --> 01:09:58,400
teams and established the governance you are ready to scale and here is what nobody wants to hear

1803
01:09:58,400 --> 01:10:02,960
most of your agents won't make it not all of them some will be transformative those few will

1804
01:10:02,960 --> 01:10:07,920
justify the entire investment but statistically more agents will fail then succeed they'll get

1805
01:10:07,920 --> 01:10:12,080
deployed and run for a few months but then they'll stall usage will drop the business will stop

1806
01:10:12,080 --> 01:10:16,320
investing in them they'll become technical debt eventually they'll get shut down this isn't a failure

1807
01:10:16,320 --> 01:10:21,360
of the technology it's a failure of expectations why do agents fail it starts with misaligned complexity

1808
01:10:21,360 --> 01:10:26,560
the business process looked automatable in theory but in practice it's full of edge cases the agent

1809
01:10:26,560 --> 01:10:32,240
handles 90% of scenarios cleanly but the other 10% are exceptions that require human judgment the agent

1810
01:10:32,240 --> 01:10:37,200
gets stuck on those it escalates everything that's complicated it becomes a filter that passes only

1811
01:10:37,200 --> 01:10:42,320
the hard work to humans the humans still do most of the work the agent adds overhead value is negative

1812
01:10:42,320 --> 01:10:47,520
the agent gets killed user trust failures are just as common an agent makes a confident recommendation

1813
01:10:47,520 --> 01:10:52,800
that turns out to be wrong not catastrophically just wrong a salesperson trusted the agents lead scoring

1814
01:10:52,800 --> 01:10:56,800
because the agent said a prospect was high probability the salesperson spent time on it but nothing

1815
01:10:56,800 --> 01:11:01,040
happened now the salesperson doesn't trust the agent anymore they start second guessing everything

1816
01:11:01,040 --> 01:11:05,600
it says they slow down and double check the agent becomes a slower way to do work they already knew

1817
01:11:05,600 --> 01:11:10,640
how to do why use it then there is the reasoning cost you deployed an agent thinking it would save time

1818
01:11:10,640 --> 01:11:15,520
and it does but barely it saves three minutes per execution but each execution costs a dollar in

1819
01:11:15,520 --> 01:11:21,040
credits do the math to save eight hours of human time you need to run the agent 160 times that costs

1820
01:11:21,040 --> 01:11:26,000
160 dollars the human time is worth maybe 60 dollars at average cost you're spending almost three

1821
01:11:26,000 --> 01:11:31,040
times more on the agent than on the human alternative the math doesn't work the agent gets shut down

1822
01:11:31,040 --> 01:11:35,680
governance constraints also strangle capability the agent was designed to do something useful

1823
01:11:35,680 --> 01:11:40,560
but policy restrictions limit what it can actually do it can't access the data it needs it can't

1824
01:11:40,560 --> 01:11:44,640
take the actions that would deliver value it can only make suggestions the humans still have to

1825
01:11:44,640 --> 01:11:48,960
execute the agent becomes advisory instead of autonomous advisory agents have a hard time

1826
01:11:48,960 --> 01:11:53,920
justifying their cost edge cases overwhelm the design the agent was trained on historical data

1827
01:11:53,920 --> 01:11:58,480
but real world conditions include scenarios the training didn't anticipate a customer service agent

1828
01:11:58,480 --> 01:12:02,960
was trained on normal support tickets but then it encounters a ticket where the customer is threatening

1829
01:12:02,960 --> 01:12:07,840
legal action the agent doesn't know what to do it escalates three quarters of escalations are edge

1830
01:12:07,840 --> 01:12:13,440
cases the agent isn't reducing work it's creating triage overhead this is the hard truth many agents fail

1831
01:12:13,440 --> 01:12:18,160
because the problem was harder than anticipated or the solution was less applicable than hoped

1832
01:12:18,160 --> 01:12:22,640
not because the technology is broken because the assumptions were wrong the organizations that succeed

1833
01:12:22,640 --> 01:12:27,840
are ruthless they pilot aggressively they measure continuously they kill agents that aren't working

1834
01:12:27,840 --> 01:12:31,680
they don't get emotionally attached to deployments they don't keep underperforming agents running

1835
01:12:31,680 --> 01:12:36,080
because they've invested in them they cut losses they double down on what works they iterate based

1836
01:12:36,080 --> 01:12:40,960
on data the organizations that fail to the opposite they deploy an agent and it underperforms

1837
01:12:40,960 --> 01:12:45,840
instead of killing it they try to fix it they tweak the prompt and add more context they adjust policies

1838
01:12:45,840 --> 01:12:49,840
they invest more in something that's not delivering they refuse to admit failure they blame the

1839
01:12:49,840 --> 01:12:53,680
technology or they blame the users or they blame the model they keep the agent running because

1840
01:12:53,680 --> 01:12:58,240
they've made a public commitment to it by the time they finally shut it down they've wasted months

1841
01:12:58,240 --> 01:13:03,600
the discipline here is counter intuitive fail fast fail cheap move on a failed pilot that cost

1842
01:13:03,600 --> 01:13:08,160
$50,000 in three months is cheaper than a failed production system that's costing money monthly

1843
01:13:08,160 --> 01:13:13,200
for two years because you can't admit it's not working pilot ruthlessly measure ruthlessly kill

1844
01:13:13,200 --> 01:13:19,680
ruthlessly scale what works the agentic operating model is not coming it's here agent 365 is in general

1845
01:13:19,680 --> 01:13:24,960
availability work IQ is in general availability a 2a is production ready the question is not whether

1846
01:13:24,960 --> 01:13:29,360
you should do this it's how fast you can move organizations that build the governance foundation

1847
01:13:29,360 --> 01:13:33,920
now that establish cost discipline and that measure outcomes instead of activity will operate at

1848
01:13:33,920 --> 01:13:38,160
machine speed in 18 months they'll have embedded intelligence in their workflows they'll have a

1849
01:13:38,160 --> 01:13:42,880
competitive advantage organizations that ignore this will spend the next two years catching up by

1850
01:13:42,880 --> 01:13:47,280
then the leaders will have moved further ahead the shift is fundamental we are moving from AI as a

1851
01:13:47,280 --> 01:13:52,080
search tool to AI as an operating layer we are moving from service accounts to digital employees

1852
01:13:52,080 --> 01:13:57,600
with entraagent IDs we are moving from static context to liquid context we are moving from retrieval

1853
01:13:57,600 --> 01:14:01,840
to reasoning the model is set the tools are ready the only variable is execution

Mirko Peters Profile Photo

Founder of m365.fm, m365.show and m365con.net

Mirko Peters is a Microsoft 365 expert, content creator, and founder of m365.fm, a platform dedicated to sharing practical insights on modern workplace technologies. His work focuses on Microsoft 365 governance, security, collaboration, and real-world implementation strategies.

Through his podcast and written content, Mirko provides hands-on guidance for IT professionals, architects, and business leaders navigating the complexities of Microsoft 365. He is known for translating complex topics into clear, actionable advice, often highlighting common mistakes and overlooked risks in real-world environments.

With a strong emphasis on community contribution and knowledge sharing, Mirko is actively building a platform that connects experts, shares experiences, and helps organizations get the most out of their Microsoft 365 investments.