Your PowerShell Scripts Are Obsolete


For years, PowerShell scripts were the foundation of Microsoft 365 automation. IT admins built massive script libraries to onboard users, assign licenses, provision devices, configure Exchange, manage permissions, and automate repetitive operational work across cloud and hybrid environments.
But enterprise IT is changing fast.
In this episode, we explore why traditional PowerShell-driven automation is becoming increasingly obsolete in modern Microsoft 365 environments. Static scripts struggle to keep up with rapidly changing APIs, evolving security models, Zero Trust architectures, AI-driven workflows, and the growing complexity of cloud-native services.
We break down how Microsoft Graph, event-driven architectures, low-code automation, Copilot, AI agents, and modern orchestration platforms are reshaping enterprise automation. Instead of maintaining fragile scripts that constantly require updates, organizations are moving toward adaptive, API-first, and AI-assisted automation models that can react dynamically to business events and security requirements.
The episode also examines the growing security and governance challenges surrounding legacy automation approaches, including excessive permissions, credential handling, deprecated modules, and maintenance overhead. With Microsoft continuing to standardize around Microsoft Graph and modern authentication models, many older PowerShell approaches are already reaching their limits.
If you work with Microsoft 365, Azure, automation, identity, or enterprise operations, this episode provides a practical look at where automation is heading next — and why the future may rely less on manually written scripts and more on intelligent orchestration powered by APIs, AI, and cloud-native services.
Relying on outdated PowerShell Scripts puts your organization at risk. Modern IT environments have grown complex, with cloud-native and hybrid systems becoming the norm. Legacy scripts struggle to keep up because they depend on platform-specific languages and lack structured error handling. You need tools that adapt to dynamic workflows and unpredictable requests. Intelligent orchestration engines like Semantic Kernel now transform PowerShell into a reasoning layer, making automation smarter and more resilient. Ask yourself if your current automation approach matches the pace of today’s enterprise evolution.
Key Takeaways
- Outdated PowerShell scripts can lead to serious risks, including automation failures and security breaches.
- Regularly review and update your scripts to avoid compatibility issues with new PowerShell versions.
- Use modern error handling techniques to catch problems early and keep your automation reliable.
- Secure your scripts by avoiding hardcoded credentials and using encrypted storage methods.
- Leverage tools like PSScriptAnalyzer to identify and fix outdated code in your scripts.
- Adopt intelligent orchestration tools like Semantic Kernel to enhance automation and adapt to changing needs.
- Document your scripts clearly to improve maintainability and help your team understand their purpose.
- Focus on modular code to make updates easier and improve the overall efficiency of your automation.
Risks of Outdated PowerShell Scripts

Outdated scripts can create serious problems for your organization. You may not notice these issues right away, but they can disrupt your daily operations and put your business at risk. Let’s look at the main dangers you face when you rely on obsolete automation.
Script Failures
Version Incompatibility
You might run a script that worked last year, only to see it fail today. PowerShell changes over time. New versions introduce updates that break old code. If your scripts use features or parameters that no longer exist, they will not run as expected. This can stop important tasks and cause confusion for your team.
Deprecated Cmdlets
Cmdlets are the building blocks of PowerShell. When Microsoft removes or replaces cmdlets, your scripts can stop working. You may see errors that are hard to understand. This often happens when you use obsolete scripts that depend on old commands. You need to update your scripts to keep up with these changes.
Security Gaps
Exposure to Exploits
Old scripts often lack modern security features. Attackers look for these weaknesses. If you use scripts that do not follow current security standards, you open the door to exploits. Hackers can use these gaps to steal data or disrupt your systems. You must update your scripts to protect your organization.
Outdated PowerShell scripts pose significant security risks as they often lack modern security features. These scripts, written for older versions, become increasingly difficult to maintain and operate safely. Organizations are advised to update these scripts to align with current security standards to prevent potential security incidents.
Weak Credential Handling
Many obsolete scripts store passwords in plain text or use weak methods to handle credentials. This puts sensitive information at risk. Modern PowerShell provides safer ways to manage credentials, but old scripts do not use these tools. You should review your scripts and remove any unsafe code.
Productivity Loss
Troubleshooting Legacy Code
When scripts fail, you spend time searching for the problem. Legacy code can be hard to read and understand. You may not know why a script breaks or how to fix it. This slows down your work and takes time away from other important tasks.
Manual Workarounds
If your scripts do not work, you may need to do tasks by hand. Manual workarounds take time and increase the chance of mistakes. You lose the benefits of automation. Keeping your scripts up to date helps you avoid these problems and keeps your team productive.
Tip: Regularly review and update your PowerShell scripts to avoid these risks. Modern automation tools can help you stay secure and efficient.
Why PowerShell Scripts Become Obsolete
You may wonder why your scripts stop working or become unreliable over time. The answer lies in how PowerShell changes and how best practices evolve. Let’s break down the main reasons.
PowerShell Evolution
Core Changes
PowerShell updates its core engine regularly. These updates bring new features and fix old problems. Sometimes, they remove functions that your scripts depend on. If you use an older version, your scripts may not run as expected. You need to check which version you use and make sure your scripts match the latest standards.
Module Updates
Modules add extra tools to PowerShell. Developers update these modules to support new technology. When a module changes, your scripts may lose access to certain commands. You must review your scripts after every module update. This helps you avoid surprises and keeps your automation running smoothly.
Note: Always test your scripts after updating PowerShell or its modules. This step prevents unexpected failures.
Obsolete Parameters
Deprecated Features
PowerShell sometimes removes features that are no longer useful. If your scripts rely on these features, they become obsolete. You may see errors or warnings when you run them. You should replace deprecated features with modern alternatives. This keeps your automation safe and reliable.
Unsupported Syntax
Syntax rules change as PowerShell evolves. Old scripts may use syntax that is no longer supported. For example, you might see errors like:
Get-Content -Encoding ASCII
If PowerShell removes the -Encoding parameter or changes how it works, your scripts will fail. You need to update your code to match the latest syntax rules.
Shifting Best Practices
Modern Error Handling
You must use modern error handling to catch problems early. Old scripts often ignore errors or use outdated methods. Today, you can use structured error handling to make your scripts more resilient. This practice helps you find issues faster and keeps your automation reliable.
Output Formatting
PowerShell now offers better ways to format output. Old scripts may produce messy or hard-to-read results. You should update your scripts to use new formatting options. Clear output makes it easier to understand what your automation does.
- Review your scripts often.
- Replace obsolete parameters and features.
- Use modern error handling and output formatting.
By keeping your scripts up to date, you avoid common pitfalls and ensure your automation stays effective.
Real-World Issues with Obsolete PowerShell Scripts
You face real challenges when you depend on obsolete scripts in your daily operations. These problems can disrupt your workflow, expose your data, and create compatibility headaches. Let’s explore the most common issues you might encounter.
Automation Breakdowns
Failed Scheduled Tasks
Scheduled tasks often rely on scripts to run at specific times. When you use obsolete scripts, these tasks can fail without warning. You might see errors because the scripts use outdated parameters or unsupported code. Failed tasks can stop important processes, such as backups or user provisioning. You need to monitor your scheduled jobs and update scripts regularly to avoid these breakdowns.
Automation failures can lead to missed deadlines and lost productivity. Always check your scripts after PowerShell updates.
Data Loss Scenarios
Obsolete scripts sometimes lack proper error handling. If a script fails during a data transfer or backup, you risk losing valuable information. You may not notice the problem until it’s too late. Modern PowerShell offers better ways to protect your data, but old scripts do not use these features. You should review your scripts to ensure they handle errors and protect your files.
Security Incidents
Hardcoded Credentials
Many legacy scripts store usernames and passwords directly in the code. Attackers look for these weaknesses because they are easy targets. For example:
- Sitecore XP versions 10.1–10.4 contain hardcoded internal user accounts with weak, pre-set passwords. These credentials are identical across all default installations.
- Attackers discovered a PowerShell script with hardcoded credentials that provided admin access to Uber’s Privileged Access Management system. This led to the compromise of multiple services, including AWS, GCP, and Slack.
You must remove hardcoded credentials from your scripts. Use secure methods to manage sensitive information.
Unpatched Vulnerabilities
Obsolete scripts often miss critical security updates. Attackers exploit these gaps to gain access to your systems. You need to update your scripts and patch vulnerabilities as soon as possible. Regular reviews help you spot weaknesses before they become problems.
Compatibility Errors
Module Conflicts
PowerShell modules change over time. When you run obsolete scripts, you may see conflicts between modules. These conflicts can cause scripts to fail or produce unexpected results. You should test your scripts after every module update to ensure compatibility.
Syntax Failures
Syntax rules evolve with new PowerShell versions. Old scripts may use syntax that no longer works. For example, a script might use a command that has changed or been removed. You need to update your scripts to match the latest syntax and avoid failures.
Tip: Keep your scripts up to date to prevent automation breakdowns, security incidents, and compatibility errors.
Identifying and Updating Old PowerShell Scripts

Keeping your automation reliable means you must regularly review and update your scripts. Old PowerShell scripts can become obsolete quickly as technology changes. You need to know how to spot problems, use the right tools, and follow a clear audit process.
Signs of Obsolescence
Deprecated Cmdlets
You should watch for deprecated cmdlets in your scripts. When Microsoft removes or replaces a cmdlet, your automation may break. If you see warnings about deprecated commands, that is a sign your scripts need attention. These warnings often appear after PowerShell updates or when you install new modules.
Frequent Errors
Frequent errors signal that your scripts are becoming obsolete. If you notice scripts failing more often or producing unexpected results, you should investigate. Errors can come from unsupported syntax, missing modules, or obsolete parameters. Do not ignore these signs. They point to deeper issues in your automation.
Review Tools
Script Analyzer
You can use tools to review your scripts for outdated code and defects. One popular tool is PSScriptAnalyzer. It checks your scripts for common problems and helps you find obsolete code. Here is a quick overview:
| Feature | Description |
|---|---|
| Static Code Checker | Identifies outdated code and potential defects in PowerShell scripts. |
| Built-in Rules | Checks for uninitialized variables, use of PSCredential type, and more. |
| Diagnostic Results | Provides errors and warnings to inform users about potential code defects. |
| Installation Command | Install-Module -Name PSScriptAnalyzer for easy installation from PowerShell Gallery. |
Using a script analyzer helps you catch issues before they cause failures.
Version Control
Version control systems help you manage your scripts over time. You can track changes, see who made edits, and roll back to earlier versions if needed. Here are some benefits:
- You make changes to scripts trackable and reversible.
- You avoid script sprawl and accidental overwrites.
- You see a history of what changed, when, and by whom.
- You collaborate with others using branches and pull requests.
Version control keeps your automation organized and secure.
Audit Process
Prioritizing Critical Scripts
You should audit your scripts regularly. Start by focusing on the most critical automation. Follow these steps:
- Enable logging of PowerShell activity. Use module logging, script block logging, and transcription to capture details.
- Configure a suitable log size. Set it to at least 150MB to keep enough data for review.
- Continuously track PowerShell events. Monitor logs in Event Viewer or use third-party tools for better tracking.
This process helps you find and update scripts that matter most.
Update Checklist
Create a checklist for updating scripts. Include steps like removing deprecated cmdlets, fixing obsolete parameters, and improving error handling. Document your changes and test each script before putting it back into production.
Regular script maintenance keeps your automation safe, efficient, and ready for the future.
Modernizing PowerShell Automation
Modernizing your PowerShell automation helps you avoid the pitfalls of obsolete scripts. You gain reliability, security, and easier maintenance. Let’s explore how you can refactor your scripts, enhance security, and improve maintainability.
Refactoring Scripts
Refactoring your scripts means making them more efficient and easier to manage. You should focus on updating cmdlets and removing obsolete parameters.
Update Cmdlets
You need to use the latest cmdlets in your scripts. PowerShell’s verb-noun syntax makes it simple to find and use updated commands. When you update cmdlets, you improve compatibility and performance. Filtering data early in your script reduces processing time. You can also optimize data retrieval by combining commands and removing unnecessary variables.
- Filter data at the start to speed up execution.
- Use parallel processing with background jobs for faster results.
- Combine commands to reduce complexity and improve reliability.
If you want to output to csv, use the latest cmdlets like Export-Csv to ensure your data is formatted correctly and your automation stays current.
Remove Obsolete Parameters
Obsolete parameters cause errors and make your scripts unreliable. You should review your scripts for parameters that no longer work or have been replaced. Remove these parameters to prevent failures and keep your automation running smoothly. PowerShell code that uses unsupported syntax often fails after updates, so always check for changes in parameter usage.
Tip: Regularly check Microsoft’s documentation for cmdlet updates and parameter changes.
Security Enhancements
Security is critical when you automate tasks. You must protect sensitive data and ensure your scripts are safe from threats.
Secure Credentials
Storing credentials securely prevents data leaks and unauthorized access. Avoid hardcoding passwords in your scripts. Use secure methods like the Get-Credential cmdlet or encrypted credential stores. Secure coding practices help you mitigate risks and protect your environment.
Script Signing
Signing your scripts ensures their integrity and authenticity. You control script execution by configuring execution policies. Script signing prevents unauthorized changes and helps you verify the source of your automation.
- Configure execution policies to restrict script execution.
- Sign your scripts to guarantee authenticity.
- Follow secure coding practices to protect sensitive data.
- Review and update your PowerShell environment regularly.
Note: Script signing and secure credential handling are essential for compliance and security.
Maintainability
Maintainable scripts save you time and reduce errors. You should focus on documentation and modular code to make your automation easier to manage.
Documentation
Clear documentation helps you and your team understand what each script does. Describe the purpose, inputs, and expected outputs. Use comments in your PowerShell code to explain complex logic. Good documentation makes troubleshooting easier and speeds up onboarding for new team members.
Modular Code
Modular code improves reusability and maintainability. Create functions that you can use across different scripts. When you update code in one place, all scripts using that function benefit. Functions can accept input from the pipeline and use built-in parameters like -Verbose and -ErrorAction. This approach enhances functionality and keeps your automation organized.
- Reuse functions to save time and reduce mistakes.
- Update code in one location to improve all related scripts.
- Use built-in parameters for better control and error handling.
PowerShell’s try-catch-finally error handling mechanism gives you sophisticated control over errors. You can manage failures gracefully and keep your automation reliable.
Callout: Modular scripts and clear documentation make your automation scalable and easier to maintain.
The Future: Intelligent Orchestration with Semantic Kernel
From Scripts to Reasoning
Adaptive Workflows
You have seen how traditional powershell scripts follow a fixed path. They run the same way every time, which can make them brittle in complex environments. With Semantic Kernel, you move beyond this limitation. You gain adaptive workflows that respond to real-time needs. AI agents can analyze your requests, understand your goals, and build plans that fit the situation. This means your automation can adjust to changes without you rewriting code.
Here is how adaptive workflows benefit your organization:
| Benefit | Description |
|---|---|
| Improved Efficiency | AI agents complete tasks faster, reducing project timelines. |
| Cost Savings | Automation lowers labor costs and prevents expensive mistakes. |
| Improved Precision | Data-driven insights help you make better decisions. |
| Scalability | You can add more agents to handle growth without extra overhead. |
Contextual Automation
Context matters in modern IT. Semantic Kernel brings contextual automation to your environment. It understands the intent behind your requests, not just the commands. For example, you might ask for a report or want to output to csv. The kernel interprets your goal and chooses the best way to deliver results. This approach reduces errors and makes automation more reliable, even as your needs change.
Semantic Kernel Overview
AI-Callable Cmdlets
Semantic Kernel turns your trusted powershell cmdlets into AI-callable tools. This means AI can select and run the right code for each task. You do not need to worry about chaining scripts or handling every detail. The kernel orchestrates the process, connecting AI models, structured data, and automation tools. It also bridges different systems, so your workflows can span multiple platforms.
- AI orchestration connects your automation with smart decision-making.
- Automated task execution lets AI handle routine work, freeing up your time.
- Dynamic orchestration adapts to new information, keeping your automation flexible.
- Enhanced interoperability allows different services to work together smoothly.
Dynamic Planning
With Semantic Kernel, you get dynamic planning. The kernel does not follow an obsolete, rigid script. It builds workflows on the fly, based on your current context and business logic. This means you can automate complex processes without constant updates. The kernel ensures that your automation stays reliable as your environment evolves.
Strategic Imperative
Staying Competitive
You need to stay ahead in a fast-changing world. Intelligent orchestration tools like Semantic Kernel give you a competitive edge. The kernel acts as a central engine for managing AI resources. It supports multi-agent orchestration, so specialized agents can work together. You also gain better control over AI usage, which helps you meet compliance and brand standards. By adopting Semantic Kernel, you reduce development time and risk, helping you bring solutions to market faster.
M365 FM Insights
M365 FM highlights how leading companies use Semantic Kernel to transform their automation. Fortune 500 organizations choose the kernel for its flexibility and modular design. Developers can focus on creating great user experiences while the kernel manages AI coordination and compliance. This shift from static scripting to intelligent orchestration ensures your business remains agile and ready for the future.
Tip: Embrace intelligent orchestration now to future-proof your automation and keep your organization competitive.
You cannot afford to let obsolete scripts slow down your business. Modern automation with PowerShell gives you better maintainability, scalability, and security. You also gain cross-platform compatibility and structured data handling. Tools like Semantic Kernel help you move beyond the limits of the system.obsolete attribute. Start by auditing your scripts, updating your automation, and staying informed about new solutions. This approach keeps your organization secure and ready for the future.
- Maintainability improves with modular scripts.
- Scalability supports enterprise growth.
- Security features reduce risks.
- Cross-platform support increases flexibility.
FAQ
What are the main dangers of using outdated PowerShell scripts?
You risk automation failures, security breaches, and wasted time. Old scripts often break after updates or expose sensitive data. You should review and update your scripts regularly to avoid these problems.
How can I tell if my PowerShell script is obsolete?
Watch for frequent errors, warnings about deprecated cmdlets, or failed tasks. If your script uses old syntax or parameters, it may not work with new PowerShell versions. Use tools like PSScriptAnalyzer to check your code.
Why should I consider Semantic Kernel for automation?
Semantic Kernel lets you build adaptive workflows. You gain automation that understands your goals and context. This approach helps you handle complex tasks and reduces the need for constant script updates.
How do I secure credentials in PowerShell scripts?
Never store passwords in plain text. Use Get-Credential or encrypted credential stores. Script signing and secure credential handling protect your data and help you meet compliance requirements.
What is the best way to update old scripts?
Start by identifying critical scripts. Remove deprecated cmdlets and obsolete parameters. Test your changes before deploying. Document updates for your team. Use version control to track changes.
Can I automate across multiple Microsoft 365 services with Semantic Kernel?
Yes. Semantic Kernel connects PowerShell cmdlets and AI models. You can automate tasks across Microsoft 365 services with dynamic, context-aware workflows.
🚀 Want to be part of m365.fm?
Then stop just listening… and start showing up.
👉 Connect with me on LinkedIn and let’s make something happen:
- 🎙️ Be a podcast guest and share your story
- 🎧 Host your own episode (yes, seriously)
- 💡 Pitch topics the community actually wants to hear
- 🌍 Build your personal brand in the Microsoft 365 space
This isn’t just a podcast — it’s a platform for people who take action.
🔥 Most people wait. The best ones don’t.
👉 Connect with me on LinkedIn and send me a message:
"I want in"
Let’s build something awesome 👊
1
00:00:00,000 --> 00:00:02,400
Your PowerShell scripts are about to become a liability.
2
00:00:02,400 --> 00:00:03,960
This isn't because PowerShell is dying
3
00:00:03,960 --> 00:00:05,360
or because your code is bad.
4
00:00:05,360 --> 00:00:08,040
The problem is that the assumptions you used to build them
5
00:00:08,040 --> 00:00:10,800
no longer match how modern enterprises actually function.
6
00:00:10,800 --> 00:00:12,240
You design those scripts for a world
7
00:00:12,240 --> 00:00:15,120
that doesn't exist anymore, where problems were predictable,
8
00:00:15,120 --> 00:00:17,880
inputs were clean, and every decision tree could fit
9
00:00:17,880 --> 00:00:19,080
on a single whiteboard.
10
00:00:19,080 --> 00:00:20,200
By the end of this video,
11
00:00:20,200 --> 00:00:22,760
you will understand how to evolve your entire script library
12
00:00:22,760 --> 00:00:24,320
into something completely different.
13
00:00:24,320 --> 00:00:25,880
We aren't talking about a replacement,
14
00:00:25,880 --> 00:00:28,280
but an evolution into a fleet of thinking workers.
15
00:00:28,280 --> 00:00:29,800
These agents authenticate themselves,
16
00:00:29,800 --> 00:00:31,360
reason about their environment,
17
00:00:31,360 --> 00:00:34,280
and execute tasks autonomously using Microsoft Graph
18
00:00:34,280 --> 00:00:36,000
as their single source of truth.
19
00:00:36,000 --> 00:00:38,120
You will see why semantic kernel is much more than a wrapper
20
00:00:38,120 --> 00:00:41,200
for PowerShell, acting instead as an orchestration engine
21
00:00:41,200 --> 00:00:44,000
that teaches models when and why to use your tools.
22
00:00:44,000 --> 00:00:47,440
The stakes are very real for your career and your organization.
23
00:00:47,440 --> 00:00:50,440
Companies that fail to operationalize this shift by 2026
24
00:00:50,440 --> 00:00:52,920
will find themselves at a massive disadvantage.
25
00:00:52,920 --> 00:00:55,640
While they are still writing manual conditional logic,
26
00:00:55,640 --> 00:00:57,480
their competitors will be deploying agents
27
00:00:57,480 --> 00:01:00,200
that adapt to variations without a single code change.
28
00:01:00,200 --> 00:01:02,600
One team will be stuck managing permanent service accounts
29
00:01:02,600 --> 00:01:04,200
with dangerous tenant-wide access,
30
00:01:04,200 --> 00:01:05,600
while the other users just in time
31
00:01:05,600 --> 00:01:07,800
identity scope to specific tasks.
32
00:01:07,800 --> 00:01:09,600
This is not a theoretical exercise or a vision
33
00:01:09,600 --> 00:01:10,760
of the distant future.
34
00:01:10,760 --> 00:01:13,360
We are taking the CMD-LEDs you already use every day,
35
00:01:13,360 --> 00:01:16,120
like GetMuser and UpdateMuser mailbox settings,
36
00:01:16,120 --> 00:01:18,120
and plugging them into a reasoning layer.
37
00:01:18,120 --> 00:01:20,000
This layer decides when to call them,
38
00:01:20,000 --> 00:01:21,640
determines the correct sequence,
39
00:01:21,640 --> 00:01:24,760
and figures out how to recover when things go sideways.
40
00:01:24,760 --> 00:01:26,280
The assumption that broke,
41
00:01:26,280 --> 00:01:29,000
the assumption that is currently breaking your automation
42
00:01:29,000 --> 00:01:31,400
is that your enterprise environment is predictable
43
00:01:31,400 --> 00:01:33,120
and unchanging.
44
00:01:33,120 --> 00:01:34,720
Think about how you build a script today.
45
00:01:34,720 --> 00:01:36,440
You start by defining the happy path,
46
00:01:36,440 --> 00:01:37,640
which is the perfect scenario
47
00:01:37,640 --> 00:01:39,520
where everything goes exactly as planned.
48
00:01:39,520 --> 00:01:41,840
The user exists, the license is available,
49
00:01:41,840 --> 00:01:44,760
and there are no strange edge cases to worry about.
50
00:01:44,760 --> 00:01:46,360
You write your steps in a linear order,
51
00:01:46,360 --> 00:01:48,520
so step one checks the account status,
52
00:01:48,520 --> 00:01:50,120
step two applies the license,
53
00:01:50,120 --> 00:01:51,880
and step three sends a notification.
54
00:01:51,880 --> 00:01:53,440
You test it and deploy it,
55
00:01:53,440 --> 00:01:56,080
and it works perfectly until a user has special characters
56
00:01:56,080 --> 00:01:59,840
in their name or Microsoft Changes and API response format.
57
00:01:59,840 --> 00:02:01,160
When those failures happen,
58
00:02:01,160 --> 00:02:04,360
you start adding conditional branches to handle the mess.
59
00:02:04,360 --> 00:02:07,000
You add error handling to catch missing users,
60
00:02:07,000 --> 00:02:08,240
or log license failures,
61
00:02:08,240 --> 00:02:10,200
and your logic begins to accumulate.
62
00:02:10,200 --> 00:02:11,840
The script grows longer and more complex,
63
00:02:11,840 --> 00:02:14,880
making it harder to follow, test, and maintain over time.
64
00:02:14,880 --> 00:02:16,800
But here's the thing that nobody wants to admit.
65
00:02:16,800 --> 00:02:18,560
You aren't actually solving the core problem
66
00:02:18,560 --> 00:02:19,760
by adding more code.
67
00:02:19,760 --> 00:02:22,640
You're just building a massive pile of if-then-else blocks
68
00:02:22,640 --> 00:02:25,600
that represent one person's guess about what might go wrong.
69
00:02:25,600 --> 00:02:28,000
Tomorrow, something you never predicted will happen,
70
00:02:28,000 --> 00:02:29,600
like a mailbox migration getting stuck
71
00:02:29,600 --> 00:02:31,040
in a state you've never seen before.
72
00:02:31,040 --> 00:02:33,000
Maybe the graph API throttles your requests
73
00:02:33,000 --> 00:02:35,040
because they are hitting the limit too fast,
74
00:02:35,040 --> 00:02:37,120
or an error message comes back in a format
75
00:02:37,120 --> 00:02:38,520
your rejects can't read.
76
00:02:38,520 --> 00:02:39,600
Every time this happens,
77
00:02:39,600 --> 00:02:41,120
someone has to open the script,
78
00:02:41,120 --> 00:02:42,280
remember how it works,
79
00:02:42,280 --> 00:02:43,840
add another branch, and redeploy it.
80
00:02:43,840 --> 00:02:45,880
That is a massive amount of maintenance debt
81
00:02:45,880 --> 00:02:47,960
that most teams never bother to track.
82
00:02:47,960 --> 00:02:51,240
Every new scenario requires a full cycle of revision and testing,
83
00:02:51,240 --> 00:02:54,040
creating constant friction for the smallest changes.
84
00:02:54,040 --> 00:02:56,560
Real enterprises are messy and unpredictable.
85
00:02:56,560 --> 00:02:59,280
User requests vary wildly from one day to the next,
86
00:02:59,280 --> 00:03:02,120
and error conditions branch out in ways you can't foresee.
87
00:03:02,120 --> 00:03:04,320
Business logic shifts the moment an organization
88
00:03:04,320 --> 00:03:06,640
reorganizes or adopts a new tool.
89
00:03:06,640 --> 00:03:08,480
A simple ticket about shared drive access
90
00:03:08,480 --> 00:03:10,160
could mean 15 different things,
91
00:03:10,160 --> 00:03:13,120
and each one requires a different diagnostic path to fix.
92
00:03:13,120 --> 00:03:15,000
The failing assumption is that you can predict
93
00:03:15,000 --> 00:03:17,000
every single one of these variations
94
00:03:17,000 --> 00:03:18,600
and encode them into a script.
95
00:03:18,600 --> 00:03:19,800
You can't do it profitably,
96
00:03:19,800 --> 00:03:22,040
and you certainly can't do it in a way that scales.
97
00:03:22,040 --> 00:03:23,880
The real issue isn't the script itself,
98
00:03:23,880 --> 00:03:25,200
but the architecture behind it.
99
00:03:25,200 --> 00:03:28,360
Scripts are deterministic engines that follow a set sequence,
100
00:03:28,360 --> 00:03:31,240
and they are excellent at doing exactly what they are told.
101
00:03:31,240 --> 00:03:34,600
However, they have no ability to reason about what they are doing.
102
00:03:34,600 --> 00:03:36,000
They can't look at a ticket and realize
103
00:03:36,000 --> 00:03:38,320
it's a licensing issue rather than an access problem.
104
00:03:38,320 --> 00:03:40,680
They can't read an error and decide to wait 30 seconds
105
00:03:40,680 --> 00:03:41,800
before trying again.
106
00:03:41,800 --> 00:03:44,080
That gap between what the script is programmed to do
107
00:03:44,080 --> 00:03:45,880
and what the situation actually requires
108
00:03:45,880 --> 00:03:47,280
is where the friction lives.
109
00:03:47,280 --> 00:03:48,960
This gap is growing wider,
110
00:03:48,960 --> 00:03:51,800
as environments become more complex and less predictable.
111
00:03:51,800 --> 00:03:53,160
This is where the architecture breaks,
112
00:03:53,160 --> 00:03:55,120
and it's why we need a new approach.
113
00:03:55,120 --> 00:03:56,920
Where automation hits its ceiling.
114
00:03:56,920 --> 00:03:59,320
Automation is powerful, but it has a hard limit.
115
00:03:59,320 --> 00:04:01,680
That limit is exactly where scripts start to break.
116
00:04:01,680 --> 00:04:03,280
We know that automation works brilliantly
117
00:04:03,280 --> 00:04:05,160
when three specific things are true.
118
00:04:05,160 --> 00:04:06,720
First, the input has to be structured,
119
00:04:06,720 --> 00:04:08,720
meaning you know exactly what you are going to receive
120
00:04:08,720 --> 00:04:10,400
and what format it will be in.
121
00:04:10,400 --> 00:04:12,400
Second, the decision tree must be finite,
122
00:04:12,400 --> 00:04:14,120
so you can map out every possible path
123
00:04:14,120 --> 00:04:15,600
and turn it into code.
124
00:04:15,600 --> 00:04:17,920
Third, the outcomes have to be deterministic.
125
00:04:17,920 --> 00:04:20,320
The same input should always produce the same output.
126
00:04:20,320 --> 00:04:23,120
If you run a script to set up a user with a specific name,
127
00:04:23,120 --> 00:04:25,360
that user gets created the same way every single time.
128
00:04:25,360 --> 00:04:27,480
But in reality, IT tickets do not arrive
129
00:04:27,480 --> 00:04:28,720
in a structured format.
130
00:04:28,720 --> 00:04:31,080
They show up as messy, natural language
131
00:04:31,080 --> 00:04:32,760
with a lot of hidden context.
132
00:04:32,760 --> 00:04:34,600
Someone sends an email saying their email is not
133
00:04:34,600 --> 00:04:37,480
syncing on their laptop, they cannot access the shared folder,
134
00:04:37,480 --> 00:04:39,800
and for some reason this only happens on Tuesdays.
135
00:04:39,800 --> 00:04:41,480
That is one single ticket, but it contains
136
00:04:41,480 --> 00:04:42,840
at least three separate problems.
137
00:04:42,840 --> 00:04:44,600
The person writing it does not understand
138
00:04:44,600 --> 00:04:46,560
how those issues relate to each other,
139
00:04:46,560 --> 00:04:49,240
and the ticket does not tell you if it is a network issue,
140
00:04:49,240 --> 00:04:51,520
a device problem, or a permission error.
141
00:04:51,520 --> 00:04:54,600
The real context is buried deep inside the description.
142
00:04:54,600 --> 00:04:56,240
Here is what actually happens today.
143
00:04:56,240 --> 00:04:58,080
The ticket lands in your queue and you read it
144
00:04:58,080 --> 00:04:59,920
and then you have to stop and think.
145
00:04:59,920 --> 00:05:02,200
You ask yourself, what is actually happening here,
146
00:05:02,200 --> 00:05:04,720
whether this is a networking problem or an identity problem,
147
00:05:04,720 --> 00:05:07,360
and if you should check the device or the cloud first.
148
00:05:07,360 --> 00:05:09,400
You have to figure out which questions will narrow down
149
00:05:09,400 --> 00:05:12,160
the cause so you run diagnostics and look at error messages
150
00:05:12,160 --> 00:05:14,040
to build a mental model of the problem.
151
00:05:14,040 --> 00:05:16,520
Only after all that thinking do you know which script to run,
152
00:05:16,520 --> 00:05:18,320
or if a script will even help at all.
153
00:05:18,320 --> 00:05:20,280
That entire middle part is the thinking part,
154
00:05:20,280 --> 00:05:23,640
and that is where the real bottleneck lives in your organization.
155
00:05:23,640 --> 00:05:25,840
The delay is not in the execution of the fix,
156
00:05:25,840 --> 00:05:29,000
it is in the diagnosis, the triage, and the decision making.
157
00:05:29,000 --> 00:05:30,360
This is why traditional automation
158
00:05:30,360 --> 00:05:32,360
has hit a ceiling in IT operations.
159
00:05:32,360 --> 00:05:34,640
Scripts handle the execution phase beautifully,
160
00:05:34,640 --> 00:05:36,080
but they cannot handle the reasoning
161
00:05:36,080 --> 00:05:38,440
that has to happen before the execution starts.
162
00:05:38,440 --> 00:05:40,320
A script cannot read a ticket and realize
163
00:05:40,320 --> 00:05:42,720
that because a mailbox and a device are both involved,
164
00:05:42,720 --> 00:05:44,560
it needs to check Azure AD health
165
00:05:44,560 --> 00:05:46,480
before looking at device compliance.
166
00:05:46,480 --> 00:05:47,880
It cannot see an error message
167
00:05:47,880 --> 00:05:49,560
and suggest that it looks like a caching issue
168
00:05:49,560 --> 00:05:51,800
that should be cleared before escalating to support.
169
00:05:51,800 --> 00:05:53,480
It cannot ask a user for more details
170
00:05:53,480 --> 00:05:55,560
when a ticket is too vague to understand.
171
00:05:55,560 --> 00:05:58,400
What organizations do instead is keep a human in the loop
172
00:05:58,400 --> 00:06:00,000
to act as the brain.
173
00:06:00,000 --> 00:06:01,880
The script handles the predictable steps,
174
00:06:01,880 --> 00:06:03,680
but the human has to handle the reasoning
175
00:06:03,680 --> 00:06:06,520
by reading the ticket and deciding what to do next.
176
00:06:06,520 --> 00:06:08,160
The human is the reasoning engine,
177
00:06:08,160 --> 00:06:10,120
while the script is just the executor,
178
00:06:10,120 --> 00:06:11,920
and this makes the human the bottleneck
179
00:06:11,920 --> 00:06:14,000
because people are slower than code.
180
00:06:14,000 --> 00:06:15,840
Humans get tired, they miss small details,
181
00:06:15,840 --> 00:06:17,800
and they are incredibly expensive to scale
182
00:06:17,800 --> 00:06:18,880
as the workload grows.
183
00:06:18,880 --> 00:06:20,840
You have probably noticed this in your own work,
184
00:06:20,840 --> 00:06:22,520
the fastest part of closing a ticket
185
00:06:22,520 --> 00:06:24,200
is not running the actual fix,
186
00:06:24,200 --> 00:06:26,920
but rather understanding what the problem is in the first place.
187
00:06:26,920 --> 00:06:28,800
You spend 90% of your cognitive effort
188
00:06:28,800 --> 00:06:31,280
figuring out which tool to use or which assumption to test
189
00:06:31,280 --> 00:06:33,480
while the actual command only takes a second.
190
00:06:33,480 --> 00:06:36,200
That fast part is the only thing scripts currently handle,
191
00:06:36,200 --> 00:06:38,200
so the ceiling we are hitting is not a limitation
192
00:06:38,200 --> 00:06:40,280
of power shell or any other language.
193
00:06:40,280 --> 00:06:42,680
It is a limitation of the architecture itself.
194
00:06:42,680 --> 00:06:45,040
Scripts are designed to be deterministic executors,
195
00:06:45,040 --> 00:06:47,200
which means they are not built to understand context
196
00:06:47,200 --> 00:06:49,360
or handle the ambiguity of a human request.
197
00:06:49,360 --> 00:06:51,240
They are not designed to look at a complex situation
198
00:06:51,240 --> 00:06:52,920
and ask what is actually going on.
199
00:06:52,920 --> 00:06:55,120
This is where the issue stops being about technology
200
00:06:55,120 --> 00:06:56,600
and starts being about architecture.
201
00:06:56,600 --> 00:06:58,120
You need a reasoning layer that can read
202
00:06:58,120 --> 00:07:00,680
an unstructured ticket, understand the context,
203
00:07:00,680 --> 00:07:02,560
and decide which diagnostics to run.
204
00:07:02,560 --> 00:07:04,400
You need something that can interpret results
205
00:07:04,400 --> 00:07:06,560
and decide on the next execution steps,
206
00:07:06,560 --> 00:07:08,520
even if that involves branching or backtracking
207
00:07:08,520 --> 00:07:09,600
when things go wrong.
208
00:07:09,600 --> 00:07:11,160
You need something that can think,
209
00:07:11,160 --> 00:07:13,360
and that is exactly where autonomous agents
210
00:07:13,360 --> 00:07:16,600
change the game as soon as the authentication paradox.
211
00:07:16,600 --> 00:07:18,360
There is a massive security problem
212
00:07:18,360 --> 00:07:19,800
that nobody is talking about,
213
00:07:19,800 --> 00:07:21,480
and it is baked directly into how
214
00:07:21,480 --> 00:07:23,280
power shell automation works right now.
215
00:07:23,280 --> 00:07:25,320
When you write a script, the very first thing you do
216
00:07:25,320 --> 00:07:26,840
is authenticate by running a command
217
00:07:26,840 --> 00:07:28,960
like connect migraph with a service principle.
218
00:07:28,960 --> 00:07:30,040
The script gets a token,
219
00:07:30,040 --> 00:07:32,040
and that token stays active for the entire time
220
00:07:32,040 --> 00:07:33,040
the script is running.
221
00:07:33,040 --> 00:07:34,720
If the process takes an hour to finish,
222
00:07:34,720 --> 00:07:37,920
that token remains valid and usable for that entire hour,
223
00:07:37,920 --> 00:07:40,440
even if the script needs to call the Microsoft Graph
224
00:07:40,440 --> 00:07:41,680
50 different times.
225
00:07:41,680 --> 00:07:44,000
On the surface, this seems fine because it is efficient.
226
00:07:44,000 --> 00:07:45,960
There is no obvious reason to authenticate
227
00:07:45,960 --> 00:07:48,880
multiple times when you can just do it once at the start,
228
00:07:48,880 --> 00:07:50,040
but here is the paradox.
229
00:07:50,040 --> 00:07:52,360
You have essentially created a permanent access key
230
00:07:52,360 --> 00:07:53,520
to your entire tenant.
231
00:07:53,520 --> 00:07:54,680
The moment that script starts,
232
00:07:54,680 --> 00:07:56,400
it has broad access to everything
233
00:07:56,400 --> 00:07:57,920
it was granted permission to do,
234
00:07:57,920 --> 00:08:00,040
and it keeps that access until it finishes.
235
00:08:00,040 --> 00:08:01,880
If something goes wrong in the middle of the run,
236
00:08:01,880 --> 00:08:03,800
or if an error causes the script to behave
237
00:08:03,800 --> 00:08:05,200
in a way you did not intend,
238
00:08:05,200 --> 00:08:06,840
that token is still live.
239
00:08:06,840 --> 00:08:09,000
That is still full unrestricted access.
240
00:08:09,000 --> 00:08:11,520
Now consider what happens if that script gets compromised.
241
00:08:11,520 --> 00:08:13,680
If someone injects malicious code into your script
242
00:08:13,680 --> 00:08:15,360
or hacks your code repository,
243
00:08:15,360 --> 00:08:17,640
they can add a single line to steal your data.
244
00:08:17,640 --> 00:08:19,480
The attacker now owns that token,
245
00:08:19,480 --> 00:08:21,440
and they have access to your entire tenant
246
00:08:21,440 --> 00:08:23,080
for the rest of the tokens life.
247
00:08:23,080 --> 00:08:25,600
By default, that gives them 60 minutes to do whatever they want
248
00:08:25,600 --> 00:08:28,000
inside your Microsoft 365 environment,
249
00:08:28,000 --> 00:08:29,160
and here is the kicker.
250
00:08:29,160 --> 00:08:31,120
In most companies, that service principle
251
00:08:31,120 --> 00:08:32,880
has incredibly broad permissions.
252
00:08:32,880 --> 00:08:35,520
It might have the power to reset passwords, manage groups,
253
00:08:35,520 --> 00:08:37,320
and read every mailbox in the organization
254
00:08:37,320 --> 00:08:38,200
all at the same time.
255
00:08:38,200 --> 00:08:40,040
One compromised script means an attacker
256
00:08:40,040 --> 00:08:42,640
gets all of those permissions at once across the whole tenant,
257
00:08:42,640 --> 00:08:44,280
which makes your automation service account
258
00:08:44,280 --> 00:08:46,440
a high-value target for any hacker.
259
00:08:46,440 --> 00:08:48,000
This is the standard practice today,
260
00:08:48,000 --> 00:08:49,760
and it usually works because the threats
261
00:08:49,760 --> 00:08:52,760
that actually exploit these gaps are still relatively rare.
262
00:08:52,760 --> 00:08:53,720
But this is a ticking clock,
263
00:08:53,720 --> 00:08:55,200
and as automation becomes more common,
264
00:08:55,200 --> 00:08:56,840
and more people have access to scripts,
265
00:08:56,840 --> 00:08:58,800
the surface area for an attack grows,
266
00:08:58,800 --> 00:09:00,480
eventually something is going to break.
267
00:09:00,480 --> 00:09:02,680
The security model for 2026
268
00:09:02,680 --> 00:09:04,280
inverts this logic completely.
269
00:09:04,280 --> 00:09:07,040
Instead of using one permanent token with broad permissions,
270
00:09:07,040 --> 00:09:10,040
you get a unique token for every single individual task.
271
00:09:10,040 --> 00:09:12,920
The token is issued just in time, right when you need it,
272
00:09:12,920 --> 00:09:15,160
and it is scope to do exactly one thing.
273
00:09:15,160 --> 00:09:17,960
It might allow you to reset a password for one specific user
274
00:09:17,960 --> 00:09:19,640
or get the settings for one mailbox,
275
00:09:19,640 --> 00:09:21,120
but it cannot do anything else.
276
00:09:21,120 --> 00:09:23,320
Once that specific task is finished,
277
00:09:23,320 --> 00:09:25,440
the token expires immediately.
278
00:09:25,440 --> 00:09:27,600
Think about how much this changes your risk profile.
279
00:09:27,600 --> 00:09:30,240
If a script is compromised, an attacker might get a token,
280
00:09:30,240 --> 00:09:33,000
but that token was only issued for one tiny operation.
281
00:09:33,000 --> 00:09:35,600
It cannot be used to look at every user in your directory,
282
00:09:35,600 --> 00:09:37,680
it cannot be used to read other mailboxes,
283
00:09:37,680 --> 00:09:40,680
and it definitely cannot be used to create new admin accounts.
284
00:09:40,680 --> 00:09:43,120
The blast radius shrinks from the entire tenant
285
00:09:43,120 --> 00:09:45,080
down to just one single task.
286
00:09:45,080 --> 00:09:46,880
The architectural shift here moves us away
287
00:09:46,880 --> 00:09:49,040
from permanent admin keys and to a just in time
288
00:09:49,040 --> 00:09:51,160
identity scope to a single operation.
289
00:09:51,160 --> 00:09:53,480
An external policy engine decides if an agent is
290
00:09:53,480 --> 00:09:56,760
allowed to do a task right now by checking the risk and the context.
291
00:09:56,760 --> 00:09:59,680
It issues a token that is only valid for 30 or 60 seconds,
292
00:09:59,680 --> 00:10:01,480
which is just long enough to get the job done
293
00:10:01,480 --> 00:10:03,320
before it is revoked automatically.
294
00:10:03,320 --> 00:10:05,040
This is vital for autonomous agents
295
00:10:05,040 --> 00:10:06,960
because those agents will be running constantly.
296
00:10:06,960 --> 00:10:10,120
They will be processing tickets and executing workflows all day long,
297
00:10:10,120 --> 00:10:13,040
and the old security model simply cannot scale to meet that demand.
298
00:10:13,040 --> 00:10:16,080
You cannot give an agent a permanent tenant-wide key
299
00:10:16,080 --> 00:10:18,480
and expect your environment to stay secure.
300
00:10:18,480 --> 00:10:21,680
Autonomous reasoning requires autonomous authentication,
301
00:10:21,680 --> 00:10:23,360
which means using a different identity
302
00:10:23,360 --> 00:10:25,880
for every single decision the system makes.
303
00:10:25,880 --> 00:10:28,200
Semantic kernel as the reasoning layer.
304
00:10:28,200 --> 00:10:30,760
We need to talk about how you actually build this reasoning layer.
305
00:10:30,760 --> 00:10:32,560
This is where semantic kernel comes in,
306
00:10:32,560 --> 00:10:35,160
but I want to be clear about what this tool actually is
307
00:10:35,160 --> 00:10:36,680
because most people get it wrong.
308
00:10:36,680 --> 00:10:38,400
Semantic kernel isn't just a wrapper,
309
00:10:38,400 --> 00:10:41,480
it isn't a library that makes it easier to call PowerShell from your code.
310
00:10:41,480 --> 00:10:42,880
It's something fundamentally different.
311
00:10:42,880 --> 00:10:47,320
It's an orchestration engine designed to teach large language models how to use tools.
312
00:10:47,320 --> 00:10:51,360
The kernel sits directly between the LLM and your PowerShell functions.
313
00:10:51,360 --> 00:10:54,800
It manages the conversation, it interprets what the model decides to do.
314
00:10:54,800 --> 00:10:56,400
It executes those decisions,
315
00:10:56,400 --> 00:10:58,160
and it feeds the results back to the model
316
00:10:58,160 --> 00:11:00,080
so the AI can decide what to do next.
317
00:11:00,080 --> 00:11:01,640
The core concept is simple.
318
00:11:01,640 --> 00:11:03,960
The kernel decides when to call PowerShell,
319
00:11:03,960 --> 00:11:05,440
not just how to call it.
320
00:11:05,440 --> 00:11:08,520
You describe your PowerShell functions to the kernel in plain language.
321
00:11:08,520 --> 00:11:09,960
You explain what each function does,
322
00:11:09,960 --> 00:11:12,920
what parameters it needs, and what kind of results it returns.
323
00:11:12,920 --> 00:11:16,800
The kernel then presents these functions to the LLM as a set of available tools.
324
00:11:16,800 --> 00:11:20,200
The model reads those descriptions, it understands the purpose.
325
00:11:20,200 --> 00:11:22,600
It reasons about when a specific tool should be used.
326
00:11:22,600 --> 00:11:24,240
Should I call this function now?
327
00:11:24,240 --> 00:11:25,680
Or should I call a different one?
328
00:11:25,680 --> 00:11:28,200
What parameters should I pass to get the right result?
329
00:11:28,200 --> 00:11:30,280
What will the output tell me about the next step?
330
00:11:30,280 --> 00:11:34,240
This is the shift from deterministic execution to adaptive reasoning.
331
00:11:34,240 --> 00:11:37,960
A traditional script says, "Do this, then do that, then do the other thing."
332
00:11:37,960 --> 00:11:40,920
The agent says, "Here is my goal, these are the tools I have,
333
00:11:40,920 --> 00:11:44,000
and here is what I should do first based on what I know right now."
334
00:11:44,000 --> 00:11:47,480
Semantic kernel supports three main ways to integrate your tools.
335
00:11:47,480 --> 00:11:49,080
First, you have native plugins,
336
00:11:49,080 --> 00:11:51,800
where you write C-Pars code and decorate methods with attributes
337
00:11:51,800 --> 00:11:53,680
to compile them into your agent.
338
00:11:53,680 --> 00:11:57,960
Second, you can use OpenAPI imports to point the kernel at a specification
339
00:11:57,960 --> 00:12:00,680
and automatically expose those endpoints as tools.
340
00:12:00,680 --> 00:12:02,760
Third, it supports MCP servers,
341
00:12:02,760 --> 00:12:07,360
which are model context protocol servers that standardize how tools are shown to AI systems.
342
00:12:07,360 --> 00:12:11,880
For PowerShell specifically, the native plug-in pattern is the most direct way to get started.
343
00:12:11,880 --> 00:12:15,800
You create a .NET class and write methods that call your PowerShell CMDlets.
344
00:12:15,800 --> 00:12:17,800
Each method should do one specific thing.
345
00:12:17,800 --> 00:12:20,440
Maybe one method retrieves a user from Microsoft Graph
346
00:12:20,440 --> 00:12:23,080
while another resets a password or checks mailbox permissions.
347
00:12:23,080 --> 00:12:25,640
You decorate each method with a kernel function attribute,
348
00:12:25,640 --> 00:12:27,400
so the kernel knows it's a tool.
349
00:12:27,400 --> 00:12:31,440
Then you add a description attribute that explains what the function does in plain English.
350
00:12:31,440 --> 00:12:34,640
When you register this class, the kernel passes every single description.
351
00:12:34,640 --> 00:12:36,240
It learns the purpose of every function.
352
00:12:36,240 --> 00:12:38,440
It understands the parameters you defined.
353
00:12:38,440 --> 00:12:41,840
It creates an internal map of the tools, the data they need,
354
00:12:41,840 --> 00:12:43,480
and the results they produce.
355
00:12:43,480 --> 00:12:47,120
Now, when you ask the model to solve a problem, it doesn't just see your question.
356
00:12:47,120 --> 00:12:50,400
It sees your question alongside a structured list of available tools.
357
00:12:50,400 --> 00:12:52,640
It reasons about which tools are relevant to the problem.
358
00:12:52,640 --> 00:12:53,800
It builds a plan.
359
00:12:53,800 --> 00:12:58,120
The model tells the kernel, "I need to call the get user function with the username jsmith."
360
00:12:58,120 --> 00:12:59,440
The kernel runs the code.
361
00:12:59,440 --> 00:13:00,720
The model sees the result.
362
00:13:00,720 --> 00:13:02,600
Then it decides if the goal is finished,
363
00:13:02,600 --> 00:13:05,280
or if it needs to call another function to keep going.
364
00:13:05,280 --> 00:13:07,240
This is why your descriptions matter so much.
365
00:13:07,240 --> 00:13:10,440
If you write a description that says resets a user's password,
366
00:13:10,440 --> 00:13:12,680
the model understands exactly when to use it.
367
00:13:12,680 --> 00:13:15,480
But if your description is vague like "updates user account",
368
00:13:15,480 --> 00:13:16,680
the model gets confused.
369
00:13:16,680 --> 00:13:20,000
It won't know if that function is for passwords, licenses, or something else.
370
00:13:20,000 --> 00:13:21,920
And it might make the wrong choice.
371
00:13:21,920 --> 00:13:26,120
The kernel uses these descriptions to teach the model the "why" behind the tool.
372
00:13:26,120 --> 00:13:28,720
This allows the LLM to compose multi-step workflows
373
00:13:28,720 --> 00:13:31,080
instead of just running one command at a time.
374
00:13:31,080 --> 00:13:33,120
The model doesn't follow a fixed sequence.
375
00:13:33,120 --> 00:13:34,880
It reads a ticket, understands the goal,
376
00:13:34,880 --> 00:13:37,080
and chains tools together to reach that goal.
377
00:13:37,080 --> 00:13:38,760
If something unexpected happens,
378
00:13:38,760 --> 00:13:41,760
like a user missing or a permission failing, the model adapts.
379
00:13:41,760 --> 00:13:44,640
It tries a different approach because it can reason about what went wrong.
380
00:13:44,640 --> 00:13:45,520
This is orchestration.
381
00:13:45,520 --> 00:13:47,080
It's not just execution.
382
00:13:47,080 --> 00:13:49,240
Microsoft Graph as the data fabric.
383
00:13:49,240 --> 00:13:51,120
The kernel now knows how to orchestrate PowerShell,
384
00:13:51,120 --> 00:13:53,080
but PowerShell needs something to talk to.
385
00:13:53,080 --> 00:13:54,680
It needs access to data.
386
00:13:54,680 --> 00:13:56,560
That's where Microsoft Graph enters the picture.
387
00:13:56,560 --> 00:14:01,480
Microsoft Graph is the unified API layer for every Microsoft 365 workload.
388
00:14:01,480 --> 00:14:05,200
It covers mail, teams, SharePoint, and user identities in Azure AD.
389
00:14:05,200 --> 00:14:09,520
It handles device management through Intune, Planet tasks, and OneNote notebooks.
390
00:14:09,520 --> 00:14:13,360
Instead of learning a dozen different APIs for a dozen different services you go through Graph.
391
00:14:13,360 --> 00:14:16,680
One interface, one way to authenticate, one consistent data model.
392
00:14:16,680 --> 00:14:19,320
For autonomous agents, Graph is more than just an API.
393
00:14:19,320 --> 00:14:22,320
It's the source of truth for the entire state of your organization.
394
00:14:22,320 --> 00:14:25,480
When an agent needs to know something about your enterprise, it goes to Graph.
395
00:14:25,480 --> 00:14:26,600
Is this user licensed?
396
00:14:26,600 --> 00:14:27,880
Is this device compliant?
397
00:14:27,880 --> 00:14:31,120
What are the email forwarding rules on this specific mailbox?
398
00:14:31,120 --> 00:14:32,840
Graph knows the answer to all of those.
399
00:14:32,840 --> 00:14:34,240
This changes how you work.
400
00:14:34,240 --> 00:14:37,880
Right now, when you write a script, you manually craft every single query.
401
00:14:37,880 --> 00:14:39,240
You decide what information you need.
402
00:14:39,240 --> 00:14:42,360
You run a command like GetMJuser, and you look at the properties.
403
00:14:42,360 --> 00:14:45,400
You see a value and decide whether to move to the next step.
404
00:14:45,400 --> 00:14:47,120
You read the error message if it fails.
405
00:14:47,120 --> 00:14:48,680
You interpret what that error means.
406
00:14:48,680 --> 00:14:52,720
The script is just a sequence of manual queries where you encoded every decision point
407
00:14:52,720 --> 00:14:53,720
up front.
408
00:14:53,720 --> 00:14:54,960
An agent doesn't work that way.
409
00:14:54,960 --> 00:14:59,720
The agent queries graph receives the data and analyzes the results to decide the next move.
410
00:14:59,720 --> 00:15:04,080
It makes decisions based on what actually happened, not what you predicted would happen.
411
00:15:04,080 --> 00:15:07,680
If a mailbox is in a strange state you've never seen before, the agent doesn't panic,
412
00:15:07,680 --> 00:15:12,560
it reads the state, understands the implications, and reasons about what fix makes sense.
413
00:15:12,560 --> 00:15:16,200
It might try one approach, see it didn't work, and then try something else.
414
00:15:16,200 --> 00:15:18,680
The power of this is obvious in complex workflows.
415
00:15:18,680 --> 00:15:22,200
Suppose you're diagnosing why a user can't access a shared document.
416
00:15:22,200 --> 00:15:26,200
The problem could be the user, the device, the network, or the permissions.
417
00:15:26,200 --> 00:15:28,880
An autonomous agent works through this systematically.
418
00:15:28,880 --> 00:15:32,760
It checks the identity in Azure AD and the device compliance in Intune.
419
00:15:32,760 --> 00:15:36,680
It looks at the sharing permissions in SharePoint and checks for conditional access policies.
420
00:15:36,680 --> 00:15:39,720
Then it correlates all that data to build a hypothesis.
421
00:15:39,720 --> 00:15:42,040
This is where agents become force multipliers.
422
00:15:42,040 --> 00:15:45,800
A human doing this investigation might take an hour to click through all those screens.
423
00:15:45,800 --> 00:15:49,240
An agent does it in seconds because it can query all these systems at once without getting
424
00:15:49,240 --> 00:15:50,240
distracted.
425
00:15:50,240 --> 00:15:51,960
But we have to talk about security.
426
00:15:51,960 --> 00:15:55,840
If permissions must be scoped to the specific task, not the whole tenant, this goes back
427
00:15:55,840 --> 00:15:58,320
to our talk about just-in-time identities.
428
00:15:58,320 --> 00:16:01,520
An agent that resets passwords doesn't need to read your email.
429
00:16:01,520 --> 00:16:04,480
An agent that checks device compliance doesn't need to write to groups.
430
00:16:04,480 --> 00:16:07,960
Each agent gets only the graph permissions it actually needs to do the job.
431
00:16:07,960 --> 00:16:09,840
The policy engine enforces this.
432
00:16:09,840 --> 00:16:14,040
When an agent requests a token for a task, that token only has the permissions required
433
00:16:14,040 --> 00:16:15,480
for that one moment.
434
00:16:15,480 --> 00:16:19,960
Graph gives agents access to the entire enterprise data fabric in a structured way, but that
435
00:16:19,960 --> 00:16:22,200
access is always controlled and auditable.
436
00:16:22,200 --> 00:16:24,600
The agent sees what it needs to see and nothing more.
437
00:16:24,600 --> 00:16:27,480
That is how you scale these operations safely.
438
00:16:27,480 --> 00:16:30,200
Headless authentication and identity governance.
439
00:16:30,200 --> 00:16:33,280
Headless means the agent runs without any user interaction at all.
440
00:16:33,280 --> 00:16:36,600
There is no human sitting at a keyboard and there is no browser window popping up to
441
00:16:36,600 --> 00:16:37,880
ask for MFA.
442
00:16:37,880 --> 00:16:41,760
The agent has to prove who it is and get access to do its work, but it does all of that
443
00:16:41,760 --> 00:16:45,040
automatically in the background, without anyone getting involved.
444
00:16:45,040 --> 00:16:48,880
This is fundamentally different from how you probably use Microsoft 365 today.
445
00:16:48,880 --> 00:16:51,080
When you open Outlook, you authenticate yourself.
446
00:16:51,080 --> 00:16:54,240
The browser might ask for your password or your Authenticator app and you respond to get
447
00:16:54,240 --> 00:16:55,240
logged in.
448
00:16:55,240 --> 00:16:56,800
That is called delegated authentication.
449
00:16:56,800 --> 00:17:00,560
It uses your identity, your credentials and your permissions because you are the actor
450
00:17:00,560 --> 00:17:02,480
and the system is responding to you.
451
00:17:02,480 --> 00:17:03,680
Agents cannot work that way.
452
00:17:03,680 --> 00:17:07,720
An agent that needed to prompt for credentials would just hang there waiting forever.
453
00:17:07,720 --> 00:17:11,760
It cannot interact with an MFA app and it cannot click approval buttons, so it has to
454
00:17:11,760 --> 00:17:14,040
authenticate in a way that is fully automated.
455
00:17:14,040 --> 00:17:15,400
This is app-only authentication.
456
00:17:15,400 --> 00:17:19,400
The agent has a credential, which is usually a certificate or a client secret stored somewhere
457
00:17:19,400 --> 00:17:24,840
secure and it uses that to request access tokens from Azure AD with no human involved.
458
00:17:24,840 --> 00:17:27,720
But this is where 2026 changes the model entirely.
459
00:17:27,720 --> 00:17:32,160
The traditional approach is to issue one credential to the service principle, store it somewhere,
460
00:17:32,160 --> 00:17:33,800
and use it for every single task.
461
00:17:33,800 --> 00:17:37,160
The agent runs and authenticates with that static credential to get a token that is valid
462
00:17:37,160 --> 00:17:38,160
for an hour or more.
463
00:17:38,160 --> 00:17:40,480
It uses that same token for everything it does.
464
00:17:40,480 --> 00:17:44,400
If someone needs to change what the agent can do, they have to update the credential storage,
465
00:17:44,400 --> 00:17:46,800
the app registration and redeploy the whole thing.
466
00:17:46,800 --> 00:17:50,520
The shift we are seeing now is to what just in time token issuance, instead of one permanent
467
00:17:50,520 --> 00:17:54,600
credential that the agent uses to get its own tokens, the agent requests a token from
468
00:17:54,600 --> 00:17:57,080
a policy engine whenever it needs to do work.
469
00:17:57,080 --> 00:18:01,160
The policy engine looks at what the agent is trying to do and checks a specific policy
470
00:18:01,160 --> 00:18:05,320
to decide if this agent should be allowed to perform this specific task right now.
471
00:18:05,320 --> 00:18:09,480
If the answer is yes, it issues a token scope exactly to that one task.
472
00:18:09,480 --> 00:18:11,800
That token might be valid for only 60 seconds.
473
00:18:11,800 --> 00:18:15,960
It can only be used to reset one user's password or read one mailbox or update one device's
474
00:18:15,960 --> 00:18:17,800
configuration and then it expires.
475
00:18:17,800 --> 00:18:19,320
This is the big architectural shift.
476
00:18:19,320 --> 00:18:23,680
We are moving from a permanent identity credential that unlocks everything to time limited, task-specific
477
00:18:23,680 --> 00:18:25,840
tokens issued by a policy engine.
478
00:18:25,840 --> 00:18:28,520
The policy engine is the new component in the setup.
479
00:18:28,520 --> 00:18:32,360
It is an external service that understands your organization's risk posture, your security
480
00:18:32,360 --> 00:18:34,800
policies, and your operational context.
481
00:18:34,800 --> 00:18:38,320
When an agent wants to do something, it goes to the policy engine and the engine starts
482
00:18:38,320 --> 00:18:39,320
asking questions.
483
00:18:39,320 --> 00:18:42,720
Does this task allowed, has this agent done this task before successfully?
484
00:18:42,720 --> 00:18:44,360
Is there unusual activity happening?
485
00:18:44,360 --> 00:18:48,000
Only if the answers to all these questions support the request, does the policy engine issue
486
00:18:48,000 --> 00:18:49,200
a token?
487
00:18:49,200 --> 00:18:51,120
Managed identity is another piece of this puzzle.
488
00:18:51,120 --> 00:18:55,320
If your agent runs inside Azure, like in a function app or a container, that agent can
489
00:18:55,320 --> 00:18:57,640
use Azure's managed identity system.
490
00:18:57,640 --> 00:18:59,680
The agent does not store any credentials at all.
491
00:18:59,680 --> 00:19:03,680
It simply proves that it is running in that specific Azure resource and Azure issues
492
00:19:03,680 --> 00:19:05,280
it a token automatically.
493
00:19:05,280 --> 00:19:08,560
You configure the managed identity with exactly the permissions it needs.
494
00:19:08,560 --> 00:19:12,280
And when the credentials need to rotate, Azure handles that in the background.
495
00:19:12,280 --> 00:19:16,000
You are not manually rotating secrets or storing credentials in code because the system
496
00:19:16,000 --> 00:19:17,960
handles identity automatically.
497
00:19:17,960 --> 00:19:20,440
But there is one more layer to consider.
498
00:19:20,440 --> 00:19:25,080
Continuous access evaluation or CAE is how Microsoft revokes access immediately if something
499
00:19:25,080 --> 00:19:26,080
changes.
500
00:19:26,080 --> 00:19:30,600
Normally, a token issued at 2pm stays valid until 3pm, even if something bad happens at
501
00:19:30,600 --> 00:19:31,600
2.30pm.
502
00:19:31,600 --> 00:19:32,920
CAE changes that logic.
503
00:19:32,920 --> 00:19:36,880
If a risk signal triggers during your tokens lifetime, like an impossible travel alert
504
00:19:36,880 --> 00:19:40,280
or a policy violation, CAE revokes your token immediately.
505
00:19:40,280 --> 00:19:42,000
Your access stops within seconds.
506
00:19:42,000 --> 00:19:44,160
For autonomous agents, this is absolutely essential.
507
00:19:44,160 --> 00:19:48,600
An agent that has been compromised or starts acting abnormally should lose its access immediately
508
00:19:48,600 --> 00:19:51,080
rather than waiting for the next token refresh.
509
00:19:51,080 --> 00:19:55,160
The continuous evaluation of risk means the policy engine can revoke a token and prevent
510
00:19:55,160 --> 00:19:59,000
further damage before the compromised agent even realises what is happening.
511
00:19:59,000 --> 00:20:04,000
This is why autonomous agents actually need tighter identity boundaries than humans do.
512
00:20:04,000 --> 00:20:05,000
Humans have judgment.
513
00:20:05,000 --> 00:20:08,800
You can see a prompt that looks suspicious and choose to ignore it, but agents do not have
514
00:20:08,800 --> 00:20:09,800
that ability.
515
00:20:09,800 --> 00:20:14,080
An agent just follows its logic, so the system has to make sure the agent's identity gives
516
00:20:14,080 --> 00:20:17,640
it access to do exactly what it is supposed to do and nothing else.
517
00:20:17,640 --> 00:20:21,600
Just in time tokens, continuous evaluation and immediate revocation are the pillars of
518
00:20:21,600 --> 00:20:25,400
an identity architecture that supports secure operation at scale.
519
00:20:25,400 --> 00:20:27,680
Rapping PowerShell as semantic kernel plugins.
520
00:20:27,680 --> 00:20:31,600
Now let's talk about the mechanics of actually connecting PowerShell to semantic kernel.
521
00:20:31,600 --> 00:20:34,840
This is the moment where the reasoning layer meets the execution layer and it is
522
00:20:34,840 --> 00:20:37,240
actually much simpler than you might expect.
523
00:20:37,240 --> 00:20:39,400
The process follows three basic steps.
524
00:20:39,400 --> 00:20:44,480
First, you identify which PowerShell CMD lets your agent actually needs to function.
525
00:20:44,480 --> 00:20:47,680
You are not wrapping every command in the library because you want to be deliberate.
526
00:20:47,680 --> 00:20:52,320
If your agent handles password resets, you would need "GetMGuser" to find the user and
527
00:20:52,320 --> 00:20:56,320
reset "MGuser Authentication Method" passed by order to change the password.
528
00:20:56,320 --> 00:20:59,520
You would also need sent "MGuserMail" to notify them of the change.
529
00:20:59,520 --> 00:21:03,480
That is your CMDL at list which might only be a dozen functions total.
530
00:21:03,480 --> 00:21:06,320
Did you create a .NET class that wraps these CMDellets?
531
00:21:06,320 --> 00:21:10,440
This is straightforward, C# code where each method in the class calls one or more PowerShell
532
00:21:10,440 --> 00:21:11,440
CMDL.
533
00:21:11,440 --> 00:21:16,200
A method named FindUser by email calls "GetMGuser" and a method named ResetUserParsword
534
00:21:16,200 --> 00:21:18,040
calls the ResetCemDL.
535
00:21:18,040 --> 00:21:20,720
Each method does one thing and performs one clear action.
536
00:21:20,720 --> 00:21:24,040
Third, you decorate these methods with two specific attributes.
537
00:21:24,040 --> 00:21:28,040
The kernel function attribute tells semantic kernel that this method is a "Calible" tool.
538
00:21:28,040 --> 00:21:31,320
While the description attribute contains a plain English explanation of what the function
539
00:21:31,320 --> 00:21:32,320
does.
540
00:21:32,320 --> 00:21:34,840
It is the critical part that most people underestimate.
541
00:21:34,840 --> 00:21:37,320
Here is why those descriptions matter so much.
542
00:21:37,320 --> 00:21:41,080
When you register your plugin with the kernel, the kernel does not actually execute your
543
00:21:41,080 --> 00:21:42,080
code yet.
544
00:21:42,080 --> 00:21:45,400
It reads those descriptions and builds an internal model of what is available.
545
00:21:45,400 --> 00:21:50,160
It notes that there is a function called ResetUserParsword that needs a user ID and a temporary
546
00:21:50,160 --> 00:21:51,160
password.
547
00:21:51,160 --> 00:21:54,840
The kernel communicates this to the LLM and the model reads these descriptions to learn
548
00:21:54,840 --> 00:21:55,840
how to use them.
549
00:21:55,840 --> 00:21:59,920
Now when you ask the agent to solve a problem, the model has all this information ready.
550
00:21:59,920 --> 00:22:03,520
It knows what each function does and understands when each function is relevant to the task
551
00:22:03,520 --> 00:22:04,520
at hand.
552
00:22:04,520 --> 00:22:07,280
It reasons about whether to use that function based on the goal.
553
00:22:07,280 --> 00:22:11,960
This is why a description like ResetUserParsword is completely different from something vague
554
00:22:11,960 --> 00:22:14,760
like "Modify's user account settings".
555
00:22:14,760 --> 00:22:18,560
The first one tells the model exactly what the function is for while the second one leaves
556
00:22:18,560 --> 00:22:19,840
the model guessing.
557
00:22:19,840 --> 00:22:22,880
You should also document the side effects and the prerequisites.
558
00:22:22,880 --> 00:22:26,800
If your Reset function requires the user to have a mailbox, you need to say that.
559
00:22:26,800 --> 00:22:30,600
If it sends a notification automatically or logs the action to an audit system, that is
560
00:22:30,600 --> 00:22:32,240
important context for the agent.
561
00:22:32,240 --> 00:22:35,920
The model needs to understand not just what the function does, but what happens as a result
562
00:22:35,920 --> 00:22:36,920
of calling it.
563
00:22:36,920 --> 00:22:39,560
If it is a high-risk operation, you should say that explicitly.
564
00:22:39,560 --> 00:22:42,440
A good description for a password Reset function might look like this.
565
00:22:42,440 --> 00:22:46,320
It resets a user's password to a temporary value and sends that new password to their
566
00:22:46,320 --> 00:22:47,720
registered email address.
567
00:22:47,720 --> 00:22:51,440
It requires that the user exists in Azure AD and has an active mailbox.
568
00:22:51,440 --> 00:22:55,280
It also triggers an audit log entry and returns true if successful or false if the mailbox
569
00:22:55,280 --> 00:22:56,280
is unreachable.
570
00:22:56,280 --> 00:22:57,280
That is specific.
571
00:22:57,280 --> 00:23:01,040
The model reads that and knows exactly what the function does, what could go wrong and
572
00:23:01,040 --> 00:23:02,720
what the result will be.
573
00:23:02,720 --> 00:23:06,600
When the model encounters a ticket about someone unable to access their account, it sees
574
00:23:06,600 --> 00:23:10,280
this function description and realises it is the right tool for the job.
575
00:23:10,280 --> 00:23:14,320
The kernel passes these descriptions and uses them to teach the model how to reason about
576
00:23:14,320 --> 00:23:15,320
your tools.
577
00:23:15,320 --> 00:23:17,800
It is not magic, but rather a form of structured learning.
578
00:23:17,800 --> 00:23:21,720
The model studies the descriptions, the parameters and the expected outputs to learn patterns
579
00:23:21,720 --> 00:23:24,120
about when each tool is actually valuable.
580
00:23:24,120 --> 00:23:27,840
This is why good descriptions are the difference between an agent that makes smart decisions
581
00:23:27,840 --> 00:23:29,400
and one that makes random ones.
582
00:23:29,400 --> 00:23:33,680
An agent that understands the purpose and the implications of each tool can chain them together
583
00:23:33,680 --> 00:23:34,880
intelligently.
584
00:23:34,880 --> 00:23:38,440
An agent given vague descriptions will just stumble around trying different functions at
585
00:23:38,440 --> 00:23:40,200
random until something works.
586
00:23:40,200 --> 00:23:44,080
So when you wrap your PowerShell CMDlets, you should treat the descriptions like documentation
587
00:23:44,080 --> 00:23:46,040
written specifically for the AI.
588
00:23:46,040 --> 00:23:47,520
Be explicit and be specific.
589
00:23:47,520 --> 00:23:51,640
Describe not just what the function does, but why someone would want to use it and what
590
00:23:51,640 --> 00:23:53,760
the final result actually means.
591
00:23:53,760 --> 00:23:58,840
This is how static PowerShell CMDs finally become intelligent, reasoning agents.
592
00:23:58,840 --> 00:24:00,840
Designing function signatures for reasoning.
593
00:24:00,840 --> 00:24:03,960
Now that you understand the description layer, we need to talk about the actual shape
594
00:24:03,960 --> 00:24:05,520
of your functions.
595
00:24:05,520 --> 00:24:08,640
Descriptions are known aren't enough to guide an agent because the structure of your functions
596
00:24:08,640 --> 00:24:13,360
how you define parameters and what you return all dictate how well an LLM can reason about
597
00:24:13,360 --> 00:24:14,360
them.
598
00:24:14,360 --> 00:24:17,280
The core principle here is single-purpose design.
599
00:24:17,280 --> 00:24:21,800
One function should do exactly one thing, which means one clear and specific action.
600
00:24:21,800 --> 00:24:25,080
This isn't just a suggestion for clean code, it's how agents think.
601
00:24:25,080 --> 00:24:29,640
When you build functions for an agent to call, every parameter you add increases the cognitive
602
00:24:29,640 --> 00:24:30,920
load on the model.
603
00:24:30,920 --> 00:24:35,240
Every possible return state creates a new layer of ambiguity that the agent has to navigate.
604
00:24:35,240 --> 00:24:38,680
The narrower and more focused your function is, the clearer the model's decision making
605
00:24:38,680 --> 00:24:39,680
becomes.
606
00:24:39,680 --> 00:24:43,560
A function that takes 15 parameters with 10 optional branches is a puzzle that forces
607
00:24:43,560 --> 00:24:47,640
the model to reason about parameter combinations that might not even make sense together.
608
00:24:47,640 --> 00:24:52,280
On the other hand, a function that takes two required parameters and returns one simple result
609
00:24:52,280 --> 00:24:53,280
is transparent.
610
00:24:53,280 --> 00:24:56,760
So the model understands exactly what happens when it triggers that call.
611
00:24:56,760 --> 00:24:58,960
Let's look at a concrete example of bad design.
612
00:24:58,960 --> 00:25:03,160
You might write a function that takes a JSON blob as input to handle multiple tasks.
613
00:25:03,160 --> 00:25:07,680
This JSON could contain a user ID to reset a password, a group ID to add a member, or even
614
00:25:07,680 --> 00:25:10,120
a device ID to trigger compliance checks.
615
00:25:10,120 --> 00:25:14,400
The function reads the JSON, figures out what action to perform, and then executes it.
616
00:25:14,400 --> 00:25:18,600
While this saves you from code duplication and feels efficient to write, it's actually terrible
617
00:25:18,600 --> 00:25:19,840
for agent reasoning.
618
00:25:19,840 --> 00:25:23,400
The model sees this function and starts guessing what should go in the JSON or what might
619
00:25:23,400 --> 00:25:24,400
come out of it.
620
00:25:24,400 --> 00:25:28,360
Because the input is ambiguous, the model might construct a JSON object that doesn't make
621
00:25:28,360 --> 00:25:30,440
sense for the operation it's trying to perform.
622
00:25:30,440 --> 00:25:34,960
It might pass fields that the function doesn't use, or it might misrequire fields entirely.
623
00:25:34,960 --> 00:25:38,720
You've essentially made the model's job harder by trying to make your own code simpler.
624
00:25:38,720 --> 00:25:41,480
Good design separates these into three distinct functions.
625
00:25:41,480 --> 00:25:46,120
One function finds a user by email, another resets that uses password, and a third notifies
626
00:25:46,120 --> 00:25:47,320
them of the change.
627
00:25:47,320 --> 00:25:51,400
Each function has a clear purpose and takes only the parameters it actually needs to function.
628
00:25:51,400 --> 00:25:55,040
This ensures each function returns exactly what the model needs to know to decide what
629
00:25:55,040 --> 00:25:56,440
to do next.
630
00:25:56,440 --> 00:25:59,200
Parameters should always be explicit and strongly typed.
631
00:25:59,200 --> 00:26:01,880
Don't accept generic strings that could mean multiple things.
632
00:26:01,880 --> 00:26:06,360
If you need a user ID, you should accept a GUID, and if you need a temporary password,
633
00:26:06,360 --> 00:26:07,840
you should accept a secure string.
634
00:26:07,840 --> 00:26:11,360
If you need a mailbox address, you should accept an email address and validate the format.
635
00:26:11,360 --> 00:26:15,480
Strong typing prevents the model from passing invalid data, and it tells the model exactly
636
00:26:15,480 --> 00:26:17,520
what kind of data the function expects.
637
00:26:17,520 --> 00:26:20,800
The model reasons better with narrow, well-defined tools because it doesn't have to figure
638
00:26:20,800 --> 00:26:22,400
out whether its input is valid.
639
00:26:22,400 --> 00:26:26,120
It doesn't have to guess what the function will do because the contract is clear.
640
00:26:26,120 --> 00:26:27,840
Return values should also be structured.
641
00:26:27,840 --> 00:26:31,200
Don't just return a string, return a JSON object instead.
642
00:26:31,200 --> 00:26:35,640
If you're checking whether a user exists, return a structured object like Psa, Tsa, Tsa,
643
00:26:35,640 --> 00:26:43,360
Tsa exists, true, user ID, ABC1023 on display name, John Smith, is rather than a simple true
644
00:26:43,360 --> 00:26:44,760
or false.
645
00:26:44,760 --> 00:26:49,960
If you're attempting a password reset, return, success, true, temporary, password, null
646
00:26:49,960 --> 00:26:53,920
notification, send, true, errors.
647
00:26:53,920 --> 00:26:58,000
Instead of just the word success, this matters because the agent needs to pass the result
648
00:26:58,000 --> 00:26:59,640
and decide on the next move.
649
00:26:59,640 --> 00:27:02,800
If you return a plain string, the model has to interpret what that string means, which
650
00:27:02,800 --> 00:27:06,000
leads to questions about whether it's an error or a success.
651
00:27:06,000 --> 00:27:10,400
If you return a structured object with explicit fields for success, status and errors encountered,
652
00:27:10,400 --> 00:27:12,160
the model knows exactly what happened.
653
00:27:12,160 --> 00:27:14,360
It can reason clearly about what to do next.
654
00:27:14,360 --> 00:27:16,840
Structured returns also make you logging much easier.
655
00:27:16,840 --> 00:27:20,600
You can serialize the return object directly to create an audit trail that shows exactly
656
00:27:20,600 --> 00:27:22,120
what each function returned.
657
00:27:22,120 --> 00:27:25,840
This allows you to analyze patterns in agent behavior because the data remains consistent
658
00:27:25,840 --> 00:27:27,320
across every execution.
659
00:27:27,320 --> 00:27:30,360
Think about function design as teaching the model a new language.
660
00:27:30,360 --> 00:27:34,520
Each function is a word, the signature is the definition of that word, and the parameters
661
00:27:34,520 --> 00:27:36,320
are what the word needs to work.
662
00:27:36,320 --> 00:27:38,200
Return values are what the word produces.
663
00:27:38,200 --> 00:27:42,080
If your words are vague and overloaded, the sentences won't make sense, but if your words
664
00:27:42,080 --> 00:27:45,200
are precise and single purpose, thoughts flow clearly.
665
00:27:45,200 --> 00:27:49,120
An agent that calls a dozen narrow well-defined functions will reason circles around an agent
666
00:27:49,120 --> 00:27:51,400
that calls three kitchen sink functions.
667
00:27:51,400 --> 00:27:55,240
Every time you remove complexity from a function, you make the agent smarter.
668
00:27:55,240 --> 00:27:57,320
The plan execute refined loop.
669
00:27:57,320 --> 00:27:58,840
This is where everything changes.
670
00:27:58,840 --> 00:28:01,800
Up until now, we've been talking about the building blocks like the reasoning engine,
671
00:28:01,800 --> 00:28:03,800
the data access, and the function design.
672
00:28:03,800 --> 00:28:07,400
Now we're going to talk about the actual workflow that ties it all together, and it's
673
00:28:07,400 --> 00:28:09,800
radically different from how scripts work.
674
00:28:09,800 --> 00:28:12,200
Standard scripts work in a linear deterministic way.
675
00:28:12,200 --> 00:28:15,640
They go from step one to step two to step three, and then they're done.
676
00:28:15,640 --> 00:28:17,560
They execute a predetermined sequence.
677
00:28:17,560 --> 00:28:20,960
And if an unexpected condition occurs, they either handle it with predefined logic or
678
00:28:20,960 --> 00:28:21,960
they fail.
679
00:28:21,960 --> 00:28:25,960
There's no adaptation and no recovery path except for what you explicitly programmed into
680
00:28:25,960 --> 00:28:27,200
the code.
681
00:28:27,200 --> 00:28:31,280
It works differently by following a cycle of thinking, acting, evaluating, and adjusting.
682
00:28:31,280 --> 00:28:35,080
The agent reads the ticket and builds a plan, then it decides what steps are needed based
683
00:28:35,080 --> 00:28:36,560
on the current context.
684
00:28:36,560 --> 00:28:40,440
After it executes those steps, it looks at what happened to see if the plan worked.
685
00:28:40,440 --> 00:28:43,800
If the goal wasn't achieved, the agent asks what went wrong and what that means for the
686
00:28:43,800 --> 00:28:44,960
next attempt.
687
00:28:44,960 --> 00:28:47,640
Then it goes back to planning and tries a different approach.
688
00:28:47,640 --> 00:28:51,480
This is the core agent workflow, plan, execute, and refine.
689
00:28:51,480 --> 00:28:55,400
The key insight is that all three stages happen inside the agent's reasoning loop.
690
00:28:55,400 --> 00:28:58,680
The agent isn't just following a script you wrote, it's building a mental model of the
691
00:28:58,680 --> 00:29:02,120
problem, and updating that model based on what it learned.
692
00:29:02,120 --> 00:29:03,680
Step one is the planning phase.
693
00:29:03,680 --> 00:29:07,480
The agent reads the incoming ticket and extracts the key information to understand what the
694
00:29:07,480 --> 00:29:08,760
user is trying to accomplish.
695
00:29:08,760 --> 00:29:12,720
It looks at its available tools, which are all those power shell functions you wrapped and
696
00:29:12,720 --> 00:29:14,160
builds a sequence of actions.
697
00:29:14,160 --> 00:29:16,600
But it doesn't execute yet because it needs to think first.
698
00:29:16,600 --> 00:29:20,720
It reasons about the order, asking if it makes sense to check if the user exists before
699
00:29:20,720 --> 00:29:22,280
resetting their password.
700
00:29:22,280 --> 00:29:26,280
The agent realizes it shouldn't notify the user before confirming the reset succeeded,
701
00:29:26,280 --> 00:29:28,160
so it builds a plan based on that logic.
702
00:29:28,160 --> 00:29:29,480
Step two is execution.
703
00:29:29,480 --> 00:29:33,080
Now the agent actually calls those power shell functions in the sequence it planned.
704
00:29:33,080 --> 00:29:36,760
It passes the right parameters and gets back structured results.
705
00:29:36,760 --> 00:29:39,520
Step three is refinement, and this is where it gets interesting.
706
00:29:39,520 --> 00:29:42,760
The agent looks at what came back to see if the first function succeeded.
707
00:29:42,760 --> 00:29:46,920
If it did, the agent confirms that and moves to the next step in the plan.
708
00:29:46,920 --> 00:29:50,360
But what if the function returned, exists, false?
709
00:29:50,360 --> 00:29:52,920
It wasn't in the plan, but the agent doesn't panic.
710
00:29:52,920 --> 00:29:54,600
It recalculates.
711
00:29:54,600 --> 00:29:58,440
If the user doesn't exist, the agent might assume the email address in the ticket is wrong
712
00:29:58,440 --> 00:30:00,040
or contains a typo.
713
00:30:00,040 --> 00:30:04,160
It might decide to search for users with a similar name, or it might ask the original request
714
00:30:04,160 --> 00:30:05,640
of a clarification.
715
00:30:05,640 --> 00:30:09,240
The plan adapts based on what actually happened in the environment.
716
00:30:09,240 --> 00:30:11,960
This is not linear execution, it's adaptive reasoning.
717
00:30:11,960 --> 00:30:16,040
This matters because in real enterprise work, things rarely go exactly as planned.
718
00:30:16,040 --> 00:30:20,080
A user might have a weird permission state you never anticipated, or a power shell function
719
00:30:20,080 --> 00:30:22,480
might return an error you didn't write logic to handle.
720
00:30:22,480 --> 00:30:25,520
Sometimes a prerequisite isn't met or an API call times out.
721
00:30:25,520 --> 00:30:27,760
Scripts fail on these issues, but an agent adapts.
722
00:30:27,760 --> 00:30:29,200
Here's a concrete example.
723
00:30:29,200 --> 00:30:32,560
An agent receives a ticket saying, "I can't log into my laptop."
724
00:30:32,560 --> 00:30:35,840
The agent plans the diagnosis by getting the device from Azure AD,
725
00:30:35,840 --> 00:30:38,480
checking compliance, and verifying the user's license.
726
00:30:38,480 --> 00:30:39,640
The plan is reasonable.
727
00:30:39,640 --> 00:30:43,560
It executes and calls the device "lookup function" which works perfectly.
728
00:30:43,560 --> 00:30:46,720
Then it calls the compliance check, but the device is non-compliant.
729
00:30:46,720 --> 00:30:49,280
The plan was based on the assumption that the device was compliant
730
00:30:49,280 --> 00:30:50,720
and that assumption just broke.
731
00:30:50,720 --> 00:30:52,320
The agent recalculates.
732
00:30:52,320 --> 00:30:56,160
If the device is non-compliant, the issue might not be about licensing,
733
00:30:56,160 --> 00:30:58,960
it might be that conditional access is blocking the login.
734
00:30:58,960 --> 00:31:03,440
The agent loops back to planning and creates a new plan to check the conditional access policies.
735
00:31:03,440 --> 00:31:06,960
It executes the new plan and evaluates the result success.
736
00:31:06,960 --> 00:31:09,960
The agent found the specific policy that's blocking the login.
737
00:31:09,960 --> 00:31:11,360
Now it needs to decide what to do.
738
00:31:11,360 --> 00:31:15,600
It probably shouldn't fix this autonomously because changing a conditional access policy is high risk.
739
00:31:15,600 --> 00:31:18,320
This is a decision point where the agent needs human judgment,
740
00:31:18,320 --> 00:31:21,200
so it builds a structured proposal instead of executing a fix.
741
00:31:21,200 --> 00:31:24,720
It tells an IT manager what the issue is and asks if it should apply the fix.
742
00:31:24,720 --> 00:31:28,080
The manager approves it in Teams and then the agent executes.
743
00:31:28,080 --> 00:31:30,880
This is the plan "execute refine loop in action".
744
00:31:30,880 --> 00:31:33,200
The agent isn't following a predetermined script.
745
00:31:33,200 --> 00:31:36,160
It's building a mental model and adjusting based on what it learns.
746
00:31:36,160 --> 00:31:39,440
When things go sideways, it doesn't fail. It adapts.
747
00:31:39,440 --> 00:31:41,280
Orca-strating multi-step workflows.
748
00:31:41,280 --> 00:31:44,000
Now we need to look at what actually makes this useful for real work,
749
00:31:44,000 --> 00:31:47,200
because the reality is that IT tickets are almost never simple.
750
00:31:47,200 --> 00:31:49,920
They don't just show up as a clean request to reset a password.
751
00:31:49,920 --> 00:31:53,280
Instead, they arrive as a messy tangle of interconnected problems
752
00:31:53,280 --> 00:31:55,200
that touch five different systems at once.
753
00:31:55,200 --> 00:31:58,240
Take a common ticket like user can't access Teams.
754
00:31:58,240 --> 00:32:00,080
On the surface, that sounds like a single problem,
755
00:32:00,080 --> 00:32:01,760
but in reality, it's a dozen.
756
00:32:01,760 --> 00:32:03,360
To figure out what's actually happening,
757
00:32:03,360 --> 00:32:05,680
you have to check if the user exists in Azure AD,
758
00:32:05,680 --> 00:32:06,960
verify their Teams license,
759
00:32:06,960 --> 00:32:09,280
and see if the service is even enabled for the company.
760
00:32:09,280 --> 00:32:11,120
Then you have to look at their device compliance,
761
00:32:11,120 --> 00:32:12,400
check for network blocks,
762
00:32:12,400 --> 00:32:14,080
investigate mailbox sync states,
763
00:32:14,080 --> 00:32:15,840
and confirm they're in the right groups.
764
00:32:15,840 --> 00:32:18,800
That isn't a two-step process. It's a 10-step investigation,
765
00:32:18,800 --> 00:32:20,320
where the order of operations matters.
766
00:32:20,320 --> 00:32:23,600
You can't check for a license if you haven't confirmed the user exists yet,
767
00:32:23,600 --> 00:32:25,600
and you can't verify device compliance
768
00:32:25,600 --> 00:32:28,000
if you don't even know which hardware they're using.
769
00:32:28,000 --> 00:32:29,760
Some of these checks are prerequisites,
770
00:32:29,760 --> 00:32:30,720
some are fallbacks,
771
00:32:30,720 --> 00:32:33,280
and others only matter if the first three things fail.
772
00:32:33,280 --> 00:32:36,000
A traditional script tries to hard-code this entire sequence
773
00:32:36,000 --> 00:32:37,600
from step one to step 10.
774
00:32:37,600 --> 00:32:38,720
If anything goes sideways,
775
00:32:38,720 --> 00:32:40,400
the script relies on conditional logic
776
00:32:40,400 --> 00:32:42,400
that you had to anticipate months ago,
777
00:32:42,400 --> 00:32:43,760
but you can't predict everything.
778
00:32:43,760 --> 00:32:45,680
Maybe a serial number is missing from the ticket,
779
00:32:45,680 --> 00:32:47,120
or an API call times out,
780
00:32:47,120 --> 00:32:48,720
or the mailbox is in a weird state
781
00:32:48,720 --> 00:32:50,640
that your code wasn't built to handle.
782
00:32:50,640 --> 00:32:52,720
When that happens, the script either hits a dead end,
783
00:32:52,720 --> 00:32:53,840
or it just fails.
784
00:32:53,840 --> 00:32:56,240
An agent doesn't follow a hard-coded map.
785
00:32:56,240 --> 00:32:57,360
It reads the ticket,
786
00:32:57,360 --> 00:33:00,000
and decides what needs to happen next based on the evidence.
787
00:33:00,000 --> 00:33:02,480
It understands that the first logical move
788
00:33:02,480 --> 00:33:04,080
is to confirm the user exists,
789
00:33:04,080 --> 00:33:06,240
so it makes that decision and executes it.
790
00:33:06,240 --> 00:33:07,520
When the result comes back,
791
00:33:07,520 --> 00:33:10,320
that data either confirms or destroys the next assumption,
792
00:33:10,320 --> 00:33:12,560
and the agent adjusts its plan accordingly.
793
00:33:12,560 --> 00:33:14,480
Watch how this workflow actually unfolds.
794
00:33:14,480 --> 00:33:16,720
The agent reads a ticket from JSmith@Company,
795
00:33:16,720 --> 00:33:18,160
comes saying they can't get into teams.
796
00:33:18,160 --> 00:33:20,880
The agent plans to check the user's state and the service state,
797
00:33:20,880 --> 00:33:22,640
so it calls the lookup function first.
798
00:33:22,640 --> 00:33:24,720
The result shows the user exists with ID,
799
00:33:24,720 --> 00:33:26,400
ABC123.
800
00:33:26,400 --> 00:33:28,480
Next, it checks the license and sees its active.
801
00:33:28,480 --> 00:33:30,080
It checks if the service is enabled,
802
00:33:30,080 --> 00:33:31,520
and the answer is yes.
803
00:33:31,520 --> 00:33:32,560
Then it hits a snag.
804
00:33:32,560 --> 00:33:34,320
It needs to check the device state,
805
00:33:34,320 --> 00:33:36,960
but the ticket doesn't say which device is the problem.
806
00:33:36,960 --> 00:33:39,120
The agent sees two registered devices
807
00:33:39,120 --> 00:33:40,720
and realizes it can't move forward
808
00:33:40,720 --> 00:33:42,240
without knowing which one is failing.
809
00:33:42,240 --> 00:33:44,640
It leaves a note for the human asking for clarification,
810
00:33:44,640 --> 00:33:46,160
but it doesn't just stop there.
811
00:33:46,160 --> 00:33:48,400
It keeps digging and finds a conditional access policy
812
00:33:48,400 --> 00:33:50,880
that's currently blocking all unmanaged devices.
813
00:33:50,880 --> 00:33:52,320
Now the agent has the full picture.
814
00:33:52,320 --> 00:33:54,320
The likely culprit is that one of the user's devices
815
00:33:54,320 --> 00:33:55,120
isn't managed,
816
00:33:55,120 --> 00:33:56,880
and the security policy is doing its job.
817
00:33:56,880 --> 00:33:58,000
The agent decides on a fix,
818
00:33:58,000 --> 00:34:00,400
like enrolling the device or creating an exception,
819
00:34:00,400 --> 00:34:02,400
but it knows these are high-risk moves.
820
00:34:02,400 --> 00:34:04,160
It won't act autonomously here.
821
00:34:04,160 --> 00:34:07,360
Instead, it builds a structured proposal for an IT manager.
822
00:34:07,360 --> 00:34:10,320
The manager gets a notification in teams with a clear breakdown.
823
00:34:10,320 --> 00:34:12,800
The user is blocked on an unmanaged device.
824
00:34:12,800 --> 00:34:14,640
And here are the three ways to fix it.
825
00:34:14,640 --> 00:34:17,040
The manager clicks "Approve for device enrollment"
826
00:34:17,040 --> 00:34:18,880
and the agent handles the rest.
827
00:34:18,880 --> 00:34:21,360
It runs the enrollment, verifies the fix,
828
00:34:21,360 --> 00:34:23,440
and tells the user they're good to go.
829
00:34:23,440 --> 00:34:25,440
This entire loop, the diagnosis,
830
00:34:25,440 --> 00:34:26,800
the adaptation, the escalation,
831
00:34:26,800 --> 00:34:28,160
and the final execution
832
00:34:28,160 --> 00:34:30,240
is what a multi-step workflow looks like.
833
00:34:30,240 --> 00:34:31,680
The agent decided the sequence
834
00:34:31,680 --> 00:34:33,680
and adapted when it hit a gap in the data.
835
00:34:33,680 --> 00:34:36,160
It didn't break just because a condition was unexpected.
836
00:34:36,160 --> 00:34:37,360
It reasoned its way through.
837
00:34:37,360 --> 00:34:38,800
A hard-coded script fails here
838
00:34:38,800 --> 00:34:41,440
because the person writing it couldn't predict every single variable.
839
00:34:41,440 --> 00:34:43,920
You can only code for the problems you've already thought of.
840
00:34:43,920 --> 00:34:46,800
An agent reasons about what matters based on the live ticket,
841
00:34:46,800 --> 00:34:48,160
deciding what to query,
842
00:34:48,160 --> 00:34:50,560
and in what order based on what it learns at every step.
843
00:34:50,560 --> 00:34:52,400
This is why agents handle variety
844
00:34:52,400 --> 00:34:54,640
without needing a script rewrite every week.
845
00:34:54,640 --> 00:34:57,200
The human in the loop isn't a feature you bolt on later.
846
00:34:57,200 --> 00:34:58,480
It's built into the logic.
847
00:34:58,480 --> 00:35:01,200
The agent knows exactly when it's moving from looking at data
848
00:35:01,200 --> 00:35:02,560
to changing things.
849
00:35:02,560 --> 00:35:05,200
And it stops to ask for permission before it crosses that line.
850
00:35:05,200 --> 00:35:07,840
Error handling and fallback strategies.
851
00:35:07,840 --> 00:35:09,600
We have to be honest about scripts.
852
00:35:09,600 --> 00:35:10,880
They are incredibly brittle.
853
00:35:10,880 --> 00:35:12,320
The moment an error happens,
854
00:35:12,320 --> 00:35:13,600
the whole thing stops,
855
00:35:13,600 --> 00:35:14,960
an exception is thrown,
856
00:35:14,960 --> 00:35:16,720
and the process dies.
857
00:35:16,720 --> 00:35:18,640
Then a human has to step in to find the bug,
858
00:35:18,640 --> 00:35:19,760
edit the code, test it,
859
00:35:19,760 --> 00:35:21,040
and redeploy the whole thing.
860
00:35:21,040 --> 00:35:23,520
That recovery cycle is slow, manual, and expensive.
861
00:35:23,520 --> 00:35:25,040
Agents have to work differently.
862
00:35:25,040 --> 00:35:26,080
When a problem pops up,
863
00:35:26,080 --> 00:35:28,080
they shouldn't just quit and report a failure.
864
00:35:28,080 --> 00:35:29,920
They need to recover, try a new path,
865
00:35:29,920 --> 00:35:31,040
and stay resilient.
866
00:35:31,040 --> 00:35:32,320
The difference is in the logic.
867
00:35:32,320 --> 00:35:33,200
A script says,
868
00:35:33,200 --> 00:35:34,480
"Do this and if it fails, stop."
869
00:35:34,480 --> 00:35:36,640
An agent says, "Try this."
870
00:35:36,640 --> 00:35:39,440
And if it fails, let's look at why and try the next best thing.
871
00:35:39,440 --> 00:35:40,880
Let's look at a password reset.
872
00:35:40,880 --> 00:35:42,400
The agent calls the reset function,
873
00:35:42,400 --> 00:35:44,960
but the system returns an error saying the user wasn't found.
874
00:35:44,960 --> 00:35:47,680
A standard script would just stop and escalate to a human.
875
00:35:47,680 --> 00:35:49,360
The agent, however, reads that error
876
00:35:49,360 --> 00:35:51,520
and thinks about what it actually means.
877
00:35:51,520 --> 00:35:53,680
Maybe the user exists in a different directory,
878
00:35:53,680 --> 00:35:55,840
or perhaps their name was misspelled in the ticket.
879
00:35:55,840 --> 00:35:56,800
Instead of quitting,
880
00:35:56,800 --> 00:35:58,560
the agent tries a different angle.
881
00:35:58,560 --> 00:36:00,240
It might search by a partial name,
882
00:36:00,240 --> 00:36:01,040
a phone number,
883
00:36:01,040 --> 00:36:02,880
or look for recently updated accounts.
884
00:36:02,880 --> 00:36:04,960
It can loop through these fallback strategies
885
00:36:04,960 --> 00:36:07,120
until it finds a match that works.
886
00:36:07,120 --> 00:36:09,440
The secret is that the agent has to be able to classify
887
00:36:09,440 --> 00:36:10,880
what kind of error it's looking at.
888
00:36:10,880 --> 00:36:12,480
Not all errors are created equal.
889
00:36:12,480 --> 00:36:15,360
A temporary error might be a service throttling your requests,
890
00:36:15,360 --> 00:36:18,080
which just means you need to wait 30 seconds and try again.
891
00:36:18,080 --> 00:36:20,400
A permanent error is something like a deleted account
892
00:36:20,400 --> 00:36:22,960
where no amount of retrying is going to help.
893
00:36:22,960 --> 00:36:24,160
Then you have recoverable errors
894
00:36:24,160 --> 00:36:26,080
like a missing user that might be found elsewhere
895
00:36:26,080 --> 00:36:28,240
and fatal errors like a lack of permissions.
896
00:36:28,240 --> 00:36:30,080
An agent that understands these categories
897
00:36:30,080 --> 00:36:32,000
can handle them with some intelligence.
898
00:36:32,000 --> 00:36:33,040
If it's temporary,
899
00:36:33,040 --> 00:36:35,520
it uses an exponential back-off to retry.
900
00:36:35,520 --> 00:36:38,160
If it's permanent, it skips to the next logical step.
901
00:36:38,160 --> 00:36:40,480
If it's recoverable, it tries a different tool.
902
00:36:40,480 --> 00:36:41,680
And if it's truly fatal,
903
00:36:41,680 --> 00:36:42,960
it stops and asks for help.
904
00:36:42,960 --> 00:36:46,080
In practice, this looks like an agent handling an access ticket.
905
00:36:46,080 --> 00:36:48,400
It tries the email address and gets nothing.
906
00:36:48,400 --> 00:36:50,400
That's a failure for that specific tool.
907
00:36:50,400 --> 00:36:52,640
But the agent knows it's a recoverable situation.
908
00:36:52,640 --> 00:36:54,400
It tries searching by the display name,
909
00:36:54,400 --> 00:36:56,320
finds the right person, and keeps moving.
910
00:36:56,320 --> 00:36:58,000
If every single search method fails,
911
00:36:58,000 --> 00:37:00,080
the agent finally stops and tells the human
912
00:37:00,080 --> 00:37:02,320
that the user doesn't exist anywhere in the system.
913
00:37:02,320 --> 00:37:03,440
When the human gets that message,
914
00:37:03,440 --> 00:37:05,760
they have all the context they need to make a fast call.
915
00:37:05,760 --> 00:37:07,840
They can see it's a typo or a new hire
916
00:37:07,840 --> 00:37:09,200
that hasn't been onboarded yet.
917
00:37:09,200 --> 00:37:10,800
Because the agent did the legwork,
918
00:37:10,800 --> 00:37:12,560
the human isn't starting from scratch.
919
00:37:12,560 --> 00:37:14,240
The goal here is graceful degradation.
920
00:37:14,240 --> 00:37:16,240
If the agent can't solve the whole ticket on its own,
921
00:37:16,240 --> 00:37:17,600
it doesn't just crash.
922
00:37:17,600 --> 00:37:18,960
It hands the problem to a human
923
00:37:18,960 --> 00:37:20,800
and explains exactly how far it got,
924
00:37:20,800 --> 00:37:22,880
what it tried, and what finally blocked it.
925
00:37:22,880 --> 00:37:25,520
The human isn't wasting time on diagnostics anymore.
926
00:37:25,520 --> 00:37:26,960
They're just the tiebreaker.
927
00:37:26,960 --> 00:37:27,680
To make this work,
928
00:37:27,680 --> 00:37:29,680
you have to build explicit error classification
929
00:37:29,680 --> 00:37:30,960
into your functions.
930
00:37:30,960 --> 00:37:33,760
You can't just return a success or fail flag.
931
00:37:33,760 --> 00:37:35,120
You need to provide enough data,
932
00:37:35,120 --> 00:37:37,520
so the agent can decide if there's a recovery path
933
00:37:37,520 --> 00:37:39,200
or if it's time to escalate.
934
00:37:39,200 --> 00:37:41,280
Resilient systems are designed for failure.
935
00:37:41,280 --> 00:37:42,960
You have to expect that things will break
936
00:37:42,960 --> 00:37:45,760
and give your agent the tools to try a plan B.
937
00:37:45,760 --> 00:37:48,480
By letting the agent navigate these small hurdles on its own,
938
00:37:48,480 --> 00:37:49,760
you keep the workflow moving
939
00:37:49,760 --> 00:37:52,400
instead of letting every minor error turn into a manual task
940
00:37:52,400 --> 00:37:53,280
for your team.
941
00:37:53,280 --> 00:37:54,880
Human in the loop approval gates,
942
00:37:54,880 --> 00:37:56,640
you need to draw a very clear line.
943
00:37:56,640 --> 00:37:59,600
On one side, you have things an agent can decide on its own.
944
00:37:59,600 --> 00:38:02,480
On the other, you have things that require a human judgment call.
945
00:38:02,480 --> 00:38:04,080
Getting this boundary right is the difference
946
00:38:04,080 --> 00:38:06,800
between a helpful tool and one that goes rogue.
947
00:38:06,800 --> 00:38:08,400
Not every decision should be autonomous.
948
00:38:08,400 --> 00:38:09,520
Some actions are low risk
949
00:38:09,520 --> 00:38:11,360
because they don't actually change anything.
950
00:38:11,360 --> 00:38:13,280
If an agent wants to pull a list of users,
951
00:38:13,280 --> 00:38:14,880
check forwarding rules on a mailbox
952
00:38:14,880 --> 00:38:17,280
or look at device compliance data, that's safe.
953
00:38:17,280 --> 00:38:19,760
It can do those things without asking for permission,
954
00:38:19,760 --> 00:38:21,200
but other actions are high risk.
955
00:38:21,200 --> 00:38:22,400
Deleting a user account,
956
00:38:22,400 --> 00:38:24,800
resetting MFA or changing access permissions
957
00:38:24,800 --> 00:38:26,640
can break someone's entire workday.
958
00:38:26,640 --> 00:38:28,800
These actions change the state of your environment
959
00:38:28,800 --> 00:38:30,800
and could even create a security hole.
960
00:38:30,800 --> 00:38:32,480
Because of that, you need a human to say yes
961
00:38:32,480 --> 00:38:33,920
before the agent pulls the trigger.
962
00:38:33,920 --> 00:38:36,720
The best practice for 2026 is simple.
963
00:38:36,720 --> 00:38:38,960
Agents propose and humans approve.
964
00:38:38,960 --> 00:38:41,840
The agent does all the heavy lifting during the diagnostic phase
965
00:38:41,840 --> 00:38:44,320
that gathers the context, looks at the situation,
966
00:38:44,320 --> 00:38:46,480
and reaches a conclusion about what needs to happen,
967
00:38:46,480 --> 00:38:49,120
but it doesn't execute high risk changes on its own.
968
00:38:49,120 --> 00:38:50,880
Instead, it creates a structured proposal
969
00:38:50,880 --> 00:38:53,120
that explains what it found, what it wants to do,
970
00:38:53,120 --> 00:38:54,960
and why it thinks that's the right move.
971
00:38:54,960 --> 00:38:56,880
Then it waits.
972
00:38:56,880 --> 00:38:59,200
The word "structured" is the most important part here.
973
00:38:59,200 --> 00:39:02,240
The proposal isn't just a big block of text that's hard to read.
974
00:39:02,240 --> 00:39:05,200
It's a specific object with clear fields like the proposed action,
975
00:39:05,200 --> 00:39:07,440
the reason, the risk level, and the potential impact.
976
00:39:07,440 --> 00:39:10,160
It shows which checks passed and if there are any blockers.
977
00:39:10,160 --> 00:39:12,560
By laying it out this way, a human can review the facts
978
00:39:12,560 --> 00:39:14,480
in seconds and make a fast decision.
979
00:39:14,480 --> 00:39:17,280
Think about how this looks inside Microsoft Teams.
980
00:39:17,280 --> 00:39:19,680
When an agent hits a situation that needs a human eye,
981
00:39:19,680 --> 00:39:21,040
it builds that proposal.
982
00:39:21,040 --> 00:39:24,640
But instead of sending a message, it sends an adaptive card.
983
00:39:24,640 --> 00:39:26,160
The card is clean and easy to scan.
984
00:39:26,160 --> 00:39:29,840
You see the username, the status, the proposed action, and the impact.
985
00:39:29,840 --> 00:39:32,000
It tells you if the user will be notified
986
00:39:32,000 --> 00:39:33,760
or if an audit log will be triggered.
987
00:39:33,760 --> 00:39:36,560
At the bottom, you have three simple buttons.
988
00:39:36,560 --> 00:39:39,360
Approve, reject, or ask me more questions.
989
00:39:39,360 --> 00:39:43,200
A human gets the notification and looks at the card.
990
00:39:43,200 --> 00:39:45,200
It takes maybe 10 seconds to review the details.
991
00:39:45,200 --> 00:39:46,960
If they hit "approved" the agent goes to work,
992
00:39:46,960 --> 00:39:49,040
if they reject it, the agent takes that feedback
993
00:39:49,040 --> 00:39:50,160
and tries a different path.
994
00:39:50,160 --> 00:39:53,040
If they have questions, the agent can provide the diagnostic data
995
00:39:53,040 --> 00:39:54,640
it used to reach that conclusion.
996
00:39:54,640 --> 00:39:56,960
The human is making an informed choice, not a guest.
997
00:39:56,960 --> 00:39:58,880
This is what we call "risk-based" tearing.
998
00:39:58,880 --> 00:40:00,640
Low-risk actions stay autonomous,
999
00:40:00,640 --> 00:40:03,680
but medium-risk tasks might need a team lead to sign off.
1000
00:40:03,680 --> 00:40:05,360
High-risk actions like deleting a user
1001
00:40:05,360 --> 00:40:07,600
should require a manager or a security officer.
1002
00:40:07,600 --> 00:40:10,800
Your policy layer defines exactly what fits into each tier.
1003
00:40:10,800 --> 00:40:13,760
This gives humans control over the decisions that actually matter
1004
00:40:13,760 --> 00:40:15,840
while letting the agent's handle the small stuff.
1005
00:40:15,840 --> 00:40:18,000
The beauty of this model is that it scales.
1006
00:40:18,000 --> 00:40:20,320
One person approving decisions won't become a bottleneck
1007
00:40:20,320 --> 00:40:22,240
if most of the work doesn't need approval.
1008
00:40:22,240 --> 00:40:23,840
The agent handles the easy stuff,
1009
00:40:23,840 --> 00:40:26,960
like running diagnostics and fixing low-risk queries.
1010
00:40:26,960 --> 00:40:30,320
Humans are no longer stuck doing the busy work or running routine tasks.
1011
00:40:30,320 --> 00:40:32,160
They are making the calls that matter.
1012
00:40:32,160 --> 00:40:34,160
Their time is spent on actual decision-making
1013
00:40:34,160 --> 00:40:37,200
rather than the administrative work that leads up to it.
1014
00:40:37,200 --> 00:40:40,320
In this setup, the agent is a decision support system.
1015
00:40:40,320 --> 00:40:42,240
It isn't there to replace human judgment,
1016
00:40:42,240 --> 00:40:43,280
but to augment it.
1017
00:40:43,280 --> 00:40:44,960
It gathers all the relevant info,
1018
00:40:44,960 --> 00:40:46,000
analyzes it,
1019
00:40:46,000 --> 00:40:47,840
and hands over a clear recommendation.
1020
00:40:47,840 --> 00:40:50,080
The human still brings their expertise
1021
00:40:50,080 --> 00:40:51,760
and organizational knowledge to the table.
1022
00:40:51,760 --> 00:40:54,560
The difference is that they're making that choice with complete information
1023
00:40:54,560 --> 00:40:56,000
instead of operating in the dark.
1024
00:40:56,000 --> 00:40:57,360
This is also how you build trust.
1025
00:40:57,360 --> 00:40:59,120
Organizations don't trust machines
1026
00:40:59,120 --> 00:41:01,760
that make irreversible changes without asking first.
1027
00:41:01,760 --> 00:41:03,760
But they do trust systems that propose a plan,
1028
00:41:03,760 --> 00:41:05,840
explain the logic and wait for a green light.
1029
00:41:05,840 --> 00:41:08,000
You aren't handing over the keys to the kingdom.
1030
00:41:08,000 --> 00:41:10,400
You're just letting the machine do the analytical work
1031
00:41:10,400 --> 00:41:12,640
so you can focus on the final decision.
1032
00:41:12,640 --> 00:41:16,160
The agent becomes a collaborator rather than a replacement worker.
1033
00:41:16,160 --> 00:41:17,280
It does what is best at,
1034
00:41:17,280 --> 00:41:19,920
which is rapid analysis and calling APIs.
1035
00:41:19,920 --> 00:41:21,200
You do what your best at,
1036
00:41:21,200 --> 00:41:23,200
which is weighing risk and understanding nuance.
1037
00:41:23,200 --> 00:41:24,640
When you work together like this,
1038
00:41:24,640 --> 00:41:26,720
you move much faster than either of you could alone.
1039
00:41:26,720 --> 00:41:28,640
Case study.
1040
00:41:28,640 --> 00:41:30,240
Autonomous ticket resolution.
1041
00:41:30,240 --> 00:41:32,720
Let's look at a real scenario to see how this works.
1042
00:41:32,720 --> 00:41:35,440
This is the step-by-step process of an autonomous agent
1043
00:41:35,440 --> 00:41:37,440
handling a ticket from start to finish.
1044
00:41:37,440 --> 00:41:40,080
A user sends in a ticket at 9.47 am.
1045
00:41:40,080 --> 00:41:41,840
They say they can't log into their laptop
1046
00:41:41,840 --> 00:41:43,280
because of a compliance error
1047
00:41:43,280 --> 00:41:44,880
and they need it fixed immediately.
1048
00:41:44,880 --> 00:41:46,720
There's no serial number, no error code,
1049
00:41:46,720 --> 00:41:48,080
and no extra context.
1050
00:41:48,080 --> 00:41:49,440
It's just a frustrated message.
1051
00:41:49,440 --> 00:41:51,600
The agent picks up the ticket the second it arrives.
1052
00:41:51,600 --> 00:41:54,240
The first thing it does is pull the email from the metadata.
1053
00:41:54,240 --> 00:41:55,440
It knows the device is involved,
1054
00:41:55,440 --> 00:41:56,880
but it doesn't know which one yet.
1055
00:41:56,880 --> 00:41:58,240
The agent has a starting point,
1056
00:41:58,240 --> 00:42:00,080
but it needs to go find the missing data.
1057
00:42:00,080 --> 00:42:03,440
It starts the diagnosis by querying Azure AD to find the user.
1058
00:42:03,440 --> 00:42:06,320
The account is active and there are no obvious issues there.
1059
00:42:06,320 --> 00:42:09,920
Next, it checks in tune to see what devices are registered to that person.
1060
00:42:09,920 --> 00:42:11,600
It finds two a laptop and a phone.
1061
00:42:11,600 --> 00:42:13,040
Since the ticket mentioned a laptop,
1062
00:42:13,040 --> 00:42:15,200
the agent pulls the compliance status for both.
1063
00:42:15,200 --> 00:42:16,480
It sees that the phone is fine,
1064
00:42:16,480 --> 00:42:18,640
but the laptop is marked as non-compliant.
1065
00:42:18,640 --> 00:42:20,720
The laptop hasn't checked in for three days.
1066
00:42:20,720 --> 00:42:22,640
The operating system is on an old build
1067
00:42:22,640 --> 00:42:25,280
and the current policy requires a much newer version.
1068
00:42:25,280 --> 00:42:26,720
Now the agent has the answer.
1069
00:42:26,720 --> 00:42:29,200
The device is blocked by a conditional access policy
1070
00:42:29,200 --> 00:42:30,880
because the OS is outdated.
1071
00:42:30,880 --> 00:42:32,480
But the agent can't just force an update.
1072
00:42:32,480 --> 00:42:35,440
That's a high-risk move that could fail or interrupt the user's work.
1073
00:42:35,440 --> 00:42:38,400
The agent doesn't have the authority to make that call on its own.
1074
00:42:38,400 --> 00:42:40,880
Instead of stopping, the agent builds a proposal.
1075
00:42:40,880 --> 00:42:42,640
It organizes everything it learned,
1076
00:42:42,640 --> 00:42:44,160
including the device ID,
1077
00:42:44,160 --> 00:42:45,200
the current build,
1078
00:42:45,200 --> 00:42:47,280
and the specific policy that's blocking access.
1079
00:42:47,280 --> 00:42:50,240
It knows the fix is to update the OS to the latest version.
1080
00:42:50,240 --> 00:42:52,880
It could also suggest replacing the device if it were too old,
1081
00:42:52,880 --> 00:42:55,040
but an update is the fastest way forward.
1082
00:42:55,040 --> 00:42:58,000
The agent sends this proposal to the IT manager on duty.
1083
00:42:58,000 --> 00:42:59,920
It doesn't send an email that will get buried.
1084
00:42:59,920 --> 00:43:03,040
It sends a team's adaptive card that pops up as a notification.
1085
00:43:03,040 --> 00:43:04,880
The manager sees the user impact,
1086
00:43:04,880 --> 00:43:06,000
the technical details,
1087
00:43:06,000 --> 00:43:07,920
and the risk level all in one place.
1088
00:43:07,920 --> 00:43:10,480
The manager reads the card in about 20 seconds.
1089
00:43:10,480 --> 00:43:11,600
They understand the requirement
1090
00:43:11,600 --> 00:43:13,360
and they see that the fix makes sense.
1091
00:43:13,360 --> 00:43:14,800
They hit a proof.
1092
00:43:14,800 --> 00:43:16,000
The moment that button is pressed,
1093
00:43:16,000 --> 00:43:18,160
the agent moves into the execution phase.
1094
00:43:18,160 --> 00:43:19,760
This is where delegation is key.
1095
00:43:19,760 --> 00:43:21,920
The agent might not push the update directly,
1096
00:43:21,920 --> 00:43:24,160
but it can trigger the workflow in your management system.
1097
00:43:24,160 --> 00:43:27,120
It moves the device into an update group and sends a note to the user.
1098
00:43:27,120 --> 00:43:29,200
It tells them a security update is ready
1099
00:43:29,200 --> 00:43:31,200
and asks them to save their work and restart.
1100
00:43:31,200 --> 00:43:33,760
It can even alert the manager once the restart happens,
1101
00:43:33,760 --> 00:43:35,120
so they can keep an eye on it.
1102
00:43:35,120 --> 00:43:37,120
The user gets the notification on their phone.
1103
00:43:37,120 --> 00:43:38,800
They probably expected this to take all day,
1104
00:43:38,800 --> 00:43:41,680
but now they have a clear instruction to just restart their machine.
1105
00:43:41,680 --> 00:43:43,840
They reboot the update runs for about 15 minutes,
1106
00:43:43,840 --> 00:43:46,560
and the device comes back online with the correct OS build.
1107
00:43:46,560 --> 00:43:49,120
The compliance start as flips to green almost immediately.
1108
00:43:49,120 --> 00:43:50,960
The agent keeps watching that status.
1109
00:43:50,960 --> 00:43:52,880
As soon as the device shows as compliant,
1110
00:43:52,880 --> 00:43:54,480
it knows the problem is solved.
1111
00:43:54,480 --> 00:43:57,840
It checks Azure AD one last time to confirm the block is gone.
1112
00:43:57,840 --> 00:44:00,160
Then it sends a final message to the user,
1113
00:44:00,160 --> 00:44:02,240
letting them know they can log back in.
1114
00:44:02,240 --> 00:44:04,560
It closes the ticket with a full summary of what happened
1115
00:44:04,560 --> 00:44:05,680
and how it was fixed.
1116
00:44:05,680 --> 00:44:07,200
The total time from the first message
1117
00:44:07,200 --> 00:44:09,360
to the final resolution was 15 minutes.
1118
00:44:09,360 --> 00:44:11,120
If a human had handled this manually,
1119
00:44:11,120 --> 00:44:13,840
they would have spent an hour just going back and forth with the user.
1120
00:44:13,840 --> 00:44:16,000
They would have had to remote in, check in tune,
1121
00:44:16,000 --> 00:44:17,920
and manually walk the user through the update.
1122
00:44:17,920 --> 00:44:19,600
That's two or three hours of wasted time.
1123
00:44:19,600 --> 00:44:21,200
The agent didn't take the human's job.
1124
00:44:21,200 --> 00:44:22,720
The manager still made the final call.
1125
00:44:22,720 --> 00:44:24,960
The agent just did the boring diagnostic work
1126
00:44:24,960 --> 00:44:27,280
and handled the execution once it had permission.
1127
00:44:27,280 --> 00:44:30,080
The human handled the judgment and the agent handled the speed.
1128
00:44:30,080 --> 00:44:32,320
This is the force multiplier effect in action.
1129
00:44:32,320 --> 00:44:35,280
The agent takes care of the complexity and the data analysis.
1130
00:44:35,280 --> 00:44:37,200
The human takes care of the authorization.
1131
00:44:37,200 --> 00:44:39,680
The result is a level of speed and accuracy
1132
00:44:39,680 --> 00:44:42,480
that you just can't get when you're working alone.
1133
00:44:42,480 --> 00:44:44,400
Scaling to enterprise operations.
1134
00:44:44,400 --> 00:44:47,200
One agent resolving one ticket is a great proof of concept,
1135
00:44:47,200 --> 00:44:48,400
but in a real organization,
1136
00:44:48,400 --> 00:44:50,080
you don't deal with one ticket a day.
1137
00:44:50,080 --> 00:44:51,920
You deal with hundreds or thousands.
1138
00:44:51,920 --> 00:44:53,360
This is where the model breaks.
1139
00:44:53,360 --> 00:44:55,120
We have to stop talking about a single agent
1140
00:44:55,120 --> 00:44:57,600
and start talking about a fleet of agents running in parallel.
1141
00:44:57,600 --> 00:45:00,400
The architecture changes fundamentally when you move to this level.
1142
00:45:00,400 --> 00:45:02,160
Instead of one agent and one orchestrator,
1143
00:45:02,160 --> 00:45:04,720
you have a pool of agents, a work dispatcher,
1144
00:45:04,720 --> 00:45:07,120
and a monitoring system that watches everything.
1145
00:45:07,120 --> 00:45:10,400
Think of the dispatcher as the nervous system of the entire operation.
1146
00:45:10,400 --> 00:45:12,960
Tickets come in and get routed to the first available agent
1147
00:45:12,960 --> 00:45:15,440
with the right skill set, which keeps the flow moving.
1148
00:45:15,440 --> 00:45:17,840
An identity agent handles user account issues,
1149
00:45:17,840 --> 00:45:20,160
while an infrastructure agent manages device compliance
1150
00:45:20,160 --> 00:45:22,320
and an application agent deals with licensing.
1151
00:45:22,320 --> 00:45:24,480
The dispatcher looks at the incoming request,
1152
00:45:24,480 --> 00:45:25,920
categorizes the problem,
1153
00:45:25,920 --> 00:45:27,680
and sends it to the right specialist.
1154
00:45:27,680 --> 00:45:29,280
If every identity agent is busy,
1155
00:45:29,280 --> 00:45:32,160
the ticket sits in a queue until someone finishes their current task
1156
00:45:32,160 --> 00:45:33,200
and picks up the next one.
1157
00:45:33,200 --> 00:45:34,480
This is load balancing,
1158
00:45:34,480 --> 00:45:37,760
but it is much smarter than just assigning tasks in a circle.
1159
00:45:37,760 --> 00:45:40,560
Each agent tracks its own performance metrics in real time,
1160
00:45:40,560 --> 00:45:42,560
measuring success rates, resolution times,
1161
00:45:42,560 --> 00:45:44,800
and how often it has to escalate a problem.
1162
00:45:44,800 --> 00:45:47,600
The dispatcher uses this data to root work intelligently.
1163
00:45:47,600 --> 00:45:50,720
If one agent consistently results password resets in eight minutes
1164
00:45:50,720 --> 00:45:52,240
while another takes 18,
1165
00:45:52,240 --> 00:45:55,680
the dispatcher roots more of those specific tasks to the faster agent.
1166
00:45:55,680 --> 00:45:59,360
This isn't a permanent change because agents improve or decline over time,
1167
00:45:59,360 --> 00:46:02,960
so the system adjusts dynamically based on current capability.
1168
00:46:02,960 --> 00:46:05,760
To make this work, your infrastructure has to be cloud native.
1169
00:46:05,760 --> 00:46:08,480
You aren't running fixed agents on fixed servers anymore.
1170
00:46:08,480 --> 00:46:11,520
Instead, you are spinning up agents as the load increases
1171
00:46:11,520 --> 00:46:13,600
and tearing them down when things get quiet.
1172
00:46:13,600 --> 00:46:15,280
This is orchestration at a different level
1173
00:46:15,280 --> 00:46:17,280
where Kubernetes handles the container life cycle
1174
00:46:17,280 --> 00:46:20,080
while the dispatcher manages the actual flow of work.
1175
00:46:20,080 --> 00:46:22,080
The monitoring layer watches both the hardware
1176
00:46:22,080 --> 00:46:24,960
and the agents themselves to ensure everything stays healthy.
1177
00:46:24,960 --> 00:46:27,520
Monitoring is the most critical piece of this puzzle.
1178
00:46:27,520 --> 00:46:30,960
A central dashboard shows you exactly how many tickets are in the queue,
1179
00:46:30,960 --> 00:46:34,240
how many are being worked on, and how many were resolved in the last hour.
1180
00:46:34,240 --> 00:46:36,720
It tracks agent health and availability constantly.
1181
00:46:36,720 --> 00:46:38,720
If an agent starts crashing repeatedly,
1182
00:46:38,720 --> 00:46:42,480
the system pulls it from the pool and flags the issue for the operations team.
1183
00:46:42,480 --> 00:46:44,240
It also reveals escalation patterns.
1184
00:46:44,240 --> 00:46:47,760
If a certain type of ticket is being kicked back to humans more than expected,
1185
00:46:47,760 --> 00:46:50,880
it tells you that something is wrong with the training or the classification.
1186
00:46:50,880 --> 00:46:53,840
The data shows you whether the problem is outside the agent's scope
1187
00:46:53,840 --> 00:46:55,600
or if it just needs better documentation.
1188
00:46:55,600 --> 00:46:58,080
This is where feedback loops finally start to matter.
1189
00:46:58,080 --> 00:46:59,440
Every ticket and agent resolves
1190
00:46:59,440 --> 00:47:02,800
in every decision it makes becomes part of a massive pool of training data.
1191
00:47:02,800 --> 00:47:04,320
Your system is actually learning.
1192
00:47:04,320 --> 00:47:07,360
Some organizations feed this data directly back into the model
1193
00:47:07,360 --> 00:47:09,600
while others use it for periodic retraining,
1194
00:47:09,600 --> 00:47:11,040
but the result is the same.
1195
00:47:11,040 --> 00:47:14,000
The agent that used to take eight minutes per ticket gets faster
1196
00:47:14,000 --> 00:47:16,240
and the escalation rate on difficult issues drops
1197
00:47:16,240 --> 00:47:18,560
as the agent learns from human corrections.
1198
00:47:18,560 --> 00:47:21,840
Load balancing also gives you the power to handle massive traffic spikes.
1199
00:47:21,840 --> 00:47:24,160
On a typical Monday morning, you might get a wave of tickets
1200
00:47:24,160 --> 00:47:26,160
because everyone had issues over the weekend,
1201
00:47:26,160 --> 00:47:28,640
but you can just spin up more agents to clear the queue.
1202
00:47:28,640 --> 00:47:30,560
By Wednesday evening, when things slow down,
1203
00:47:30,560 --> 00:47:33,200
you scale back and stop paying for compute you don't need.
1204
00:47:33,200 --> 00:47:35,920
Parallel execution changes your throughput completely
1205
00:47:35,920 --> 00:47:38,240
and if you had one script running one ticket at a time,
1206
00:47:38,240 --> 00:47:39,760
you would be limited by the clock
1207
00:47:39,760 --> 00:47:42,400
and a hundred tickets would take hours to finish
1208
00:47:42,400 --> 00:47:44,240
with a fleet running in parallel
1209
00:47:44,240 --> 00:47:46,720
though same hundred tickets might be done in 30 minutes.
1210
00:47:46,720 --> 00:47:49,920
The time it takes to resolve an individual ticket doesn't change much
1211
00:47:49,920 --> 00:47:51,680
but your total capacity multiplies.
1212
00:47:51,680 --> 00:47:54,800
This is the shift from running a script to running an operation.
1213
00:47:54,800 --> 00:47:57,040
You aren't just launching a PowerShell command anymore.
1214
00:47:57,040 --> 00:48:00,800
You are operating a complex system of agents, orchestration and governance.
1215
00:48:00,800 --> 00:48:02,320
The agents are just the workers,
1216
00:48:02,320 --> 00:48:05,520
while everything else is the machinery that makes them reliable at scale.
1217
00:48:05,520 --> 00:48:07,520
This requires a real infrastructure shift.
1218
00:48:07,520 --> 00:48:09,360
You need a message queue for work distribution
1219
00:48:09,360 --> 00:48:11,760
and deep observability through logging and metrics.
1220
00:48:11,760 --> 00:48:14,080
You also need policy engines to enforce rules
1221
00:48:14,080 --> 00:48:16,640
and human approval workflows to keep things safe.
1222
00:48:16,640 --> 00:48:19,840
Most importantly, you need an audit trail for every single action.
1223
00:48:19,840 --> 00:48:22,480
It also forces you to change how you think about governance.
1224
00:48:22,480 --> 00:48:24,560
You have to decide who approves what,
1225
00:48:24,560 --> 00:48:28,080
how the escalation path works and what happens if an agent makes a mistake
1226
00:48:28,080 --> 00:48:28,880
or gets attacked.
1227
00:48:28,880 --> 00:48:31,520
These are operational questions, not development ones.
1228
00:48:31,520 --> 00:48:35,040
Moving from a pilot to full production isn't about writing better code.
1229
00:48:35,040 --> 00:48:37,840
It's about building the system that the agents live inside.
1230
00:48:37,840 --> 00:48:40,560
Governance and guardrails.
1231
00:48:40,560 --> 00:48:43,680
Autonomous agents running at scale need boundaries that are clear,
1232
00:48:43,680 --> 00:48:45,120
enforced and unchangeable.
1233
00:48:45,120 --> 00:48:47,760
Without these limits, you are just giving away access
1234
00:48:47,760 --> 00:48:49,040
and hoping for the best.
1235
00:48:49,040 --> 00:48:52,320
The governance layer is what turns an agent from a wildcard
1236
00:48:52,320 --> 00:48:54,240
into a predictable reliable system.
1237
00:48:54,560 --> 00:48:57,280
Think about the level of access you are actually granting here.
1238
00:48:57,280 --> 00:48:59,680
You have agents that can query sensitive user data,
1239
00:48:59,680 --> 00:49:01,920
reset passwords, modify permissions,
1240
00:49:01,920 --> 00:49:04,640
and even send emails on behalf of the entire company.
1241
00:49:04,640 --> 00:49:06,480
Any one of those powers is a massive risk
1242
00:49:06,480 --> 00:49:08,080
if it's used in the wrong context.
1243
00:49:08,080 --> 00:49:10,800
Putting all of them together in an agent without guardrails
1244
00:49:10,800 --> 00:49:12,960
is a guaranteed way to end up explaining
1245
00:49:12,960 --> 00:49:15,120
a massive security breach to your board.
1246
00:49:15,120 --> 00:49:17,840
The policy layer is what defines these guardrails.
1247
00:49:17,840 --> 00:49:21,120
It is a declarative model that states exactly what an agent can do,
1248
00:49:21,120 --> 00:49:24,080
which data it can touch and under what specific conditions it can act.
1249
00:49:24,320 --> 00:49:27,920
This isn't buried in a block of code or a forgotten configuration file.
1250
00:49:27,920 --> 00:49:30,320
It is a living document that the governance team controls
1251
00:49:30,320 --> 00:49:31,920
and auditors can review at any time
1252
00:49:31,920 --> 00:49:33,520
without needing to redeploy the agents.
1253
00:49:33,520 --> 00:49:35,360
In practice, this looks very specific.
1254
00:49:35,360 --> 00:49:37,120
You might define a policy that says
1255
00:49:37,120 --> 00:49:40,720
a password reset agent can only help users in the HR department
1256
00:49:40,720 --> 00:49:42,880
if it gets approval from an HR manager first.
1257
00:49:42,880 --> 00:49:44,880
You can even add a rule that it only works
1258
00:49:44,880 --> 00:49:48,160
if the user hasn't asked for a reset in the last 30 days.
1259
00:49:48,160 --> 00:49:50,160
Because the rules are measurable and specific,
1260
00:49:50,160 --> 00:49:52,720
the agent checks them every single time it tries to act.
1261
00:49:52,720 --> 00:49:54,720
If the conditions aren't met, the agent stops
1262
00:49:54,720 --> 00:49:57,440
and either refuses the request or escalates it to a human.
1263
00:49:57,440 --> 00:49:59,360
Policies can be incredibly granular.
1264
00:49:59,360 --> 00:50:02,000
You don't just say an agent can read all user data,
1265
00:50:02,000 --> 00:50:04,480
you say it can read six specific fields for users
1266
00:50:04,480 --> 00:50:06,480
in the sales department during business hours.
1267
00:50:06,480 --> 00:50:09,440
The access is scoped to exactly what is needed for the task
1268
00:50:09,440 --> 00:50:10,400
and nothing more.
1269
00:50:10,400 --> 00:50:13,120
These policies also determine when a human needs to step in.
1270
00:50:13,120 --> 00:50:16,160
A simple read-only query might not need any oversight,
1271
00:50:16,160 --> 00:50:19,520
but modifying an email address or deleting a device certainly does.
1272
00:50:19,520 --> 00:50:21,440
Your policy layer maps every possible action
1273
00:50:21,440 --> 00:50:22,960
to a specific approval tier.
1274
00:50:22,960 --> 00:50:25,360
This means the agent knows whether it needs a yes
1275
00:50:25,360 --> 00:50:28,080
from a human before it even attempts to perform the action.
1276
00:50:28,080 --> 00:50:30,960
The audit trail is the necessary partner to these policies.
1277
00:50:30,960 --> 00:50:32,880
Every single move an agent makes is logged,
1278
00:50:32,880 --> 00:50:34,400
including what triggered the action,
1279
00:50:34,400 --> 00:50:35,680
which policy allowed it,
1280
00:50:35,680 --> 00:50:37,440
and what the final outcome was.
1281
00:50:37,440 --> 00:50:38,560
If something goes wrong,
1282
00:50:38,560 --> 00:50:40,720
you can rewind the tape to see exactly what happened
1283
00:50:40,720 --> 00:50:41,760
and who approved it.
1284
00:50:41,760 --> 00:50:44,400
This audit trail serves several purposes at once.
1285
00:50:44,400 --> 00:50:46,160
Regulators need to see that you are tracking
1286
00:50:46,160 --> 00:50:47,440
who accessed what and when,
1287
00:50:47,440 --> 00:50:49,360
but it's also vital for troubleshooting.
1288
00:50:49,360 --> 00:50:51,440
If an agent starts behaving strangely,
1289
00:50:51,440 --> 00:50:53,120
you can replay the logs to see exactly
1290
00:50:53,120 --> 00:50:54,880
where its reasoning went off the rails.
1291
00:50:54,880 --> 00:50:57,360
It also provides the feedback you need to improve the system
1292
00:50:57,360 --> 00:50:59,280
by showing you where policies are too tight
1293
00:50:59,280 --> 00:51:01,600
or where agents are making repeated mistakes.
1294
00:51:01,600 --> 00:51:04,320
Compliance requirements are baked directly into this layer.
1295
00:51:04,320 --> 00:51:06,320
If you have rules about data classification,
1296
00:51:06,320 --> 00:51:07,760
the agent has to respect them.
1297
00:51:07,760 --> 00:51:10,800
It won't export confidential files or keep personal data
1298
00:51:10,800 --> 00:51:12,400
longer than the retention window allows.
1299
00:51:12,400 --> 00:51:14,320
The policy isn't just a suggestion.
1300
00:51:14,320 --> 00:51:16,800
It is a hard constraint that the agent cannot break.
1301
00:51:17,680 --> 00:51:19,440
The shift happening here is fundamental
1302
00:51:19,440 --> 00:51:21,440
because you are moving from trusting the script
1303
00:51:21,440 --> 00:51:22,640
to trusting the policy.
1304
00:51:22,640 --> 00:51:23,840
With old school scripts,
1305
00:51:23,840 --> 00:51:25,920
you had to trust that the author wrote secure code,
1306
00:51:25,920 --> 00:51:28,400
which meant you had to audit the code itself before it ran.
1307
00:51:28,400 --> 00:51:30,160
If something broke, the script was the problem.
1308
00:51:30,160 --> 00:51:32,640
With agents, you trust that the policy is correct
1309
00:51:32,640 --> 00:51:34,480
and the enforcement is airtight.
1310
00:51:34,480 --> 00:51:36,320
The agent is just a worker following orders.
1311
00:51:36,320 --> 00:51:38,880
If something goes wrong, it means the policy failed,
1312
00:51:38,880 --> 00:51:40,000
not the agent.
1313
00:51:40,000 --> 00:51:42,000
This also changes how you handle updates.
1314
00:51:42,000 --> 00:51:43,120
With traditional scripts,
1315
00:51:43,120 --> 00:51:45,440
every update is terrifying because a small error
1316
00:51:45,440 --> 00:51:46,720
could break everything.
1317
00:51:46,720 --> 00:51:49,600
With agents, you don't change the code to change the behavior.
1318
00:51:49,600 --> 00:51:51,040
You just update the policy.
1319
00:51:51,040 --> 00:51:53,920
If you want an agent to be more autonomous,
1320
00:51:53,920 --> 00:51:55,520
you adjust the governance layer.
1321
00:51:55,520 --> 00:51:57,520
If you need to restrict it, you do the same.
1322
00:51:57,520 --> 00:51:59,920
The capability of the agent stays the same in the code,
1323
00:51:59,920 --> 00:52:01,840
but its permissions change in the governance.
1324
00:52:01,840 --> 00:52:04,080
Governance is actually what enables you to scale.
1325
00:52:04,080 --> 00:52:06,240
It isn't a hurdle or a constraint.
1326
00:52:06,240 --> 00:52:08,560
It's the foundation that lets you spin up more agents
1327
00:52:08,560 --> 00:52:11,200
with total confidence because the boundaries are clear
1328
00:52:11,200 --> 00:52:12,160
and the audit is complete.
1329
00:52:12,160 --> 00:52:14,240
You don't have to worry about what might go wrong.
1330
00:52:14,240 --> 00:52:16,400
You have a perfect record of everything that happens
1331
00:52:16,400 --> 00:52:18,720
and a system that ensures it stays within the lines.
1332
00:52:18,720 --> 00:52:21,680
Security considerations for autonomous agents.
1333
00:52:21,680 --> 00:52:24,160
The moment you give an agent the power to execute commands,
1334
00:52:24,160 --> 00:52:26,000
you have created a high-value target.
1335
00:52:26,000 --> 00:52:28,640
Unlike a human employee, an agent never gets tired,
1336
00:52:28,640 --> 00:52:31,280
and it won't stop to question a suspicious instruction.
1337
00:52:31,280 --> 00:52:32,960
It simply follows its logic to the end.
1338
00:52:32,960 --> 00:52:35,840
This creates a massive risk if the agent is compromised
1339
00:52:35,840 --> 00:52:38,400
or if someone manages to inject malicious instructions
1340
00:52:38,400 --> 00:52:39,520
into its context.
1341
00:52:39,520 --> 00:52:40,640
You have to ask yourself,
1342
00:52:40,640 --> 00:52:42,560
what is the blast radius if this goes wrong?
1343
00:52:42,560 --> 00:52:44,000
You need a real-world threat model
1344
00:52:44,000 --> 00:52:46,880
based on what actually matters in your specific environment.
1345
00:52:46,880 --> 00:52:49,040
The danger isn't just that an agent might malfunction.
1346
00:52:49,040 --> 00:52:51,680
The real threat is that someone outside your organization
1347
00:52:51,680 --> 00:52:55,120
or even a malicious insider turns that agent into a weapon.
1348
00:52:55,120 --> 00:52:57,200
Isolation is your first line of defense here.
1349
00:52:57,200 --> 00:53:00,480
Agents shouldn't run on the same infrastructure as your critical systems.
1350
00:53:00,480 --> 00:53:02,320
Instead, they need to live in containers,
1351
00:53:02,320 --> 00:53:05,280
Kubernetes pods, or isolated network segments.
1352
00:53:05,280 --> 00:53:06,800
If an agent is compromised,
1353
00:53:06,800 --> 00:53:09,120
that isolation keeps the damage contained.
1354
00:53:09,120 --> 00:53:11,200
It prevents the attacker from immediately pivoting
1355
00:53:11,200 --> 00:53:13,520
to your domain controllers or email servers.
1356
00:53:13,520 --> 00:53:14,720
Think of it as a sandbox.
1357
00:53:14,720 --> 00:53:17,120
This is a major shift from how we usually run scripts.
1358
00:53:17,120 --> 00:53:18,960
A script running on a server with admin rights
1359
00:53:18,960 --> 00:53:20,560
can reach your entire environment,
1360
00:53:20,560 --> 00:53:22,640
but an agent in a properly configured container
1361
00:53:22,640 --> 00:53:25,840
can only see what you explicitly allow through network policies.
1362
00:53:25,840 --> 00:53:27,440
Credential handling has to change as well.
1363
00:53:27,440 --> 00:53:29,360
Traditional scripts usually store credentials
1364
00:53:29,360 --> 00:53:31,360
in an encrypted vault or a config file.
1365
00:53:31,360 --> 00:53:32,880
The script authenticates once,
1366
00:53:32,880 --> 00:53:33,760
gets it token,
1367
00:53:33,760 --> 00:53:36,080
and uses that token for as long as it lives.
1368
00:53:36,080 --> 00:53:37,280
If that credential leaks,
1369
00:53:37,280 --> 00:53:38,960
it stays valid for a long time.
1370
00:53:38,960 --> 00:53:41,280
An agent should never store credentials like that.
1371
00:53:41,280 --> 00:53:43,040
It shouldn't have long-lived tokens at all.
1372
00:53:43,040 --> 00:53:45,040
Every time it needs to perform a task,
1373
00:53:45,040 --> 00:53:47,680
it should request a new credential from an identity broker
1374
00:53:47,680 --> 00:53:49,520
that is only valid for that specific job.
1375
00:53:49,520 --> 00:53:51,280
Because these credentials expire in minutes,
1376
00:53:51,280 --> 00:53:54,400
they are worthless to a thief almost immediately after they are issued.
1377
00:53:54,400 --> 00:53:56,960
This ties back to headless authentication in CAE,
1378
00:53:56,960 --> 00:53:59,040
but the security angle is much more aggressive.
1379
00:53:59,040 --> 00:54:01,040
It isn't just about checking a compliance box.
1380
00:54:01,040 --> 00:54:03,920
It's about ensuring that even if someone dumps the agent's memory
1381
00:54:03,920 --> 00:54:05,120
or reads its logs,
1382
00:54:05,120 --> 00:54:07,600
there are no usable credentials sitting there.
1383
00:54:07,600 --> 00:54:09,600
If that agent starts behaving strangely
1384
00:54:09,600 --> 00:54:12,880
by making unauthorized API calls or accessing data it doesn't need,
1385
00:54:12,880 --> 00:54:15,520
the policy engine can revoke its access instantly.
1386
00:54:15,520 --> 00:54:18,240
Behavioral analytics serves as your early warning system.
1387
00:54:18,240 --> 00:54:21,120
If an agent that usually processes password resets
1388
00:54:21,120 --> 00:54:23,920
suddenly starts scanning every user in the directory,
1389
00:54:23,920 --> 00:54:25,120
that is a major signal.
1390
00:54:25,120 --> 00:54:27,360
If an agent that typically runs during business hours
1391
00:54:27,360 --> 00:54:28,880
suddenly wakes up at 3am,
1392
00:54:28,880 --> 00:54:31,360
or if it jumps from 10 API calls to 1,000,
1393
00:54:31,360 --> 00:54:33,120
your monitoring system needs to flag it.
1394
00:54:33,120 --> 00:54:34,960
This is actually easier than traditional monitoring
1395
00:54:34,960 --> 00:54:37,600
because agent behavior is incredibly predictable.
1396
00:54:37,600 --> 00:54:39,680
You know exactly which commands it should run,
1397
00:54:39,680 --> 00:54:40,640
what data it needs,
1398
00:54:40,640 --> 00:54:42,400
and how long the process should take.
1399
00:54:42,400 --> 00:54:44,640
Any deviation from that baseline is a red flag
1400
00:54:44,640 --> 00:54:46,480
that likely indicates an attack.
1401
00:54:46,480 --> 00:54:48,080
While a script has broad boundaries
1402
00:54:48,080 --> 00:54:50,240
because humans might run it in different ways,
1403
00:54:50,240 --> 00:54:52,240
a production agent has a very narrow path.
1404
00:54:52,240 --> 00:54:53,440
Anything outside that path
1405
00:54:53,440 --> 00:54:55,200
deserves an immediate investigation.
1406
00:54:55,200 --> 00:54:56,560
The principle of least privilege
1407
00:54:56,560 --> 00:54:58,400
becomes much tighter with agents.
1408
00:54:58,400 --> 00:55:00,080
A human might need broad permissions
1409
00:55:00,080 --> 00:55:02,000
because their job requires flexibility
1410
00:55:02,000 --> 00:55:03,520
across different domains.
1411
00:55:03,520 --> 00:55:06,000
An agent only needs exactly what is required
1412
00:55:06,000 --> 00:55:07,600
for its one specific function.
1413
00:55:07,600 --> 00:55:11,120
A password reset agent has no business reading email servers
1414
00:55:11,120 --> 00:55:14,400
and a device provisioning agent doesn't need to see financial data.
1415
00:55:14,400 --> 00:55:16,000
You enforce this at the API level
1416
00:55:16,000 --> 00:55:17,680
so the agent can call a function,
1417
00:55:17,680 --> 00:55:20,880
but the function itself is restricted to specific objects.
1418
00:55:20,880 --> 00:55:24,000
This represents a profound shift in how we think about security.
1419
00:55:24,000 --> 00:55:26,240
With scripts, you focus on securing the code
1420
00:55:26,240 --> 00:55:28,720
by scanning for vulnerabilities and auditing changes.
1421
00:55:28,720 --> 00:55:30,320
It's a code security problem.
1422
00:55:30,320 --> 00:55:33,520
With agents, you are securing the identity and the context.
1423
00:55:33,520 --> 00:55:35,120
The code is actually secondary.
1424
00:55:35,120 --> 00:55:37,680
The real question is what this agent is authorized to do
1425
00:55:37,680 --> 00:55:39,520
right now in this exact moment
1426
00:55:39,520 --> 00:55:40,880
with this specific identity.
1427
00:55:40,880 --> 00:55:42,640
If that answer is tightly scoped
1428
00:55:42,640 --> 00:55:44,320
and constantly re-evaluated,
1429
00:55:44,320 --> 00:55:47,120
the agent stays secure even if the code has flaws.
1430
00:55:47,120 --> 00:55:50,480
Security architecture is moving away from the traditional perimeter.
1431
00:55:50,480 --> 00:55:52,320
We are moving toward a zero-trust model
1432
00:55:52,320 --> 00:55:54,960
where every single agent action is evaluated,
1433
00:55:54,960 --> 00:55:56,720
every identity is verified
1434
00:55:56,720 --> 00:55:58,480
and every behavior is monitored.
1435
00:55:58,480 --> 00:56:00,320
The agent doesn't get a free pass
1436
00:56:00,320 --> 00:56:01,760
just because it's software.
1437
00:56:01,760 --> 00:56:03,840
In fact, you should treat it with less trust
1438
00:56:03,840 --> 00:56:06,480
than a human user because it operates autonomously.
1439
00:56:06,480 --> 00:56:08,960
Your security team will find their roles changing too.
1440
00:56:08,960 --> 00:56:10,960
They won't just be auditing lines of code anymore.
1441
00:56:10,960 --> 00:56:12,960
Instead, they will be defining policies,
1442
00:56:12,960 --> 00:56:14,560
reviewing behavioral baselines
1443
00:56:14,560 --> 00:56:16,240
and managing the identity life cycle
1444
00:56:16,240 --> 00:56:18,240
for an entire fleet of agents.
1445
00:56:18,240 --> 00:56:19,760
They have to think about attack surfaces
1446
00:56:19,760 --> 00:56:21,760
that simply didn't exist a few years ago
1447
00:56:21,760 --> 00:56:23,920
which requires a completely different skill set,
1448
00:56:23,920 --> 00:56:25,440
measuring agent performance.
1449
00:56:25,440 --> 00:56:27,040
When you move from scripts to agents,
1450
00:56:27,040 --> 00:56:29,360
your measurement system has to change completely.
1451
00:56:29,360 --> 00:56:30,640
You can't rely on the old metrics
1452
00:56:30,640 --> 00:56:33,200
because you aren't evaluating the same type of work anymore.
1453
00:56:33,200 --> 00:56:35,600
With scripts, success was a binary outcome.
1454
00:56:35,600 --> 00:56:36,880
You check to see if it ran,
1455
00:56:36,880 --> 00:56:38,880
if it crashed or if the logs showed any errors.
1456
00:56:38,880 --> 00:56:40,240
These are mechanical measurements.
1457
00:56:40,240 --> 00:56:42,480
The script either finished its sequence or it didn't.
1458
00:56:42,480 --> 00:56:45,440
You measured things like runtime and error frequency.
1459
00:56:45,440 --> 00:56:47,440
And if the script finished without a crash,
1460
00:56:47,440 --> 00:56:48,880
you just assumed it worked.
1461
00:56:48,880 --> 00:56:50,080
The logic was hidden inside
1462
00:56:50,080 --> 00:56:52,240
so you couldn't see why it made certain decisions.
1463
00:56:52,240 --> 00:56:53,520
Agents are a different story.
1464
00:56:53,520 --> 00:56:54,720
A script might run perfectly
1465
00:56:54,720 --> 00:56:56,880
but still fail to solve the actual problem.
1466
00:56:56,880 --> 00:56:58,560
On the other hand, an agent that stops
1467
00:56:58,560 --> 00:57:00,320
and escalates a ticket to a human
1468
00:57:00,320 --> 00:57:02,640
might actually be making the smartest possible move.
1469
00:57:02,640 --> 00:57:04,960
Success isn't about the agent running smoothly.
1470
00:57:04,960 --> 00:57:07,280
It's about the agent achieving the final goal.
1471
00:57:07,280 --> 00:57:09,280
You should start by looking at your automation rate.
1472
00:57:09,280 --> 00:57:10,640
This is the percentage of tasks
1473
00:57:10,640 --> 00:57:12,160
that are fully resolved by the agent
1474
00:57:12,160 --> 00:57:13,680
without any human stepping in.
1475
00:57:13,680 --> 00:57:15,360
If you handle a thousand tickets a month
1476
00:57:15,360 --> 00:57:17,360
and your agents resolve 400 of them,
1477
00:57:17,360 --> 00:57:18,960
your automation rate is 40%.
1478
00:57:18,960 --> 00:57:20,080
This is your headline metric
1479
00:57:20,080 --> 00:57:22,080
because it shows exactly how much work the agents
1480
00:57:22,080 --> 00:57:23,120
are taking off your plate.
1481
00:57:23,120 --> 00:57:24,720
You should expect this number to climb
1482
00:57:24,720 --> 00:57:26,160
as you refine the agents
1483
00:57:26,160 --> 00:57:28,080
though it will never hit 100%.
1484
00:57:28,080 --> 00:57:30,400
Some problems will always be outside the agent's scope
1485
00:57:30,400 --> 00:57:31,520
but the trend will tell you
1486
00:57:31,520 --> 00:57:33,120
if you're moving in the right direction.
1487
00:57:33,120 --> 00:57:35,600
The escalation rate is the other side of that coin.
1488
00:57:35,600 --> 00:57:37,360
When an agent can't finish a task,
1489
00:57:37,360 --> 00:57:39,440
you need to know how often it escalates
1490
00:57:39,440 --> 00:57:41,440
versus how often it just fails.
1491
00:57:41,440 --> 00:57:43,920
A well-designed agent will escalate more than it fails.
1492
00:57:43,920 --> 00:57:45,760
Escalation means the agent tried,
1493
00:57:45,760 --> 00:57:46,960
realized it couldn't finish
1494
00:57:46,960 --> 00:57:48,560
and handed the ticket to a human
1495
00:57:48,560 --> 00:57:50,240
with all the necessary context.
1496
00:57:50,240 --> 00:57:51,760
Failure means it just crashed
1497
00:57:51,760 --> 00:57:54,160
or threw an error with no way forward.
1498
00:57:54,160 --> 00:57:55,680
You want to see low failure rates
1499
00:57:55,680 --> 00:57:57,120
and high escalation rates.
1500
00:57:57,120 --> 00:57:59,440
An agent that hands over a complete diagnostic report
1501
00:57:59,440 --> 00:58:01,120
to a human is much more valuable
1502
00:58:01,120 --> 00:58:02,560
than one that fails silently.
1503
00:58:02,560 --> 00:58:04,320
Approval time is another critical factor
1504
00:58:04,320 --> 00:58:06,160
because humans are often the bottleneck.
1505
00:58:06,160 --> 00:58:07,600
When an agent suggests a fix
1506
00:58:07,600 --> 00:58:09,520
that needs a sign off, how long does that take?
1507
00:58:09,520 --> 00:58:10,560
If your agents are fast,
1508
00:58:10,560 --> 00:58:12,800
but your humans take two days to review the work,
1509
00:58:12,800 --> 00:58:14,560
you've just traded one bottleneck for another.
1510
00:58:14,560 --> 00:58:15,760
You want these approvals happening
1511
00:58:15,760 --> 00:58:17,200
in minutes rather than hours.
1512
00:58:17,200 --> 00:58:19,760
This metric tells you if your workflow is actually efficient
1513
00:58:19,760 --> 00:58:21,520
or if it needs to be redesigned.
1514
00:58:21,520 --> 00:58:23,200
Quality metrics are harder to track,
1515
00:58:23,200 --> 00:58:24,480
but they are essential.
1516
00:58:24,480 --> 00:58:26,800
Accuracy asks if the agent did the right thing,
1517
00:58:26,800 --> 00:58:28,400
diagnose the problem correctly
1518
00:58:28,400 --> 00:58:29,920
and apply the proper fix.
1519
00:58:29,920 --> 00:58:32,240
You measure this by having a human expert sample agent
1520
00:58:32,240 --> 00:58:34,160
decisions to verify they were correct.
1521
00:58:34,160 --> 00:58:36,640
Over time, this builds a reliable quality score.
1522
00:58:36,640 --> 00:58:38,240
You also need to measure completeness.
1523
00:58:38,240 --> 00:58:39,840
Did the agent solve the whole problem?
1524
00:58:39,840 --> 00:58:43,040
Or did it stop after fixing just one part of a multi-step issue?
1525
00:58:43,040 --> 00:58:44,240
Then you have the business metrics.
1526
00:58:44,240 --> 00:58:47,040
Cost per resolution is the total cost of your compute power
1527
00:58:47,040 --> 00:58:48,320
and human approval time
1528
00:58:48,320 --> 00:58:50,240
divided by the number of tickets resolved.
1529
00:58:50,240 --> 00:58:53,280
This tells you if automation is actually saving you money.
1530
00:58:53,280 --> 00:58:55,440
If your agents end up costing more than your human staff,
1531
00:58:55,440 --> 00:58:56,960
your model is likely broken.
1532
00:58:56,960 --> 00:59:00,480
You also want to track the mean time to resolution or MTTR.
1533
00:59:00,480 --> 00:59:02,880
When agents are involved, the time from ticket submission
1534
00:59:02,880 --> 00:59:05,040
to closure should drop significantly.
1535
00:59:05,040 --> 00:59:07,520
Finally, check user satisfaction to see if people are actually
1536
00:59:07,520 --> 00:59:09,520
happier with the faster response times.
1537
00:59:09,520 --> 00:59:11,120
The shift here is moving from,
1538
00:59:11,120 --> 00:59:12,560
did the system work?
1539
00:59:12,560 --> 00:59:15,040
To, did the system solve the problem?
1540
00:59:15,040 --> 00:59:16,720
Scripts let you measure execution,
1541
00:59:16,720 --> 00:59:18,800
but agents require you to measure helpfulness.
1542
00:59:18,800 --> 00:59:20,320
This also changes your feedback loop.
1543
00:59:20,320 --> 00:59:23,280
With scripts, you look at logs to find bugs in the code.
1544
00:59:23,280 --> 00:59:25,200
With agents, you look at quality metrics
1545
00:59:25,200 --> 00:59:27,360
to see where the decision making went wrong
1546
00:59:27,360 --> 00:59:30,160
so you can retrain the model or update your policies.
1547
00:59:30,160 --> 00:59:31,920
You aren't just debugging code anymore.
1548
00:59:31,920 --> 00:59:34,080
You are improving the way the system thinks.
1549
00:59:34,080 --> 00:59:37,360
New observability practices are starting to emerge around this.
1550
00:59:37,360 --> 00:59:39,520
You need real-time dashboards that show these trends
1551
00:59:39,520 --> 00:59:42,800
and historical data to see if your improvements are actually sticking.
1552
00:59:42,800 --> 00:59:44,480
You need to break things down by category
1553
00:59:44,480 --> 00:59:46,640
so you can see if your agents are great at one task
1554
00:59:46,640 --> 00:59:47,760
but struggling with another.
1555
00:59:47,760 --> 00:59:49,280
You also need drill-down views
1556
00:59:49,280 --> 00:59:51,200
that let you inspect a specific ticket
1557
00:59:51,200 --> 00:59:52,960
to see exactly what the agent decided
1558
00:59:52,960 --> 00:59:54,640
and why it made that choice.
1559
00:59:54,640 --> 00:59:56,560
Measurement becomes a constant conversation
1560
00:59:56,560 --> 00:59:58,480
between you and your agent fleet.
1561
00:59:58,480 --> 01:00:01,120
The metrics will show you exactly where to focus your energy next.
1562
01:00:01,120 --> 01:00:03,440
If your automation rate is stuck at 30%,
1563
01:00:03,440 --> 01:00:06,400
you need to look at the 70% that is still escalating.
1564
01:00:06,400 --> 01:00:07,920
If your approval time is creeping up,
1565
01:00:07,920 --> 01:00:09,840
your process is becoming a bottleneck.
1566
01:00:09,840 --> 01:00:11,920
If the cost per resolution is climbing,
1567
01:00:11,920 --> 01:00:13,600
your agent might be over-engineered
1568
01:00:13,600 --> 01:00:15,520
or your infrastructure might be inefficient.
1569
01:00:15,520 --> 01:00:18,080
Without these specific metrics, you are flying blind.
1570
01:00:18,080 --> 01:00:20,880
You won't know if your agents are actually making things better
1571
01:00:20,880 --> 01:00:24,160
or if you're just enjoying the feeling of having things automated.
1572
01:00:24,160 --> 01:00:27,360
The roadmap to operationalization.
1573
01:00:27,360 --> 01:00:30,640
Moving an agent from a lab experiment to a production workhorse
1574
01:00:30,640 --> 01:00:32,160
requires a specific path.
1575
01:00:32,160 --> 01:00:34,080
The technology isn't actually the hard part
1576
01:00:34,080 --> 01:00:36,640
because you have already built the core logic.
1577
01:00:36,640 --> 01:00:39,040
The real challenge is shifting how your team works,
1578
01:00:39,040 --> 01:00:41,520
what they monitor, and how they build trust in the system.
1579
01:00:41,520 --> 01:00:44,080
You have to follow a roadmap with distinct phases
1580
01:00:44,080 --> 01:00:46,320
and if you try to skip them, it will cost you.
1581
01:00:46,320 --> 01:00:47,360
Phase one is the pilot.
1582
01:00:47,360 --> 01:00:50,240
You pick one narrow domain and one specific capability
1583
01:00:50,240 --> 01:00:52,480
like password resets and that is it.
1584
01:00:52,480 --> 01:00:54,640
You build a single agent that does one thing.
1585
01:00:54,640 --> 01:00:58,160
It only runs queries with read-only access to Azure AD and Intune
1586
01:00:58,160 --> 01:01:00,240
so there are no rights and no approvals yet.
1587
01:01:00,240 --> 01:01:02,800
When a user submits a ticket saying their password expired,
1588
01:01:02,800 --> 01:01:05,680
the agent reads the account status, confirms they exist,
1589
01:01:05,680 --> 01:01:07,360
and checks that they aren't locked out.
1590
01:01:07,360 --> 01:01:09,920
It resets the password and notifies the user
1591
01:01:09,920 --> 01:01:11,360
before closing the ticket.
1592
01:01:11,360 --> 01:01:13,840
This phase proves the fundamental concept
1593
01:01:13,840 --> 01:01:16,240
that an agent can interpret natural language
1594
01:01:16,240 --> 01:01:19,360
and execute a sequence of actions to solve a real problem.
1595
01:01:19,360 --> 01:01:22,400
You aren't doing anything fancy or using complex reasoning yet
1596
01:01:22,400 --> 01:01:24,960
but you are showing the team that the idea works
1597
01:01:24,960 --> 01:01:26,480
so they can gain confidence.
1598
01:01:26,480 --> 01:01:28,560
This pilot usually runs for four to eight weeks
1599
01:01:28,560 --> 01:01:30,960
while you measure success rates in error patterns.
1600
01:01:30,960 --> 01:01:33,200
This is where you discover the operational infrastructure
1601
01:01:33,200 --> 01:01:35,440
you are missing, like better logging or a plan
1602
01:01:35,440 --> 01:01:37,120
for updating agent behavior.
1603
01:01:37,120 --> 01:01:38,640
You might realize that approval workflows
1604
01:01:38,640 --> 01:01:39,920
are much harder than you thought
1605
01:01:39,920 --> 01:01:42,720
but these discoveries are exactly why you run a pilot.
1606
01:01:42,720 --> 01:01:45,760
You are learning what it takes to run the system at a low scale
1607
01:01:45,760 --> 01:01:48,000
where mistakes don't cause a massive collapse.
1608
01:01:48,000 --> 01:01:49,360
Phase two is expansion.
1609
01:01:49,360 --> 01:01:51,680
You add a second domain, perhaps access requests
1610
01:01:51,680 --> 01:01:54,480
or device provisioning, which means you now have two agents
1611
01:01:54,480 --> 01:01:55,600
running in parallel.
1612
01:01:55,600 --> 01:01:57,440
This is where scaling starts to matter
1613
01:01:57,440 --> 01:02:00,800
because you are adding complexity instead of just doing more of the same.
1614
01:02:00,800 --> 01:02:04,080
The orchestration layer now has to root tickets to the right agent
1615
01:02:04,080 --> 01:02:07,680
and your monitoring must track which agent handled which specific task.
1616
01:02:07,680 --> 01:02:10,320
You can start introducing low risk right operations here.
1617
01:02:10,320 --> 01:02:12,720
The access request agent might assign a user to a group
1618
01:02:12,720 --> 01:02:13,920
if the request is valid
1619
01:02:13,920 --> 01:02:17,600
but you still avoid high risk changes like deletions or MFA resets.
1620
01:02:17,600 --> 01:02:19,440
You only allow straightforward modifications
1621
01:02:19,440 --> 01:02:21,840
that you can easily undo if something goes wrong.
1622
01:02:21,840 --> 01:02:23,760
You also start introducing approval workflows
1623
01:02:23,760 --> 01:02:26,960
for medium risk decisions during this six to ten week period.
1624
01:02:26,960 --> 01:02:29,360
You are proving that multiple agents can coexist
1625
01:02:29,360 --> 01:02:32,480
and that the routing works without the approval system becoming a bottleneck.
1626
01:02:32,480 --> 01:02:33,920
This is when you start seeing patterns
1627
01:02:33,920 --> 01:02:36,320
in what the agents get right and where they struggle
1628
01:02:36,320 --> 01:02:40,160
and that feedback becomes the fuel for your retraining and policy adjustments.
1629
01:02:40,160 --> 01:02:41,600
Phase three is scale.
1630
01:02:41,600 --> 01:02:42,640
You have a fleet now
1631
01:02:42,640 --> 01:02:44,880
with five or six agents handling different domains
1632
01:02:44,880 --> 01:02:46,320
from a shared ticket queue.
1633
01:02:46,320 --> 01:02:48,880
The orchestrator distributes work based on capability
1634
01:02:48,880 --> 01:02:50,000
and availability.
1635
01:02:50,000 --> 01:02:52,160
And while high risk operations are finally in the mix
1636
01:02:52,160 --> 01:02:54,080
they all have strict approval gates.
1637
01:02:54,080 --> 01:02:57,440
This phase is about proving the system can handle real production volume.
1638
01:02:57,440 --> 01:02:59,920
You aren't testing 100 tickets a week anymore
1639
01:02:59,920 --> 01:03:02,240
but instead you are running hundreds every single day.
1640
01:03:02,240 --> 01:03:04,400
This is where your infrastructure scaling happens
1641
01:03:04,400 --> 01:03:07,600
and you find out if your Kubernetes setup can actually handle the load.
1642
01:03:07,600 --> 01:03:09,600
You will see which types of tickets the agents nail
1643
01:03:09,600 --> 01:03:11,440
and which ones cause them to stumble.
1644
01:03:11,440 --> 01:03:13,520
This phase typically runs for three or four months
1645
01:03:13,520 --> 01:03:14,960
as you ramp up the volume
1646
01:03:14,960 --> 01:03:16,960
and the business impact becomes visible.
1647
01:03:16,960 --> 01:03:19,760
Tickets that used to take two hours now take 15 minutes
1648
01:03:19,760 --> 01:03:21,200
which clears your backlog
1649
01:03:21,200 --> 01:03:23,440
and frees your IT staff from routine drudgery.
1650
01:03:23,440 --> 01:03:25,920
This is the moment you get the organizational buy-in
1651
01:03:25,920 --> 01:03:27,680
needed for the final stage.
1652
01:03:27,680 --> 01:03:29,280
Phase four is optimization.
1653
01:03:29,280 --> 01:03:30,880
Your agents are production grade
1654
01:03:30,880 --> 01:03:32,800
so now you focus on making them better.
1655
01:03:32,800 --> 01:03:34,640
You feed resolution data back into the model
1656
01:03:34,640 --> 01:03:36,880
so the agents can learn from human corrections.
1657
01:03:36,880 --> 01:03:38,960
You adjust policies based on the patterns you see
1658
01:03:38,960 --> 01:03:39,920
and optimize routing
1659
01:03:39,920 --> 01:03:42,320
so the right agent gets the right ticket more often.
1660
01:03:42,320 --> 01:03:44,160
You reduce approval times by refining
1661
01:03:44,160 --> 01:03:45,760
what actually needs human judgment
1662
01:03:45,760 --> 01:03:48,080
versus what the agent can decide on its own.
1663
01:03:48,080 --> 01:03:50,800
This phase is continuous because it isn't a checkpoint
1664
01:03:50,800 --> 01:03:52,640
but an ongoing operational mode.
1665
01:03:52,640 --> 01:03:55,280
You are constantly measuring ROI and tuning performance
1666
01:03:55,280 --> 01:03:57,600
while adding new domains as opportunities appear.
1667
01:03:57,600 --> 01:03:59,600
The entire journey from pilot to production
1668
01:03:59,600 --> 01:04:01,440
usually takes six to 12 months.
1669
01:04:01,440 --> 01:04:03,120
It isn't because the technology is slow
1670
01:04:03,120 --> 01:04:06,160
but because operationalization requires a high level of discipline.
1671
01:04:06,160 --> 01:04:09,360
You cannot skip phases or try to pilot everything at once.
1672
01:04:09,360 --> 01:04:11,760
You have to prove at each stage that the system works
1673
01:04:11,760 --> 01:04:13,280
that your team knows how to run it
1674
01:04:13,280 --> 01:04:16,080
and that the business impact justifies the next step.
1675
01:04:16,080 --> 01:04:18,400
The mindset shift moves from can we build this
1676
01:04:18,400 --> 01:04:20,160
and to how do we run this?
1677
01:04:20,160 --> 01:04:21,760
And that is where the real work happens.
1678
01:04:21,760 --> 01:04:23,280
The technology is the easy part.
1679
01:04:23,280 --> 01:04:26,240
What changes when agents are autonomous?
1680
01:04:26,240 --> 01:04:28,480
Everything changes the moment your agents start running
1681
01:04:28,480 --> 01:04:29,760
without permission gates.
1682
01:04:29,760 --> 01:04:32,640
Your operations transform not because the technology is different
1683
01:04:32,640 --> 01:04:34,320
but because the work distribution
1684
01:04:34,320 --> 01:04:37,280
and the organizational structure have to reorganize.
1685
01:04:37,280 --> 01:04:39,040
Think about your operational tempo.
1686
01:04:39,040 --> 01:04:40,720
Right now your IT team is reactive
1687
01:04:40,720 --> 01:04:43,040
because a problem happens, a user calls
1688
01:04:43,040 --> 01:04:45,120
and then someone diagnoses and fixes it.
1689
01:04:45,120 --> 01:04:47,040
This is a respond to crisis model
1690
01:04:47,040 --> 01:04:50,720
but with autonomous agents at scale, the game completely inverts.
1691
01:04:50,720 --> 01:04:53,520
Agents can detect that a device is drifting out of compliance
1692
01:04:53,520 --> 01:04:55,520
before the user even notices a problem.
1693
01:04:55,520 --> 01:04:58,240
They see patterns of failed logins from a specific location
1694
01:04:58,240 --> 01:04:59,840
and flag them for security
1695
01:04:59,840 --> 01:05:02,240
or they notice outdated applications across departments
1696
01:05:02,240 --> 01:05:04,080
and proactively push updates.
1697
01:05:04,080 --> 01:05:06,320
They watch your tenant health metrics constantly
1698
01:05:06,320 --> 01:05:08,560
and predict when you are going to hit a service limit.
1699
01:05:08,560 --> 01:05:11,520
This shift from reactive to proactive is preventive meaning
1700
01:05:11,520 --> 01:05:14,000
problems get solved before they ever become outages.
1701
01:05:14,000 --> 01:05:16,560
This changes the types of fires that end up on your desk.
1702
01:05:16,560 --> 01:05:18,960
Instead of spending your day fighting active incidents
1703
01:05:18,960 --> 01:05:21,040
you spend it reviewing what the agents prevented.
1704
01:05:21,040 --> 01:05:23,600
Your monitoring doesn't show you crisis dashboards anymore
1705
01:05:23,600 --> 01:05:25,120
but instead shows you early warnings
1706
01:05:25,120 --> 01:05:26,960
and the agents proposed fixes.
1707
01:05:26,960 --> 01:05:29,520
Instead of getting paged at midnight because something broke
1708
01:05:29,520 --> 01:05:31,360
you get an alert that something is about to break
1709
01:05:31,360 --> 01:05:34,080
and the agent is asking for approval to fix it ahead of time.
1710
01:05:34,080 --> 01:05:37,120
The organizational structure has to follow this shift right now.
1711
01:05:37,120 --> 01:05:38,880
You have a lot of people doing level one work
1712
01:05:38,880 --> 01:05:40,080
like ticket triage,
1713
01:05:40,080 --> 01:05:42,240
password resets and access requests.
1714
01:05:42,240 --> 01:05:45,280
These roles exist because you need human judgment to decide
1715
01:05:45,280 --> 01:05:48,080
where a ticket goes and what the best path forward is.
1716
01:05:48,080 --> 01:05:50,960
Autonomous agents eliminate that entire category of work.
1717
01:05:50,960 --> 01:05:53,840
You don't need as many junior technicians doing routine resolution
1718
01:05:53,840 --> 01:05:56,160
but the people you do have must be more skilled.
1719
01:05:56,160 --> 01:05:58,400
You need architects to design what agents can do
1720
01:05:58,400 --> 01:06:00,800
and policy experts to define the guardrails.
1721
01:06:00,800 --> 01:06:03,040
You need people who can read agent behavior analytics
1722
01:06:03,040 --> 01:06:05,520
to determine if an agent is making good decisions.
1723
01:06:05,520 --> 01:06:07,440
When an agent struggles in a specific domain
1724
01:06:07,440 --> 01:06:09,840
you need someone who can actually debug the workflow.
1725
01:06:09,840 --> 01:06:12,000
The skills conversation changes entirely.
1726
01:06:12,000 --> 01:06:14,400
You stop hiring people who are just good at following procedures
1727
01:06:14,400 --> 01:06:18,080
and start hiring people who can design procedures for agents to execute.
1728
01:06:18,080 --> 01:06:20,960
You stop training people on how to reset a password in exchange
1729
01:06:20,960 --> 01:06:23,040
and start training them on how to design an agent
1730
01:06:23,040 --> 01:06:25,680
that understands when a reset is the right answer.
1731
01:06:25,680 --> 01:06:29,440
You are moving from execution focused skills to design focused skills,
1732
01:06:29,440 --> 01:06:32,320
shifting from task specialists to system architects.
1733
01:06:32,320 --> 01:06:33,920
The cost structure inverts as well.
1734
01:06:33,920 --> 01:06:35,600
Right now you are paying salaries
1735
01:06:35,600 --> 01:06:38,000
and labor is your biggest operational expense.
1736
01:06:38,000 --> 01:06:39,680
As you move to autonomous agents
1737
01:06:39,680 --> 01:06:43,600
you are paying for compute, cloud infrastructure and API calls.
1738
01:06:43,600 --> 01:06:46,240
These are capital expenses, not labor expenses
1739
01:06:46,240 --> 01:06:48,880
and that matters because capital scales differently than people.
1740
01:06:48,880 --> 01:06:51,120
You can spin up more agents instantly
1741
01:06:51,120 --> 01:06:54,160
but you cannot hire a dozen good technicians overnight.
1742
01:06:54,160 --> 01:06:56,880
The constraint shifts from how many people you can hire
1743
01:06:56,880 --> 01:06:58,400
to what your cloud budget looks like
1744
01:06:58,400 --> 01:07:00,720
and whether your infrastructure can handle the throughput.
1745
01:07:00,720 --> 01:07:02,880
For some this is a dramatic cost reduction
1746
01:07:02,880 --> 01:07:04,560
while for others it is a cost shift
1747
01:07:04,560 --> 01:07:06,160
where you replace people with infrastructure.
1748
01:07:06,160 --> 01:07:07,520
Either way the economics are different.
1749
01:07:07,520 --> 01:07:10,880
The institutional knowledge problem gets solved and created at the same time.
1750
01:07:10,880 --> 01:07:12,800
Right now if your best technician leaves
1751
01:07:12,800 --> 01:07:14,640
you lose years of troubleshooting knowledge
1752
01:07:14,640 --> 01:07:16,240
and all the workarounds they've learned.
1753
01:07:16,240 --> 01:07:17,440
Their departure is painful
1754
01:07:17,440 --> 01:07:20,960
but with agents that knowledge is codified in the decision making logic.
1755
01:07:20,960 --> 01:07:22,960
When an agent escalates to a human
1756
01:07:22,960 --> 01:07:24,800
it isn't because it doesn't know what to do
1757
01:07:24,800 --> 01:07:27,360
but because the situation requires human judgment.
1758
01:07:27,360 --> 01:07:29,040
The knowledge doesn't walk out the door
1759
01:07:29,040 --> 01:07:30,800
but you do have a new problem.
1760
01:07:30,800 --> 01:07:32,400
The agent only knows what it knows
1761
01:07:32,400 --> 01:07:35,600
and updating that knowledge requires retraining or policy changes.
1762
01:07:35,600 --> 01:07:38,080
You can't just call a senior tech and ask them a question.
1763
01:07:38,080 --> 01:07:40,400
You have to change the model or the policy.
1764
01:07:40,400 --> 01:07:44,080
The speed to resolution changes your entire relationship with your users.
1765
01:07:44,080 --> 01:07:46,160
Problems get fixed in minutes instead of hours
1766
01:07:46,160 --> 01:07:49,600
and users start to see IT as a responsive service they can trust.
1767
01:07:49,600 --> 01:07:50,880
This isn't a small detail.
1768
01:07:50,880 --> 01:07:52,880
This is how you shift the perception of IT
1769
01:07:52,880 --> 01:07:55,520
from a bottleneck to a service that actually works.
1770
01:07:55,520 --> 01:07:58,080
It changes user behavior, ticket patterns
1771
01:07:58,080 --> 01:08:00,320
and how the business views technology as a whole.
1772
01:08:00,320 --> 01:08:03,680
The real opportunity underneath all of this is the human one.
1773
01:08:03,680 --> 01:08:07,200
Your team isn't spending their energy on repetitive tasks anymore
1774
01:08:07,200 --> 01:08:09,200
so they are finally free to work on things
1775
01:08:09,200 --> 01:08:10,960
that require real human thinking.
1776
01:08:10,960 --> 01:08:13,680
They can design better systems, understand business context
1777
01:08:13,680 --> 01:08:16,160
and strategize about how technology supports the company.
1778
01:08:16,160 --> 01:08:18,720
This is the best version of where technology should go.
1779
01:08:18,720 --> 01:08:20,240
It isn't about replacing humans
1780
01:08:20,240 --> 01:08:23,280
but about removing the drudgery so humans can do the thinking.
1781
01:08:23,280 --> 01:08:24,800
The competitive advantage.
1782
01:08:24,800 --> 01:08:26,560
The companies that get this right first
1783
01:08:26,560 --> 01:08:28,080
have something the others don't
1784
01:08:28,080 --> 01:08:30,320
and it's not just faster ticket resolution.
1785
01:08:30,320 --> 01:08:32,000
It's not just lower operational costs
1786
01:08:32,000 --> 01:08:32,960
though both of those matters.
1787
01:08:32,960 --> 01:08:35,680
The real advantage is that they're building a capability stack
1788
01:08:35,680 --> 01:08:37,120
that compounds over time
1789
01:08:37,120 --> 01:08:39,120
while everyone else is still writing if they're logic.
1790
01:08:39,120 --> 01:08:40,960
Here's how it actually breaks down.
1791
01:08:40,960 --> 01:08:44,160
If your organization moves to autonomous agents in 2026
1792
01:08:44,160 --> 01:08:46,800
you get a two to three year head start on everyone who waits.
1793
01:08:46,800 --> 01:08:48,960
This isn't because the technology is hard to copy.
1794
01:08:48,960 --> 01:08:51,600
Eventually other companies will build similar systems
1795
01:08:51,600 --> 01:08:53,600
but the organizations that start now
1796
01:08:53,600 --> 01:08:55,920
accumulate operational knowledge, feedback data
1797
01:08:55,920 --> 01:08:58,720
and policy patterns that the late comers simply won't have.
1798
01:08:58,720 --> 01:09:01,280
Think about what happens during your first year operating agents.
1799
01:09:01,280 --> 01:09:03,440
You're not just running automation, you're learning,
1800
01:09:03,440 --> 01:09:05,600
you're seeing patterns in what kinds of tickets agents
1801
01:09:05,600 --> 01:09:07,840
handle beautifully and what kinds trip them up.
1802
01:09:07,840 --> 01:09:09,440
You're building institutional knowledge about
1803
01:09:09,440 --> 01:09:11,520
how your specific environment behaves
1804
01:09:11,520 --> 01:09:13,760
and your refining policies dozens of times
1805
01:09:13,760 --> 01:09:15,440
based on real production feedback.
1806
01:09:15,440 --> 01:09:18,080
By the time the year is up you don't just have working agents.
1807
01:09:18,080 --> 01:09:20,240
You have a mature operational model.
1808
01:09:20,240 --> 01:09:23,760
An organization that starts this in 2027 or 2028
1809
01:09:23,760 --> 01:09:25,040
doesn't get to follow your path.
1810
01:09:25,040 --> 01:09:26,160
They have to chart their own.
1811
01:09:26,160 --> 01:09:28,240
They might avoid some of your early mistakes
1812
01:09:28,240 --> 01:09:31,200
but they don't get the benefit of your years of operational refinement.
1813
01:09:31,200 --> 01:09:34,160
They're starting from scratch while you're already optimized.
1814
01:09:34,160 --> 01:09:35,840
The second advantage is data.
1815
01:09:35,840 --> 01:09:38,560
Every ticket in agent resolves, every decision it makes
1816
01:09:38,560 --> 01:09:40,080
and every correction it receives.
1817
01:09:40,080 --> 01:09:41,040
That's training data.
1818
01:09:41,040 --> 01:09:43,040
Your agents get smarter over time
1819
01:09:43,040 --> 01:09:45,680
because they're operating in your actual environment
1820
01:09:45,680 --> 01:09:47,040
against your actual problems.
1821
01:09:47,040 --> 01:09:48,000
If you move fast,
1822
01:09:48,000 --> 01:09:50,400
you accumulate 18 months of production feedback
1823
01:09:50,400 --> 01:09:52,480
before competitors even launch pilots.
1824
01:09:52,480 --> 01:09:54,240
That feedback becomes a learning advantage.
1825
01:09:54,240 --> 01:09:56,000
Your agents understand your environment better
1826
01:09:56,000 --> 01:09:57,840
than any generic agent ever could
1827
01:09:57,840 --> 01:09:59,600
because they've seen your specific patterns,
1828
01:09:59,600 --> 01:10:01,360
your specific edge cases
1829
01:10:01,360 --> 01:10:03,200
and your specific business context.
1830
01:10:03,200 --> 01:10:05,520
The third advantage is organizational readiness.
1831
01:10:05,520 --> 01:10:07,600
By the time your industry is having conversations
1832
01:10:07,600 --> 01:10:09,920
about autonomous agents in 2027,
1833
01:10:09,920 --> 01:10:11,440
you've already retrained your team.
1834
01:10:11,440 --> 01:10:13,360
You've already figured out what skills matter.
1835
01:10:13,360 --> 01:10:15,120
You've already built the cultural capability
1836
01:10:15,120 --> 01:10:17,040
to work alongside autonomous systems.
1837
01:10:17,040 --> 01:10:18,400
You've already had the hard conversations
1838
01:10:18,400 --> 01:10:20,560
about what gets automated and what stays human.
1839
01:10:20,560 --> 01:10:23,600
Organizations coming later don't get that cultural head start.
1840
01:10:23,600 --> 01:10:25,440
They're still figuring out how to think about this
1841
01:10:25,440 --> 01:10:26,960
while you're already executing.
1842
01:10:26,960 --> 01:10:29,440
The fourth advantage is market consolidation.
1843
01:10:29,440 --> 01:10:30,800
In every technology transition,
1844
01:10:30,800 --> 01:10:33,120
the leaders captured disproportionate market share,
1845
01:10:33,120 --> 01:10:35,280
not because they have better technology eventually.
1846
01:10:35,280 --> 01:10:36,320
Competitors catch up,
1847
01:10:36,320 --> 01:10:38,400
but because they hit escape velocity first,
1848
01:10:38,400 --> 01:10:40,240
they're solving customer problems faster.
1849
01:10:40,240 --> 01:10:42,000
They're cheaper, they're more reliable.
1850
01:10:42,000 --> 01:10:43,040
They start winning business.
1851
01:10:43,040 --> 01:10:46,480
By the time competitors launch similar solutions,
1852
01:10:46,480 --> 01:10:48,640
the leader has already grown their customer base,
1853
01:10:48,640 --> 01:10:50,080
built stronger partnerships,
1854
01:10:50,080 --> 01:10:51,840
and established market presence.
1855
01:10:51,840 --> 01:10:54,960
The technology is table stakes after the first mover advantage.
1856
01:10:54,960 --> 01:10:56,880
The market position is what lasts,
1857
01:10:56,880 --> 01:10:58,160
but there's also a risk
1858
01:10:58,160 --> 01:11:00,480
that makes this a necessity rather than an option.
1859
01:11:00,480 --> 01:11:03,040
Organizations that don't move toward autonomous agents
1860
01:11:03,040 --> 01:11:06,720
by 2027 or 2028 start falling behind operationally.
1861
01:11:06,720 --> 01:11:08,480
Not gradually, sharply.
1862
01:11:08,480 --> 01:11:10,560
Imagine your competitor has reduced their meantime
1863
01:11:10,560 --> 01:11:11,920
to resolution to 40 minutes,
1864
01:11:11,920 --> 01:11:13,600
while yours is still three hours.
1865
01:11:13,600 --> 01:11:17,360
Imagine they're handling 80% of routine tickets autonomously
1866
01:11:17,360 --> 01:11:18,640
while you're at 20%.
1867
01:11:18,640 --> 01:11:22,240
Imagine their IT operational cost per resolution is half of yours.
1868
01:11:22,240 --> 01:11:23,520
These aren't minor differences.
1869
01:11:23,520 --> 01:11:26,000
These are the differences that show up in your margins,
1870
01:11:26,000 --> 01:11:27,680
your customer satisfaction scores,
1871
01:11:27,680 --> 01:11:29,440
and your ability to compete.
1872
01:11:29,440 --> 01:11:32,080
This is why the shift from automation is optional
1873
01:11:32,080 --> 01:11:34,080
to automation is strategic matters.
1874
01:11:34,080 --> 01:11:35,520
For the last 15 years,
1875
01:11:35,520 --> 01:11:38,000
you could be competitive with decent scripting
1876
01:11:38,000 --> 01:11:39,920
and solid operational practices.
1877
01:11:39,920 --> 01:11:41,360
That era is ending.
1878
01:11:41,360 --> 01:11:44,080
The organizations that build autonomous agent capabilities
1879
01:11:44,080 --> 01:11:47,280
now will set the standard for what IT operations look like.
1880
01:11:47,280 --> 01:11:48,640
Everyone else will be playing catch-up.
1881
01:11:48,640 --> 01:11:50,800
Not because the technology is complex, it's not.
1882
01:11:50,800 --> 01:11:52,320
Because the organizational capability
1883
01:11:52,320 --> 01:11:54,560
and operational knowledge compounds over time.
1884
01:11:54,560 --> 01:11:57,280
You can't catch up on learning by throwing money at the problem.
1885
01:11:57,280 --> 01:11:58,640
You catch up by committing now.
1886
01:11:58,640 --> 01:12:00,560
The competitive advantage isn't permanent.
1887
01:12:00,560 --> 01:12:02,720
It's an 18-24-month window
1888
01:12:02,720 --> 01:12:04,800
where the first move is established a lead.
1889
01:12:04,800 --> 01:12:07,520
After that, the technology is available to everyone.
1890
01:12:07,520 --> 01:12:08,960
But the organizations that move first
1891
01:12:08,960 --> 01:12:10,560
have already built the muscle memory.
1892
01:12:10,560 --> 01:12:12,160
They know what works in their environment.
1893
01:12:12,160 --> 01:12:13,440
They've optimized their workflows.
1894
01:12:13,440 --> 01:12:14,720
They've trained their teams.
1895
01:12:14,720 --> 01:12:17,360
They've embedded autonomous operations into their DNA.
1896
01:12:17,360 --> 01:12:19,040
That advantage takes years to overcome.
1897
01:12:19,040 --> 01:12:22,400
Static power shell scripts worked for a predictable world.
1898
01:12:22,400 --> 01:12:23,680
That world doesn't exist anymore.
1899
01:12:23,680 --> 01:12:25,120
Your enterprise is dynamic.
1900
01:12:25,120 --> 01:12:26,480
Your problems are complex.
1901
01:12:26,480 --> 01:12:29,680
And automation without reasoning is just repetition at high speed.
1902
01:12:29,680 --> 01:12:32,320
The shift to autonomous agents powered by semantic kernel
1903
01:12:32,320 --> 01:12:34,480
and Microsoft Graph isn't a technology change.
1904
01:12:34,480 --> 01:12:35,600
It's an operational one.
1905
01:12:35,600 --> 01:12:37,760
It's how IT operations work starting now.
1906
01:12:37,760 --> 01:12:40,400
Start building your first agent in the next quarter.
1907
01:12:40,400 --> 01:12:42,240
Pick a domain where you have high ticket volume
1908
01:12:42,240 --> 01:12:43,360
and low variation.
1909
01:12:43,360 --> 01:12:44,560
Get it running in production.
1910
01:12:44,560 --> 01:12:45,360
Measure what matters.
1911
01:12:45,360 --> 01:12:47,600
Use those results to expand.
1912
01:12:47,600 --> 01:12:49,600
The time to start isn't some time next year.
1913
01:12:49,600 --> 01:12:50,160
It's now.
1914
01:12:50,160 --> 01:12:52,000
Your competitive advantage depends on it.

Founder of m365.fm, m365.show and m365con.net
Mirko Peters is a Microsoft 365 expert, content creator, and founder of m365.fm, a platform dedicated to sharing practical insights on modern workplace technologies. His work focuses on Microsoft 365 governance, security, collaboration, and real-world implementation strategies.
Through his podcast and written content, Mirko provides hands-on guidance for IT professionals, architects, and business leaders navigating the complexities of Microsoft 365. He is known for translating complex topics into clear, actionable advice, often highlighting common mistakes and overlooked risks in real-world environments.
With a strong emphasis on community contribution and knowledge sharing, Mirko is actively building a platform that connects experts, shares experiences, and helps organizations get the most out of their Microsoft 365 investments.









