The 10 Architectural Mandates That Stop Copilot Chaos
Most organizations think Copilot is just a helpful layer that writes drafts faster. That misunderstanding is exactly how silent data leaks, invented policies, and irreversible automation changes begin. This episode argues that Copilot is not a colleague or assistant at all, but a distributed decision engine built on Microsoft Graph that executes whatever boundaries you actually configure, not the intent you think you expressed. When leaders rely on casual prompts, implicit trust, or “user has access” as a boundary, Copilot faithfully compiles that ambiguity into behavior, pulling in overshared HR or legal data, inventing authoritative-sounding procedures, and triggering real system changes without consent. The core lesson is that probabilistic language models are safe only when confined to reasoning and drafting; the moment outputs drive decisions or actions, determinism, enforced scopes, refusal states, and citations become mandatory. The episode walks through real failure patterns, from quiet data leakage and confident fiction to automation that mutates live systems, and shows that these are not model errors but architectural ones. The takeaway is blunt: Copilot entropy grows by default. If you do not deliberately encode hard boundaries, source authority, separation between reasoning and execution, and strong observability, the system will operationalize your optimism. Control is not about better prompts, it is about designing the control plane so silence beats fiction and helpfulness cannot escape the fence you built.
The 10 Architectural Mandates That Stop Copilot Chaos Most organizations treat Copilot like a helpful feature. That assumption is the root cause of nearly every Copilot incident. In reality, Copilot is a distributed decision engine riding Microsoft Graph—compiling intent, permissions, and ambiguity into real actions. When boundaries aren’t encoded, ambiguity becomes policy. In this episode, we move past theory and features and lay out ten enforceable architectural mandates that turn Copilot from a chaos amplifier into a governed control plane. This is a masterclass for architects, security leaders, and operators who own the blast radius when Copilot goes wrong. What This Episode Delivers
- A clear explanation of why Copilot failures are architectural, not model errors
- The single misunderstanding that creates data leakage, hallucinated authority, and irreversible automation
- A practical control pattern you can implement immediately
- Ten mandates that convert intent into enforceable design
- A red-flag test to identify Copilot chaos before the incident ticket arrives
This is not a tour of Copilot features. It’s a system-level blueprint for controlling them. The Core Insight Copilot is not a colleague or assistant. It is a control plane component.
It does not ask clarifying questions.
It evaluates the state you designed—and executes inside it. If intent is not encoded in scopes, identities, gates, and refusals, Copilot will faithfully compile ambiguity into behavior. Confidently. At scale. The 10 Architectural Mandates (High-Level)
- Define the System, Not the Feature – Name the control plane you’re operating.
- Boundaries First – Constrain Graph scope before writing prompts.
- Structured Output or Nothing – Prose drafts are safe; actions require schemas.
- Separate Reasoning from Execution – Reason → Plan → Gate → Execute. Always.
- Authority Gating – No citations, no answers. Truth or silence.
- Explicit State – Session contracts and visible context ledgers only.
- Observability, Budgets, and Drift – Cost is a security signal.
- Identity & Least Privilege – Agents are roles, not people.
- Teams & Outlook Controls – Conversation is a high-risk edge.
- Power Automate Guardrails – Where hallucinations become incidents.
Each mandate is tied directly to real failure modes already showing up in enterprises: silent data leakage, confidently wrong decisions, unauthorized automation, false trust from “memory,” and runaway cost. Who This Episode Is For
- Enterprise architects and platform owners
- Security, identity, and governance teams
- Copilot Studio and Power Automate builders
- Leaders accountable for compliance, audit, and incident response
If you are responsible for outcomes—not demos—this episode is for you. Key Takeaway Copilot does not create chaos.
Unencoded intent does. Acceleration is easy.
Control requires architecture. Encode the boundaries.
Gate authority.
Separate thinking from doing.
Instrument everything. That’s how you stop Copilot chaos—without slowing the business.
Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-fm-modern-work-security-and-productivity-with-microsoft-365--6704921/support.
1
00:00:00,000 --> 00:00:02,360
Most people treat Copilot as a helpful feature layer.
2
00:00:02,360 --> 00:00:03,200
It is not.
3
00:00:03,200 --> 00:00:05,920
It's a distributed decision engine sitting on top of Microsoft Graph
4
00:00:05,920 --> 00:00:09,500
that amplifies whatever you encode, intent or ambiguity.
5
00:00:09,500 --> 00:00:11,700
Get this wrong and you don't get a harmless draft.
6
00:00:11,700 --> 00:00:14,540
You get discoverable leakage, confidently wrong actions,
7
00:00:14,540 --> 00:00:16,780
and incident tickets that arrive before the logs.
8
00:00:16,780 --> 00:00:20,480
Legal security and IT will firefight silent failures at scale.
9
00:00:20,480 --> 00:00:23,220
In this masterclass, I'll give you one control pattern,
10
00:00:23,220 --> 00:00:25,320
ten enforceable mandates, and a red flag test
11
00:00:25,320 --> 00:00:26,520
you can run tomorrow.
12
00:00:26,520 --> 00:00:28,000
This isn't a tour of features.
13
00:00:28,000 --> 00:00:30,520
It's a control system masterclass for the people accountable
14
00:00:30,520 --> 00:00:31,920
when Copilot goes wrong.
15
00:00:31,920 --> 00:00:32,440
Pause here.
16
00:00:32,440 --> 00:00:33,480
You are at the starting line.
17
00:00:33,480 --> 00:00:35,560
Next, we name the single misunderstanding
18
00:00:35,560 --> 00:00:37,720
that creates every downstream failure,
19
00:00:37,720 --> 00:00:41,240
and then we map exactly how those failures show up.
20
00:00:41,240 --> 00:00:43,040
The foundational misunderstanding.
21
00:00:43,040 --> 00:00:44,720
Copilot is not a colleague.
22
00:00:44,720 --> 00:00:46,160
Let's set the stake in the ground.
23
00:00:46,160 --> 00:00:49,560
Copilot is not a colleague, an assistant, or a junior analyst.
24
00:00:49,560 --> 00:00:51,800
Architecturally, it is a control plane component
25
00:00:51,800 --> 00:00:54,680
riding Microsoft Graph, compiling decisions in real time.
26
00:00:54,680 --> 00:00:56,520
It does not simply generate drafts.
27
00:00:56,520 --> 00:00:59,240
It roots grounds and executes across your state
28
00:00:59,240 --> 00:01:01,400
according to whatever boundaries and refusal states
29
00:01:01,400 --> 00:01:02,960
you actually enforce.
30
00:01:02,960 --> 00:01:04,840
That distinction matters because a colleague
31
00:01:04,840 --> 00:01:06,440
will ask for clarification.
32
00:01:06,440 --> 00:01:09,840
Copilot will evaluate the state you designed and act inside it.
33
00:01:09,840 --> 00:01:12,800
If your intent is not encoded and beguity becomes policy,
34
00:01:12,800 --> 00:01:15,760
this is the only theory you need, everything else is enforcement.
35
00:01:15,760 --> 00:01:17,680
The foundational mistake is framing Copilot
36
00:01:17,680 --> 00:01:19,320
as a conversational helper.
37
00:01:19,320 --> 00:01:21,080
That framing invites casual prompting,
38
00:01:21,080 --> 00:01:23,800
unbounded context, and implicit trust in memory.
39
00:01:23,800 --> 00:01:26,720
In reality, you're operating a distributed decision engine.
40
00:01:26,720 --> 00:01:28,120
Each request becomes a program.
41
00:01:28,120 --> 00:01:30,640
Retrieve data through graph scopes, apply instructions,
42
00:01:30,640 --> 00:01:32,800
call tools, and optionally write back.
43
00:01:32,800 --> 00:01:35,560
The engine compiles your configuration into behavior.
44
00:01:35,560 --> 00:01:37,120
It doesn't care about your intent slides.
45
00:01:37,120 --> 00:01:39,800
It cares about scopes, connectors, labels, and gates.
46
00:01:39,800 --> 00:01:42,680
Now, let's bring in determinism versus probability.
47
00:01:42,680 --> 00:01:45,280
Probabilistic language models are fine for drafting,
48
00:01:45,280 --> 00:01:47,120
brainstorming, and synthesis.
49
00:01:47,120 --> 00:01:50,240
You can tolerate variation when the output is a paragraph.
50
00:01:50,240 --> 00:01:53,080
You cannot tolerate variation when the paragraph is a database
51
00:01:53,080 --> 00:01:54,080
write.
52
00:01:54,080 --> 00:01:56,560
Execution requires determinism, clear boundaries,
53
00:01:56,560 --> 00:01:58,480
structured outputs, and a refusal model
54
00:01:58,480 --> 00:02:00,160
that prefers silence over fiction.
55
00:02:00,160 --> 00:02:02,280
Every place you allow unconstrained free text
56
00:02:02,280 --> 00:02:04,920
without structure, you convert a deterministic execution
57
00:02:04,920 --> 00:02:06,240
path into a probabilistic one.
58
00:02:06,240 --> 00:02:08,000
That's where conditional chaos starts.
59
00:02:08,000 --> 00:02:09,640
Before we continue, you need to understand
60
00:02:09,640 --> 00:02:11,280
the environment you already run.
61
00:02:11,280 --> 00:02:14,320
Teams conversations feed graph, outlook bodies feed graph,
62
00:02:14,320 --> 00:02:16,760
SharePoint and OneDrive Inventories feed graph.
63
00:02:16,760 --> 00:02:19,440
Power automate turns prompts into actions.
64
00:02:19,440 --> 00:02:21,360
The system did exactly what you allowed.
65
00:02:21,360 --> 00:02:24,520
It inherited your permissions model and your oversharing.
66
00:02:24,520 --> 00:02:27,080
If your tenant has everyone except external libraries
67
00:02:27,080 --> 00:02:29,680
with HR files, copilot doesn't break in.
68
00:02:29,680 --> 00:02:31,800
It walks through the front door you left open.
69
00:02:31,800 --> 00:02:33,800
Intent versus configuration is the gap
70
00:02:33,800 --> 00:02:35,120
that creates incidents.
71
00:02:35,120 --> 00:02:38,280
Leaders say only summarize project conversations.
72
00:02:38,280 --> 00:02:40,560
By the way, the configuration says user has access,
73
00:02:40,560 --> 00:02:42,240
so content is in scope.
74
00:02:42,240 --> 00:02:46,000
Leaders assume summarize this channel means just this channel.
75
00:02:46,000 --> 00:02:48,000
The authorization engine hears everything
76
00:02:48,000 --> 00:02:49,440
the caller can read.
77
00:02:49,440 --> 00:02:53,080
You think you drew a boundary, you didn't, you described one.
78
00:02:53,080 --> 00:02:55,240
Let me make the psychological trap explicit.
79
00:02:55,240 --> 00:02:58,800
Because copilot speaks confidently, people assume authority.
80
00:02:58,800 --> 00:03:01,760
That voice invites obedience under time pressure.
81
00:03:01,760 --> 00:03:04,000
When copilot sites are non-existent policy,
82
00:03:04,000 --> 00:03:05,400
it's not a quality problem.
83
00:03:05,400 --> 00:03:06,680
It's a control problem.
84
00:03:06,680 --> 00:03:09,160
No authoritative source gating, no refusal state,
85
00:03:09,160 --> 00:03:11,000
no citations, therefore, confident fiction
86
00:03:11,000 --> 00:03:12,240
becomes operational truth.
87
00:03:12,240 --> 00:03:13,840
That's why this is the most dangerous failure
88
00:03:13,840 --> 00:03:15,600
psychologically, not technically.
89
00:03:15,600 --> 00:03:17,120
People follow text that sounds right.
90
00:03:17,120 --> 00:03:19,360
This is also where memory illusions creep in.
91
00:03:19,360 --> 00:03:22,360
Users project human continuity onto a stateless front paired
92
00:03:22,360 --> 00:03:23,440
with stateful caches.
93
00:03:23,440 --> 00:03:25,920
They expect remember when I said to work across sessions
94
00:03:25,920 --> 00:03:26,480
and channels.
95
00:03:26,480 --> 00:03:29,680
It does not, unless you design explicit state contracts
96
00:03:29,680 --> 00:03:32,480
with visible context ledgers and reset semantics.
97
00:03:32,480 --> 00:03:34,480
Otherwise, prompt strift embeddings linger
98
00:03:34,480 --> 00:03:37,960
and outcomes vary, trust decays, outputs diverge,
99
00:03:37,960 --> 00:03:40,360
users compensate by overprompting, which widens
100
00:03:40,360 --> 00:03:41,320
the blast radius.
101
00:03:41,320 --> 00:03:43,360
The thing most people miss, graph scoping
102
00:03:43,360 --> 00:03:45,560
and sensitivity labels are not suggestions.
103
00:03:45,560 --> 00:03:47,480
They are the only hard edges you have.
104
00:03:47,480 --> 00:03:49,760
A label with no enforced policy is a sticker.
105
00:03:49,760 --> 00:03:51,960
A scope with no allow list is an invitation.
106
00:03:51,960 --> 00:03:53,680
Default denies not a slogan.
107
00:03:53,680 --> 00:03:55,680
It's the difference between bounded retrieval
108
00:03:55,680 --> 00:03:57,200
and lateral discovery.
109
00:03:57,200 --> 00:03:59,280
You don't fix oversharing with prompt instructions.
110
00:03:59,280 --> 00:04:02,360
You fix oversharing with identity labels and access reviews.
111
00:04:02,360 --> 00:04:04,480
Then you let co-pilot inherit the safer world.
112
00:04:04,480 --> 00:04:06,640
This clicked for me when I started thinking of co-pilot
113
00:04:06,640 --> 00:04:08,480
as an authorization compiler.
114
00:04:08,480 --> 00:04:10,640
Every accept clause, every broad connector,
115
00:04:10,640 --> 00:04:12,760
every wildcard scope adds branches.
116
00:04:12,760 --> 00:04:14,640
Those branches become untested code paths
117
00:04:14,640 --> 00:04:16,680
that users will inevitably execute
118
00:04:16,680 --> 00:04:18,080
the moment they're rushed.
119
00:04:18,080 --> 00:04:20,800
Over time, policies drift away from intent.
120
00:04:20,800 --> 00:04:21,920
Exceptions accumulate.
121
00:04:21,920 --> 00:04:23,880
These pathways don't disappear.
122
00:04:23,880 --> 00:04:25,200
They accumulate.
123
00:04:25,200 --> 00:04:27,000
Here's the shortcut nobody teaches.
124
00:04:27,000 --> 00:04:29,720
Separate reasoning from execution by design,
125
00:04:29,720 --> 00:04:31,800
even before we get to the formal pattern,
126
00:04:31,800 --> 00:04:34,840
force structure on outputs, tables, JSON, schemas,
127
00:04:34,840 --> 00:04:36,880
so downstream actions can be deterministic.
128
00:04:36,880 --> 00:04:38,360
Require citations for any claim
129
00:04:38,360 --> 00:04:39,880
that could drive a decision.
130
00:04:39,880 --> 00:04:42,080
Prefer no answer over hallucinated detail.
131
00:04:42,080 --> 00:04:44,000
If you remember nothing else, remember this.
132
00:04:44,000 --> 00:04:45,880
Silence beats fiction every time.
133
00:04:45,880 --> 00:04:47,600
Real world impact, Mr. Frame
134
00:04:47,600 --> 00:04:50,720
and you get five anchor failures will map precisely to mandates.
135
00:04:50,720 --> 00:04:52,680
You get team summaries that quietly ingest
136
00:04:52,680 --> 00:04:55,400
HR and legal content because user has access
137
00:04:55,400 --> 00:04:56,680
wasn't treated as a boundary.
138
00:04:56,680 --> 00:04:59,080
You get invented procedures executed like policy
139
00:04:59,080 --> 00:05:01,120
because no authority gate existed.
140
00:05:01,120 --> 00:05:03,680
You get power automate flows that modify records
141
00:05:03,680 --> 00:05:05,720
without human consent because planning and execution
142
00:05:05,720 --> 00:05:06,560
were fused.
143
00:05:06,560 --> 00:05:09,080
You get trust collapses from memory illusions.
144
00:05:09,080 --> 00:05:11,360
And you get cost spikes that reveal invisible agents
145
00:05:11,360 --> 00:05:12,840
running without telemetry.
146
00:05:12,840 --> 00:05:14,600
This is not about model unpredictability.
147
00:05:14,600 --> 00:05:16,400
It's about architectural responsibility.
148
00:05:16,400 --> 00:05:18,080
Copilot entropy grows by default.
149
00:05:18,080 --> 00:05:19,760
Determinism only exists by design.
150
00:05:19,760 --> 00:05:21,320
You own the boundaries or you don't.
151
00:05:21,320 --> 00:05:22,720
You own refusal or you don't.
152
00:05:22,720 --> 00:05:25,800
You encode intent or the engine compiles your ambiguity
153
00:05:25,800 --> 00:05:26,720
into behavior.
154
00:05:26,720 --> 00:05:30,240
From here on, we stop talking theory and enforce it.
155
00:05:30,240 --> 00:05:34,240
Anchor failure, one, graph overreach, silent data leakage.
156
00:05:34,240 --> 00:05:35,960
Start with a scene you've already lived.
157
00:05:35,960 --> 00:05:38,080
Someone asks copilot in teams,
158
00:05:38,080 --> 00:05:40,640
summarize last week's discussion on the rollout.
159
00:05:40,640 --> 00:05:42,120
The summary arrives fast, polished,
160
00:05:42,120 --> 00:05:44,000
and on the surface useful.
161
00:05:44,000 --> 00:05:45,920
A few hours later, a manager replies,
162
00:05:45,920 --> 00:05:48,000
why is HR remediation language in this?
163
00:05:48,000 --> 00:05:49,160
Who authorised that?
164
00:05:49,160 --> 00:05:49,960
Nobody did.
165
00:05:49,960 --> 00:05:51,520
The engine compiled your estate.
166
00:05:51,520 --> 00:05:53,240
It followed scopes, not intentions.
167
00:05:53,240 --> 00:05:56,160
User has access became content is in scope.
168
00:05:56,160 --> 00:05:58,360
Your summary didn't leak because copilot broke in.
169
00:05:58,360 --> 00:06:01,480
It leaked because your boundary never existed in configuration.
170
00:06:01,480 --> 00:06:02,920
This is the uncomfortable truth.
171
00:06:02,920 --> 00:06:05,080
Microsoft Graph is not a polite librarian.
172
00:06:05,080 --> 00:06:06,960
It is an index and retrieval surface
173
00:06:06,960 --> 00:06:09,240
that honors the effective permissions of the caller.
174
00:06:09,240 --> 00:06:12,200
If your team's channel sits alongside a SharePoint library
175
00:06:12,200 --> 00:06:14,920
that's broadly shared, everyone accepts external
176
00:06:14,920 --> 00:06:17,000
and that library happens to contain HR
177
00:06:17,000 --> 00:06:18,960
or legal documents with weak labeling,
178
00:06:18,960 --> 00:06:20,720
the retrieval step will see them.
179
00:06:20,720 --> 00:06:23,000
The model does not need to be clever to overreach.
180
00:06:23,000 --> 00:06:25,320
It only needs your authorization layer to be generous
181
00:06:25,320 --> 00:06:26,840
and you're scoping to be unbounded.
182
00:06:26,840 --> 00:06:28,280
That's why this failure is silent.
183
00:06:28,280 --> 00:06:32,440
There's no denied access event, no firewall log, no obvious tripwire.
184
00:06:32,440 --> 00:06:36,040
The content was allowed, therefore, retrieval was compliant.
185
00:06:36,040 --> 00:06:38,440
The only alarm is the downstream human
186
00:06:38,440 --> 00:06:41,960
who recognizes language that never belonged in a project recap.
187
00:06:41,960 --> 00:06:44,160
By then, the damage is already discoverable.
188
00:06:44,160 --> 00:06:47,320
That summary can be forwarded, filed or subpoenaed.
189
00:06:47,320 --> 00:06:50,120
Passive leakage looks harmless until its exhibit material.
190
00:06:50,120 --> 00:06:51,800
The root cause is architectural,
191
00:06:51,800 --> 00:06:54,480
relying on user has access as a boundary.
192
00:06:54,480 --> 00:06:57,360
That phrase describes a capability, not a constraint.
193
00:06:57,360 --> 00:07:00,320
In a distributed decision engine capability expands context.
194
00:07:00,320 --> 00:07:02,960
Context expands surface, surface expands risk.
195
00:07:02,960 --> 00:07:05,080
When you leave graph scopes unconstrained,
196
00:07:05,080 --> 00:07:08,680
no connector allow lists, no retrieval filters tied to sensitivity,
197
00:07:08,680 --> 00:07:11,240
no default deny on high-risk sources.
198
00:07:11,240 --> 00:07:14,360
You hand the engine a map without borders and tell it to be helpful.
199
00:07:14,360 --> 00:07:14,920
It is.
200
00:07:14,920 --> 00:07:15,680
Accessively.
201
00:07:15,680 --> 00:07:17,960
Sensitivity labels get misunderstood here.
202
00:07:17,960 --> 00:07:20,520
A label that isn't enforced in policy is a sticker.
203
00:07:20,520 --> 00:07:22,200
If your policy doesn't drive deny,
204
00:07:22,200 --> 00:07:24,840
redact or refusal behavior at retrieval time,
205
00:07:24,840 --> 00:07:26,440
the label won't stop inclusion.
206
00:07:26,440 --> 00:07:27,720
Inheritance matters as well.
207
00:07:27,720 --> 00:07:29,520
If a parent site is broadly readable
208
00:07:29,520 --> 00:07:31,960
and a child library contains confidential HR,
209
00:07:31,960 --> 00:07:34,800
any weak break in inheritance means the graph index reflects
210
00:07:34,800 --> 00:07:36,400
that effective openness.
211
00:07:36,400 --> 00:07:37,880
Copilot doesn't reason about your intent
212
00:07:37,880 --> 00:07:39,640
to keep HR out of project summaries.
213
00:07:39,640 --> 00:07:41,800
It reasons about your tenant's access graph.
214
00:07:41,800 --> 00:07:44,760
The thing most people miss oversharing proceeds overreach.
215
00:07:44,760 --> 00:07:48,440
Teams, one drive share point, years of share to collaborate,
216
00:07:48,440 --> 00:07:50,880
produce islands of content where owners have changed,
217
00:07:50,880 --> 00:07:54,000
stewardship is unclear, and expiration never existed.
218
00:07:54,000 --> 00:07:56,160
At copilot and those islands become tributaries
219
00:07:56,160 --> 00:07:58,640
into every helpful answer, you didn't create a new leak,
220
00:07:58,640 --> 00:07:59,720
you created a new pump.
221
00:07:59,720 --> 00:08:01,000
So what prevents it?
222
00:08:01,000 --> 00:08:02,840
Hard edges, boundaries first,
223
00:08:02,840 --> 00:08:05,840
enforced default deny for data sources by risk class.
224
00:08:05,840 --> 00:08:09,240
If a source contains HR, legal finance or regulated data,
225
00:08:09,240 --> 00:08:12,720
it is opt in by exception with expiry not opt out by oversight.
226
00:08:12,720 --> 00:08:16,280
Constraint graph scopes with allow lists per agent and per channel.
227
00:08:16,280 --> 00:08:18,480
In Teams, configure summarization boundaries
228
00:08:18,480 --> 00:08:20,920
to exclude private channels and sensitive domains.
229
00:08:20,920 --> 00:08:23,200
In Outlook require label-aware summarization
230
00:08:23,200 --> 00:08:24,600
with reduction or refusal.
231
00:08:24,600 --> 00:08:27,520
If the retrieval step can't prove the source is in scope,
232
00:08:27,520 --> 00:08:30,720
the answer is no content, not best effort.
233
00:08:30,720 --> 00:08:34,640
Next, enforce sensitivity with effect, not aspiration.
234
00:08:34,640 --> 00:08:37,640
Labels must trigger behavior in copilot interactions.
235
00:08:37,640 --> 00:08:40,800
Deny content with specific labels from grounding.
236
00:08:40,800 --> 00:08:43,280
Force reduction of detected sensitive types
237
00:08:43,280 --> 00:08:46,800
and log any attempted inclusion in activity explorer.
238
00:08:46,800 --> 00:08:48,800
Use data security posture management
239
00:08:48,800 --> 00:08:50,880
to find where sensitive data is concentrated
240
00:08:50,880 --> 00:08:53,320
and where it's being referenced by copilot interactions.
241
00:08:53,320 --> 00:08:56,280
If you see sensitive interactions rising in harmless apps,
242
00:08:56,280 --> 00:08:57,600
that's not an adoption victory.
243
00:08:57,600 --> 00:09:00,120
It's an exposure metric, identities your third lever.
244
00:09:00,120 --> 00:09:02,680
The callus context determines blast radius.
245
00:09:02,680 --> 00:09:04,960
Separate agent identities from humans,
246
00:09:04,960 --> 00:09:08,520
never let human makers broad-read rights flow into agent runs.
247
00:09:08,520 --> 00:09:10,720
Use inter-conditional access with authentication context,
248
00:09:10,720 --> 00:09:14,680
so high-risk users, elevated insiders, departing employees,
249
00:09:14,680 --> 00:09:16,520
operate under tighter copilot constraints
250
00:09:16,520 --> 00:09:18,840
or are blocked from sensitive scopes entirely.
251
00:09:18,840 --> 00:09:20,800
Least privilege is not a governance slogan.
252
00:09:20,800 --> 00:09:24,000
It's how you shrink what the engine can see before synthesizes.
253
00:09:24,000 --> 00:09:25,760
Then there's observability.
254
00:09:25,760 --> 00:09:27,640
Leakage is silent until you make it loud.
255
00:09:27,640 --> 00:09:30,440
Budget costs to telemetry, not to usage.
256
00:09:30,440 --> 00:09:32,480
Require prompt response audit trails,
257
00:09:32,480 --> 00:09:34,840
label-aware metrics and anomaly alerts.
258
00:09:34,840 --> 00:09:37,200
Certain jumps in sensitive label mentions,
259
00:09:37,200 --> 00:09:39,360
cross-domain co-occurrence you didn't expect,
260
00:09:39,360 --> 00:09:41,840
or team summaries citing sharepoint paths
261
00:09:41,840 --> 00:09:44,520
outside the project boundary.
262
00:09:44,520 --> 00:09:46,760
Treat costs spikes as smoke.
263
00:09:46,760 --> 00:09:48,440
Invisible agents, shadow prompts,
264
00:09:48,440 --> 00:09:51,600
and uncontrolled execution often show up as tokens first,
265
00:09:51,600 --> 00:09:52,880
incidents later.
266
00:09:52,880 --> 00:09:54,240
Common anti-patterns,
267
00:09:54,240 --> 00:09:57,160
relying on prompt instructions like only summarise this channel
268
00:09:57,160 --> 00:09:58,440
and believing that's a boundary.
269
00:09:58,440 --> 00:10:00,320
It is not hoping sensitivity labels
270
00:10:00,320 --> 00:10:03,880
without corresponding DLP or purview enforcement will self-police.
271
00:10:03,880 --> 00:10:04,880
They do not.
272
00:10:04,880 --> 00:10:07,440
Enabling org-wide access for convenience and trusting
273
00:10:07,440 --> 00:10:10,000
that people won't place HR content there, they will.
274
00:10:10,000 --> 00:10:12,480
And publishing agent experiences without connector allow lists
275
00:10:12,480 --> 00:10:14,080
because we'll tighten it later.
276
00:10:14,080 --> 00:10:14,760
You won't.
277
00:10:14,760 --> 00:10:16,200
Entropy always wins unopposed.
278
00:10:16,200 --> 00:10:19,200
If this sounds heavy, it's because execution is heavy.
279
00:10:19,200 --> 00:10:21,000
But the controls are straightforward
280
00:10:21,000 --> 00:10:24,360
when you treat copilot as a control plane, not a chatbot.
281
00:10:24,360 --> 00:10:26,320
Define the system, constrain the surfaces,
282
00:10:26,320 --> 00:10:28,280
enforce sensitivity with outcomes,
283
00:10:28,280 --> 00:10:30,520
separate identities, instrument aggressively.
284
00:10:30,520 --> 00:10:32,200
Then when someone asks for a summary,
285
00:10:32,200 --> 00:10:33,920
the engine compiles your boundaries,
286
00:10:33,920 --> 00:10:36,160
not your optimism into behaviour.
287
00:10:36,160 --> 00:10:38,840
Remember, copilot did not overreach, your estate did,
288
00:10:38,840 --> 00:10:40,600
you're not misconfiguring Entra,
289
00:10:40,600 --> 00:10:42,200
you are under designing your graph.
290
00:10:42,200 --> 00:10:45,200
Bound the context and the helpful answer stays inside the fence,
291
00:10:45,200 --> 00:10:48,680
leave it open and the quiet leak becomes a record.
292
00:10:48,680 --> 00:10:51,320
Ancafalia 2, hallucinated authority,
293
00:10:51,320 --> 00:10:52,920
confidently wrong action.
294
00:10:52,920 --> 00:10:54,120
You've seen this one.
295
00:10:54,120 --> 00:10:55,400
A project lead asks,
296
00:10:55,400 --> 00:10:57,960
what's our approval path for vendor onboarding?
297
00:10:57,960 --> 00:11:01,160
Copilot replies, cleanly formatted, confident tone,
298
00:11:01,160 --> 00:11:03,720
per section 4.2 of the procurement policy
299
00:11:03,720 --> 00:11:06,760
financed signs of after security due diligence.
300
00:11:06,760 --> 00:11:09,560
Exceptions require VP approval.
301
00:11:09,560 --> 00:11:11,320
It cites nothing, under time pressure,
302
00:11:11,320 --> 00:11:13,480
the lead forwards it and moves on.
303
00:11:13,480 --> 00:11:15,080
Our later security asks,
304
00:11:15,080 --> 00:11:17,360
what policy is section 4.2?
305
00:11:17,360 --> 00:11:19,240
It doesn't exist, the tech sounded right,
306
00:11:19,240 --> 00:11:20,760
therefore people obeyed it.
307
00:11:20,760 --> 00:11:23,560
This is the most dangerous failure psychologically,
308
00:11:23,560 --> 00:11:24,400
not technically.
309
00:11:24,400 --> 00:11:26,400
The harm doesn't come from model inaccuracy,
310
00:11:26,400 --> 00:11:29,440
it comes from human obedience to confident pros under deadline.
311
00:11:29,440 --> 00:11:31,440
The system's voice carries implied authority.
312
00:11:31,440 --> 00:11:33,760
In organizations trained to trust official tone,
313
00:11:33,760 --> 00:11:35,920
credible phrasing is a controlled surface.
314
00:11:35,920 --> 00:11:37,200
Absent a gate?
315
00:11:37,200 --> 00:11:39,200
That surface becomes the policy.
316
00:11:39,200 --> 00:11:41,600
Root cause, no authoritative source gating,
317
00:11:41,600 --> 00:11:44,800
no refusal state, no citations that bind answers to canon.
318
00:11:44,800 --> 00:11:47,600
Without a source registry and a truth or silence rule,
319
00:11:47,600 --> 00:11:50,560
you've built an idea generator inside an execution environment.
320
00:11:50,560 --> 00:11:52,800
People will operationalize plausible fiction.
321
00:11:52,800 --> 00:11:53,840
They won't intend to.
322
00:11:53,840 --> 00:11:55,600
They'll be late, overloaded,
323
00:11:55,600 --> 00:11:57,920
and a sentence that sounds like a manual will win.
324
00:11:57,920 --> 00:12:00,080
Architecturally, this isn't an accuracy problem,
325
00:12:00,080 --> 00:12:01,520
it's a sourcing problem.
326
00:12:01,520 --> 00:12:04,320
You didn't define what counts as law for a given question,
327
00:12:04,320 --> 00:12:05,880
so the engine compiled patterns,
328
00:12:05,880 --> 00:12:07,000
patterns are persuasive,
329
00:12:07,000 --> 00:12:09,240
they approximate the shape of internal policy language
330
00:12:09,240 --> 00:12:10,560
the cadence of your writing.
331
00:12:10,560 --> 00:12:13,400
The probability distribution delivered a fluent paragraph,
332
00:12:13,400 --> 00:12:15,880
the control system accepted it as authority.
333
00:12:15,880 --> 00:12:18,320
Most teams try to fix this with prompt language.
334
00:12:18,320 --> 00:12:19,960
Only use official sources.
335
00:12:19,960 --> 00:12:21,720
That's not a gate, that's a wish.
336
00:12:21,720 --> 00:12:22,920
A gate is a registry,
337
00:12:22,920 --> 00:12:25,320
a catalog of SharePoint sites, fabric lake houses,
338
00:12:25,320 --> 00:12:27,680
and graph scopes that define canon by domain,
339
00:12:27,680 --> 00:12:29,760
procurement, HR security.
340
00:12:29,760 --> 00:12:32,760
A gate is an evaluation step that rejects any answer
341
00:12:32,760 --> 00:12:35,560
without inline citations that resolve to that registry.
342
00:12:35,560 --> 00:12:37,120
A gate is a refusal model that says,
343
00:12:37,120 --> 00:12:40,160
I don't have an answer from the authorized corpus and stops.
344
00:12:40,160 --> 00:12:41,760
Inline citations are not decorative.
345
00:12:41,760 --> 00:12:44,640
They are the contract, no citation, no execution path.
346
00:12:44,640 --> 00:12:46,520
If the answer might drive a decision,
347
00:12:46,520 --> 00:12:49,080
require links into the source with stable IDs,
348
00:12:49,080 --> 00:12:50,920
version stamps, and label awareness.
349
00:12:50,920 --> 00:12:53,480
If the source is a policy, site the policy record,
350
00:12:53,480 --> 00:12:56,200
and the section anchor, not a file path that will rot.
351
00:12:56,200 --> 00:12:59,440
In practice, this nudges your knowledge stewardship as well.
352
00:12:59,440 --> 00:13:02,080
Wikis become versioned documents with anchors,
353
00:13:02,080 --> 00:13:04,840
emails become knowledge summaries in governed sites.
354
00:13:04,840 --> 00:13:06,360
Now consider the operational edge,
355
00:13:06,360 --> 00:13:08,960
even with a registry, your corpus can be wrong or stale.
356
00:13:08,960 --> 00:13:10,800
That's fine, because the control pattern
357
00:13:10,800 --> 00:13:12,920
will use later separates reasoning from execution.
358
00:13:12,920 --> 00:13:15,440
You can root risky recommendations into the gate.
359
00:13:15,440 --> 00:13:17,120
Human or policy check happens there.
360
00:13:17,120 --> 00:13:18,480
The machine proposes.
361
00:13:18,480 --> 00:13:20,160
The human disposes.
362
00:13:20,160 --> 00:13:22,880
When the human declines, you feed the evaluator.
363
00:13:22,880 --> 00:13:25,400
Reason for refusal, source outdated,
364
00:13:25,400 --> 00:13:29,080
which retrains your quality metrics, not the model.
365
00:13:29,080 --> 00:13:31,200
There's a second trap here, assistance,
366
00:13:31,200 --> 00:13:34,000
that propose actions with implicit authority.
367
00:13:34,000 --> 00:13:37,240
In Power Automate, a copilot suggested flow might say,
368
00:13:37,240 --> 00:13:39,520
root non-compliant vendors to provisional approval,
369
00:13:39,520 --> 00:13:41,480
notify security post facto.
370
00:13:41,480 --> 00:13:43,000
It reads like a codified norm.
371
00:13:43,000 --> 00:13:45,120
Without a source bound rule, that suggestion
372
00:13:45,120 --> 00:13:47,200
will be accepted by someone who assumes,
373
00:13:47,200 --> 00:13:48,520
this is how we do it.
374
00:13:48,520 --> 00:13:51,560
That's how confidently wrong becomes irreversible change.
375
00:13:51,560 --> 00:13:52,600
The fix is the same.
376
00:13:52,600 --> 00:13:54,840
Buying suggestions to policy citations,
377
00:13:54,840 --> 00:13:58,600
enforce refusal when absent, and require plan a gate before execute.
378
00:13:58,600 --> 00:14:01,280
What about we'll train people to ask for citations?
379
00:14:01,280 --> 00:14:02,320
You won't scale that.
380
00:14:02,320 --> 00:14:03,280
Humans forget.
381
00:14:03,280 --> 00:14:04,600
New hires don't know the canon.
382
00:14:04,600 --> 00:14:06,240
Contractors have no context.
383
00:14:06,240 --> 00:14:09,040
Design the refusal model so the agent stops itself.
384
00:14:09,040 --> 00:14:11,480
I cannot answer from approved sources.
385
00:14:11,480 --> 00:14:14,160
Make that message friendly but non-negotiable.
386
00:14:14,160 --> 00:14:16,960
Then give the user a frictionless recovery.
387
00:14:16,960 --> 00:14:19,880
Would you like me to search only the procurement policy library?
388
00:14:19,880 --> 00:14:22,040
Offer bounded pathways, not apologies.
389
00:14:22,040 --> 00:14:24,040
Detecting this failure before it becomes an incident
390
00:14:24,040 --> 00:14:25,520
requires telemetry.
391
00:14:25,520 --> 00:14:27,720
In your prompt, watch response audit flag answers
392
00:14:27,720 --> 00:14:30,360
without citations, classified topics by domain,
393
00:14:30,360 --> 00:14:32,680
and alert when citationless answers cluster
394
00:14:32,680 --> 00:14:34,560
around policy heavy domains.
395
00:14:34,560 --> 00:14:36,200
Communication compliance can help.
396
00:14:36,200 --> 00:14:39,000
Jailbreak detectors can be tuned to policy evasion attempts,
397
00:14:39,000 --> 00:14:41,120
phrases like, as per our standard practice,
398
00:14:41,120 --> 00:14:42,800
without a binding reference.
399
00:14:42,800 --> 00:14:44,520
When you see them, it's not a user problem.
400
00:14:44,520 --> 00:14:47,080
It's a design absence, common anti-patterns.
401
00:14:47,080 --> 00:14:49,480
Letting co-pilot search the open web for internal policy
402
00:14:49,480 --> 00:14:51,080
questions to augment answers.
403
00:14:51,080 --> 00:14:53,560
Allowing helpful paraphrases of policies
404
00:14:53,560 --> 00:14:55,640
without a link to canonical text,
405
00:14:55,640 --> 00:14:58,080
accepting summaries that include company policy states
406
00:14:58,080 --> 00:15:00,080
without pointing to the precise source.
407
00:15:00,080 --> 00:15:02,760
And worst, approving actions, approvals, escalations,
408
00:15:02,760 --> 00:15:05,360
exceptions triggered by texts that sound official
409
00:15:05,360 --> 00:15:06,640
but are anchored to nothing.
410
00:15:06,640 --> 00:15:08,960
If this sounds heavy, it's because execution is heavy,
411
00:15:08,960 --> 00:15:10,440
but the controls are concrete.
412
00:15:10,440 --> 00:15:12,920
Build an authoritative source registry per domain.
413
00:15:12,920 --> 00:15:16,560
Require inline citations that resolve to those registries.
414
00:15:16,560 --> 00:15:18,680
Enforce a refusal when uncertain model,
415
00:15:18,680 --> 00:15:20,200
silence beats fiction.
416
00:15:20,200 --> 00:15:23,200
Separate reasoning from execution so proposals can be reviewed.
417
00:15:23,200 --> 00:15:26,200
Instrument answers without citations and act on clusters.
418
00:15:26,200 --> 00:15:28,720
Teach the organization one phrase, truth or silence.
419
00:15:28,720 --> 00:15:30,800
Remember, co-pilot didn't assert authority.
420
00:15:30,800 --> 00:15:32,040
You granted it by omission.
421
00:15:32,040 --> 00:15:33,800
In a distributed decision engine, anything
422
00:15:33,800 --> 00:15:35,840
that sounds like policy becomes policy,
423
00:15:35,840 --> 00:15:37,680
unless you enforce what policy is.
424
00:15:37,680 --> 00:15:39,880
Encode that boundary and confident text
425
00:15:39,880 --> 00:15:41,240
goes back to being a draft.
426
00:15:41,240 --> 00:15:43,200
Leave it open and pros will run your process.
427
00:15:43,200 --> 00:15:44,440
Anker Failure 3.
428
00:15:44,440 --> 00:15:46,040
Automation without consent?
429
00:15:46,040 --> 00:15:47,280
Irreversible change.
430
00:15:47,280 --> 00:15:49,120
Here's how quiet turns into permanent.
431
00:15:49,120 --> 00:15:51,480
A product manager asks co-pilot to streamline
432
00:15:51,480 --> 00:15:52,920
the exception workflow.
433
00:15:52,920 --> 00:15:55,760
In power automate, the agent proposes a flow.
434
00:15:55,760 --> 00:15:57,840
When a vendor is missing a security document,
435
00:15:57,840 --> 00:16:00,400
create a provisional approval, notify the owner,
436
00:16:00,400 --> 00:16:02,800
and update the vendor record with a temporary flag.
437
00:16:02,800 --> 00:16:04,240
It reads like housekeeping.
438
00:16:04,240 --> 00:16:06,520
Someone hits create.
439
00:16:06,520 --> 00:16:09,080
Two weeks later, finance discovers dozens of vendors
440
00:16:09,080 --> 00:16:11,280
with active spend and no completed due diligence.
441
00:16:11,280 --> 00:16:13,760
There was no meeting, no gate, planning and execution
442
00:16:13,760 --> 00:16:14,840
fused in one gesture.
443
00:16:14,840 --> 00:16:17,240
The draft became a right, root cause.
444
00:16:17,240 --> 00:16:20,640
You let reasoning and execution share a rail.
445
00:16:20,640 --> 00:16:23,000
The same surface that brainstormed improvements generated
446
00:16:23,000 --> 00:16:25,120
a runnable flow, connected to production
447
00:16:25,120 --> 00:16:28,480
with maker credential standing in for a least privileged agent.
448
00:16:28,480 --> 00:16:32,000
There was no required plan, no human gate by risk tier,
449
00:16:32,000 --> 00:16:35,160
no sandbox, dry run, no rollback path.
450
00:16:35,160 --> 00:16:37,160
In other words, no design separation between here's
451
00:16:37,160 --> 00:16:39,400
what we could do and we did it.
452
00:16:39,400 --> 00:16:41,960
This isn't a tooling critique, it's a control failure.
453
00:16:41,960 --> 00:16:43,520
By default, power automate will happily
454
00:16:43,520 --> 00:16:45,040
let a maker with broad rights produce
455
00:16:45,040 --> 00:16:47,600
an always on listener that mutates records.
456
00:16:47,600 --> 00:16:49,240
The model supplied plausible glue,
457
00:16:49,240 --> 00:16:51,360
the platform supplied capability.
458
00:16:51,360 --> 00:16:55,000
The missing piece was architecture, reason, plan, gate, execute,
459
00:16:55,000 --> 00:16:58,000
without it, helpful suggestion, become state change.
460
00:16:58,000 --> 00:16:59,520
Let's make the failure concrete.
461
00:16:59,520 --> 00:17:02,120
A flow watches a new vendor share point list.
462
00:17:02,120 --> 00:17:03,960
Condition, if security doc is empty,
463
00:17:03,960 --> 00:17:06,960
set approval status, it was provisional, send an email,
464
00:17:06,960 --> 00:17:08,840
and copy the road to an exceptions list.
465
00:17:08,840 --> 00:17:10,080
Hidden costs.
466
00:17:10,080 --> 00:17:12,520
Provisional is treated by downstream systems
467
00:17:12,520 --> 00:17:14,520
as approved for limited spend.
468
00:17:14,520 --> 00:17:17,160
Email alerts get filtered and no one ever reconciles
469
00:17:17,160 --> 00:17:18,680
the exceptions list.
470
00:17:18,680 --> 00:17:20,880
The mutation looks reversible, flip the status later,
471
00:17:20,880 --> 00:17:22,200
but the side effects aren't.
472
00:17:22,200 --> 00:17:24,280
Purchase orders get issued, payments queued,
473
00:17:24,280 --> 00:17:25,600
audit trail stamped.
474
00:17:25,600 --> 00:17:28,640
Undo doesn't exist, compensating transactions do,
475
00:17:28,640 --> 00:17:29,720
and you didn't design them.
476
00:17:29,720 --> 00:17:31,320
What's the architectural correction?
477
00:17:31,320 --> 00:17:32,360
Separate the planes.
478
00:17:32,360 --> 00:17:35,760
Reason produces a documented plan with sources, assumptions,
479
00:17:35,760 --> 00:17:37,840
risks, and a shape you can evaluate.
480
00:17:37,840 --> 00:17:40,560
Plan becomes a machine readable artifact.
481
00:17:40,560 --> 00:17:44,080
Jason, table, schema, not a wall of text.
482
00:17:44,080 --> 00:17:46,880
Gate evaluates by risk, tier, and data class.
483
00:17:46,880 --> 00:17:49,040
Low-risk proposals can auto-approved.
484
00:17:49,040 --> 00:17:51,480
Medium-risk require human and the loop approvals.
485
00:17:51,480 --> 00:17:54,800
High-risk require human on the loop, plus policy checks.
486
00:17:54,800 --> 00:17:56,640
Execute runs, least-privileged actions
487
00:17:56,640 --> 00:17:58,720
with rollback and audit by default.
488
00:17:58,720 --> 00:18:01,200
That patent turns idea into govern change.
489
00:18:01,200 --> 00:18:03,520
In practice, power automate gives you enforcement handles
490
00:18:03,520 --> 00:18:04,600
if you design for them.
491
00:18:04,600 --> 00:18:07,720
Mandatory approvals, any flow that updates or deletes
492
00:18:07,720 --> 00:18:10,040
in tier one systems routes through an approval action
493
00:18:10,040 --> 00:18:13,440
to a defined gatekeeper group, simulated execution.
494
00:18:13,440 --> 00:18:15,520
Every proposed flow runs a dry run mode
495
00:18:15,520 --> 00:18:17,480
that logs intended changes without writing
496
00:18:17,480 --> 00:18:19,840
so you can review the blast radius.
497
00:18:19,840 --> 00:18:21,880
Connector governance, high-risk connectors
498
00:18:21,880 --> 00:18:24,240
and unknown APIs are blocked by policy and production
499
00:18:24,240 --> 00:18:26,080
environments only approved connectors
500
00:18:26,080 --> 00:18:28,040
are available in deep-and-test.
501
00:18:28,040 --> 00:18:31,000
Roll separation, agent identities separate from humans,
502
00:18:31,000 --> 00:18:32,400
with narrowly-scoped permissions
503
00:18:32,400 --> 00:18:34,000
just for the intended actions.
504
00:18:34,000 --> 00:18:36,560
Rollback reality needs to be stated plainly
505
00:18:36,560 --> 00:18:39,040
without pre-plan compensating transactions.
506
00:18:39,040 --> 00:18:40,840
Undo is a story you tell yourself.
507
00:18:40,840 --> 00:18:44,080
For every write capability you introduce, define the inverse.
508
00:18:44,080 --> 00:18:45,840
If a flow can set provisional,
509
00:18:45,840 --> 00:18:47,680
it must be paired with a closure path
510
00:18:47,680 --> 00:18:49,800
that either escalates to completed due diligence
511
00:18:49,800 --> 00:18:51,760
or rolls back and flags spend.
512
00:18:51,760 --> 00:18:53,920
Those are distinct flows owned by different rolls
513
00:18:53,920 --> 00:18:55,560
with SLAs and metrics.
514
00:18:55,560 --> 00:18:58,040
Audit must see both directions not just the happy path.
515
00:18:58,040 --> 00:18:59,400
Common anti-patents?
516
00:18:59,400 --> 00:19:01,640
Letting co-pilot wire flows directly in production
517
00:19:01,640 --> 00:19:02,960
because it's faster.
518
00:19:02,960 --> 00:19:04,360
Relying on maker credentials
519
00:19:04,360 --> 00:19:06,200
that have read right across multiple systems
520
00:19:06,200 --> 00:19:08,760
for convenience which widens every blast radius.
521
00:19:08,760 --> 00:19:11,040
Accepting free text approvals looks fine
522
00:19:11,040 --> 00:19:13,800
with no machine readable plan to compare against.
523
00:19:13,800 --> 00:19:16,000
Treating the test button as validation,
524
00:19:16,000 --> 00:19:18,680
it validates the button not the outcome in live data.
525
00:19:18,680 --> 00:19:20,920
And believing a notification equals control.
526
00:19:20,920 --> 00:19:21,680
It doesn't.
527
00:19:21,680 --> 00:19:23,120
No one has time to read your email.
528
00:19:23,120 --> 00:19:25,000
Detection is telemetry, not luck.
529
00:19:25,000 --> 00:19:27,600
Instrument power, automate with anomaly alerts,
530
00:19:27,600 --> 00:19:30,560
spikes in update actions, new flows executing rights
531
00:19:30,560 --> 00:19:31,880
without recorded approvals,
532
00:19:31,880 --> 00:19:34,240
usage of connectors outside the allow list
533
00:19:34,240 --> 00:19:36,080
or flows created by human identities
534
00:19:36,080 --> 00:19:38,640
but running as agents with broader scopes.
535
00:19:38,640 --> 00:19:42,200
Tie cost anomalies to execution, token or API spikes,
536
00:19:42,200 --> 00:19:44,200
often a company unconstrained flows.
537
00:19:44,200 --> 00:19:45,960
When you see them, don't ask for better prompts.
538
00:19:45,960 --> 00:19:48,480
Ask which gate failed or which gate never existed.
539
00:19:48,480 --> 00:19:51,280
If this sounds heavy, it's because execution is heavy.
540
00:19:51,280 --> 00:19:52,920
That refrain applies here more than anywhere.
541
00:19:52,920 --> 00:19:55,240
Automation magnifies your design choices.
542
00:19:55,240 --> 00:19:57,000
Either you enforce separation
543
00:19:57,000 --> 00:19:58,680
and live in a deterministic world
544
00:19:58,680 --> 00:20:01,240
where proposals are reviewed and actions are scoped
545
00:20:01,240 --> 00:20:02,920
or you let assistance drive your data
546
00:20:02,920 --> 00:20:05,200
and discover policies you didn't intend.
547
00:20:05,200 --> 00:20:07,080
Encoded in flows you didn't review.
548
00:20:07,080 --> 00:20:09,360
One last point on psychology.
549
00:20:09,360 --> 00:20:11,360
The UI cadence suggests safety.
550
00:20:11,360 --> 00:20:13,160
It feels like building a spreadsheet rule.
551
00:20:13,160 --> 00:20:14,000
It's not.
552
00:20:14,000 --> 00:20:15,520
Each trigger plus action is a program
553
00:20:15,520 --> 00:20:17,560
that runs at scale on live data.
554
00:20:17,560 --> 00:20:19,680
The friendlier, it feels the more rigor you need
555
00:20:19,680 --> 00:20:21,440
but that's not an argument against co-pilot.
556
00:20:21,440 --> 00:20:23,320
It's a mandate to encode your control system
557
00:20:23,320 --> 00:20:26,200
so the engine compiles your guardrails, not your optimism.
558
00:20:26,200 --> 00:20:28,640
Automation without consent isn't dramatic when it happens.
559
00:20:28,640 --> 00:20:29,640
It's mundane.
560
00:20:29,640 --> 00:20:33,160
A checkbox here, a condition there, an email that no one reads
561
00:20:33,160 --> 00:20:35,200
and then one day your ledger, your CRM
562
00:20:35,200 --> 00:20:37,760
or your access model reflects weeks of temporary states
563
00:20:37,760 --> 00:20:38,840
that never reversed.
564
00:20:38,840 --> 00:20:40,240
That's irreversible change.
565
00:20:40,240 --> 00:20:42,880
Separate the planes or accept that every helpful suggestion
566
00:20:42,880 --> 00:20:44,080
is a right-weighting to happen.
567
00:20:44,080 --> 00:20:45,160
Breathe.
568
00:20:45,160 --> 00:20:47,840
You've just heard why co-pilot fails.
569
00:20:47,840 --> 00:20:51,080
Passive leakage, confident fiction, irreversible action.
570
00:20:51,080 --> 00:20:53,360
From here on, we enforce how it stops.
571
00:20:53,360 --> 00:20:55,720
The mandates that turn architecture into control.
572
00:20:56,960 --> 00:20:59,840
Anchor Failure 4, Memory Illusion False Trust.
573
00:20:59,840 --> 00:21:01,440
This one hides in plain sight.
574
00:21:01,440 --> 00:21:03,560
A manager asked co-pilot, "Draft the follow-up
575
00:21:03,560 --> 00:21:06,600
we discussed yesterday, same tone, same constraints."
576
00:21:06,600 --> 00:21:09,880
The reply looks familiar but nudges a detail, drops a qualifier
577
00:21:09,880 --> 00:21:12,480
or reintroduces an exclusion you thought you killed.
578
00:21:12,480 --> 00:21:13,960
The user assumes continuity.
579
00:21:13,960 --> 00:21:15,400
Co-pilot didn't forget.
580
00:21:15,400 --> 00:21:17,080
It never promised to remember.
581
00:21:17,080 --> 00:21:19,640
You projected human continuity onto a stateless front
582
00:21:19,640 --> 00:21:21,320
paired to stateful caches
583
00:21:21,320 --> 00:21:24,360
that projection creates false trust, root cause.
584
00:21:24,360 --> 00:21:26,720
Uncontrolled prompt chaining and cached embeddings
585
00:21:26,720 --> 00:21:29,000
operating without an explicit state contract.
586
00:21:29,000 --> 00:21:32,000
In English, yesterday's context lived in the transient chat
587
00:21:32,000 --> 00:21:34,320
history and whatever embeddings the system
588
00:21:34,320 --> 00:21:35,760
cached from your documents.
589
00:21:35,760 --> 00:21:38,120
Today's request arrived in a new session,
590
00:21:38,120 --> 00:21:40,840
a different channel or a different agent surface.
591
00:21:40,840 --> 00:21:42,520
The visible thread felt continuous.
592
00:21:42,520 --> 00:21:44,080
The underlying state was not.
593
00:21:44,080 --> 00:21:46,320
Without a designed boundary, start scope, sources,
594
00:21:46,320 --> 00:21:48,160
reset, drift is inevitable.
595
00:21:48,160 --> 00:21:50,960
Architecturally, this matters because continuity is a safety
596
00:21:50,960 --> 00:21:51,840
claim.
597
00:21:51,840 --> 00:21:54,200
When users believe, remember when I said persists,
598
00:21:54,200 --> 00:21:56,000
they lower their guard, shorten prompts
599
00:21:56,000 --> 00:21:58,120
and stop re-declaring constraints that
600
00:21:58,120 --> 00:22:00,000
collapses determinism.
601
00:22:00,000 --> 00:22:02,440
The model backfills missing pieces from whatever
602
00:22:02,440 --> 00:22:05,680
is nearby, a stale vector index, a similar thread,
603
00:22:05,680 --> 00:22:07,520
or a default behavioral instruction.
604
00:22:07,520 --> 00:22:08,360
Outputs diverge.
605
00:22:08,360 --> 00:22:09,560
The divergence isn't dramatic.
606
00:22:09,560 --> 00:22:10,800
It's cumulative.
607
00:22:10,800 --> 00:22:12,960
Overdays, small deltas become policy.
608
00:22:12,960 --> 00:22:15,040
The stateless front, stateful back pattern,
609
00:22:15,040 --> 00:22:18,200
is the right foundation, but only if you expose it.
610
00:22:18,200 --> 00:22:21,280
Stateless by default means every session begins at 0
611
00:22:21,280 --> 00:22:23,400
unless the user explicitly attaches state.
612
00:22:23,400 --> 00:22:25,920
Stateful back means you can persist context, documents,
613
00:22:25,920 --> 00:22:28,080
decisions, constraints under an ID
614
00:22:28,080 --> 00:22:30,000
with a time to live and provenance.
615
00:22:30,000 --> 00:22:32,960
The illusion is pretending stateless behaves like memory
616
00:22:32,960 --> 00:22:36,200
without the contract that ties requests to persisted context.
617
00:22:36,200 --> 00:22:38,800
That illusion breeds errors, users won't catch.
618
00:22:38,800 --> 00:22:41,680
The mandate here is explicit state, session contracts,
619
00:22:41,680 --> 00:22:43,720
and a user-visible context ledger.
620
00:22:43,720 --> 00:22:46,440
A session contract defines what this conversation will use,
621
00:22:46,440 --> 00:22:50,360
sources, time window, labels, refusal rules, and for how long.
622
00:22:50,360 --> 00:22:52,200
The ledger shows in plain text.
623
00:22:52,200 --> 00:22:57,520
This answer used procurement policy V3.4, label confidential.
624
00:22:57,520 --> 00:23:00,680
Last week's project minutes, team A channel, assumptions,
625
00:23:00,680 --> 00:23:02,240
no vendor exceptions.
626
00:23:02,240 --> 00:23:03,920
Every answer updates the ledger.
627
00:23:03,920 --> 00:23:06,960
Every new channel or day resets, unless the user explicitly
628
00:23:06,960 --> 00:23:08,680
reattaches the prior session.
629
00:23:08,680 --> 00:23:12,040
No phantom carryover, design clear reset semantics,
630
00:23:12,040 --> 00:23:14,880
changing channels resets, crossing midnight resets,
631
00:23:14,880 --> 00:23:17,040
moving from teams to outlook resets.
632
00:23:17,040 --> 00:23:19,560
Users can override by reattaching the contract.
633
00:23:19,560 --> 00:23:22,400
Continue session 7F3C until Friday.
634
00:23:22,400 --> 00:23:25,000
Force an explicit action, attach, extend, or end.
635
00:23:25,000 --> 00:23:27,680
If there's any ambiguity default to refusal,
636
00:23:27,680 --> 00:23:29,480
I don't have your prior constraints attached.
637
00:23:29,480 --> 00:23:34,400
Would you like to reattach session 7F3C or restate them?
638
00:23:34,400 --> 00:23:35,960
Silence beats fiction.
639
00:23:35,960 --> 00:23:38,280
Cash governance matters more than most realize.
640
00:23:38,280 --> 00:23:41,160
Embeddings feel like harmless acceleration.
641
00:23:41,160 --> 00:23:42,520
They are memory without governance
642
00:23:42,520 --> 00:23:44,560
if you let them persist indefinitely.
643
00:23:44,560 --> 00:23:47,760
Set TTLs based on data class hours for sensitive drafts,
644
00:23:47,760 --> 00:23:49,240
days for public docs.
645
00:23:49,240 --> 00:23:51,040
Perch cycles and audits must exist.
646
00:23:51,040 --> 00:23:52,680
Track who created the vector store.
647
00:23:52,680 --> 00:23:55,280
What source labels it contains, when it last refreshed,
648
00:23:55,280 --> 00:23:56,760
and which agents can read it.
649
00:23:56,760 --> 00:24:00,120
When users attach a contract, you bind to a versioned snapshot,
650
00:24:00,120 --> 00:24:02,160
not a constantly mutating index.
651
00:24:02,160 --> 00:24:04,720
Common anti-patterns, sticky side bars
652
00:24:04,720 --> 00:24:07,320
that quietly carry instructions across tenants, channels,
653
00:24:07,320 --> 00:24:09,880
and days, hidden system prompts that inject tone
654
00:24:09,880 --> 00:24:12,040
and behavior without user awareness.
655
00:24:12,040 --> 00:24:15,120
Auto-attached recent files that change as users browse,
656
00:24:15,120 --> 00:24:16,960
subtly shifting grounding.
657
00:24:16,960 --> 00:24:19,920
And the worst, long running threads that migrate topics,
658
00:24:19,920 --> 00:24:23,000
sources, and stakes without a single visible boundary,
659
00:24:23,000 --> 00:24:24,320
users trust the thread.
660
00:24:24,320 --> 00:24:25,680
The system silently drifted.
661
00:24:25,680 --> 00:24:27,200
Detection is behavioral.
662
00:24:27,200 --> 00:24:29,360
Look for answer variants on identical prompts
663
00:24:29,360 --> 00:24:31,680
across channels for policy-heavy domains.
664
00:24:31,680 --> 00:24:33,520
Variance is a smell.
665
00:24:33,520 --> 00:24:35,760
Instrument context change events, channels, switches,
666
00:24:35,760 --> 00:24:37,560
models, switches, index refreshes,
667
00:24:37,560 --> 00:24:40,640
and log-weather accession was explicitly reattached.
668
00:24:40,640 --> 00:24:42,160
Alert on high-risk answers produced
669
00:24:42,160 --> 00:24:44,960
without an active contract in audits, sample answers
670
00:24:44,960 --> 00:24:46,120
against ledgers.
671
00:24:46,120 --> 00:24:48,080
If the ledger can't explain an inclusion,
672
00:24:48,080 --> 00:24:49,680
you have phantom carryover.
673
00:24:49,680 --> 00:24:52,600
If this sounds heavy, it's because execution is heavy,
674
00:24:52,600 --> 00:24:54,600
but the controls are simple to teach.
675
00:24:54,600 --> 00:24:57,680
One phrase for users, attach a restate.
676
00:24:57,680 --> 00:25:00,320
One habit for builders, show the ledger.
677
00:25:00,320 --> 00:25:01,360
One policy for owners.
678
00:25:01,360 --> 00:25:03,760
Memory is opt-in scoped, dated, and revocable.
679
00:25:03,760 --> 00:25:05,640
Remember, copilot didn't forget you.
680
00:25:05,640 --> 00:25:06,960
You forgot to define memory.
681
00:25:06,960 --> 00:25:10,000
In a distributed decision engine, continuity isn't a vibe.
682
00:25:10,000 --> 00:25:11,000
It's a contract.
683
00:25:11,000 --> 00:25:12,720
Encoded and trust has a basis.
684
00:25:12,720 --> 00:25:15,920
Leave it implied, and drift will write your outcomes for you.
685
00:25:16,760 --> 00:25:19,320
Encafailure 5 cost and chaos explosion.
686
00:25:19,320 --> 00:25:21,240
Here's how entropy invoices you.
687
00:25:21,240 --> 00:25:22,360
Adoptions bikes.
688
00:25:22,360 --> 00:25:24,200
Leaders green-light pilot everywhere.
689
00:25:24,200 --> 00:25:27,120
Within a week, token burn jumps, API calls triple,
690
00:25:27,120 --> 00:25:29,680
and someone asks finance why the copilot line item looks
691
00:25:29,680 --> 00:25:30,920
like a seasonal surge.
692
00:25:30,920 --> 00:25:32,240
No breach, no scandal.
693
00:25:32,240 --> 00:25:34,720
Just money evaporating into productivity.
694
00:25:34,720 --> 00:25:36,240
Cost looks like a finance problem.
695
00:25:36,240 --> 00:25:36,760
It is not.
696
00:25:36,760 --> 00:25:38,280
It's a security and governance problem
697
00:25:38,280 --> 00:25:39,560
wearing an easy metric.
698
00:25:39,560 --> 00:25:40,840
The pattern is consistent.
699
00:25:40,840 --> 00:25:43,200
Shadow prompts proliferate in teams and outlook,
700
00:25:43,200 --> 00:25:46,240
helpful side bars quietly hit graph and external endpoints,
701
00:25:46,240 --> 00:25:48,520
make up build agents run with human credentials,
702
00:25:48,520 --> 00:25:50,960
flows trigger on noisy lists, unknown connectors,
703
00:25:50,960 --> 00:25:52,680
slip into dev that's actually prod.
704
00:25:52,680 --> 00:25:54,720
Nobody sees it because nobody instrumented it.
705
00:25:54,720 --> 00:25:56,120
Usage feels like success.
706
00:25:56,120 --> 00:25:58,280
Spend becomes the only visible telemetry,
707
00:25:58,280 --> 00:26:00,640
so cost becomes your first alarm bell.
708
00:26:00,640 --> 00:26:04,320
Root cause, zero observability, no rooting discipline,
709
00:26:04,320 --> 00:26:07,680
unmanaged environments, and absent ceilings.
710
00:26:07,680 --> 00:26:10,040
You didn't classify agents by environment group.
711
00:26:10,040 --> 00:26:11,600
You didn't allocate budgets per team,
712
00:26:11,600 --> 00:26:12,760
per agent, per environment.
713
00:26:12,760 --> 00:26:14,440
You didn't block ambiguous connectors.
714
00:26:14,440 --> 00:26:17,720
You didn't capture prompt response logs or cost per answer.
715
00:26:17,720 --> 00:26:19,760
The system did exactly what you allowed.
716
00:26:19,760 --> 00:26:23,400
It scaled without boundaries, tie cost to security debt.
717
00:26:23,400 --> 00:26:24,480
Spikes aren't random.
718
00:26:24,480 --> 00:26:27,960
They correlate with invisible agents and uncontrolled execution.
719
00:26:27,960 --> 00:26:29,840
A long running thread with sticky instructions
720
00:26:29,840 --> 00:26:31,640
becomes a silent super user.
721
00:26:31,640 --> 00:26:34,320
A copilot suggested flow polling every minute hits
722
00:26:34,320 --> 00:26:38,880
API's 1,440 times a day per list multiplied by departments.
723
00:26:38,880 --> 00:26:41,080
A summarizer grounded to a broad sharepoint scope
724
00:26:41,080 --> 00:26:43,440
slurps pages to answer a trivial question.
725
00:26:43,440 --> 00:26:44,840
You don't have a finance anomaly.
726
00:26:44,840 --> 00:26:46,280
You have an attack surface expansion.
727
00:26:46,280 --> 00:26:47,520
You can finally graph.
728
00:26:47,520 --> 00:26:50,200
Start with observability as a mandate, not a dashboard wish.
729
00:26:50,200 --> 00:26:52,200
You need DSPM for AI to show where
730
00:26:52,200 --> 00:26:54,800
sensitive data intersects copilot usage.
731
00:26:54,800 --> 00:26:56,840
You need activity explorer logging prompts
732
00:26:56,840 --> 00:26:58,920
and responses with label awareness.
733
00:26:58,920 --> 00:27:01,560
You need per agent, per environment, cost curves,
734
00:27:01,560 --> 00:27:03,040
and volumetrics.
735
00:27:03,040 --> 00:27:05,960
And you need anomaly alerts tied to policy.
736
00:27:05,960 --> 00:27:08,320
Sudden growth in sensitive label mentions.
737
00:27:08,320 --> 00:27:11,640
Cross-tenant traffic, connector usage outside allow lists,
738
00:27:11,640 --> 00:27:14,120
or cost per answer doubling for the same task.
739
00:27:14,120 --> 00:27:16,000
Budgets are controls, not punishments,
740
00:27:16,000 --> 00:27:17,800
allocate ceilings by environment group,
741
00:27:17,800 --> 00:27:20,280
dev test, prod, then by team and by agent.
742
00:27:20,280 --> 00:27:23,040
When an agent hits 80% of its monthly budget,
743
00:27:23,040 --> 00:27:26,480
it notifies owners with usage, decomposition, top prompts,
744
00:27:26,480 --> 00:27:28,600
top connectors, average grounding size,
745
00:27:28,600 --> 00:27:31,920
average answer length, and sensitive content references.
746
00:27:31,920 --> 00:27:34,520
At 100%, it pauses and roots to a review queue
747
00:27:34,520 --> 00:27:36,920
with arrays, redesign, or retire choice.
748
00:27:36,920 --> 00:27:39,640
Money becomes a circuit breaker that forces clarity.
749
00:27:39,640 --> 00:27:41,920
Roting discipline comes next default deny
750
00:27:41,920 --> 00:27:43,640
on connectors by risk class.
751
00:27:43,640 --> 00:27:46,200
Only approved connectors in managed environments.
752
00:27:46,200 --> 00:27:49,160
Block unknown APIs entirely in prod for dev and test,
753
00:27:49,160 --> 00:27:51,360
allow with expiry and cost caps.
754
00:27:51,360 --> 00:27:53,320
Use environment groups to stamp these rules
755
00:27:53,320 --> 00:27:54,880
once for many environments.
756
00:27:54,880 --> 00:27:58,000
Anyone trying to wire a flow with a disallowed connector
757
00:27:58,000 --> 00:28:00,840
gets a refusal with a link to the approval path,
758
00:28:00,840 --> 00:28:02,600
to not temporarily allow it.
759
00:28:02,600 --> 00:28:04,760
Entropy loves temporary.
760
00:28:04,760 --> 00:28:06,760
Design for cost aware behavior,
761
00:28:06,760 --> 00:28:10,520
force structured output so you aren't paying for walls of text,
762
00:28:10,520 --> 00:28:13,680
require bounded context windows, sources by site,
763
00:28:13,680 --> 00:28:15,920
time window and label, cash governance
764
00:28:15,920 --> 00:28:19,080
with TTLs prevents stale oversized vector stores.
765
00:28:19,080 --> 00:28:22,400
Teach agents short answers unless the plan requires depth.
766
00:28:22,400 --> 00:28:24,480
Measure cost per action for automations
767
00:28:24,480 --> 00:28:26,560
and reject proposals that exceed thresholds
768
00:28:26,560 --> 00:28:27,960
without justification.
769
00:28:27,960 --> 00:28:30,720
Think step by step is free, fetch the internet isn't,
770
00:28:30,720 --> 00:28:33,520
anti-patterns to kill, will monitor cost quarterly.
771
00:28:33,520 --> 00:28:34,560
You'll be too late.
772
00:28:34,560 --> 00:28:36,080
Proms are just conversations.
773
00:28:36,080 --> 00:28:37,520
Their programs with bills.
774
00:28:37,520 --> 00:28:39,240
Let's open the web then refine.
775
00:28:39,240 --> 00:28:42,200
That's how your spend becomes someone else's training data.
776
00:28:42,200 --> 00:28:43,280
We'll optimize later.
777
00:28:43,280 --> 00:28:45,160
Later is when finance locks the account.
778
00:28:45,160 --> 00:28:47,680
If this sounds heavy, it's because execution is heavy.
779
00:28:47,680 --> 00:28:49,080
But this is mostly plumbing.
780
00:28:49,080 --> 00:28:52,240
Instrument, budget, route, cap, alert.
781
00:28:52,240 --> 00:28:53,920
When cost surges treated like smoke
782
00:28:53,920 --> 00:28:55,320
from an electrical panel.
783
00:28:55,320 --> 00:28:57,240
Don't fan it with more pilots, open the box
784
00:28:57,240 --> 00:28:58,960
and pull the wrong wires, remember.
785
00:28:58,960 --> 00:29:00,880
The invoice is the only honest stakeholder
786
00:29:00,880 --> 00:29:02,840
you have in an unobserved system.
787
00:29:02,840 --> 00:29:04,280
It tells you where entropy won.
788
00:29:04,280 --> 00:29:04,880
Use it.
789
00:29:04,880 --> 00:29:07,240
Enforce ceilings make spend the forcing function
790
00:29:07,240 --> 00:29:09,200
that drags invisible agents into daylight.
791
00:29:09,200 --> 00:29:10,240
Cost isn't the problem.
792
00:29:10,240 --> 00:29:11,320
Cost is the light.
793
00:29:11,320 --> 00:29:13,640
Mandate one, define the system, not the feature.
794
00:29:13,640 --> 00:29:15,120
This is the first non-negotiable.
795
00:29:15,120 --> 00:29:17,480
Before you enable anything, define the system you're actually
796
00:29:17,480 --> 00:29:18,240
running.
797
00:29:18,240 --> 00:29:19,520
You are not turning on a feature.
798
00:29:19,520 --> 00:29:21,320
You are declaring a control plane
799
00:29:21,320 --> 00:29:24,760
with data, identity, tools, and execution semantics.
800
00:29:24,760 --> 00:29:26,080
If you don't name it explicitly,
801
00:29:26,080 --> 00:29:28,840
the engine will compile your ambiguity into behavior.
802
00:29:28,840 --> 00:29:31,520
Start with a system map you can read aloud, one page,
803
00:29:31,520 --> 00:29:34,680
four columns, surfaces, data, identity, and execution.
804
00:29:34,680 --> 00:29:38,360
Surfaces, teams, outlook, co-pilot in word and Excel,
805
00:29:38,360 --> 00:29:40,880
co-pilot studio agents, and power automate.
806
00:29:40,880 --> 00:29:44,520
Data, SharePoint, and OneDrive sites by risk class,
807
00:29:44,520 --> 00:29:48,280
Exchange mailboxes, team stores, fabric, lake houses,
808
00:29:48,280 --> 00:29:50,480
and external connectors you will allow.
809
00:29:50,480 --> 00:29:53,680
Identity, humans, agent personas, service principles,
810
00:29:53,680 --> 00:29:56,280
and authentication context for risk-based rules.
811
00:29:56,280 --> 00:29:58,920
Execution, which surfaces can only reason,
812
00:29:58,920 --> 00:30:01,200
which can plan and which are permitted to execute
813
00:30:01,200 --> 00:30:03,520
under what gates with what audit and with what rollback
814
00:30:03,520 --> 00:30:04,960
name every surface.
815
00:30:04,960 --> 00:30:06,440
Teams is not a surface.
816
00:30:06,440 --> 00:30:09,680
Tenant-wide team summarization with channel exclusion controls
817
00:30:09,680 --> 00:30:11,840
plus private channel deny is.
818
00:30:11,840 --> 00:30:13,760
Outlook is not a surface.
819
00:30:13,760 --> 00:30:16,400
Message level summarization with label-aware refusal
820
00:30:16,400 --> 00:30:18,520
and cross-mailbox deny is.
821
00:30:18,520 --> 00:30:20,400
Power automate is not a surface.
822
00:30:20,400 --> 00:30:23,480
Dev test, prod environments with connector allow lists,
823
00:30:23,480 --> 00:30:26,720
mandatory approvals, dry run, and JT elevation is.
824
00:30:26,720 --> 00:30:28,640
If you can't list it, you can't bound it.
825
00:30:28,640 --> 00:30:30,360
If you can't bound it, it will overreach.
826
00:30:30,360 --> 00:30:34,080
Next, define the intended outcomes and the unacceptable states.
827
00:30:34,080 --> 00:30:35,960
Outcomes are verbs with subjects.
828
00:30:35,960 --> 00:30:39,480
Summarize project channels excluding HR legal domains,
829
00:30:39,480 --> 00:30:42,680
draft but do not send vendor emails with citations.
830
00:30:42,680 --> 00:30:46,480
Propose, not publish, flows to reduce polling.
831
00:30:46,480 --> 00:30:48,840
Unexceptible states are refusal criteria.
832
00:30:48,840 --> 00:30:51,280
Any answer without citations for policy claims,
833
00:30:51,280 --> 00:30:53,760
any grounding that includes content labeled
834
00:30:53,760 --> 00:30:55,960
confidential HR legal.
835
00:30:55,960 --> 00:30:59,040
Any execution proposal without a machine readable plan.
836
00:30:59,040 --> 00:31:01,800
Put the refusal states on the same page as the outcomes.
837
00:31:01,800 --> 00:31:03,240
Silence beats fiction.
838
00:31:03,240 --> 00:31:05,000
If you don't codify no-go zones,
839
00:31:05,000 --> 00:31:07,400
someone will operationalize a maybe.
840
00:31:07,400 --> 00:31:10,240
Translate intent into configuration objects, no narratives.
841
00:31:10,240 --> 00:31:12,880
Policies allow lists, labels with enforcement,
842
00:31:12,880 --> 00:31:16,320
authentication contexts, environment groups, agent registries.
843
00:31:16,320 --> 00:31:18,920
For each agent experience, Teams co-pilot, Outlook co-pilot,
844
00:31:18,920 --> 00:31:21,640
a co-pilot studio agent bind it to one, an identity
845
00:31:21,640 --> 00:31:25,480
that is not a person, two, a data scope that is default deny
846
00:31:25,480 --> 00:31:27,320
with named inclusions and expires,
847
00:31:27,320 --> 00:31:30,800
and three, an execution role that cannot exceed its tier.
848
00:31:30,800 --> 00:31:33,680
Reason only identities can never call right connectors.
849
00:31:33,680 --> 00:31:35,840
Planned tier identities can produce JSON plans
850
00:31:35,840 --> 00:31:37,680
never invoke side effects.
851
00:31:37,680 --> 00:31:40,400
Execute tier identities act only through gate approved
852
00:31:40,400 --> 00:31:43,320
workflows with least privilege and rollback.
853
00:31:43,320 --> 00:31:45,160
Model the decision graph explicitly.
854
00:31:45,160 --> 00:31:47,400
A request hits a surface which invokes retrieval
855
00:31:47,400 --> 00:31:49,080
through graph scoped allow lists, which
856
00:31:49,080 --> 00:31:51,560
grounds into an evaluator that checks citations
857
00:31:51,560 --> 00:31:53,720
and refusal rules, which optionally
858
00:31:53,720 --> 00:31:55,880
emits a machine readable plan, which
859
00:31:55,880 --> 00:31:58,560
routes to a gate, which if approved executes
860
00:31:58,560 --> 00:32:00,840
as an agent with constrained permissions
861
00:32:00,840 --> 00:32:03,000
and logs everything to audit.
862
00:32:03,000 --> 00:32:05,600
Draw the arrows, put names on the boxes.
863
00:32:05,600 --> 00:32:09,360
Teams summarizer, retrieval filter, project sites A, B,
864
00:32:09,360 --> 00:32:13,680
time window, 14 days, labels, exclude HR legal evaluator
865
00:32:13,680 --> 00:32:18,040
citations required for policy terms, output summary only.
866
00:32:18,040 --> 00:32:19,640
That is not documentation theater.
867
00:32:19,640 --> 00:32:22,200
That is executable intent define your source registry
868
00:32:22,200 --> 00:32:22,800
by domain.
869
00:32:22,800 --> 00:32:25,720
Procurement policy lives here, HR guidance lives there.
870
00:32:25,720 --> 00:32:27,360
Security standards live here.
871
00:32:27,360 --> 00:32:29,800
Each registry entry has an owner, a label, a version,
872
00:32:29,800 --> 00:32:31,120
and a stable anchor.
873
00:32:31,120 --> 00:32:32,920
Copilot doesn't search everything.
874
00:32:32,920 --> 00:32:35,600
It answers from registries or it refuses.
875
00:32:35,600 --> 00:32:38,960
Teach listeners one phrase across the org truth or silence.
876
00:32:38,960 --> 00:32:40,960
Then why are the registry to your evaluators?
877
00:32:40,960 --> 00:32:43,320
So any claim that sounds like policy must carry
878
00:32:43,320 --> 00:32:47,040
an inline citation that resolves to canon or the answer stops?
879
00:32:47,040 --> 00:32:50,240
Establish environment groups as your scale lever.
880
00:32:50,240 --> 00:32:52,360
Personal dev environments, reason only,
881
00:32:52,360 --> 00:32:55,840
no external connectors, tight budgets, owner only sharing disabled.
882
00:32:55,840 --> 00:32:59,640
Team a test, plan, plus dry run, limited connectors,
883
00:32:59,640 --> 00:33:03,480
budget ceilings with alerts at 80% and stops at 100%.
884
00:33:03,480 --> 00:33:06,520
Production, gate control execute, approved connectors only,
885
00:33:06,520 --> 00:33:08,600
cost per answer and cost per action metrics,
886
00:33:08,600 --> 00:33:11,360
activity explorer logging of prompts and responses
887
00:33:11,360 --> 00:33:12,600
with label awareness.
888
00:33:12,600 --> 00:33:15,040
Stamped policies once at the group, not 500 times
889
00:33:15,040 --> 00:33:17,920
per environment, entropy hates inherited rigor.
890
00:33:17,920 --> 00:33:19,840
Instrument the system before you launch it.
891
00:33:19,840 --> 00:33:21,320
Budget is a control not a report.
892
00:33:21,320 --> 00:33:23,240
Attach per agent ceilings anomaly alerts
893
00:33:23,240 --> 00:33:25,600
for sensitive label mentions cross domain co-occurrence,
894
00:33:25,600 --> 00:33:28,680
connector deviations and execution without recorded approvals.
895
00:33:28,680 --> 00:33:31,320
Define owners, human names for every agent identity
896
00:33:31,320 --> 00:33:32,440
and every environment.
897
00:33:32,440 --> 00:33:34,160
No owner, no runtime.
898
00:33:34,160 --> 00:33:36,640
Establish drift reviews, monthly checks on scopes,
899
00:33:36,640 --> 00:33:38,400
connectors and refusal rates.
900
00:33:38,400 --> 00:33:41,280
If refusal is dropping to zero, your gates are eroding.
901
00:33:41,280 --> 00:33:43,840
If cost per answer is rising, your retrieval windows
902
00:33:43,840 --> 00:33:45,360
or sources are bloated.
903
00:33:45,360 --> 00:33:47,120
Common anti-patents you will kill here.
904
00:33:47,120 --> 00:33:48,960
We'll start open and tighten later.
905
00:33:48,960 --> 00:33:49,760
You won't.
906
00:33:49,760 --> 00:33:51,080
Labels first policies later.
907
00:33:51,080 --> 00:33:52,760
Stickers don't stop retrieval.
908
00:33:52,760 --> 00:33:54,920
Prompts as policy, wishes are not gates.
909
00:33:54,920 --> 00:33:56,720
Maker credentials as runtime.
910
00:33:56,720 --> 00:33:58,800
That's how blast radio become incidents.
911
00:33:58,800 --> 00:34:01,720
And the softest failure will keep a wiki of rules.
912
00:34:01,720 --> 00:34:03,240
Control planes don't read wikis.
913
00:34:03,240 --> 00:34:05,160
They read policies, scopes and rolls.
914
00:34:05,160 --> 00:34:07,640
If this sounds heavy, it's because execution is heavy.
915
00:34:07,640 --> 00:34:09,160
But this is the mandate that prevents you
916
00:34:09,160 --> 00:34:12,120
from narrating intent while the engine compiles entropy.
917
00:34:12,120 --> 00:34:13,440
Define the system once.
918
00:34:13,440 --> 00:34:15,240
Everything else is enforcement.
919
00:34:15,240 --> 00:34:19,400
Mandate two boundaries first, constrain context, then language.
920
00:34:19,400 --> 00:34:22,920
If mandate one names the system, mandate two draws the fence.
921
00:34:22,920 --> 00:34:24,520
You don't start with clever prompts.
922
00:34:24,520 --> 00:34:26,480
You start with hard edges that constrain what
923
00:34:26,480 --> 00:34:29,480
the engine can see and touch, context before language,
924
00:34:29,480 --> 00:34:31,880
boundaries before tone.
925
00:34:31,880 --> 00:34:34,720
Architecturally, that means three concrete controls.
926
00:34:34,720 --> 00:34:37,520
Graphscoping, sensitivity enforcement, and connector
927
00:34:37,520 --> 00:34:38,640
allow lists.
928
00:34:38,640 --> 00:34:40,480
In combination, they convert everything
929
00:34:40,480 --> 00:34:44,080
the caller can read into only what this agent may retrieve.
930
00:34:44,080 --> 00:34:47,040
If this sounds heavy, it's because execution is heavy.
931
00:34:47,040 --> 00:34:49,800
But this is plumbing you set once, then inherit everywhere.
932
00:34:49,800 --> 00:34:51,320
Begin with graph scoping.
933
00:34:51,320 --> 00:34:53,320
Default deny at the agent level.
934
00:34:53,320 --> 00:34:57,160
Then add explicit inclusions by site, team, mailbox, time
935
00:34:57,160 --> 00:34:58,360
window and label.
936
00:34:58,360 --> 00:35:00,200
A team summarizer for Project A doesn't
937
00:35:00,200 --> 00:35:02,120
get all tenant share point.
938
00:35:02,120 --> 00:35:05,480
It gets sites, Project A, channels, Project A general.
939
00:35:05,480 --> 00:35:09,440
Time window, 14 days, labels, exclude HR's legal.
940
00:35:09,440 --> 00:35:12,240
In Outlook, summarization can see only the current mailbox,
941
00:35:12,240 --> 00:35:14,360
not shared mailboxes or shared folders,
942
00:35:14,360 --> 00:35:16,960
and it refuses cross mailbox search full stop.
943
00:35:16,960 --> 00:35:18,920
If a scope isn't named, it doesn't exist.
944
00:35:18,920 --> 00:35:20,640
That is the difference between bounded retrieval
945
00:35:20,640 --> 00:35:21,680
and lateral discovery.
946
00:35:21,680 --> 00:35:25,000
Next, enforce sensitivity with effect, not aspiration.
947
00:35:25,000 --> 00:35:27,280
A label must change behavior at retrieval time.
948
00:35:27,280 --> 00:35:30,320
For high-risk labels, confidential HR, legal privilege,
949
00:35:30,320 --> 00:35:32,840
restricted finance, bind policies that either deny
950
00:35:32,840 --> 00:35:35,920
grounding entirely or redact, recognize sensitive types
951
00:35:35,920 --> 00:35:36,840
before synthesis.
952
00:35:36,840 --> 00:35:38,120
Don't rely on model etiquette.
953
00:35:38,120 --> 00:35:41,160
Use purview policy to force deny or redact
954
00:35:41,160 --> 00:35:43,600
and instrument every attempted inclusion.
955
00:35:43,600 --> 00:35:48,120
So, activity explorer shows sensitive interactions per AI app.
956
00:35:48,120 --> 00:35:49,840
That metric is your canary.
957
00:35:49,840 --> 00:35:52,320
Rising counts mean your fences have gaps.
958
00:35:52,320 --> 00:35:54,560
Connector allow lists are the third rail.
959
00:35:54,560 --> 00:35:57,600
Managed environments get a name set of connectors per risk tier.
960
00:35:57,600 --> 00:35:59,680
Dev can experiment with a small expansion pack
961
00:35:59,680 --> 00:36:01,560
under budget and with expiry.
962
00:36:01,560 --> 00:36:03,560
Project gets only approved connectors,
963
00:36:03,560 --> 00:36:07,080
no unknown APIs, no generic HTTP in the dock.
964
00:36:07,080 --> 00:36:09,320
If a proposed agent or flow needs a new connector,
965
00:36:09,320 --> 00:36:11,360
it routes to the gate you defined later.
966
00:36:11,360 --> 00:36:13,440
Temporarily allow it is how entropy moves in.
967
00:36:13,440 --> 00:36:14,720
Don't sign that waiver.
968
00:36:14,720 --> 00:36:16,920
Language boundaries come after context boundaries.
969
00:36:16,920 --> 00:36:18,920
Prompt contracts do not replace fences.
970
00:36:18,920 --> 00:36:19,840
They live inside them.
971
00:36:19,840 --> 00:36:21,640
So, one scopes and policies are tight.
972
00:36:21,640 --> 00:36:24,440
You define prompt constraints as device, not defense.
973
00:36:24,440 --> 00:36:28,240
In teams, summarize this channel, exclude names
974
00:36:28,240 --> 00:36:30,720
and output as bullets with citations.
975
00:36:30,720 --> 00:36:32,880
In outlook, draft but never send.
976
00:36:32,880 --> 00:36:35,800
Reference only attached or in thread content.
977
00:36:35,800 --> 00:36:37,200
The model cannot be polite enough
978
00:36:37,200 --> 00:36:39,000
to overcome a wide open scope.
979
00:36:39,000 --> 00:36:40,280
But with a bounded corpus,
980
00:36:40,280 --> 00:36:43,920
prompt constraints make outputs machine readable and auditable.
981
00:36:43,920 --> 00:36:46,760
Persona is a constraint device, not decoration.
982
00:36:46,760 --> 00:36:48,720
A procurement policy analyst persona,
983
00:36:48,720 --> 00:36:50,160
narrows verbs and sources.
984
00:36:50,160 --> 00:36:52,720
Sight policy anchors, refuse without cannon,
985
00:36:52,720 --> 00:36:54,080
never propose execution.
986
00:36:54,080 --> 00:36:57,880
A flow architect persona, outputs Jason plans, never actions.
987
00:36:57,880 --> 00:36:59,720
Tie personas to identities and scopes
988
00:36:59,720 --> 00:37:01,760
so the engine cannot act out of character,
989
00:37:01,760 --> 00:37:03,960
even if the prompt tries to jailbreak tone.
990
00:37:03,960 --> 00:37:06,040
Make default deny the operational habit.
991
00:37:06,040 --> 00:37:08,400
High-risk sources start out of scope.
992
00:37:08,400 --> 00:37:11,120
Access is opt in with explicit owner, purpose,
993
00:37:11,120 --> 00:37:12,240
expiry and budget.
994
00:37:12,240 --> 00:37:14,480
Periodic reviews, reproof scope and revoke
995
00:37:14,480 --> 00:37:16,440
by default if owners don't respond.
996
00:37:16,440 --> 00:37:17,400
This isn't bureaucracy.
997
00:37:17,400 --> 00:37:19,160
It's how you prevent one time for a demo
998
00:37:19,160 --> 00:37:22,240
from turning into always available to every summary.
999
00:37:22,240 --> 00:37:23,720
Add two critical refuse rules.
1000
00:37:23,720 --> 00:37:26,240
First, no citation, no answer for any claim
1001
00:37:26,240 --> 00:37:27,440
that sounds like policy.
1002
00:37:27,440 --> 00:37:29,400
The evaluator checks the registry you define
1003
00:37:29,400 --> 00:37:31,200
and stops if the link isn't present.
1004
00:37:31,200 --> 00:37:32,760
Second, no scope, no ground.
1005
00:37:32,760 --> 00:37:35,840
If retrieval can't prove the sources authorized
1006
00:37:35,840 --> 00:37:37,880
for this agent, the answer does not attempt
1007
00:37:37,880 --> 00:37:39,040
the best effort.
1008
00:37:39,040 --> 00:37:41,280
The refusal message offers a bounded alternative,
1009
00:37:41,280 --> 00:37:43,280
attached project or policy library.
1010
00:37:43,280 --> 00:37:45,360
You give users a ladder back into the fence.
1011
00:37:45,360 --> 00:37:47,000
For teams and outlook, bake boundaries
1012
00:37:47,000 --> 00:37:48,800
into configuration, not culture.
1013
00:37:48,800 --> 00:37:50,880
Team summarization excludes private channels
1014
00:37:50,880 --> 00:37:54,000
and sensitive domains by policy, not reminder.
1015
00:37:54,000 --> 00:37:55,720
Outlook summarization is label-aware
1016
00:37:55,720 --> 00:37:57,640
and redacts by rule, not by etiquette.
1017
00:37:57,640 --> 00:38:00,200
If a user needs a one-off, they attach a scope contract
1018
00:38:00,200 --> 00:38:01,160
with expiry.
1019
00:38:01,160 --> 00:38:03,160
Your defaults remain safe.
1020
00:38:03,160 --> 00:38:06,880
Common anti-patterns, trusting only summarise this channel
1021
00:38:06,880 --> 00:38:09,160
in a prompt to stop cross-site retrieval,
1022
00:38:09,160 --> 00:38:12,600
putting labels on content without deny-sash-redact policies,
1023
00:38:12,600 --> 00:38:16,920
enabling co-pilot on org-wide sites whose owners are former employee
1024
00:38:16,920 --> 00:38:19,960
and treating connector governance as a future hardening step.
1025
00:38:19,960 --> 00:38:22,720
Future never arrives, entropy does.
1026
00:38:22,720 --> 00:38:24,080
This mandate is simple to test.
1027
00:38:24,080 --> 00:38:26,000
Ask, can this agent read X?
1028
00:38:26,000 --> 00:38:28,840
If the answer isn't provably no unless your fence has a hole,
1029
00:38:28,840 --> 00:38:32,680
constraint context first, then and only then refine language.
1030
00:38:32,680 --> 00:38:35,960
Mandate three, demand structured output or don't automate.
1031
00:38:35,960 --> 00:38:39,080
If mandate one defines the system and mandate two draws the fence,
1032
00:38:39,080 --> 00:38:42,120
this is the first mandate that enables automation under control.
1033
00:38:42,120 --> 00:38:43,600
Without structure, you can draft.
1034
00:38:43,600 --> 00:38:44,600
You cannot act.
1035
00:38:44,600 --> 00:38:48,600
Wall of text is untestable, unauditable and non-inforcible.
1036
00:38:48,600 --> 00:38:50,480
Structure turns pros into a contract
1037
00:38:50,480 --> 00:38:52,560
the rest of your control plane can evaluate.
1038
00:38:52,560 --> 00:38:53,240
Here's the rule.
1039
00:38:53,240 --> 00:38:56,000
If it's not machine readable, it's not actionable.
1040
00:38:56,000 --> 00:38:58,840
That means tables, JSON or schemas never freeform paragraphs
1041
00:38:58,840 --> 00:39:02,160
whenever an answer might drive a decision, a ticket or a write.
1042
00:39:02,160 --> 00:39:04,280
Human readers can infer, gates cannot.
1043
00:39:04,280 --> 00:39:08,080
Evaluators need shape to check completeness, citations and refusal reasons
1044
00:39:08,080 --> 00:39:11,400
before anything leaves the reason plan lane.
1045
00:39:11,400 --> 00:39:14,000
Start by separating outputs into three classes.
1046
00:39:14,000 --> 00:39:16,400
Class A, narrative only drafting emails,
1047
00:39:16,400 --> 00:39:18,320
rephrasing updates, summarizing meetings,
1048
00:39:18,320 --> 00:39:19,960
these never trigger actions.
1049
00:39:19,960 --> 00:39:23,960
Let them be pros but still require citations when they assert policy.
1050
00:39:23,960 --> 00:39:27,640
Class B, decisions that inform humans, recommendations,
1051
00:39:27,640 --> 00:39:29,840
option matrices, risk assessments.
1052
00:39:29,840 --> 00:39:31,440
These must be structured.
1053
00:39:31,440 --> 00:39:34,840
Tables with defined columns, JSON with required keys,
1054
00:39:34,840 --> 00:39:37,960
enumerations with explicit constraints, class C,
1055
00:39:37,960 --> 00:39:40,240
anything that could be executed, plans for flows,
1056
00:39:40,240 --> 00:39:42,680
remediation steps, access changes.
1057
00:39:42,680 --> 00:39:44,600
These must be structured and validatable
1058
00:39:44,600 --> 00:39:47,040
against a schema that your gate understands.
1059
00:39:47,040 --> 00:39:48,680
Tables are your friend for class B,
1060
00:39:48,680 --> 00:39:52,240
defined columns once per use case and reuse them relentlessly.
1061
00:39:52,240 --> 00:39:56,320
For a risk recommendation, option, prerequisites, citations,
1062
00:39:56,320 --> 00:40:00,200
predicted impact, residual risk and refusal reason if incomplete.
1063
00:40:00,200 --> 00:40:02,640
In teams that renders readable, in your gate,
1064
00:40:02,640 --> 00:40:05,000
those columns become fields to verify.
1065
00:40:05,000 --> 00:40:08,200
Citations resolve to the registry, prerequisites exist
1066
00:40:08,200 --> 00:40:11,320
in the environment, residual risk falls below a threshold.
1067
00:40:11,320 --> 00:40:14,320
If a cell is missing, the evaluator refuses.
1068
00:40:14,320 --> 00:40:16,520
Silence beats fiction.
1069
00:40:16,520 --> 00:40:19,600
JSON is the right fit for class C and plan artifacts.
1070
00:40:19,600 --> 00:40:20,840
The pattern is simple.
1071
00:40:20,840 --> 00:40:25,840
Goal, assumptions, sources, array of canonical links,
1072
00:40:25,840 --> 00:40:30,000
steps, each with action, target, preconditions,
1073
00:40:30,000 --> 00:40:34,400
rollback, risk tier, approvals required.
1074
00:40:34,400 --> 00:40:37,360
Tell us us a sit-bass is it's sit-its,
1075
00:40:37,360 --> 00:40:40,000
your evaluator enforces required keys,
1076
00:40:40,000 --> 00:40:43,240
checks that each source resolves to the authorized corpus,
1077
00:40:43,240 --> 00:40:46,040
confirms rollback exists for each step
1078
00:40:46,040 --> 00:40:49,480
and rejects any action that touches a disallowed connector.
1079
00:40:49,480 --> 00:40:52,120
Because it's shape-checked, humans can approve quickly,
1080
00:40:52,120 --> 00:40:53,640
because it's bound to the registry,
1081
00:40:53,640 --> 00:40:55,760
hallucinated authority can't sneak in.
1082
00:40:55,760 --> 00:40:58,920
Now build evaluators, real ones, not people reading and nodding,
1083
00:40:58,920 --> 00:41:01,360
code that enforces three things on every structured output,
1084
00:41:01,360 --> 00:41:03,400
shape, citations and refusal.
1085
00:41:03,400 --> 00:41:06,280
Shape does the output match the table or JSON schema?
1086
00:41:06,280 --> 00:41:07,480
Citations.
1087
00:41:07,480 --> 00:41:09,520
Do the links resolve to the registered canon
1088
00:41:09,520 --> 00:41:12,440
for this domain with stable anchors and acceptable labels?
1089
00:41:12,440 --> 00:41:13,440
Refusal.
1090
00:41:13,440 --> 00:41:15,080
If any required element is missing
1091
00:41:15,080 --> 00:41:17,240
or an assertion lacks a valid citation,
1092
00:41:17,240 --> 00:41:19,400
the output must carry a refusal-reason field
1093
00:41:19,400 --> 00:41:20,560
explaining why it stopped.
1094
00:41:20,560 --> 00:41:23,160
You are not training politeness, you are compiling safety.
1095
00:41:23,160 --> 00:41:24,440
This is where teams stumble.
1096
00:41:24,440 --> 00:41:26,560
They treat structure as formatting, it isn't.
1097
00:41:26,560 --> 00:41:27,720
It's governance.
1098
00:41:27,720 --> 00:41:30,160
For example, a power-automate plan that arrives
1099
00:41:30,160 --> 00:41:32,760
as a bulleted list can't be dived, can't be linted,
1100
00:41:32,760 --> 00:41:34,440
can't be compared to what ran.
1101
00:41:34,440 --> 00:41:36,840
The same plan as JSON can be checked against the policy
1102
00:41:36,840 --> 00:41:39,800
that forbids loops, blocks, unapproved connectors
1103
00:41:39,800 --> 00:41:41,040
and requires rollback.
1104
00:41:41,040 --> 00:41:42,480
The difference is not pedantry.
1105
00:41:42,480 --> 00:41:45,240
It's whether execute will ever be deterministic,
1106
00:41:45,240 --> 00:41:47,520
make structure ergonomics so people adopt it,
1107
00:41:47,520 --> 00:41:50,440
ship prompt scaffolds that produce the shape by default.
1108
00:41:50,440 --> 00:41:52,960
When proposing a vendor onboarding exception process,
1109
00:41:52,960 --> 00:41:55,880
output JSON with these keys, embed those scaffolds
1110
00:41:55,880 --> 00:41:57,560
in your agent persona so users don't
1111
00:41:57,560 --> 00:41:58,840
have to memorize them.
1112
00:41:58,840 --> 00:42:01,320
Provide one click validators in Teams and Outlook,
1113
00:42:01,320 --> 00:42:03,880
validate plan, runs shape and citation checks
1114
00:42:03,880 --> 00:42:05,960
and shows exactly which fields failed.
1115
00:42:05,960 --> 00:42:09,320
Reduce friction at the point of thought, not after the fact.
1116
00:42:09,320 --> 00:42:12,480
Common anti-patterns, letting quick answers be free text
1117
00:42:12,480 --> 00:42:14,400
because we'll paste into a ticket.
1118
00:42:14,400 --> 00:42:16,800
You won't and copy-paste loses context,
1119
00:42:16,800 --> 00:42:19,400
accepting tables that vary columns per answer,
1120
00:42:19,400 --> 00:42:21,760
evaluators can't enforce shifting shapes.
1121
00:42:21,760 --> 00:42:24,800
Allowing execution proposals that cite internal practice
1122
00:42:24,800 --> 00:42:27,520
without a link, that's just polite hallucination.
1123
00:42:27,520 --> 00:42:30,160
And worst, retrofitting structure after the gate.
1124
00:42:30,160 --> 00:42:32,080
If you didn't demand shape before review,
1125
00:42:32,080 --> 00:42:34,600
your rubber stamping narrative, detection is easy
1126
00:42:34,600 --> 00:42:37,280
if you instrument, track refusal rates by evaluator,
1127
00:42:37,280 --> 00:42:40,080
if refusal is near zero, your checks are theater.
1128
00:42:40,080 --> 00:42:41,720
If refusal spikes on citations,
1129
00:42:41,720 --> 00:42:44,040
your registry is missing canon for that domain.
1130
00:42:44,040 --> 00:42:46,720
If validators flag missing rollback repeatedly,
1131
00:42:46,720 --> 00:42:48,720
your makers are defaulting to rights.
1132
00:42:48,720 --> 00:42:51,160
These are not content problems, they're controlled signals.
1133
00:42:51,160 --> 00:42:54,080
If this sounds heavy, it's because execution is heavy.
1134
00:42:54,080 --> 00:42:54,880
Embrace that.
1135
00:42:54,880 --> 00:42:57,560
Structure is how you make proposals explorable, comparable,
1136
00:42:57,560 --> 00:42:58,680
and safe to approve.
1137
00:42:58,680 --> 00:43:00,160
It's also how you make them teachable.
1138
00:43:00,160 --> 00:43:02,600
Your JSON and tables become living examples
1139
00:43:02,600 --> 00:43:03,840
for the next agent run.
1140
00:43:03,840 --> 00:43:06,960
Over time, your organization will internalize the shape
1141
00:43:06,960 --> 00:43:08,960
and copilot will meet them there.
1142
00:43:08,960 --> 00:43:11,360
Remember the chain, mandate one defines the system.
1143
00:43:11,360 --> 00:43:12,960
Mandate two constraints context.
1144
00:43:12,960 --> 00:43:15,520
Mandate three supplies, the first enforceable artifact.
1145
00:43:15,520 --> 00:43:18,960
From here on, reason output structure, plan carry schema,
1146
00:43:18,960 --> 00:43:21,640
gate enforces it, and execute refuses anything
1147
00:43:21,640 --> 00:43:22,840
that arrives as pros.
1148
00:43:22,840 --> 00:43:25,040
Without this mandate, you're back to walls of text
1149
00:43:25,040 --> 00:43:26,640
and conditional chaos.
1150
00:43:26,640 --> 00:43:27,920
With it, you finally have something
1151
00:43:27,920 --> 00:43:29,800
a control plane can govern.
1152
00:43:29,800 --> 00:43:33,040
Mandate four, separate reasoning from execution, reason, plan,
1153
00:43:33,040 --> 00:43:34,200
gate, execute.
1154
00:43:34,200 --> 00:43:35,800
This is the reference control pattern.
1155
00:43:35,800 --> 00:43:38,560
Memorize it, reason, plan, gate, execute.
1156
00:43:38,560 --> 00:43:40,400
It turns helpful into govern change
1157
00:43:40,400 --> 00:43:42,240
and converts probabilistic suggestions
1158
00:43:42,240 --> 00:43:43,600
into deterministic actions.
1159
00:43:43,600 --> 00:43:45,600
If you implement nothing else, implement this,
1160
00:43:45,600 --> 00:43:47,800
it's the spine that keeps the rest upright.
1161
00:43:47,800 --> 00:43:50,320
Reason is where the model thinks, compares options,
1162
00:43:50,320 --> 00:43:51,640
and gathers sources.
1163
00:43:51,640 --> 00:43:54,200
It produces analysis, alternatives, and trade-offs.
1164
00:43:54,200 --> 00:43:56,440
But reason never touches data or systems.
1165
00:43:56,440 --> 00:43:57,920
It's read only by contract.
1166
00:43:57,920 --> 00:44:00,200
Its only deliverables are a structured recommendation
1167
00:44:00,200 --> 00:44:02,440
and the citations that bind it to canon.
1168
00:44:02,440 --> 00:44:05,040
If an output looks like pros that could drive a decision,
1169
00:44:05,040 --> 00:44:07,680
reason refuses, structure required.
1170
00:44:07,680 --> 00:44:11,120
If an assertion lacks a registered source, reason refuses.
1171
00:44:11,120 --> 00:44:12,360
Truth or silence.
1172
00:44:12,360 --> 00:44:14,320
Plan transforms that structured recommendation
1173
00:44:14,320 --> 00:44:16,400
into a machine readable plan artifact.
1174
00:44:16,400 --> 00:44:19,360
Not a wall of text, a schema, goal, assumptions,
1175
00:44:19,360 --> 00:44:21,800
sources with stable anchors, risk tier,
1176
00:44:21,800 --> 00:44:24,360
and a step list where every step has action, target,
1177
00:44:24,360 --> 00:44:26,800
preconditions, side effects, and rollback.
1178
00:44:26,800 --> 00:44:28,160
The plan is a difficult object.
1179
00:44:28,160 --> 00:44:30,360
You can lint it, compare it to policy,
1180
00:44:30,360 --> 00:44:32,680
and validate it without touching production.
1181
00:44:32,680 --> 00:44:35,720
Planning is not a euphemism for almost execute.
1182
00:44:35,720 --> 00:44:37,840
It is the blueprint you'll put in front of a gate.
1183
00:44:37,840 --> 00:44:40,600
Gate evaluates the plan against risk tier and data class.
1184
00:44:40,600 --> 00:44:43,400
Low-risk low-impact plans read time only, no rights,
1185
00:44:43,400 --> 00:44:46,920
no identity change, can auto-approve under policy checks.
1186
00:44:46,920 --> 00:44:50,560
Medium-risk plans, rights with rollback in a constrained data set,
1187
00:44:50,560 --> 00:44:53,160
require a human in the loop approval by a named role.
1188
00:44:53,160 --> 00:44:56,360
High-risk plans, identity finance, HR compliant surfaces,
1189
00:44:56,360 --> 00:44:58,880
require a human on the loop plus policy checks,
1190
00:44:58,880 --> 00:45:02,320
separation of duties, SOD conflicts, and counter signatures.
1191
00:45:02,320 --> 00:45:05,080
Gate is where silence beats fiction is enforced in code.
1192
00:45:05,080 --> 00:45:07,320
If shape fails, citations don't resolve
1193
00:45:07,320 --> 00:45:09,440
or rollback is missing, gate refuses.
1194
00:45:09,440 --> 00:45:12,200
It doesn't negotiate, execute runs, the approved plan,
1195
00:45:12,200 --> 00:45:14,800
in the smallest blast radius with a separate agent identity
1196
00:45:14,800 --> 00:45:16,960
that has only the permissions needed for those steps,
1197
00:45:16,960 --> 00:45:19,920
only for the duration, and only in the approved environment.
1198
00:45:19,920 --> 00:45:22,440
Every action is locked, every step checks preconditions,
1199
00:45:22,440 --> 00:45:24,560
every right pairs with a compensating transaction
1200
00:45:24,560 --> 00:45:26,040
you designed at plan time.
1201
00:45:26,040 --> 00:45:27,840
If any check fails, execute stops
1202
00:45:27,840 --> 00:45:30,160
and emits a structured failure report back to gate.
1203
00:45:30,160 --> 00:45:32,000
There is no best effort.
1204
00:45:32,000 --> 00:45:33,520
There is no will fix it later.
1205
00:45:33,520 --> 00:45:35,120
Now, wire this into your platforms.
1206
00:45:35,120 --> 00:45:38,000
In Teams, reason outputs a structured recommendation table.
1207
00:45:38,000 --> 00:45:39,520
Plan emits JSON.
1208
00:45:39,520 --> 00:45:41,600
Gate is an approval card that renders plan shape,
1209
00:45:41,600 --> 00:45:44,880
flags failed validations, and requires the correct approval roll.
1210
00:45:44,880 --> 00:45:47,120
Execute is an orchestrated back end process
1211
00:45:47,120 --> 00:45:49,440
that runs only after gate stamps the plan.
1212
00:45:49,440 --> 00:45:52,880
In Outlook, reason drafts but never sends.
1213
00:45:52,880 --> 00:45:54,840
Plan produces a send ready object
1214
00:45:54,840 --> 00:45:57,720
with recipients, content hash, and citations,
1215
00:45:57,720 --> 00:46:00,720
gate routes to the senders manager or a policy mailbox,
1216
00:46:00,720 --> 00:46:04,160
execute sends under an agent identity with DKIM,
1217
00:46:04,160 --> 00:46:07,200
the SPF alignment, and logs the exact body.
1218
00:46:07,200 --> 00:46:09,080
In Power Automate codify the pattern,
1219
00:46:09,080 --> 00:46:12,840
reason is a co-pilot suggestion that can only output a plan schema.
1220
00:46:12,840 --> 00:46:16,680
Plan captures the proposed trigger, actions, connectors, and rollback.
1221
00:46:16,680 --> 00:46:18,200
Gate is a required approval step
1222
00:46:18,200 --> 00:46:20,640
that rejects any plan using disallowed connectors,
1223
00:46:20,640 --> 00:46:24,040
missing rollback or lacking citations for policy-dependent steps.
1224
00:46:24,040 --> 00:46:26,760
Execute is the flow running in managed environments
1225
00:46:26,760 --> 00:46:29,440
with an agent identity, not the maker's account,
1226
00:46:29,440 --> 00:46:30,880
and with dry run mode available
1227
00:46:30,880 --> 00:46:33,720
so you can inspect the blast radius before rides happen.
1228
00:46:33,720 --> 00:46:36,480
A word on rollback, you either design compensating transactions
1229
00:46:36,480 --> 00:46:38,680
up front or you accept irreversibility.
1230
00:46:38,680 --> 00:46:41,120
For every proposed ride, define the inverse
1231
00:46:41,120 --> 00:46:43,000
and the conditions that make it valid.
1232
00:46:43,000 --> 00:46:45,960
If a plan sets provisional, the paired rollback clears it
1233
00:46:45,960 --> 00:46:47,800
reverses downstream side effects
1234
00:46:47,800 --> 00:46:49,720
and creates an audit link between them.
1235
00:46:49,720 --> 00:46:52,640
When gate sees a ride without a rollback step, it refuses.
1236
00:46:52,640 --> 00:46:54,040
There is no will added later.
1237
00:46:54,040 --> 00:46:55,640
Telemetry is part of the pattern.
1238
00:46:55,640 --> 00:46:58,080
Instrument refusal rates in reason and gate.
1239
00:46:58,080 --> 00:47:00,920
When refusal drops to zero, the controls are theater.
1240
00:47:00,920 --> 00:47:03,200
Track gate latency and approval sources.
1241
00:47:03,200 --> 00:47:05,080
When approvals bunch around a single person,
1242
00:47:05,080 --> 00:47:07,360
you have a bottleneck and a single point of failure,
1243
00:47:07,360 --> 00:47:09,920
measure cost per plan and cost per execution.
1244
00:47:09,920 --> 00:47:12,000
When cost doubles for the same outcome,
1245
00:47:12,000 --> 00:47:15,000
your retrieval bounds or execution environment drifted.
1246
00:47:15,000 --> 00:47:16,560
Antipatence to kill.
1247
00:47:16,560 --> 00:47:19,160
Brain storming that directly emits runnable flows,
1248
00:47:19,160 --> 00:47:21,360
approvals that read like LGTM
1249
00:47:21,360 --> 00:47:23,840
with no machine readable plan underneath.
1250
00:47:23,840 --> 00:47:26,400
Running execute under human credentials,
1251
00:47:26,400 --> 00:47:29,720
and executing in production without a dry run option.
1252
00:47:29,720 --> 00:47:32,360
If you see them, you don't need more prompting workshops.
1253
00:47:32,360 --> 00:47:34,400
You need to enforce the pattern at the platform edges
1254
00:47:34,400 --> 00:47:37,240
if this sounds heavy, it's because execution is heavy.
1255
00:47:37,240 --> 00:47:40,000
But discipline here buys you speed everywhere else.
1256
00:47:40,000 --> 00:47:43,200
Reason moves fast, plan is reusable, gate is predictable,
1257
00:47:43,200 --> 00:47:46,640
execute is safe, you are no longer outsourcing architecture to a prompt.
1258
00:47:46,640 --> 00:47:49,120
You are encoding it in a sequence the engine can't skip.
1259
00:47:49,120 --> 00:47:52,160
Mandate 5, authority gating, truth or silence.
1260
00:47:52,160 --> 00:47:55,400
We're going to remove the single largest psychological risk in the system,
1261
00:47:55,400 --> 00:47:57,560
confident fiction treated as policy.
1262
00:47:57,560 --> 00:48:01,360
Authority gating is how you force truth or silence at the point of answer,
1263
00:48:01,360 --> 00:48:04,280
not later in training, not in a guideline, at runtime.
1264
00:48:04,280 --> 00:48:06,880
Start with an authoritative source registry, by domain.
1265
00:48:06,880 --> 00:48:11,360
Procurement policy here, HR guidance there, security standards somewhere else.
1266
00:48:11,360 --> 00:48:15,360
Each entry has an owner, label, stable anchor and version, not a folder path.
1267
00:48:15,360 --> 00:48:19,360
A canonical ID you can cite, audit and revoke the registries, the law library.
1268
00:48:19,360 --> 00:48:22,440
Everything outside it is noise, bind agents to that registry
1269
00:48:22,440 --> 00:48:26,000
through an evaluator that enforces one rule, no citation, no answer.
1270
00:48:26,000 --> 00:48:29,760
If a claim sounds like policy as per section 4.2,
1271
00:48:29,760 --> 00:48:31,520
company standard requires,
1272
00:48:31,520 --> 00:48:36,320
exceptions must, the answer carries inline citations that resolve to the registry entries.
1273
00:48:36,320 --> 00:48:40,240
If the links fail, the answer stops, not a software phrasing, not a hedged guess.
1274
00:48:40,240 --> 00:48:43,280
A refusal with a reason, no approved source.
1275
00:48:43,280 --> 00:48:44,800
Silence beats fiction.
1276
00:48:44,800 --> 00:48:46,880
This is not a prompt trick, it's a gate.
1277
00:48:46,880 --> 00:48:49,440
Prompts can request, only gates can refuse,
1278
00:48:49,440 --> 00:48:52,640
implemented as code in the same path that already checks shape.
1279
00:48:52,640 --> 00:48:56,160
The evaluator validates three things, structure matches schema,
1280
00:48:56,160 --> 00:48:59,520
citations resolve to canonical sources with acceptable labels
1281
00:48:59,520 --> 00:49:02,160
and refusal is present whenever either condition fails.
1282
00:49:02,160 --> 00:49:03,600
You're not asking the model to behave,
1283
00:49:03,600 --> 00:49:06,800
you're compelling the system to hold when authority isn't proven,
1284
00:49:06,800 --> 00:49:10,240
make refusal the first class outcome, friendly but immovable.
1285
00:49:10,240 --> 00:49:12,000
I can't answer from approved sources.
1286
00:49:12,000 --> 00:49:15,760
Would you like me to search the procurement policy library or notify its owner?
1287
00:49:15,760 --> 00:49:20,000
Give the user bounded ladders back into canon but never best effort.
1288
00:49:20,000 --> 00:49:22,720
People under deadline will accept anything that looks official,
1289
00:49:22,720 --> 00:49:23,680
remove the option.
1290
00:49:23,680 --> 00:49:25,600
Now scope the registry to the work.
1291
00:49:25,600 --> 00:49:28,560
In teams, the project summarizer sites only project canon,
1292
00:49:28,560 --> 00:49:31,200
minutes, design decisions and posted policy summaries,
1293
00:49:31,200 --> 00:49:34,320
never raw HR or legal, in outlook, message summaries,
1294
00:49:34,320 --> 00:49:37,680
site the current thread or attach policy extracts that live in the registry
1295
00:49:37,680 --> 00:49:39,360
not scraped text from a drive.
1296
00:49:39,360 --> 00:49:43,600
For co-pilot studio agents, bind each persona to its domain.
1297
00:49:43,600 --> 00:49:47,200
The policy analyst, sites only policy with anchors.
1298
00:49:47,200 --> 00:49:50,800
The controls engineer, site standards and procedure work instructions.
1299
00:49:50,800 --> 00:49:53,680
Cross domain requires an explicit attach,
1300
00:49:53,680 --> 00:49:56,720
with a session contract that names both registries and an expiry.
1301
00:49:56,720 --> 00:50:00,080
Here's the enforcement detail most people skip citation proof.
1302
00:50:00,080 --> 00:50:03,520
The link must resolve to a registered artifact with a stable anchor,
1303
00:50:03,520 --> 00:50:05,520
a version and a permissible label.
1304
00:50:05,520 --> 00:50:08,560
A pasted URL to a random doc is not a citation.
1305
00:50:08,560 --> 00:50:11,920
A sentence that paraphrases canon without pointing to it is not a citation.
1306
00:50:11,920 --> 00:50:16,400
The evaluator follows links, checks labels, no confidential HR legal in a public answer,
1307
00:50:16,400 --> 00:50:17,920
and rejects if any element fails.
1308
00:50:17,920 --> 00:50:19,040
You don't debate nuance.
1309
00:50:19,040 --> 00:50:19,440
You stop.
1310
00:50:19,440 --> 00:50:22,000
Guard against policy evasion.
1311
00:50:22,000 --> 00:50:24,560
Communication compliance can flag jailbreak language,
1312
00:50:24,560 --> 00:50:25,760
standard practice.
1313
00:50:25,760 --> 00:50:28,480
As everyone knows, and we typically, per usual,
1314
00:50:28,480 --> 00:50:31,920
when these appear in a policy-shaped answer without citations,
1315
00:50:31,920 --> 00:50:34,160
root to refusal and log an event,
1316
00:50:34,160 --> 00:50:36,720
you're training the organization that policy is not vibes.
1317
00:50:36,720 --> 00:50:39,440
Its anchors and owners, what about stale or wrong canon,
1318
00:50:39,440 --> 00:50:41,760
authority-gating doesn't guarantee correctness.
1319
00:50:41,760 --> 00:50:43,360
It guarantees traceability.
1320
00:50:43,360 --> 00:50:47,200
When gate refuses a plan because the citation points to an outdated control,
1321
00:50:47,200 --> 00:50:49,920
the human on the loop updates canon, not the prompt.
1322
00:50:49,920 --> 00:50:52,160
That change propagates to every future answer,
1323
00:50:52,160 --> 00:50:54,400
because the registry is the single source.
1324
00:50:54,400 --> 00:50:56,400
You fix the system, not the sentence.
1325
00:50:56,400 --> 00:50:59,040
Make the gate ergonomic or people will rot around it.
1326
00:50:59,040 --> 00:51:01,920
Provide insert citation actions in teams and outlook
1327
00:51:01,920 --> 00:51:03,680
that search only the registry.
1328
00:51:03,680 --> 00:51:07,280
Surface the most relevant anchors with preview snippets and labels.
1329
00:51:07,280 --> 00:51:09,680
Offer file a gap when canon is missing,
1330
00:51:09,680 --> 00:51:12,640
pre-filled with the refused answer and the needed domain.
1331
00:51:12,640 --> 00:51:15,840
Owners get queued requests with context, not vague pings.
1332
00:51:15,840 --> 00:51:16,720
Pause here.
1333
00:51:16,720 --> 00:51:17,920
You've crossed the midpoint.
1334
00:51:17,920 --> 00:51:21,200
Mandates 15 define the system, drew the fence,
1335
00:51:21,200 --> 00:51:24,960
demanded structure, split thinking from doing, and gated authority.
1336
00:51:24,960 --> 00:51:28,800
Next we make state explicit, c-drift, and bind identities so enforcement
1337
00:51:28,800 --> 00:51:30,240
survives contact with production.
1338
00:51:30,240 --> 00:51:31,920
Detect drift with telemetry.
1339
00:51:31,920 --> 00:51:36,880
Track the ratio of policy-shaped answers with valid citations to total policy-shaped answers.
1340
00:51:36,880 --> 00:51:40,240
When it dips, your evaluators are off or builders are bypassing.
1341
00:51:40,240 --> 00:51:41,840
Group refusals by domain.
1342
00:51:41,840 --> 00:51:44,720
Spikes in no canon tell you where to invest.
1343
00:51:44,720 --> 00:51:48,640
Lock attempted external web citations for internal policy questions.
1344
00:51:48,640 --> 00:51:51,840
Those are design defects, not user mistakes.
1345
00:51:51,840 --> 00:51:56,800
Anti-patterns to kill, allowing for internal use only web results to pad answers,
1346
00:51:56,800 --> 00:52:00,160
accepting summaries that claim company policy without anchors,
1347
00:52:00,160 --> 00:52:04,160
letting builders mark whole sharepoint sites as canon without owner,
1348
00:52:04,160 --> 00:52:08,400
label and version, and worst, letting execution proceed on recommendations
1349
00:52:08,400 --> 00:52:10,400
that read official but cite nothing.
1350
00:52:10,400 --> 00:52:13,840
If the sounds heavy, it's because execution is heavy,
1351
00:52:13,840 --> 00:52:15,200
but the contract is simple.
1352
00:52:15,200 --> 00:52:19,360
Define the library, enforce the rule, teach the phrase truth or silence.
1353
00:52:19,360 --> 00:52:22,080
When authority is gated, prose stops being policy,
1354
00:52:22,080 --> 00:52:24,560
and copilot goes back to being a controlled contributor,
1355
00:52:24,560 --> 00:52:26,400
not an uninvited lawmaker.
1356
00:52:26,400 --> 00:52:29,760
Mandate 6 explicit state session contracts and context ledger.
1357
00:52:29,760 --> 00:52:31,200
Continuity is not kindness.
1358
00:52:31,200 --> 00:52:32,400
It is a safety claim.
1359
00:52:32,400 --> 00:52:35,600
If you don't encode it, users will assume memory that doesn't exist,
1360
00:52:35,600 --> 00:52:39,040
prompts will shorten, and drift will write your outcomes.
1361
00:52:39,040 --> 00:52:40,640
Explicit state prevents that.
1362
00:52:40,640 --> 00:52:44,480
Start by declaring sessions as first class objects, stateless by default.
1363
00:52:44,480 --> 00:52:47,520
Nothing carries unless the user attaches a session contract.
1364
00:52:47,520 --> 00:52:50,800
The contract names four things, scope, sources, rules and time.
1365
00:52:50,800 --> 00:52:54,640
Scope, which project, team or mailbox, this session covers.
1366
00:52:54,640 --> 00:52:58,240
Sources, which registries, sites or indices are attached,
1367
00:52:58,240 --> 00:52:59,840
with versions and labels.
1368
00:52:59,840 --> 00:53:03,360
Rules refusal criteria, citation requirements, and output structure,
1369
00:53:03,360 --> 00:53:06,000
time, start, TTL and expiry behavior.
1370
00:53:06,000 --> 00:53:09,840
If any element is missing, the agent refuses and offers a bounded alternative.
1371
00:53:09,840 --> 00:53:13,120
Attach project alpha policy registry and minutes for the last 14 days?
1372
00:53:13,120 --> 00:53:15,680
No attach, no assumed memory.
1373
00:53:15,680 --> 00:53:19,040
Expose a context ledger to the user every time the engine answers.
1374
00:53:19,040 --> 00:53:20,400
Plain text, no magic.
1375
00:53:20,400 --> 00:53:25,360
This answer used project alpha minutes 2025-06-04,
1376
00:53:25,360 --> 00:53:30,640
public team, procurement policy V3.4, confidential policy.
1377
00:53:30,640 --> 00:53:35,040
Assumptions, no vendor exceptions, tone, external, refusal trigger,
1378
00:53:35,040 --> 00:53:36,400
missing citation.
1379
00:53:36,400 --> 00:53:37,680
The ledger is not decoration.
1380
00:53:37,680 --> 00:53:40,560
It is the explainer that lets a human spot in appropriate
1381
00:53:40,560 --> 00:53:41,760
carryover in five seconds.
1382
00:53:41,760 --> 00:53:44,480
It also becomes the artifact your gate and audit rely on.
1383
00:53:44,480 --> 00:53:47,200
If an inclusion cannot be explained by the ledger, it was phantom state.
1384
00:53:47,200 --> 00:53:48,720
That's a defect, not a discussion.
1385
00:53:48,720 --> 00:53:50,640
Define clear reset semantics.
1386
00:53:50,640 --> 00:53:53,040
Channel changes reset, surface changes reset,
1387
00:53:53,040 --> 00:53:55,120
midnight reset, unless a user extends.
1388
00:53:55,120 --> 00:53:56,560
Cross tenant resets always.
1389
00:53:56,560 --> 00:53:59,280
You can soften that by allowing explicit reattachment.
1390
00:53:59,280 --> 00:54:02,240
Continue session 7F3C for three hours.
1391
00:54:02,240 --> 00:54:04,320
But ambiguity defaults to refusal.
1392
00:54:04,320 --> 00:54:06,560
I don't have your prior constraints attached.
1393
00:54:06,560 --> 00:54:09,040
Users learn one habit, attach or restate.
1394
00:54:09,040 --> 00:54:10,800
Treat embeddings like fuel, not memory.
1395
00:54:10,800 --> 00:54:13,360
Cash governance gets the same rigor you give data.
1396
00:54:13,360 --> 00:54:15,840
Vector stores have TTLs by data class.
1397
00:54:15,840 --> 00:54:18,560
Hours for drafts, days for project minutes,
1398
00:54:18,560 --> 00:54:20,000
longer for canon.
1399
00:54:20,000 --> 00:54:22,320
Every store tracks provenance, who created it,
1400
00:54:22,320 --> 00:54:24,320
from which labeled sources, when it refreshed,
1401
00:54:24,320 --> 00:54:25,680
and which agents may read it.
1402
00:54:25,680 --> 00:54:28,240
When a user attaches a session, bind to a snapshot,
1403
00:54:28,240 --> 00:54:30,720
not a live index that mutates mid-conversation.
1404
00:54:30,720 --> 00:54:32,480
Perch on schedule, audit what you purge.
1405
00:54:32,480 --> 00:54:35,040
If this sounds heavy, it's because execution is heavy.
1406
00:54:35,040 --> 00:54:37,920
Bind session contracts to identity and environment.
1407
00:54:37,920 --> 00:54:40,320
A developer's reason only persona in dev
1408
00:54:40,320 --> 00:54:42,960
can attach broad project minutes but never HR.
1409
00:54:42,960 --> 00:54:45,760
A plant here agent in test can attach policy registries
1410
00:54:45,760 --> 00:54:48,320
and plant templates, but never execution credentials.
1411
00:54:48,320 --> 00:54:51,120
A summarizer in teams can attach only the current channel
1412
00:54:51,120 --> 00:54:52,560
and registered canon.
1413
00:54:52,560 --> 00:54:54,800
Contracts include identity class and environment groups
1414
00:54:54,800 --> 00:54:57,200
so your evaluators and gates can refuse politely
1415
00:54:57,200 --> 00:55:00,080
when someone tries to carry a dev contract into prod.
1416
00:55:00,080 --> 00:55:01,840
Make state visible at the right moments.
1417
00:55:01,840 --> 00:55:03,520
Start off session, show the contract.
1418
00:55:03,520 --> 00:55:05,440
Every answer, show the ledger.
1419
00:55:05,440 --> 00:55:07,840
On reset events emit a visible banner.
1420
00:55:07,840 --> 00:55:10,960
Context reset due to channel change and pause.
1421
00:55:10,960 --> 00:55:13,200
On attach, link the exact sources and versions
1422
00:55:13,200 --> 00:55:15,280
on expiry, refuse and offer reattach.
1423
00:55:15,280 --> 00:55:18,480
Users stop guessing because the system narrates what it's using.
1424
00:55:18,480 --> 00:55:20,880
Prevent silent drift with sticky instructions
1425
00:55:20,880 --> 00:55:23,680
by scoping behavioral prompts to the contract.
1426
00:55:23,680 --> 00:55:26,480
Tone, audience and structure live in the contract.
1427
00:55:26,480 --> 00:55:29,440
They don't bleed across tenants, channels or days.
1428
00:55:29,440 --> 00:55:32,320
System prompts that set organization-wide defaults
1429
00:55:32,320 --> 00:55:34,560
are visible in the ledger as organization behavior
1430
00:55:34,560 --> 00:55:37,520
with an ID and owner, no hidden autopilot.
1431
00:55:37,520 --> 00:55:40,320
Detect failure with variance testing and telemetry.
1432
00:55:40,320 --> 00:55:42,240
Run canary prompts across channels.
1433
00:55:42,240 --> 00:55:44,800
Variance in policy domains is a smell.
1434
00:55:44,800 --> 00:55:48,480
Lock context changes and whether a valid contract was present
1435
00:55:48,480 --> 00:55:51,200
alert on high-risk answers without contracts.
1436
00:55:51,200 --> 00:55:52,960
Sample answers against ledgers.
1437
00:55:52,960 --> 00:55:56,240
If an inclusion lacks a ledger entry, you have ghost state.
1438
00:55:56,240 --> 00:55:58,240
Fix the pipeline, not the prompt.
1439
00:55:58,240 --> 00:56:00,160
Antipatens you will kill.
1440
00:56:00,160 --> 00:56:02,960
Long chat threads that meander topics and sources
1441
00:56:02,960 --> 00:56:05,840
auto-attaching recent files as grounding,
1442
00:56:05,840 --> 00:56:08,320
carrying instructions invisibly between products
1443
00:56:08,320 --> 00:56:11,120
and helpful summarizers that infer memory from phrasing
1444
00:56:11,120 --> 00:56:12,960
like as we said yesterday.
1445
00:56:12,960 --> 00:56:14,160
Yesterday isn't a contract.
1446
00:56:14,160 --> 00:56:16,000
Explicit state buys you determinism.
1447
00:56:16,000 --> 00:56:17,680
Users attach or restate.
1448
00:56:17,680 --> 00:56:19,120
Ledgers explain.
1449
00:56:19,120 --> 00:56:21,680
Resets are predictable, embeddings are governed.
1450
00:56:21,680 --> 00:56:23,440
And when something drifts, you have a date,
1451
00:56:23,440 --> 00:56:25,040
a source and a lever to pull.
1452
00:56:25,040 --> 00:56:26,880
That's not ceremony, that's control.
1453
00:56:26,880 --> 00:56:29,440
Mandate seven observability, budgets and drift.
1454
00:56:29,440 --> 00:56:31,280
You can't control what you don't see.
1455
00:56:31,280 --> 00:56:34,320
And in a distributed decision engine, invisibility is permission.
1456
00:56:34,320 --> 00:56:36,160
Observability is not a report.
1457
00:56:36,160 --> 00:56:39,040
It's the nervous system that turns cost into a signal, behavior
1458
00:56:39,040 --> 00:56:41,280
into metrics and drift into alarms.
1459
00:56:41,280 --> 00:56:43,040
Budgets are not finance artifacts.
1460
00:56:43,040 --> 00:56:45,280
They're circuit breakers that force explanations
1461
00:56:45,280 --> 00:56:47,360
before entropy becomes policy.
1462
00:56:47,360 --> 00:56:48,320
Drift isn't a feeling.
1463
00:56:48,320 --> 00:56:50,640
It's a measurable departure from your declared shape,
1464
00:56:50,640 --> 00:56:51,920
sources and refusal model.
1465
00:56:51,920 --> 00:56:54,880
Start with runtime visibility, not retrospective audits.
1466
00:56:54,880 --> 00:56:57,920
Turn on DSPM for AI, so sensitive labels intersecting
1467
00:56:57,920 --> 00:57:01,760
co-pilot activity, light up by app, by agent and by domain.
1468
00:57:01,760 --> 00:57:04,640
Use activity explorer to capture prompts, responses,
1469
00:57:04,640 --> 00:57:06,880
grounding calls and label context.
1470
00:57:06,880 --> 00:57:08,400
That's not surveillance theater.
1471
00:57:08,400 --> 00:57:10,960
It's the only way to answer two questions on demand.
1472
00:57:10,960 --> 00:57:12,800
What did this agent read?
1473
00:57:12,800 --> 00:57:14,720
And what did this answer rely on?
1474
00:57:14,720 --> 00:57:17,680
If this sounds heavy, it's because execution is heavy.
1475
00:57:17,680 --> 00:57:20,480
But without these traces, every incident becomes folklore.
1476
00:57:20,480 --> 00:57:22,640
Make cost an own metric, not a blame ritual.
1477
00:57:22,640 --> 00:57:24,720
Track cost per answer for narrative tasks
1478
00:57:24,720 --> 00:57:27,040
and cost per action for automations.
1479
00:57:27,040 --> 00:57:29,840
Decompose spent by agent, connector, retrieval window
1480
00:57:29,840 --> 00:57:31,040
and answer length.
1481
00:57:31,040 --> 00:57:33,200
When cost doubles for the same outcome,
1482
00:57:33,200 --> 00:57:34,720
you don't have a budgeting issue.
1483
00:57:34,720 --> 00:57:37,200
You have retrieval bloat, unbounded context
1484
00:57:37,200 --> 00:57:38,960
or a connector out of policy.
1485
00:57:38,960 --> 00:57:41,760
Tie alerts to deltas, not absolute numbers.
1486
00:57:41,760 --> 00:57:43,920
Cost per answer for project alpha summarizer
1487
00:57:43,920 --> 00:57:47,440
up 80% week over week, average grounding size doubled.
1488
00:57:47,440 --> 00:57:50,240
That's a design problem, not a team performance discussion.
1489
00:57:50,240 --> 00:57:52,400
Budgets are control levers, allocate ceilings
1490
00:57:52,400 --> 00:57:55,520
per environment group, then per team, then per agent.
1491
00:57:55,520 --> 00:57:59,600
At 80% notify owners with usage anatomy and top offenders.
1492
00:58:00,400 --> 00:58:03,360
These three prompts drove 60% of spend.
1493
00:58:03,360 --> 00:58:06,320
Connector X added 40% latency and cost.
1494
00:58:06,320 --> 00:58:09,520
At 100% pause the agent and route a decision to gate,
1495
00:58:09,520 --> 00:58:12,960
raise, redesign or retire, no silent overages.
1496
00:58:12,960 --> 00:58:17,280
Money is the only hard stop users respect under deadline.
1497
00:58:17,280 --> 00:58:20,800
Make it an engineered stop with a recovery path.
1498
00:58:20,800 --> 00:58:22,000
Define drift.
1499
00:58:22,000 --> 00:58:23,840
Like an SRE defines error budgets,
1500
00:58:23,840 --> 00:58:27,120
you declare shape, sources, refusal rates and latency.
1501
00:58:27,120 --> 00:58:29,440
Drift is deviation from that declaration.
1502
00:58:29,440 --> 00:58:33,520
Instrument for drifts, shape drift, plan artifacts, no longer match schema,
1503
00:58:33,520 --> 00:58:36,720
new fields sneak in, required ones vanish, source drift.
1504
00:58:36,720 --> 00:58:39,760
Citations resolve to non registry content
1505
00:58:39,760 --> 00:58:42,560
or registered labels degrade from policy to Wiki.
1506
00:58:42,560 --> 00:58:47,360
Refusal drift, refusal rates collapse to near zero in policy heavy domains,
1507
00:58:47,360 --> 00:58:48,560
your gates are eroding.
1508
00:58:48,560 --> 00:58:52,800
Latency, swash, cost drift, the same tasks take longer and cost more,
1509
00:58:52,800 --> 00:58:55,520
retrieval windows widened or caches bloated.
1510
00:58:55,520 --> 00:58:57,440
When any drift exceeds thresholds,
1511
00:58:57,440 --> 00:58:59,440
the agent enters degraded mode,
1512
00:58:59,440 --> 00:59:03,120
reason only, refuse execution and demand a review.
1513
00:59:03,120 --> 00:59:05,680
Alert design matters, noise kills adoption,
1514
00:59:05,680 --> 00:59:06,800
silence kills control.
1515
00:59:06,800 --> 00:59:10,400
Route alerts to owners with fix or refuse options,
1516
00:59:10,400 --> 00:59:13,200
not FYIs, bundle incidents by cause,
1517
00:59:13,200 --> 00:59:16,880
10 no citation refusals across procurement is one registry gap,
1518
00:59:16,880 --> 00:59:19,120
not 10 tickets, provide one click diffs.
1519
00:59:19,120 --> 00:59:21,040
Plan schema changed.
1520
00:59:21,040 --> 00:59:23,600
Here's the diff, here's the policy it violates.
1521
00:59:23,600 --> 00:59:25,920
Give people letters up the stack, update registry,
1522
00:59:25,920 --> 00:59:29,040
titan scope, re-enable once thresholds are back in range,
1523
00:59:29,040 --> 00:59:31,520
observability must be persona aware.
1524
00:59:31,520 --> 00:59:34,240
For reason only agents emphasize answer quality,
1525
00:59:34,240 --> 00:59:36,320
citation coverage and cost per answer.
1526
00:59:36,320 --> 00:59:38,960
For plan tier, emphasize schema conformance,
1527
00:59:38,960 --> 00:59:41,760
refusal discipline and approval latency.
1528
00:59:41,760 --> 00:59:44,560
For execute tier, emphasize precondition failures,
1529
00:59:44,560 --> 00:59:47,520
rollback rates and compensating transactions executed.
1530
00:59:47,520 --> 00:59:49,280
Plan security and reliability,
1531
00:59:49,280 --> 00:59:52,960
a spike in rollbacks without paired adjustments to plans is a signal.
1532
00:59:52,960 --> 00:59:55,760
Execution is compensating for bad thinking upstream.
1533
00:59:55,760 --> 00:59:58,480
Block the classic anti-patterns, quarterly cost reviews,
1534
00:59:58,480 --> 01:00:02,480
two late dashboards nobody owns, entropy feed will rely on user training,
1535
01:00:02,480 --> 01:00:04,160
that's delegation not design.
1536
01:00:04,160 --> 01:00:06,000
We'll trust labels alone,
1537
01:00:06,000 --> 01:00:07,760
their stickers without deny a redact,
1538
01:00:07,760 --> 01:00:09,760
and we'll optimize after launch,
1539
01:00:09,760 --> 01:00:11,680
that's how drift becomes policy.
1540
01:00:11,680 --> 01:00:13,360
This mandate implements pressure early,
1541
01:00:13,360 --> 01:00:15,920
so small deviations become teachable moments,
1542
01:00:15,920 --> 01:00:19,120
not ordered findings, one more lever, canaries.
1543
01:00:19,120 --> 01:00:22,000
Run the same prompt weekly across channels for policy domains,
1544
01:00:22,000 --> 01:00:24,960
variants beyond a narrow band triggers investigation.
1545
01:00:24,960 --> 01:00:26,960
Rehearse red flag tests monthly.
1546
01:00:26,960 --> 01:00:29,760
If we turn this agent on for everyone tomorrow,
1547
01:00:29,760 --> 01:00:30,960
what would scare us?
1548
01:00:30,960 --> 01:00:32,960
Let the answers drive registry updates,
1549
01:00:32,960 --> 01:00:34,960
scope tightening or budget changes.
1550
01:00:34,960 --> 01:00:36,960
Observability is not a one-time install,
1551
01:00:36,960 --> 01:00:37,760
it's a practice.
1552
01:00:37,760 --> 01:00:39,760
Remember the thesis.
1553
01:00:39,760 --> 01:00:41,760
Copilot entropy grows by default.
1554
01:00:41,760 --> 01:00:43,760
Determinism exists only by design.
1555
01:00:43,760 --> 01:00:45,760
Observability turns guesses into graphs,
1556
01:00:45,760 --> 01:00:47,760
budgets force hard choices, drift budgets
1557
01:00:47,760 --> 01:00:49,360
keep your declarations honest,
1558
01:00:49,360 --> 01:00:52,160
implement them and cost becomes a light, not a fire.
1559
01:00:52,160 --> 01:00:54,560
Ignore them and your first telemetry will be an invoice.
1560
01:00:54,560 --> 01:00:58,560
Mandate 8 identity, least privilege and agent personas.
1561
01:00:58,560 --> 01:01:00,560
Identity is the blast radius,
1562
01:01:00,560 --> 01:01:02,560
if you get this wrong, everything else is ornamental.
1563
01:01:02,560 --> 01:01:04,560
Least privilege is not a slogan,
1564
01:01:04,560 --> 01:01:07,360
it's how you turn smart suggestions into scoped actions
1565
01:01:07,360 --> 01:01:08,560
that can't wander,
1566
01:01:08,560 --> 01:01:10,560
start by separating humans from agents.
1567
01:01:10,560 --> 01:01:12,560
A person has broad context, shifting tasks
1568
01:01:12,560 --> 01:01:14,560
and legacy access you haven't cleaned up yet.
1569
01:01:14,560 --> 01:01:16,560
An agent has a single job, a narrow scope,
1570
01:01:16,560 --> 01:01:18,560
and a lifetime you can measure in weeks.
1571
01:01:18,560 --> 01:01:20,560
If an agent runs with a human's token,
1572
01:01:20,560 --> 01:01:22,560
you've already lost.
1573
01:01:22,560 --> 01:01:24,560
That's a shared identity disguised as convenience.
1574
01:01:24,560 --> 01:01:26,560
Create dedicated agent identities in
1575
01:01:26,560 --> 01:01:30,560
Entra for every copilot surface and copilot studio agent
1576
01:01:30,560 --> 01:01:32,560
that can plan or execute.
1577
01:01:32,560 --> 01:01:34,560
Each identity gets three bindings,
1578
01:01:34,560 --> 01:01:36,560
role, scope and time.
1579
01:01:36,560 --> 01:01:38,560
Role is what the agent does,
1580
01:01:38,560 --> 01:01:40,560
summarizer, analyst, planner, executor.
1581
01:01:40,560 --> 01:01:42,560
Scope is the data and systems it may touch.
1582
01:01:42,560 --> 01:01:44,560
Named sites, channels, mailboxes, connectors.
1583
01:01:44,560 --> 01:01:46,560
Time is the duration of privilege,
1584
01:01:46,560 --> 01:01:48,560
start and renewal policy.
1585
01:01:48,560 --> 01:01:50,560
No binding, no runtime.
1586
01:01:50,560 --> 01:01:51,560
Tie identities to authentication context.
1587
01:01:51,560 --> 01:01:53,560
Risk-based controls apply to agents too.
1588
01:01:53,560 --> 01:01:55,560
An executor agent that touches finance
1589
01:01:55,560 --> 01:01:57,560
runs only from managed networks
1590
01:01:57,560 --> 01:01:59,560
through managed devices with conditional access
1591
01:01:59,560 --> 01:02:01,560
that denies if risk is elevated.
1592
01:02:01,560 --> 01:02:03,560
Think of it as jade for machines.
1593
01:02:03,560 --> 01:02:05,560
Elevation lasts minutes, not days.
1594
01:02:05,560 --> 01:02:07,560
It's approved by a gate not implied by a prompt.
1595
01:02:07,560 --> 01:02:09,560
Personas are constraints encoded as identities,
1596
01:02:09,560 --> 01:02:11,560
not costumes in a prompt.
1597
01:02:11,560 --> 01:02:15,560
A procurement policy analyst persona can read the policy registry,
1598
01:02:15,560 --> 01:02:19,560
summarize decisions and refuse without canon.
1599
01:02:19,560 --> 01:02:21,560
A flow architect persona can emit JSON plans.
1600
01:02:21,560 --> 01:02:23,560
It cannot invoke connectors.
1601
01:02:23,560 --> 01:02:25,560
An executor persona can call a narrow set of actions
1602
01:02:25,560 --> 01:02:27,560
in a single environment for a single workflow,
1603
01:02:27,560 --> 01:02:31,560
only after gate approval and only with preconditions satisfied.
1604
01:02:31,560 --> 01:02:33,560
If the persona can do everything,
1605
01:02:33,560 --> 01:02:35,560
it's not a persona, it's a permission leak.
1606
01:02:35,560 --> 01:02:37,560
Apply lease privilege at three layers.
1607
01:02:37,560 --> 01:02:41,560
Data, default deniescopes with explicit inclusions and expires.
1608
01:02:41,560 --> 01:02:43,560
Actions, minimal connector permissions
1609
01:02:43,560 --> 01:02:45,560
per agent, per environment, with deny by default in prod.
1610
01:02:45,560 --> 01:02:47,560
Identity, no role union.
1611
01:02:47,560 --> 01:02:51,560
Don't let a single identity reason, plan and execute.
1612
01:02:51,560 --> 01:02:53,560
Separate them so telemetry tells you who thought,
1613
01:02:53,560 --> 01:02:55,560
who proposed and who acted.
1614
01:02:55,560 --> 01:02:57,560
Enforced separation of duties.
1615
01:02:57,560 --> 01:02:59,560
The human who approves a plan cannot be the owner
1616
01:02:59,560 --> 01:03:01,560
of the executor identity that will run it.
1617
01:03:01,560 --> 01:03:03,560
The team that maintains the registry cannot be the team
1618
01:03:03,560 --> 01:03:05,560
that grants connector approvals.
1619
01:03:05,560 --> 01:03:09,560
The person who build the flow cannot approve its elevation to prod.
1620
01:03:09,560 --> 01:03:11,560
If this sounds heavy, it's because execution is heavy.
1621
01:03:11,560 --> 01:03:13,560
But this is how you prevent friendly fraud
1622
01:03:13,560 --> 01:03:15,560
and honest mistakes from becoming incidents.
1623
01:03:15,560 --> 01:03:17,560
Introduced jade elevation for agents.
1624
01:03:17,560 --> 01:03:19,560
By default executor identities are inert.
1625
01:03:19,560 --> 01:03:21,560
A gate approved plan wakes them.
1626
01:03:21,560 --> 01:03:23,560
Grants the smallest necessary permissions for the shortest time
1627
01:03:23,560 --> 01:03:25,560
and logs every action.
1628
01:03:25,560 --> 01:03:27,560
When the plan completes or time expires,
1629
01:03:27,560 --> 01:03:29,560
permissions collapse to zero.
1630
01:03:29,560 --> 01:03:31,560
Store no standing secrets.
1631
01:03:31,560 --> 01:03:33,560
Use managed identities wherever the platform supports them.
1632
01:03:33,560 --> 01:03:35,560
If you must store a secret, keep it in a tenant vault.
1633
01:03:35,560 --> 01:03:37,560
Scope to identity and plan with rotation on every run.
1634
01:03:37,560 --> 01:03:39,560
Watch for anti-patterns.
1635
01:03:39,560 --> 01:03:41,560
Make our credentials use the service accounts.
1636
01:03:41,560 --> 01:03:43,560
Kill that.
1637
01:03:43,560 --> 01:03:45,560
The owners owner inflows.
1638
01:03:45,560 --> 01:03:47,560
Kill that too.
1639
01:03:47,560 --> 01:03:49,560
Long-lived app registrations with tenant-wide graph scopes.
1640
01:03:49,560 --> 01:03:51,560
Replace with scoped app roles bound to environment groups.
1641
01:03:51,560 --> 01:03:55,560
Broad power user groups that bleed into prod, split them and expire them.
1642
01:03:55,560 --> 01:03:57,560
And the worst.
1643
01:03:57,560 --> 01:04:01,560
Copying an executor identity across multiple agents for simplicity.
1644
01:04:01,560 --> 01:04:05,560
Simplicity is how an outage in one area becomes a Saturday for everyone.
1645
01:04:05,560 --> 01:04:07,560
Telemetry closes the loop.
1646
01:04:07,560 --> 01:04:09,560
Record which identity produced each answer plan and action.
1647
01:04:09,560 --> 01:04:11,560
Alert on executor identities acting without gate references.
1648
01:04:11,560 --> 01:04:13,560
Track permission changes per identity.
1649
01:04:13,560 --> 01:04:15,560
Drift means someone's granting without policy.
1650
01:04:15,560 --> 01:04:19,560
Measure the ratio of requested permission to use permission per run.
1651
01:04:19,560 --> 01:04:23,560
High unused permission is a signal to trim.
1652
01:04:23,560 --> 01:04:25,560
Rotate client secrets out of existence.
1653
01:04:25,560 --> 01:04:27,560
Count them then count down to zero.
1654
01:04:27,560 --> 01:04:29,560
This mandate sounds like plumbing because it is.
1655
01:04:29,560 --> 01:04:33,560
But identity is the only firm edge in a probabilistic system.
1656
01:04:33,560 --> 01:04:35,560
Humans stay human.
1657
01:04:35,560 --> 01:04:37,560
Agents become roles with scopes and clocks.
1658
01:04:37,560 --> 01:04:39,560
At least privilege becomes measurable.
1659
01:04:39,560 --> 01:04:41,560
And when something goes wrong, you don't argue about prompts.
1660
01:04:41,560 --> 01:04:45,560
You revoke a key, tighten a scope, shorten a clock,
1661
01:04:45,560 --> 01:04:47,560
and your blast radius shrinks on command.
1662
01:04:47,560 --> 01:04:49,560
Mandate 9.
1663
01:04:49,560 --> 01:04:51,560
Teams and Outlook controls.
1664
01:04:51,560 --> 01:04:53,560
The conversational edge.
1665
01:04:53,560 --> 01:04:55,560
Conversations are the most dangerous edge because they feel harmless.
1666
01:04:55,560 --> 01:04:57,560
You ask, it answers.
1667
01:04:57,560 --> 01:04:59,560
No apparent stakes, no visible scope.
1668
01:04:59,560 --> 01:05:01,560
But teams and Outlook are where leakage begins and policy becomes pros.
1669
01:05:01,560 --> 01:05:05,560
If you don't engineer the edge, the edge engineers your engineers will be in trouble.
1670
01:05:05,560 --> 01:05:07,560
Don't engineer the edge, the edge engineers your incidents.
1671
01:05:07,560 --> 01:05:09,560
Start with team summarization boundaries.
1672
01:05:09,560 --> 01:05:13,560
Configure channel level scopes as default deny, public channels in,
1673
01:05:13,560 --> 01:05:15,560
private channels out by policy, not by etiquette.
1674
01:05:15,560 --> 01:05:19,560
Exclude HR and legal domains outright,
1675
01:05:19,560 --> 01:05:23,560
unless a named session contract attaches those registries with expiry.
1676
01:05:23,560 --> 01:05:27,560
Summaries cite only project canon and channel content within a defined time window.
1677
01:05:27,560 --> 01:05:29,560
Say 14 days, never tenant wide context.
1678
01:05:29,560 --> 01:05:31,560
If a user wants broader grounding, they attach it explicitly.
1679
01:05:31,560 --> 01:05:33,560
No attach, no broadened retrieval.
1680
01:05:33,560 --> 01:05:37,560
The model cannot be polite enough to cross a fence you didn't build.
1681
01:05:37,560 --> 01:05:39,560
Apply persona constraints as identities, not vibes.
1682
01:05:39,560 --> 01:05:43,560
The project summarizer for teams reads only designated channels and registered canon.
1683
01:05:43,560 --> 01:05:47,560
It outputs bullets, not essays, with inline citations that resolve to stable anchors.
1684
01:05:47,560 --> 01:05:51,560
It never proposes execution and it refuses when a claim sounds like policy without a source.
1685
01:05:51,560 --> 01:05:57,560
Tie that persona to a reason only identity with narrow graph scopes and a budget ceiling.
1686
01:05:57,560 --> 01:05:59,560
That heaviness is where safety lives.
1687
01:05:59,560 --> 01:06:01,560
This is how you convert conversation into good news.
1688
01:06:01,560 --> 01:06:05,560
This is how you convert conversation into governed retrieval instead of accidental discovery.
1689
01:06:05,560 --> 01:06:09,560
Outlook requires label aware refusal and reduction.
1690
01:06:09,560 --> 01:06:12,560
Summarization can see only the current mailbox and current thread.
1691
01:06:12,560 --> 01:06:17,560
Cross mailbox search is a hard deny when messages or attachments carry high risk labels.
1692
01:06:17,560 --> 01:06:22,560
Confidential HR legal privileged co-pilot refuses to synthesize, offers a bounded alternative,
1693
01:06:22,560 --> 01:06:30,560
summarize headers only, or redacts recognizable sensitive types before synthesis by policy, not by prompt.
1694
01:06:30,560 --> 01:06:36,560
Teach users that refusal is a feature, silence beats fiction, and redaction beats best effort with secrets.
1695
01:06:36,560 --> 01:06:40,560
Disable silence send, draft but never send is the law.
1696
01:06:40,560 --> 01:06:45,560
A recent tier outlook persona produces a structured draft with citations and a tone field.
1697
01:06:45,560 --> 01:06:51,560
Plant here can generate a send ready object with recipients, subject, content hash, and citations.
1698
01:06:51,560 --> 01:06:54,560
Gate routes to the senders approver or policy mailbox.
1699
01:06:54,560 --> 01:07:01,560
Execute sends only under an agent identity with DKIM, SPF alignment, and full audit.
1700
01:07:01,560 --> 01:07:05,560
Any attempt to bypass gate is a refusal with a link to the approval path.
1701
01:07:05,560 --> 01:07:13,560
Temporarily allow is how mishaps ship, killed sticky behavior across the edge, behavioral defaults, tone, audience, structure,
1702
01:07:13,560 --> 01:07:17,560
belong in the session contract and appear in the context ledger.
1703
01:07:17,560 --> 01:07:19,560
They do not bleed across products, tenants, or days.
1704
01:07:19,560 --> 01:07:24,560
If a user flips from teams to outlook, the system announces a reset in pauses.
1705
01:07:24,560 --> 01:07:32,560
Context reset due to surface change, attach session 9c2a, hidden system prompts are locked as organization behavior with owner ID and version.
1706
01:07:32,560 --> 01:07:36,560
No unseen autopilot, constraint attachments, and recent files.
1707
01:07:36,560 --> 01:07:39,560
Autoattach reasons looks helpful and behaves like uncontrolled state.
1708
01:07:39,560 --> 01:07:43,560
Instead, require explicit attach of allowed sources with version and label.
1709
01:07:43,560 --> 01:07:49,560
The ledger lists them, the evaluator refuses if the answer leans on a file not attached or label not permitted.
1710
01:07:49,560 --> 01:07:51,560
Your goal is boring predictability.
1711
01:07:51,560 --> 01:07:55,560
Every inclusion is traceable, and nothing sneaks in because a sidebar was helpful.
1712
01:07:55,560 --> 01:07:57,560
Instrument the edge like a production system.
1713
01:07:57,560 --> 01:08:01,560
Capture prompt response pairs with label context in activity explorer.
1714
01:08:01,560 --> 01:08:09,560
Alert on policy-shaped answers with outside citations, cross mailbox access attempts, and sudden increases in sensitive label mentions within conversational outputs.
1715
01:08:09,560 --> 01:08:13,560
Track cost per answer for teams and outlook separately.
1716
01:08:13,560 --> 01:08:19,560
When it spikes, look for retrieval window bloat, overbroad scopes, or long answers where tables would suffice.
1717
01:08:19,560 --> 01:08:24,560
Cost is a light here as well. Design refusals, people accept. Friendly crisp and prescriptive.
1718
01:08:24,560 --> 01:08:27,560
I can't include legal privileged content.
1719
01:08:27,560 --> 01:08:31,560
Attach an extract from the policy registry or restate without legal sources.
1720
01:08:31,560 --> 01:08:36,560
Offer bounded ladders. Attach approved cannon, adjust time window or route a policy gap to the owner.
1721
01:08:36,560 --> 01:08:42,560
Don't offer the web, don't offer best guess. You are training the org that conversation is still a control surface.
1722
01:08:42,560 --> 01:08:44,560
Anti-patterns to remove.
1723
01:08:44,560 --> 01:08:52,560
Org-wide teams, summarization with private channels enabled for convenience, outlook summarization that reads shared mailboxes by default,
1724
01:08:52,560 --> 01:09:00,560
drafts that send on enter, and the quiet bleed where a tone instruction migrates from a chatbot to email drafts without a ledger entry.
1725
01:09:00,560 --> 01:09:03,560
These aren't UX quirks, they are policy erosion.
1726
01:09:03,560 --> 01:09:07,560
One more explicit line you need to hear. Most co-pilot incidents will originate here.
1727
01:09:07,560 --> 01:09:12,560
Not because teams and outlook are malicious, but because they're constant, casual and assumed safe.
1728
01:09:12,560 --> 01:09:18,560
Engineer the edge, bound the scopes, label refuse in outlook, persona restrict in teams, show the ledger.
1729
01:09:18,560 --> 01:09:23,560
And when something drifts, your telemetry will tell you before or forwarded summary does.
1730
01:09:23,560 --> 01:09:27,560
Mandate 10. Power automate guard rails where hallucinations become incidents.
1731
01:09:27,560 --> 01:09:33,560
Floes don't hallucinate, people wire hallucinations into flows. That's why this surface is incident dense.
1732
01:09:33,560 --> 01:09:38,560
Once co-pilot's good idea lands in a trigger and a right, you've converted fiction into state.
1733
01:09:38,560 --> 01:09:47,560
Start with approvals as a law, not a habit. Any flow that can update, create, delete or change permissions, runs through a mandatory approval step tied to risk tier and data classes.
1734
01:09:47,560 --> 01:09:49,560
The approver is a named role, not the maker.
1735
01:09:49,560 --> 01:09:52,560
The approval card renders the plan, Jason, you already require.
1736
01:09:52,560 --> 01:09:58,560
Goal steps preconditions, citations, rollback, no Jason, no button, block unknown and ambiguous connectors.
1737
01:09:58,560 --> 01:10:00,560
Production environments get allow lists only.
1738
01:10:00,560 --> 01:10:05,560
HTTP generic, custom APIs without registration, tenant wide graph scopes denied.
1739
01:10:05,560 --> 01:10:10,560
Devent test can request temporary access with expiry, budget and a gate ticket.
1740
01:10:10,560 --> 01:10:16,560
Entropy loves just as once, don't feed it. Require simulated execution, dry run before any right.
1741
01:10:16,560 --> 01:10:22,560
The engine evaluates preconditions, enumerates targets, estimates blast radius and previews compensating transactions.
1742
01:10:22,560 --> 01:10:29,560
If the diff looks wrong, the gate refuses with specifics, missing rollback, unlabeled sources, disallowed connector or scope drift.
1743
01:10:29,560 --> 01:10:32,560
Dry run outputs become the audit seat for execute.
1744
01:10:32,560 --> 01:10:35,560
Rollback isn't a paragraph, it's a step per right with a predicate.
1745
01:10:35,560 --> 01:10:40,560
If you can't describe the inverse and when it's safe, you're proposing an irrecoverable change.
1746
01:10:40,560 --> 01:10:43,560
Gate stops it, period. Run as agent, not owner.
1747
01:10:43,560 --> 01:10:49,560
Every flow executes under a dedicated identity with least privilege, JIT elevation and a short clock.
1748
01:10:49,560 --> 01:10:53,560
Standing secrets go to zero. Managed identities were ever possible.
1749
01:10:53,560 --> 01:11:01,560
Ten and vault for anything else with rotation on run, instrument like production, log preconditions, approvals, identity, step outcomes and compensating transactions.
1750
01:11:01,560 --> 01:11:07,560
Alert on rights without gate IDs, connectors outside allow lists, high rollback rates and permission changes clustered by identity.
1751
01:11:07,560 --> 01:11:11,560
Say it plainly, most co-pilot incidents will originate here.
1752
01:11:11,560 --> 01:11:16,560
You can't just add it, like your change management system, because architecturally it is one.
1753
01:11:16,560 --> 01:11:21,560
The ten point co-pilot governance checklist, this is the artifact your security team will wish you had.
1754
01:11:21,560 --> 01:11:25,560
It's short, it's enforceable and it maps directly to the mandates you just heard.
1755
01:11:25,560 --> 01:11:29,560
Use it before rollout during design reviews and in incident post mortems.
1756
01:11:29,560 --> 01:11:34,560
Ask each question out loud, if the answer is not provable in configuration or telemetry, it's a no.
1757
01:11:34,560 --> 01:11:40,560
One, can this co-pilot read x? x is not the tenant. x is the named default deny scope with explicit inclusions.
1758
01:11:40,560 --> 01:11:42,560
Time windows and label behavior.
1759
01:11:42,560 --> 01:11:45,560
Services, sites, channels, mailboxes, list them.
1760
01:11:45,560 --> 01:11:50,560
If you can't point to the graph scopes, environment group policies and label enforced, deny, redact rules.
1761
01:11:50,560 --> 01:11:52,560
You don't have a fence, you have a hope.
1762
01:11:52,560 --> 01:11:54,560
Two, can it execute y?
1763
01:11:54,560 --> 01:11:58,560
Why is the specific class of actions this agent may take if any?
1764
01:11:58,560 --> 01:12:01,560
Read only, plan only or execute under gate with least privilege and rollback.
1765
01:12:01,560 --> 01:12:07,560
Name the connectors, the environment, the agent identity and the JIT elevation window.
1766
01:12:07,560 --> 01:12:11,560
If run as owner shows up anywhere, the answer is no.
1767
01:12:11,560 --> 01:12:15,560
Three, who approves z? z is the gatekeeper by risk tier and data class.
1768
01:12:15,560 --> 01:12:18,560
Not a manager, a role. Procurement exceptions go here.
1769
01:12:18,560 --> 01:12:21,560
HR writes go there. Finance changes require counter signature.
1770
01:12:21,560 --> 01:12:27,560
If the approval card doesn't render a machine readable plan and show the approvers role, not just the name, you can't prove gate.
1771
01:12:27,560 --> 01:12:29,560
Four, where is the refusal state?
1772
01:12:29,560 --> 01:12:36,560
Show me the rules that stop answers without registered citations, stop grounding, without authorized scope and stop execute without preconditions.
1773
01:12:36,560 --> 01:12:38,560
Refusal is not a tone, it's a code path.
1774
01:12:38,560 --> 01:12:45,560
If your evaluator can't emit refusal reason fields and your users don't see clear letters back to canon, your training compliance theatre.
1775
01:12:45,560 --> 01:12:50,560
Five, how is output structured for anything that informs a decision or proposes action show the schema?
1776
01:12:50,560 --> 01:12:52,560
Tables with defined columns.
1777
01:12:52,560 --> 01:12:57,560
Jason with required keys, goal, assumptions, sources, steps, rollback, risk tier approvals.
1778
01:12:57,560 --> 01:13:04,560
If your gate can't validate shape, resolve citations to the registry and diff plan changes, your rubber stamping pros.
1779
01:13:04,560 --> 01:13:06,560
Six, what's the state model?
1780
01:13:06,560 --> 01:13:11,560
Stateless by default, with explicit session contracts when continuity is required.
1781
01:13:11,560 --> 01:13:13,560
The contract name scope sources rules time.
1782
01:13:13,560 --> 01:13:18,560
Every answer presents a visible context ledger, resets on surface changes, TTLs on embeddings.
1783
01:13:18,560 --> 01:13:22,560
If users can't see what was carried and why drift is happening silently.
1784
01:13:22,560 --> 01:13:26,560
Seven, what's the budget? Per environment group, per team, per agent.
1785
01:13:26,560 --> 01:13:29,560
Cost per answer for narrative, cost per action for automations.
1786
01:13:29,560 --> 01:13:37,560
Alerts at 80% with anatomy, prompts retrieval windows, connectors, hard stops at 100% that route to gate, race, redesign or retire.
1787
01:13:37,560 --> 01:13:41,560
If cost is a monthly report, not a circuit breaker, your financing entropy.
1788
01:13:41,560 --> 01:13:48,560
Eight, who owns telemetry? Owners, human names, for every agent identity, environment and registry domain.
1789
01:13:48,560 --> 01:13:52,560
Activity Explorer captures prompt response with label context.
1790
01:13:52,560 --> 01:13:54,560
DSPM lights up sensitive interactions per app.
1791
01:13:54,560 --> 01:14:00,560
Canaries run on policy domains if nobody is accountable for drift budgets, refusal rates and approval latency.
1792
01:14:00,560 --> 01:14:01,560
Your graphs are wallpaper.
1793
01:14:01,560 --> 01:14:04,560
Nine, what's the identity story? Separate humans from agents.
1794
01:14:04,560 --> 01:14:08,560
Dedicated agent identities per role, with scope and time.
1795
01:14:08,560 --> 01:14:11,560
No role union, reason, plan and execute identities are distinct.
1796
01:14:11,560 --> 01:14:14,560
Conditional access based on authentication context.
1797
01:14:14,560 --> 01:14:17,560
Managed identities, no standing secrets, tenant vault if you must.
1798
01:14:17,560 --> 01:14:21,560
If a maker token can run aflow, your blast radius is a person.
1799
01:14:21,560 --> 01:14:24,560
Ten, what breaks the glass? Incidents happen.
1800
01:14:24,560 --> 01:14:26,560
Show the disabled switch per agent and environment group.
1801
01:14:26,560 --> 01:14:31,560
Show the path to revoke scopes, rotate secrets and force refusal to reason only across a portfolio.
1802
01:14:31,560 --> 01:14:34,560
Show the audit linkage from plan to approve to execute.
1803
01:14:34,560 --> 01:14:37,560
If your first move is an email, you don't have an incident path.
1804
01:14:37,560 --> 01:14:40,560
You have a hope chest. Run the checklist before you turn anything on.
1805
01:14:40,560 --> 01:14:46,560
Run it again when costs spikes. Run it after every refusal law, because refusal drifting to zero means your gates are loaded.
1806
01:14:46,560 --> 01:14:49,560
This isn't a maturity model. Let's go no go test.
1807
01:14:49,560 --> 01:14:55,560
When the answers are provable in policy objects, scopes, registries, identities and logs, you're running a control plane.
1808
01:14:55,560 --> 01:15:00,560
When they're aspirational, you're narrating intent while the engine compiles behavior.
1809
01:15:00,560 --> 01:15:07,560
One more practical note. Print this and give it to security, legal, data and engineering with a single instruction answer an artifact.
1810
01:15:07,560 --> 01:15:10,560
If they show you a wiki, it's a no. If they show you a prompt, it's a no.
1811
01:15:10,560 --> 01:15:17,560
Policies allow lists labels with effect authentication context, environment groups, evaluator code, activity explorer traces those are yes.
1812
01:15:17,560 --> 01:15:24,560
Copilot entropy grows by default. Determinism has receipts. Reference control pattern, reason, plan, gate, execute.
1813
01:15:24,560 --> 01:15:28,560
You've heard the mandates. Here is the operating system that binds them at runtime.
1814
01:15:28,560 --> 01:15:31,560
Reason, plan, gate, execute. It isn't a slogan.
1815
01:15:31,560 --> 01:15:36,560
It's the sequence that turns a probabilistic assistant into a deterministic change machine.
1816
01:15:36,560 --> 01:15:45,560
Reason is the thinking surface. Read only by design. It gathers facts from bounded scopes, compares options and outputs a structured recommendation with citations that resolve to your registry.
1817
01:15:45,560 --> 01:15:53,560
No actions, no side effects. If the recommendation would inform a decision, it arrives in a table with your defined columns or a JSON object with required keys.
1818
01:15:53,560 --> 01:15:59,560
If a claim sounds like policy without an approved anchor, reason refuses. Truth or silence.
1819
01:15:59,560 --> 01:16:05,560
That refusal isn't UX etiquette. It's a code path, enforced by the evaluator that already checks, shape and citations.
1820
01:16:05,560 --> 01:16:11,560
Plan is translation. It takes the structured recommendation and compiles it into a machine readable plan artifact your gate can validate.
1821
01:16:11,560 --> 01:16:18,560
Not prose, a schema. Goal, assumptions, sources with stable anchors and labels, risk tier and steps.
1822
01:16:18,560 --> 01:16:23,560
Each step includes action, target, preconditions, side effects and rollback.
1823
01:16:23,560 --> 01:16:28,560
If you can't name the inverse and the predicate that makes it safe, you're proposing an irreversible change.
1824
01:16:28,560 --> 01:16:33,560
The plan gets a content hash, so humans and systems can talk about the same thing without guessing.
1825
01:16:33,560 --> 01:16:38,560
Because it's a diffable object, you can lint it, compare it against policy and store versions.
1826
01:16:38,560 --> 01:16:42,560
Gate is evaluation plus authority. It applies your risk tiers and data classes.
1827
01:16:42,560 --> 01:16:45,560
Low risk read only plans or to approve under policy.
1828
01:16:45,560 --> 01:16:49,560
Medium risk rights require human in the loop by a named role not a manager.
1829
01:16:49,560 --> 01:16:53,560
High risk domains require separation of duties and counter signatures.
1830
01:16:53,560 --> 01:17:00,560
The gate checks three things in code every time, shape matches schema, citations resolved to approve canon and rollback is present per right.
1831
01:17:00,560 --> 01:17:04,560
If any fails, it refuses with a reason and bounded letters back to fix.
1832
01:17:04,560 --> 01:17:08,560
Attach missing canon, add preconditions, supply rollback.
1833
01:17:08,560 --> 01:17:12,560
Gate is where silence beats fiction, stops being rhetoric and starts being enforcement.
1834
01:17:12,560 --> 01:17:17,560
Execute is scoped, short-lived and observable. It runs only after gate stamps the specific plan hash.
1835
01:17:17,560 --> 01:17:25,560
It uses a dedicated executor identity with the smallest permissions for the shortest time in the approved environment under authentication context.
1836
01:17:25,560 --> 01:17:27,560
Preconditions are checked before every action.
1837
01:17:27,560 --> 01:17:30,560
Every right logs a compensating transaction design you declared in plan.
1838
01:17:30,560 --> 01:17:41,560
If a precondition fails execute stops, no best effort. If a step fails midrun, execute triggers rollbacks and emits a structured failure report to gate with the plan ID, step error and what was undone.
1839
01:17:41,560 --> 01:17:46,560
That report is not an apology. It's an artifact, your telemetry and post-incident review can trust.
1840
01:17:46,560 --> 01:17:52,560
Map this pattern to the surfaces. In teams, reason outputs are recommendation table with alternatives and citations.
1841
01:17:52,560 --> 01:17:59,560
Plan returns, JSON, gate renders and approval card that shows plan hash, failed validations and required roles.
1842
01:17:59,560 --> 01:18:05,560
Execute runs a backend process tied to the plan hash and executor identity with logs in activity explorer.
1843
01:18:05,560 --> 01:18:08,560
In Outlook, reason drafts but never sends.
1844
01:18:08,560 --> 01:18:19,560
Plan produces a send ready object with recipients, content hash and anchors, gate routes, approvals, execute sends under agent identity with DKM SPF alignment and full traceability.
1845
01:18:19,560 --> 01:18:25,560
In Power Automate, co-pilot suggestions can only produce plans. Approvals are mandatory for rights.
1846
01:18:25,560 --> 01:18:31,560
Execute runs flows as agents with dry run available to preview blast radius. Telemetry closes the loop.
1847
01:18:31,560 --> 01:18:41,560
Track refusal rates in reason and gate. Near zero means your gates eroded. Track gate latency and approval sources, concentration reveals bottlenecks and SOD risks.
1848
01:18:41,560 --> 01:18:46,560
Track cost per plan and cost per execution. Spike signal retrieval bloat or environment drift.
1849
01:18:46,560 --> 01:18:53,560
Tie every action back to plan hash and execute identity so you can break glass by plan by agent or by scope without guesswork.
1850
01:18:53,560 --> 01:19:04,560
Common anti-patterns to kill with this pattern, brainstorming that emits runnable flows, approvals over pros, run as owner credentials, no dry run for rights, rollbacks as narratives instead of steps.
1851
01:19:04,560 --> 01:19:11,560
The fix is boring and repeatable. Reason structures, plan compiles, gate enforces, execute logs and rolls back.
1852
01:19:11,560 --> 01:19:19,560
If this sounds heavy it's because execution is heavy but once the sequence is real speed returns in the only place it's safe, reason, while control anchors every action that follows.
1853
01:19:19,560 --> 01:19:26,560
The red flag test you can run tomorrow. Here's the test that exposes risk without a committee or a six week review. Ask one question.
1854
01:19:26,560 --> 01:19:35,560
If we turn this copilot on for everyone tomorrow, what would scare us? Then run six passes fast with artifacts on the table, not opinions, pass one, data scope.
1855
01:19:35,560 --> 01:19:43,560
Name exactly what this agent can read today. Not SharePoint. The named sites, channels, mailboxes, indices with default deny, inclusions, labels and time windows.
1856
01:19:43,560 --> 01:19:48,560
If anyone says it depends on the prompt, that's a red flag. Prompts don't draw fences. Policies do.
1857
01:19:48,560 --> 01:19:53,560
If scopes aren't visible in configuration, stop and fix mandates one and two.
1858
01:19:53,560 --> 01:20:00,560
Pass two execution scope. Can this agent write, where? Under what identity? With which connectors? For how long and with what rollback?
1859
01:20:00,560 --> 01:20:07,560
Pull one real plan and trace it. Does it arrive as Jason with goal? Assumptions, sources, steps, preconditions, rollback and risk tier?
1860
01:20:07,560 --> 01:20:13,560
If you see pros approvals or run as owner, that's a red flag. Fix mandates three, four, eight and ten before a button exists.
1861
01:20:13,560 --> 01:20:20,560
Pass three, refusal model. Ask for something policy-shaped. What's the exception process for vendor onboarding? Watch the answer.
1862
01:20:20,560 --> 01:20:27,560
Does it cite your registry with stable anchors? Can you click through to a labeled versioned artifact? Or does it hedge with typically unstandard practice?
1863
01:20:27,560 --> 01:20:33,560
No citation, no answer is the rule. If the evaluator can't hold in a midter refusal reason, that's a red flag. Fix mandates five.
1864
01:20:33,560 --> 01:20:40,560
Pass four, structure. Request a recommendation that would inform a decision. Choose between option A and B for renewal.
1865
01:20:40,560 --> 01:20:51,560
Demand the table. Other columns what you declared. Option, prerequisites, citations, predicted impact, residual risk, refusal reason if incomplete or is it a paragraph.
1866
01:20:51,560 --> 01:20:58,560
Then ask for a plan. Do you get the Jason you lint or bullets you can't validate? If shape shifts between runs, that's a red flag.
1867
01:20:58,560 --> 01:21:04,560
Fix mandates three, four, eight and ten before a button exists. Fix mandates five. Change channels and ask a follow-up that relies on prior constraints.
1868
01:21:04,560 --> 01:21:14,560
Does the system announce a reset and ask to attach a session contract? Or does it carry phantom state and answer confidently? Does every answer show a context ledger, scope, sources, rules, time?
1869
01:21:14,560 --> 01:21:19,560
If memory appears without a contract, that's a red flag. Fix mandates six upon past six telemetry and cost.
1870
01:21:19,560 --> 01:21:27,560
Pull last week's activity explorer for this agent. Can you see prompt response pairs with label context? Can you calculate cost per answer and cost per action?
1871
01:21:27,560 --> 01:21:38,560
Do you have budgets per environment group, team and agent with alerts at 80% and hard stops at 100? Run a canary. Same prompt across channels, variance beyond a tight band means drift.
1872
01:21:38,560 --> 01:21:44,560
If you can't see it or you see costs spikes with no explanation, that's a red flag.
1873
01:21:44,560 --> 01:21:51,560
Fix mandates seven point and the test with identity. Show me the executor identity, its scope and its clock. Show me separation of duties in approvals.
1874
01:21:51,560 --> 01:22:03,560
If a human token can run a flow or a single identity can reason plan and execute, that's a red flag. Fix mandate eight, you'll finish this in an hour if the controls exist. You'll spend weeks if all you have are prompts and optimism.
1875
01:22:03,560 --> 01:22:10,560
That's the point. The red flag test forces answers in artifacts, scopes, registries, schemas, evaluators, identities, logs.
1876
01:22:10,560 --> 01:22:18,560
If they're not present, you don't need more training. You need to pause rollout and implement the mandates. Copilot entropy grows by default. This test tells you where it's already winning.
1877
01:22:18,560 --> 01:22:28,560
From failures to controls, closing the loop, let's close the loop by mapping each anchor failure to the control that neutralizes it. Not advice, not culture, concrete enforcement you can ship.
1878
01:22:28,560 --> 01:22:37,560
Graph overreach, the team summary that casually drags HR or legal, dies under bounded context. Mandate one defines the system as a control plane component, not a helpful pal.
1879
01:22:37,560 --> 01:22:52,560
Mandate two Carves defense default deny scopes explicit inclusions label aware deny and redact in practice. That means channel level scopes in teams, private channels and high risk labels excluded by policy, not etiquette, registry bound citations so tenant wide never sneaks in.
1880
01:22:52,560 --> 01:23:01,560
If a summary crosses a label it can't carry, the evaluator refuses with a path back to canon. leakage is prevented by design, not after the fact at munitions.
1881
01:23:01,560 --> 01:23:10,560
Hallucinated authority the invented policy asserted as fact stops at authority gating mandate five installs a registry with owners labels versions and stable anchors.
1882
01:23:10,560 --> 01:23:22,560
The evaluator enforces no citation no answer whenever a claim looks like policy answers that lack anchors refuse with a reason and bounded letters search the policy domain notify the owner or restate without policy claims.
1883
01:23:22,560 --> 01:23:51,560
You're not teaching better prompts you're removing the option to look official without anchors automation without consent the flow that modifies records without a human break can't form under separation and approvals mandate four splits reason plan gate execute mandate three demand structure so gate can evaluate mandate 10 makes approvals a law dry run mandatory and rollbacks first class approvals render plan Jason with goal steps preconditions citations rollback and risk tier.
1884
01:23:51,560 --> 01:24:06,560
This allowed connectors missing rollback or pros instead of schema are rejected in code execute runs under an agent identity with least privilege and a clock not the makers token you turned good idea into govern change or you stopped it there's no middle.
1885
01:24:06,560 --> 01:24:19,560
Memory illusion the false continuity that erodes trust goes away when state is explicit mandate six create session contracts that name scope sources rules and time and context ledges that list exactly what an answer used.
1886
01:24:19,560 --> 01:24:37,560
Channel changes and midnight resets are visible and pause the engine attached to carry refuse when a user asks for continuity without a contract embeddings get TTL by data class with provenance and snapshots bound to contracts ghost state has no way to hide because memory is either declared or doesn't exist.
1887
01:24:37,560 --> 01:24:55,560
Custom chaos token spikes API bursts and mystery agents become signal under observability and budgets mandate seven instruments prompts responses grounding calls labels and cost by agent you track cost per answer and cost per action alert on delta's and impose ceilings at the environment team and agent layers.
1888
01:24:55,560 --> 01:25:06,560
Drift is defined and enforced shape drift source drift refusal drift latency and cost drift push agents into degraded mode until someone fixes scopes registry or retrieval windows.
1889
01:25:06,560 --> 01:25:35,560
Budget become circuit breakers not retrospectives one more cross cutting fix neutralizes blast radius everywhere identity mandate eight separates humans and agents applies least privilege a data action and identity layers and enforces separation of duties reason identities read plan identities compile execute identities act for minutes not days with manage identities and no standing secrets approvers aren't builders registry owners aren't approvers when something goes wrong you revoke a key short in a clock tightness scope.
1890
01:25:35,560 --> 01:25:56,560
And the incident ends where it started notice the pattern each failure was in the model problem it was absence of boundaries absence of anchors absence of breaks absence of state absence of telemetry demand aids fill those absences with enforcement you define the system constraint context demand structure split thinking from doing gate authority make state explicit see and budget behavior pin identity.
1891
01:25:56,560 --> 01:26:24,560
And then the next time you see engineer the conversational edge and harden the action frontier now fold this back into your operating rhythm use the checklist before you ship during design and after incidents teach the reason plan gate execute sequence until you can say it in your sleep run the red flag test monthly let the results update registries scopes budgets and identities if refusal rates fall to zero in policy domains you tighten the gate
1892
01:26:24,560 --> 01:26:53,560
and then you can't just go to the level if a plan lands as pros you don't lecture you block the approval button copilot entropy grows by default determinism exists only by design you don't need a culture change to start you need to enforce your intent with code parts the engine can't skip when you do the firefighting stops and the control plane you thought you bought is finally the one you're running pause before the board room you've heard the failures the mandates the checklist the pattern and the red flag test now we translate this into what your board owns boundaries refusal identity and observability with on off switches and names.
1893
01:26:53,560 --> 01:27:13,560
Executive brief to your board what you own let's make this unambiguous you don't own prompts you own the control plane that means four things with names artifacts and on off switches first you own the boundaries not the features define what any copilot can read and where it can never look that's default deny scopes by surface explicit inclusions with
1894
01:27:13,560 --> 01:27:38,560
the spires and label aware deny and redact if a director asks can this agency HR or legal your answer is not a slide it's a policy object that says no with an audit showing it refused last week when leakage happens it's never the model it's a missing fence second you own the refusal model not optimism truth or silence is not a slogan it's an evaluator in the answer path no citation no answer no scheme on no plan no preconditions no execution
1895
01:27:38,560 --> 01:28:07,560
refusal rates and policy domains should not trend to zero when they do your gates eroded you don't ask for better behavior you instruct your team to restore the code path that refuses without cannon and roots users back to owners third you own identity separation not convenience humans and agents are different species agents get dedicated identities with a role scope and a clock reason reads plan compiles execute acts briefly under gate no shared tokens no runners owner no standing secrets separation of duties is not a compliance paragraph
1896
01:28:07,560 --> 01:28:20,560
it's who can approve who can build who can execute never the same names when something goes wrong you should be able to revoke a key short in a clock tight in the scope and watch the blast radius shrink in real time
1897
01:28:20,560 --> 01:28:35,560
fourth you own observability budgets and drift cost per answer and cost per action are not finance metrics their health signals drift is defined and enforced shape source refusal latency budgets are circuit breakers at the environment team and agent layers
1898
01:28:35,560 --> 01:28:56,560
at 80% and owner gets the anatomy and the decision at 100 the agent pauses and gate to the sides raise redesign or retire if your first telemetry is an invoice your funding entropy why now because the estate is already accumulating invisible agents run times inside channels faster than policy can catch up every week you delay temporary scopes become the factor norms
1899
01:28:56,560 --> 01:29:25,560
helpful defaults become policy and pros approvals become change management the firefighting you see silent leakage confident fiction irreversible automations false continuity costs spikes are not a i surprises there are contextual emissions with your name on the control register what do you ask for three artifacts in writing with owners one alive registry of cannon with labels versions and anchors to the evaluators that enforce structure citations refusal and plan gating code not guidelines three the inventory of agent identities with roles
1900
01:29:25,560 --> 01:29:51,560
scopes clocks and the incident path that disables by portfolio not person and then impose the operating system reason plan gate execute ask your teams to answer every question and artifacts scopes registries identities logs if they show you a wiki it's a no if they show your prompt it's a no control the control plane or it will control you if you remember one thing remember this copilot entropy grows by default
1901
01:29:51,560 --> 01:30:10,560
determinism only exists by design put the boundaries in code gate authority to truth or silence split thinking from doing and make state an identity explicit use the checklist before rollout and force reason plan gate execute and run the red flag test tomorrow if any control fails pause fix the plane you fly then and only then turn copilot back on