Let us connect on LinkedIn!

M365 FM Podcast

M365 FM Podcast

The M365 FM Podcast is your daily destination for everything happening across the Microsoft cloud. We cover the full spectrum of Microsoft 365, including Teams, SharePoint, Exchange, OneDrive, and the tools driving the modern workplace. Each episode delivers practical insights, expert interviews, and hands-on strategies for IT admins, cloud architects, developers, power users, and decision-makers in the Microsoft ecosystem.We explore the latest M365 updates, dive into Power Platform topics like Power Apps, Power Automate, Power BI, Power Pages, and share real-world guidance on automation, digital transformation, and low-code development. You’ll also get deep insights into Azure, including cloud infrastructure, Azure AD / Entra ID, identity, hybrid cloud, and Azure security.The show features focused discussions on Microsoft 365 Security, Defender, compliance, DLP, Zero Trust, and the best practices needed to protect and optimize your environment. We also highlight how AI and Copilot for Microsoft 365 are transforming productivity, collaboration, and automation across the cloud.Whether you want to improve Teams collaboration, strengthen security, enhance cloud architecture, or stay ahead of the latest Microsoft 365, Azure, Power Platform, and AI announcements, The M365 Podcast is your essential guide. M365 FM Podcast is Part of the M365.Show Network.

Choose your favorite podcast player

Architectural Drift: Governing Autonomous AI Models in Power BI Fabric

January 22, 2026

Architectural Drift: Governing Autonomous AI Models in Power BI Fabric

In this episode, Architectural Drift: Governing Autonomous AI Models in Power BI Fabric, we explore why modern analytics platforms like Microsoft Fabric and Power BI are not simply reporting tools, but are now part of a broader architectural ecosystem that must be governed to prevent silent semantic drift. Rather than treating Power BI dashboards as the final destination for insights, the episode reframes them as evidence and validation layers within a data ecosystem whose primary interaction surface has shifted upstream. Fabric collapses traditional boundaries between storage, compute, semantic models, publishing, and analytics into a unified environment, which accelerates decision making but also amplifies drift in definitions, metrics, and authority boundaries.

As analytic workloads become conversational and AI-enabled, legacy governance assumptions no longer hold. The issue isn’t technical failure but architectural drift—data semantics that once required explicit definition now evolve at refresh velocity, and AI systems will amplify whatever definitions and access paths they encounter. Dashboards remain important for auditability, lineage, and compliance, but they no longer dictate how answers arrive; that role has shifted to semantic models and AI decision logic. The right question for organizations is not whether AI replaces BI but whether BI semantic contracts are used as the governing input layer for conversational answers, preventing conditional chaos and ensuring deterministic responses based on enterprise-agreed definitions. The episode concludes by outlining the role of semantic governance, owned measures, and defensible models in modern analytics, and why this approach is essential to maintain trust, accountability, and control at scale.

In this episode, we explore architectural drift in Power BI and Microsoft Fabric and why modern analytics failures rarely come from broken tools, but from unmanaged semantic and architectural decisions.

Power BI dashboards were once the primary interface for business questions. Today, with Microsoft Fabric, AI-assisted analytics, and conversational query surfaces, dashboards are no longer the starting point for decisions — they are the validation layer. This shift fundamentally changes how governance, trust, and accountability must be designed.

Architectural drift occurs when semantic definitions, measures, and business logic slowly diverge across datasets, models, and AI-driven answers. The result is not obvious failure, but quiet inconsistency: multiple correct-looking answers that cannot be defended or audited.

This episode explains why Fabric accelerates this drift, why dashboards alone cannot prevent it, and how organizations must move governance upstream into semantic models and decision pathways rather than UI layers.

Key Topics Covered

What architectural drift means in modern analytics platforms
Why Power BI dashboards are no longer the primary decision interface
How Microsoft Fabric collapses boundaries between data, models, and consumption
The difference between data correctness and semantic correctness
Why AI and conversational analytics amplify drift instead of revealing it
The role of semantic models as governance contracts
Why governance must focus on meaning, not visualization

Core Insight

Architectural drift is not a reporting problem.
It is a semantic governance problem.

As AI systems increasingly answer questions directly, any ambiguity in models, measures, or definitions is amplified. If meaning is not governed, AI will still respond — just inconsistently.

Dashboards survive not as decision engines, but as audit artifacts: places to validate, defend, and trace answers back to governed logic.

Why This Matters for Power BI and Fabric Users

Microsoft Fabric enables speed by design. It removes friction between ingestion, modeling, and consumption. While this improves agility, it also removes natural checkpoints that once forced governance decisions.

Without explicit semantic ownership and architectural control, organizations experience:

metric duplication
conflicting definitions
untraceable AI-generated answers
erosion of trust in analytics

The solution is not more dashboards, but stronger semantic contracts.

Governance Implications

Effective governance in Power BI and Fabric requires:

Treating semantic models as authoritative sources of truth
Enforcing ownership of measures and definitions
Designing AI and conversational systems to query governed models only
Using dashboards as validation layers, not as primary interfaces

Governance shifts from configuration to meaning enforcement.

Who Should Listen

This episode is for:

Power BI and Fabric architects
Analytics and data governance leaders
CIOs, CTOs, and data platform owners
Teams preparing for AI-enabled analytics and Copilot adoption
Anyone responsible for decision trust at scale

Related Themes

Semantic governance in analytics
AI-driven decision systems
Microsoft Fabric architecture
Power BI governance models
Managing drift in enterprise data platforms

Transcript

1
00:00:00,000 --> 00:00:04,960
Most teams assume AI agents will standardize their power BI or fabric models they won't.

2
00:00:04,960 --> 00:00:08,900
They'll produce something that looks consistent, compiles and even performs while the meaning

3
00:00:08,900 --> 00:00:10,560
quietly changes underneath you.

4
00:00:10,560 --> 00:00:12,200
That's architectural drift.

5
00:00:12,200 --> 00:00:16,200
The system still answers questions, but it no longer answers them the same way for the

6
00:00:16,200 --> 00:00:18,800
same reasons across teams and time.

7
00:00:18,800 --> 00:00:23,440
In the next few minutes, this will get defined in fabric terms then tied to the exact places

8
00:00:23,440 --> 00:00:24,920
drift starts.

9
00:00:24,920 --> 00:00:28,280
Measures, relationships, transformations and report semantics.

10
00:00:28,280 --> 00:00:33,400
Our speed without intent becomes entropy on the defined architectural drift in power BI

11
00:00:33,400 --> 00:00:35,000
fabric terms.

12
00:00:35,000 --> 00:00:37,760
Architectural drift in power BI and fabric is simple.

13
00:00:37,760 --> 00:00:41,600
The semantic meaning of your data changes without explicit intent, without review and

14
00:00:41,600 --> 00:00:44,040
without an accountable owner signing off.

15
00:00:44,040 --> 00:00:45,520
Not the model broke.

16
00:00:45,520 --> 00:00:47,000
Not refresh failed.

17
00:00:47,000 --> 00:00:49,000
Not a report error popped up.

18
00:00:49,000 --> 00:00:50,000
Drift is worse.

19
00:00:50,000 --> 00:00:51,000
Drift keeps working.

20
00:00:51,000 --> 00:00:52,400
A measure returns a number.

21
00:00:52,400 --> 00:00:56,720
A visual renders a stakeholder exports to excel and forwards it to finance everyone

22
00:00:56,720 --> 00:00:57,720
nods.

23
00:00:57,720 --> 00:01:01,200
The only problem is that the number is now answering a slightly different question than

24
00:01:01,200 --> 00:01:02,800
the one you think you asked.

25
00:01:02,800 --> 00:01:06,840
That distinction matters because power BI's semantic model is not a dashboard toy.

26
00:01:06,840 --> 00:01:08,800
It is an authorization and meaning engine.

27
00:01:08,800 --> 00:01:13,880
It encodes how the organization defines revenue, headcount, churn, backlog, margin.

28
00:01:13,880 --> 00:01:16,280
Whatever your executives insist is one number.

29
00:01:16,280 --> 00:01:20,800
And drift happens when that meaning changes via small local edits that look harmless in

30
00:01:20,800 --> 00:01:21,800
isolation.

31
00:01:21,800 --> 00:01:23,920
So what does that look like in real model terms?

32
00:01:23,920 --> 00:01:24,920
Start with measures.

33
00:01:24,920 --> 00:01:26,440
A measure is a contract.

34
00:01:26,440 --> 00:01:31,800
It defines a business concept under filter context when that definition changes even by optimization,

35
00:01:31,800 --> 00:01:36,280
even by refactoring, even by helpful AI cleanup you've changed the contract.

36
00:01:36,280 --> 00:01:37,920
If nobody approved it, that's drift.

37
00:01:37,920 --> 00:01:38,920
Then relationships.

38
00:01:38,920 --> 00:01:42,400
Relationships decide filter propagation, which decides what counts as included.

39
00:01:42,400 --> 00:01:46,800
You can keep the same measure text and still get different results if an agent adds a relationship,

40
00:01:46,800 --> 00:01:50,040
flips cross filter direction or creates a many to many shortcut.

41
00:01:50,040 --> 00:01:51,040
The model still works.

42
00:01:51,040 --> 00:01:53,120
It's just now answering a different question.

43
00:01:53,120 --> 00:01:54,760
Then power query and transformations.

44
00:01:54,760 --> 00:01:59,480
A power query steps that trims white space, replaces nulls, merges, two sources or changes

45
00:01:59,480 --> 00:02:01,240
a data type looks like plumbing.

46
00:02:01,240 --> 00:02:04,000
But it can change cardinality, keys and join behavior.

47
00:02:04,000 --> 00:02:05,000
That's not plumbing.

48
00:02:05,000 --> 00:02:09,520
That's meaning drift shows up when the transform changes and nobody records why the business

49
00:02:09,520 --> 00:02:11,400
definition changed with it.

50
00:02:11,400 --> 00:02:12,400
Then calculation groups.

51
00:02:12,400 --> 00:02:16,120
They're powerful because they rewrite measure evaluation at query time.

52
00:02:16,120 --> 00:02:19,240
They're also dangerous because a small change can apply everywhere.

53
00:02:19,240 --> 00:02:24,320
When an agent adjusts time intelligence logic, YTD, MTD, fiscal calendars, it can

54
00:02:24,320 --> 00:02:28,000
globally redefine what this year even means across the tenant.

55
00:02:28,000 --> 00:02:29,000
Then report semantics.

56
00:02:29,000 --> 00:02:32,560
In PBR or PBIP, the report isn't a single file anymore.

57
00:02:32,560 --> 00:02:33,720
It's a graph of definitions.

58
00:02:33,720 --> 00:02:37,560
Page JSON, visual JSON, themes, interactions, filters, bookmarks.

59
00:02:37,560 --> 00:02:41,640
So drift can happen purely at the presentation layer, the same model, the same measures, but

60
00:02:41,640 --> 00:02:44,320
a filter moved from page level to visual level.

61
00:02:44,320 --> 00:02:47,400
An interaction disabled, a hidden slicer left behind.

62
00:02:47,400 --> 00:02:48,720
Users see a stable report.

63
00:02:48,720 --> 00:02:49,720
They're wrong.

64
00:02:49,720 --> 00:02:53,880
Now separate drift from defects because governance failures come from confusing the two.

65
00:02:53,880 --> 00:02:55,040
Difts are loud.

66
00:02:55,040 --> 00:02:56,040
Something breaks.

67
00:02:56,040 --> 00:02:57,840
You get errors.

68
00:02:57,840 --> 00:02:58,960
Incidents happen.

69
00:02:58,960 --> 00:03:00,200
People respond.

70
00:03:00,200 --> 00:03:01,200
Drift is quiet.

71
00:03:01,200 --> 00:03:02,360
It passes validation.

72
00:03:02,360 --> 00:03:03,360
It ships.

73
00:03:03,360 --> 00:03:04,840
It becomes truth through repetition.

74
00:03:04,840 --> 00:03:08,920
That's why drift is the default outcome when you delegate modeling to agents without designing

75
00:03:08,920 --> 00:03:10,480
the controls first.

76
00:03:10,480 --> 00:03:11,640
Agents are pattern engines.

77
00:03:11,640 --> 00:03:13,120
They produce plausible structures.

78
00:03:13,120 --> 00:03:17,640
They don't carry your organization's semantic intent unless you force it into the workflow.

79
00:03:17,640 --> 00:03:20,720
And fabric makes this easier to trigger at scale.

80
00:03:20,720 --> 00:03:26,080
You can encourage reuse, shared lake houses, shared warehouses, shared semantic models,

81
00:03:26,080 --> 00:03:30,560
shared notebooks, shared APIs and multiple teams shipping at the same time.

82
00:03:30,560 --> 00:03:31,560
That's great for velocity.

83
00:03:31,560 --> 00:03:34,160
And it's also perfect conditions for drift.

84
00:03:34,160 --> 00:03:38,320
Because now the model stops being a product with an owner in a road map and becomes an

85
00:03:38,320 --> 00:03:41,960
artifact that exists because someone needed a report by Friday.

86
00:03:41,960 --> 00:03:43,200
Agents amplify that behavior.

87
00:03:43,200 --> 00:03:46,880
They make it cheap to create just one more measure, just one more relationship, just one

88
00:03:46,880 --> 00:03:48,480
more version of a KPI.

89
00:03:48,480 --> 00:03:53,360
At the time, those just one more choices accumulate and the organization ends up with a semantic

90
00:03:53,360 --> 00:03:55,360
layer that is no longer deterministic.

91
00:03:55,360 --> 00:03:56,560
It's probabilistic.

92
00:03:56,560 --> 00:04:00,600
Your answer depends on which workspace, which model, which measure variant and which

93
00:04:00,600 --> 00:04:02,320
agent last touched the definition.

94
00:04:02,320 --> 00:04:05,760
Next, the foundational misunderstanding that makes this inevitable.

95
00:04:05,760 --> 00:04:07,240
Agents don't understand your business.

96
00:04:07,240 --> 00:04:08,480
They approximate it.

97
00:04:08,480 --> 00:04:10,520
The foundational misunderstanding.

98
00:04:10,520 --> 00:04:12,400
Agents don't understand your business.

99
00:04:12,400 --> 00:04:13,560
Agents don't understand your business.

100
00:04:13,560 --> 00:04:17,040
They understand your prompts, your metadata and whatever patterns they've seen that

101
00:04:17,040 --> 00:04:18,680
look like your situation.

102
00:04:18,680 --> 00:04:19,840
That's not the same thing.

103
00:04:19,840 --> 00:04:24,120
And in semantic modeling, that distinction matters more than anywhere else because close enough

104
00:04:24,120 --> 00:04:25,120
becomes a KPI.

105
00:04:25,120 --> 00:04:26,560
Here's the uncomfortable truth.

106
00:04:26,560 --> 00:04:29,960
Most people evaluate an agent's output like they evaluate autocomplete.

107
00:04:29,960 --> 00:04:30,960
Does it look right?

108
00:04:30,960 --> 00:04:31,960
Does it run?

109
00:04:31,960 --> 00:04:32,960
Did the chart appear?

110
00:04:32,960 --> 00:04:33,960
If yes, they ship it.

111
00:04:33,960 --> 00:04:34,960
But a semantic model isn't text.

112
00:04:34,960 --> 00:04:36,880
It's an executable definition of meaning.

113
00:04:36,880 --> 00:04:41,000
If the definition is slightly wrong, the organization doesn't get a slightly wrong report.

114
00:04:41,000 --> 00:04:43,400
It gets two competing realities that both look official.

115
00:04:43,400 --> 00:04:44,760
The system did not fail.

116
00:04:44,760 --> 00:04:46,320
Your assumptions did.

117
00:04:46,320 --> 00:04:47,960
An LLM is a pattern engine.

118
00:04:47,960 --> 00:04:51,400
It predicts the most plausible next step given the inputs.

119
00:04:51,400 --> 00:04:52,800
Sometimes that looks like reasoning.

120
00:04:52,800 --> 00:04:53,800
It isn't.

121
00:04:53,800 --> 00:04:55,160
It's approximation under uncertainty.

122
00:04:55,160 --> 00:05:00,080
And BI is a domain where uncertainty has to be eliminated, not embraced.

123
00:05:00,080 --> 00:05:04,520
So when an agent generates a measure, it is not deriving the correct business logic.

124
00:05:04,520 --> 00:05:06,600
It is selecting a plausible formula shape.

125
00:05:06,600 --> 00:05:11,000
SUMX patterns, calculate patterns, time intelligence patterns, common KPI templates.

126
00:05:11,000 --> 00:05:14,200
If your organization's definition matches those shapes, great.

127
00:05:14,200 --> 00:05:16,920
If it doesn't, the agent will still produce an answer.

128
00:05:16,920 --> 00:05:20,040
It will just be wrong in the specific way that sounds confident.

129
00:05:20,040 --> 00:05:22,440
Now add non-determinism.

130
00:05:22,440 --> 00:05:26,200
Even if you prompt the same agent the same way, you are not guaranteed the same modeling

131
00:05:26,200 --> 00:05:27,200
choices.

132
00:05:27,200 --> 00:05:31,040
Model temperature, context window changes, updated system prompts, different tool availability

133
00:05:31,040 --> 00:05:35,440
and subtle differences in retrieved context all push the agent toward different outputs.

134
00:05:35,440 --> 00:05:36,680
That's fine for brainstorming.

135
00:05:36,680 --> 00:05:38,200
Governance hates it.

136
00:05:38,200 --> 00:05:40,360
Because governance is built on repeatability.

137
00:05:40,360 --> 00:05:44,160
The same inputs produce the same outcomes and changes are intentional.

138
00:05:44,160 --> 00:05:48,240
Authentic modeling turns that deterministic security model into a probabilistic one, where

139
00:05:48,240 --> 00:05:51,480
what happened depends on which run you're looking at.

140
00:05:51,480 --> 00:05:53,120
Then there's the close enough trap.

141
00:05:53,120 --> 00:05:59,200
An agent will happily map net sales to a column called sales amount, or infer active customers

142
00:05:59,200 --> 00:06:03,920
as distinct count customer customer key with a filter on the last 30 days because that's

143
00:06:03,920 --> 00:06:04,920
a common pattern.

144
00:06:04,920 --> 00:06:09,120
The problem is that your enterprise definition might exclude returns, include only invoice

145
00:06:09,120 --> 00:06:13,960
transactions, require posted status, align to fiscal periods and treat churned customers

146
00:06:13,960 --> 00:06:15,760
differently across product lines.

147
00:06:15,760 --> 00:06:19,840
If the agent doesn't have those constraints, it cannot invent them correctly.

148
00:06:19,840 --> 00:06:23,080
So it invents something else, a plausible business definition.

149
00:06:23,080 --> 00:06:26,000
That definition then gets reused, copied and referenced.

150
00:06:26,000 --> 00:06:29,040
And because it compiles, it becomes truth by repetition.

151
00:06:29,040 --> 00:06:31,440
This is what hallucination looks like in BI.

152
00:06:31,440 --> 00:06:33,920
Not making up a number out of thin air.

153
00:06:33,920 --> 00:06:36,040
Making up the definition that produces the number.

154
00:06:36,040 --> 00:06:39,280
The weird part is that hallucinated definitions don't throw errors.

155
00:06:39,280 --> 00:06:41,440
They can produce perfectly consistent results.

156
00:06:41,440 --> 00:06:46,200
They can even match expectations for a while until someone runs the same question in a different

157
00:06:46,200 --> 00:06:48,400
model with a slightly different definition.

158
00:06:48,400 --> 00:06:52,640
And the executive meeting turns into a debate about which reality is the real one.

159
00:06:52,640 --> 00:06:54,880
And fabric increases the blast radius.

160
00:06:54,880 --> 00:06:59,400
With shared assets and APIs, an agent doesn't just help one developer in one PBX.

161
00:06:59,400 --> 00:07:01,520
It can propagate patterns across workspaces.

162
00:07:01,520 --> 00:07:02,880
It can mass edit measures.

163
00:07:02,880 --> 00:07:04,560
It can replicate modeling templates.

164
00:07:04,560 --> 00:07:07,320
The same approximation becomes standardized drift.

165
00:07:07,320 --> 00:07:10,480
Now to be precise, this is not an argument to ban agents.

166
00:07:10,480 --> 00:07:13,800
Accounts can be useful, especially when you treat them as accelerators for repetitive work

167
00:07:13,800 --> 00:07:15,280
and documentation.

168
00:07:15,280 --> 00:07:20,120
Even SQL BI has been explicit that effective AI use depends on building blocks like context,

169
00:07:20,120 --> 00:07:22,200
tools and environment, not just prompts.

170
00:07:22,200 --> 00:07:23,200
That's the point.

171
00:07:23,200 --> 00:07:25,960
Without scaffolding, agents will fill the gaps with whatever looks plausible.

172
00:07:25,960 --> 00:07:28,080
So the foundational misunderstanding is simple.

173
00:07:28,080 --> 00:07:30,000
You think you're delegating understanding.

174
00:07:30,000 --> 00:07:33,040
You are delegating decision making under ambiguity.

175
00:07:33,040 --> 00:07:37,240
And unless you constrain that ambiguity, the agent will resolve it for you, quietly,

176
00:07:37,240 --> 00:07:40,440
repeatedly, at scale.

177
00:07:40,440 --> 00:07:44,560
Next, that misunderstanding becomes concrete in the most common drift vector.

178
00:07:44,560 --> 00:07:49,240
Measure generation, where semantic forks multiply faster than team's notice.

179
00:07:49,240 --> 00:07:53,280
Where drift starts, measure generation as a semantic fork bomb.

180
00:07:53,280 --> 00:07:57,880
Measures are where drift becomes scalable because measures are easy to create, easy to copy

181
00:07:57,880 --> 00:07:59,920
and hard to police once they spread.

182
00:07:59,920 --> 00:08:02,840
The agent doesn't need to redesign your schema to change reality.

183
00:08:02,840 --> 00:08:05,560
It just needs to generate one helpful KPI.

184
00:08:05,560 --> 00:08:07,080
And then another and another.

185
00:08:07,080 --> 00:08:11,000
That's why measure generation is the semantic fork bomb of power BI and fabric.

186
00:08:11,000 --> 00:08:14,480
It multiplies definitions faster than your organization can notice.

187
00:08:14,480 --> 00:08:15,600
Let alone agree.

188
00:08:15,600 --> 00:08:17,880
The first failure mode is duplicate definitions.

189
00:08:17,880 --> 00:08:20,840
You ask for net sales and the agent produces a measure.

190
00:08:20,840 --> 00:08:24,840
Another team asks for net sales in a different workspace and the agent produces a slightly

191
00:08:24,840 --> 00:08:26,000
different measure.

192
00:08:26,000 --> 00:08:30,360
Same display name, same folder, same tooltip description that looks professional.

193
00:08:30,360 --> 00:08:33,240
Different logic, one version subtracts returns.

194
00:08:33,240 --> 00:08:37,840
Another version subtracts discounts, a third version filters out internal customers.

195
00:08:37,840 --> 00:08:40,000
A fourth version uses a different date column.

196
00:08:40,000 --> 00:08:41,240
None of these are obviously wrong.

197
00:08:41,240 --> 00:08:43,880
They're just different contracts wearing the same label.

198
00:08:43,880 --> 00:08:48,240
And because fabric makes it trivial to reuse artifacts, those variants don't stay local.

199
00:08:48,240 --> 00:08:49,680
A report references one.

200
00:08:49,680 --> 00:08:51,600
A data agent references another.

201
00:08:51,600 --> 00:08:55,640
Someone exports a table and pastes it into a deck and calls it the number.

202
00:08:55,640 --> 00:09:00,400
Now the organization has multiple canonical truths and nobody can prove which one was intended.

203
00:09:00,400 --> 00:09:03,520
The second failure mode is filter context traps.

204
00:09:03,520 --> 00:09:05,120
DAX is not hard because it is complicated.

205
00:09:05,120 --> 00:09:06,880
DAX is hard because it is contextual.

206
00:09:06,880 --> 00:09:08,200
A measure isn't a formula.

207
00:09:08,200 --> 00:09:12,200
It's a formula evaluated inside a shape of filters you often don't see.

208
00:09:12,200 --> 00:09:14,360
Agents will produce time intelligence quickly.

209
00:09:14,360 --> 00:09:15,360
That's their favorite trick.

210
00:09:15,360 --> 00:09:18,840
YTD, MTD, rolling 13 months, prior year comparisons.

211
00:09:18,840 --> 00:09:23,320
The problem is that every one of those patterns assumes something about your calendar table,

212
00:09:23,320 --> 00:09:27,280
your relationships, your fiscal logic and which date column is the date.

213
00:09:27,280 --> 00:09:31,120
If your model has a single marked date table, single direction relationships and consistent

214
00:09:31,120 --> 00:09:33,160
date usage, you can survive that.

215
00:09:33,160 --> 00:09:34,160
Most models don't.

216
00:09:34,160 --> 00:09:36,400
The agent guesses it picks date date.

217
00:09:36,400 --> 00:09:38,160
Or it picks the fact table date.

218
00:09:38,160 --> 00:09:39,560
Or it mixes them across measures.

219
00:09:39,560 --> 00:09:40,960
The measures still return values.

220
00:09:40,960 --> 00:09:43,040
They'll even look correct in a single visual.

221
00:09:43,040 --> 00:09:47,480
But move the slicer, change the grain, add a second date attribute and the measures meaning

222
00:09:47,480 --> 00:09:48,480
shifts.

223
00:09:48,480 --> 00:09:50,280
That's drift through implied assumptions.

224
00:09:50,280 --> 00:09:52,400
The third failure mode is naming drift.

225
00:09:52,400 --> 00:09:53,880
Humans already struggle with naming.

226
00:09:53,880 --> 00:09:55,080
Agents do it faster, not better.

227
00:09:55,080 --> 00:09:59,800
You end up with measures called net sales, net sales, net sales, sales, net sales, net

228
00:09:59,800 --> 00:10:01,240
and net sales amount.

229
00:10:01,240 --> 00:10:05,800
All logically similar, all discoverable through search, non-consistently reusable and because

230
00:10:05,800 --> 00:10:08,680
people can't find the right measure, they create a new one.

231
00:10:08,680 --> 00:10:11,560
That is not a productivity gain, that is semantic inflation.

232
00:10:11,560 --> 00:10:14,480
The fourth failure mode is optimization drift.

233
00:10:14,480 --> 00:10:16,680
Faster DAX that changes meaning.

234
00:10:16,680 --> 00:10:20,080
Agents will refactor for performance because performance is measurable.

235
00:10:20,080 --> 00:10:21,680
Meaning is not, so they replace iterators.

236
00:10:21,680 --> 00:10:23,320
They rewrite, calculate logic.

237
00:10:23,320 --> 00:10:25,320
They remove filters, they think are redundant.

238
00:10:25,320 --> 00:10:26,800
They move logic into variables.

239
00:10:26,800 --> 00:10:28,640
They introduce key filters or remove it.

240
00:10:28,640 --> 00:10:31,440
They swap all for all selected because it fixes a visual.

241
00:10:31,440 --> 00:10:32,560
The number still returns.

242
00:10:32,560 --> 00:10:34,080
The query plan improves.

243
00:10:34,080 --> 00:10:35,080
Everyone claps.

244
00:10:35,080 --> 00:10:36,400
But the contract changed.

245
00:10:36,400 --> 00:10:41,200
This is how optimization turns into a governance incident six months later when finance asks

246
00:10:41,200 --> 00:10:44,040
why the monthly margin trend doesn't match the ledger.

247
00:10:44,040 --> 00:10:46,480
And the only answer you have is, we optimised it.

248
00:10:46,480 --> 00:10:49,640
The fifth failure mode is silent dependency changes.

249
00:10:49,640 --> 00:10:51,120
DAX measures are a graph.

250
00:10:51,120 --> 00:10:54,480
As those reference measures you change one definition and you didn't just change one

251
00:10:54,480 --> 00:10:55,480
KPI.

252
00:10:55,480 --> 00:10:58,080
You changed every downstream KPI that depends on it.

253
00:10:58,080 --> 00:11:01,280
Agents don't see that graph the way an accountable owner does.

254
00:11:01,280 --> 00:11:03,240
They see a task, make this measure.

255
00:11:03,240 --> 00:11:08,040
And if you give them tool access through MCP, they can execute that task by modifying multiple

256
00:11:08,040 --> 00:11:10,080
measures until the model validates.

257
00:11:10,080 --> 00:11:11,480
Validation is not correctness.

258
00:11:11,480 --> 00:11:13,800
It is syntax and dependency integrity.

259
00:11:13,800 --> 00:11:17,200
So the agent makes a change that technically works and it passes.

260
00:11:17,200 --> 00:11:19,640
But your report intent is now broken without errors.

261
00:11:19,640 --> 00:11:20,640
The totals moved.

262
00:11:20,640 --> 00:11:21,640
The exception logic changed.

263
00:11:21,640 --> 00:11:23,600
A calculation group now applies differently.

264
00:11:23,600 --> 00:11:24,800
The broad pack looks plausible.

265
00:11:24,800 --> 00:11:26,000
It is not trustworthy.

266
00:11:26,000 --> 00:11:27,960
And here's the part most teams miss.

267
00:11:27,960 --> 00:11:31,800
Measures are the easiest place to hide drift because they look like code.

268
00:11:31,800 --> 00:11:32,920
Code feels reviewable.

269
00:11:32,920 --> 00:11:35,040
But most organisations don't review semantics.

270
00:11:35,040 --> 00:11:36,040
They review syntax.

271
00:11:36,040 --> 00:11:37,360
They review whether it compiles.

272
00:11:37,360 --> 00:11:39,280
They review whether the report still loads.

273
00:11:39,280 --> 00:11:40,280
That is not governance.

274
00:11:40,280 --> 00:11:41,880
That is hope with a pull request.

275
00:11:41,880 --> 00:11:45,120
So if you want one mental model to carry forward, it's this.

276
00:11:45,120 --> 00:11:49,360
Every agent generated measure is a fork of meaning unless you force it to be a reuse

277
00:11:49,360 --> 00:11:50,360
of meaning.

278
00:11:50,360 --> 00:11:52,520
Next, drift doesn't stay in measures.

279
00:11:52,520 --> 00:11:56,120
Once the measure results look off, agents start fixing the model.

280
00:11:56,120 --> 00:11:59,440
And relationships are the first lever they pull.

281
00:11:59,440 --> 00:12:00,440
Relationship drift.

282
00:12:00,440 --> 00:12:01,600
The star schema.

283
00:12:01,600 --> 00:12:04,040
You had becomes the graph you fear.

284
00:12:04,040 --> 00:12:06,840
Relationships are where close enough turns into structural damage.

285
00:12:06,840 --> 00:12:10,000
A measure can drift and you can still contain it.

286
00:12:10,000 --> 00:12:14,320
You can replace it, certify one version, deprecate the rest.

287
00:12:14,320 --> 00:12:15,480
Relationships don't work like that.

288
00:12:15,480 --> 00:12:19,120
A relationship change rewires filter propagation across the model.

289
00:12:19,120 --> 00:12:23,800
It changes what counts as included even if every measure definition stays untouched.

290
00:12:23,800 --> 00:12:27,240
And because visuals still render, most teams won't notice until they're arguing about

291
00:12:27,240 --> 00:12:28,240
totals.

292
00:12:28,240 --> 00:12:30,800
This is why agents are dangerous in relationship space.

293
00:12:30,800 --> 00:12:34,320
When an agency's numbers that don't match, it doesn't have business intent to reconcile.

294
00:12:34,320 --> 00:12:38,600
It has a tool belt, so it tries to make the model behave like the output it expects.

295
00:12:38,600 --> 00:12:40,720
And the easiest knob to turn is relationships.

296
00:12:40,720 --> 00:12:43,640
The first pattern is the helpful relationship addition.

297
00:12:43,640 --> 00:12:47,320
You have a fact table and two dimensions, and there's a snowflake table or a bridge that

298
00:12:47,320 --> 00:12:50,320
wasn't modeled because the owner made a deliberate choice.

299
00:12:50,320 --> 00:12:55,120
Keep the star clean, control ambiguity, force explicit logic in measures.

300
00:12:55,120 --> 00:12:58,720
The agent doesn't see deliberate choice, it sees a missing connection, so it adds one.

301
00:12:58,720 --> 00:13:01,200
It will pick the most plausible key name match.

302
00:13:01,200 --> 00:13:04,160
Custom ID to custom ID, product key to product key, date to date.

303
00:13:04,160 --> 00:13:06,440
It will do it fast, it will do it confidently.

304
00:13:06,440 --> 00:13:08,360
And it will often be wrong in the only way that matters.

305
00:13:08,360 --> 00:13:10,520
It changes the semantics of filter propagation.

306
00:13:10,520 --> 00:13:14,680
Now the model answers questions through an implicit join path you didn't authorize.

307
00:13:14,680 --> 00:13:17,000
Next comes bidirectional filtering creep.

308
00:13:17,000 --> 00:13:18,880
And you can already do this under pressure.

309
00:13:18,880 --> 00:13:20,480
Just make the slicer work.

310
00:13:20,480 --> 00:13:22,560
Agents do it as a default remediation step.

311
00:13:22,560 --> 00:13:25,920
They detect that a dimension filter doesn't reach a table the visual uses.

312
00:13:25,920 --> 00:13:28,880
Therefore they flip cross filter direction or set it to both.

313
00:13:28,880 --> 00:13:31,640
It feels like a fix, it is not.

314
00:13:31,640 --> 00:13:35,160
Bidirectional filtering converts a clean star into a graph.

315
00:13:35,160 --> 00:13:38,160
A graph can still be queried, but it's no longer predictable.

316
00:13:38,160 --> 00:13:43,040
Filters can travel in loops, and the answer can depend on evaluation order, ambiguous paths,

317
00:13:43,040 --> 00:13:45,480
and which relationships the engine chooses to activate.

318
00:13:45,480 --> 00:13:46,960
That's not deterministic semantics.

319
00:13:46,960 --> 00:13:50,080
Conditional chaos, the worst part is that this doesn't crash anything.

320
00:13:50,080 --> 00:13:54,440
It produces numbers, it even produces the numbers people expect in the specific report,

321
00:13:54,440 --> 00:13:56,480
the agent was fixing.

322
00:13:56,480 --> 00:14:00,400
But it quietly breaks every other report that assumed single direction flow because the

323
00:14:00,400 --> 00:14:03,720
same slicer now behaves differently across context.

324
00:14:03,720 --> 00:14:05,680
Then there's the many to many shortcut.

325
00:14:05,680 --> 00:14:10,280
When an agent can't reconcile granularity, it reaches for a bridge table or uses a many

326
00:14:10,280 --> 00:14:13,800
to many relationship because it often works in demos.

327
00:14:13,800 --> 00:14:17,080
In small models, sometimes it does, but you don't get correctness for free.

328
00:14:17,080 --> 00:14:18,880
You get ambiguity with a friendly UI.

329
00:14:18,880 --> 00:14:22,920
Many to many relationships force you to be explicit about grain, about distinct counts,

330
00:14:22,920 --> 00:14:25,520
about duplication, about which side should filter which.

331
00:14:25,520 --> 00:14:27,240
An agent does not carry that discipline.

332
00:14:27,240 --> 00:14:30,480
It will implement the relationship so the visual returns a result.

333
00:14:30,480 --> 00:14:35,000
That result can be numerically consistent and still conceptually wrong because it double

334
00:14:35,000 --> 00:14:38,320
counts or under counts depending on slicer combinations.

335
00:14:38,320 --> 00:14:40,920
Next is role-playing dimensions, especially dates.

336
00:14:40,920 --> 00:14:43,360
This is where organizations bleed out slowly.

337
00:14:43,360 --> 00:14:47,160
Most enterprise models have multiple dates, order date, ship date, invoice date, posting

338
00:14:47,160 --> 00:14:48,160
date.

339
00:14:48,160 --> 00:14:50,320
The star schema you had relied on explicit choices.

340
00:14:50,320 --> 00:14:56,280
A marked date table, inactive relationships, user relationship in measures, control time intelligence.

341
00:14:56,280 --> 00:14:57,960
It's not elegant, but it's intentional.

342
00:14:57,960 --> 00:15:01,080
An agent sees multiple date columns and sees a problem.

343
00:15:01,080 --> 00:15:05,360
So it activates relationships or it rewrites measures to use, whichever date happens, to

344
00:15:05,360 --> 00:15:07,320
produce the output it expects.

345
00:15:07,320 --> 00:15:10,640
Or it creates a second date table because it's seen that pattern before.

346
00:15:10,640 --> 00:15:11,960
All of those are possible.

347
00:15:11,960 --> 00:15:15,440
None of them are guaranteed to match your business definition of this month.

348
00:15:15,440 --> 00:15:18,400
Now you're not drifting one KPI, you're drifting time itself.

349
00:15:18,400 --> 00:15:22,440
And once relationships drift, the model becomes unreviewable by casual inspection.

350
00:15:22,440 --> 00:15:23,760
A star schema is legible.

351
00:15:23,760 --> 00:15:26,040
You can point to it in a design review and explain it.

352
00:15:26,040 --> 00:15:26,800
A graph is not.

353
00:15:26,800 --> 00:15:31,000
A graph requires you to reason about propagation parts and the guilty resolution and the side

354
00:15:31,000 --> 00:15:34,320
effects of convenient settings that were added one at a time.

355
00:15:34,320 --> 00:15:37,360
That is why a relationship drift is an enterprise multiplier.

356
00:15:37,360 --> 00:15:40,960
Because the moment the schema becomes a graph, every new measure is now evaluated against

357
00:15:40,960 --> 00:15:41,960
a moving target.

358
00:15:41,960 --> 00:15:43,960
The engine still returns results.

359
00:15:43,960 --> 00:15:46,600
But your ability to explain those results collapses.

360
00:15:46,600 --> 00:15:51,240
And when an auditor or a regulator asks why a number changed quarter over quarter, the

361
00:15:51,240 --> 00:15:54,880
agent adjusted relationships to fix a report is not an explanation.

362
00:15:54,880 --> 00:15:56,600
It's an admission.

363
00:15:56,600 --> 00:16:02,520
Now add fabric automation on top, PBIP, PBIR, and agent's editing report artifacts as files.

364
00:16:02,520 --> 00:16:05,360
And you get drift in the presentation layer too.

365
00:16:05,360 --> 00:16:10,240
PBIR, PBIP, and the illusion of report as code governance.

366
00:16:10,240 --> 00:16:15,960
PBIR and PBIP get sold as report as code and teams here what they want to hear.

367
00:16:15,960 --> 00:16:16,960
Finally governance.

368
00:16:16,960 --> 00:16:21,760
Get pool requests, CI/CD, the same discipline they already use for everything else.

369
00:16:21,760 --> 00:16:22,760
Here's the problem.

370
00:16:22,760 --> 00:16:24,640
A power BI report is not code.

371
00:16:24,640 --> 00:16:28,800
It is configuration state serialized into a pile of files with multiple overlapping places

372
00:16:28,800 --> 00:16:30,280
to define the same outcome.

373
00:16:30,280 --> 00:16:35,080
And when you give agents permission to edit that state, you don't get software engineering.

374
00:16:35,080 --> 00:16:37,440
You get faster entropy with nicer diffs.

375
00:16:37,440 --> 00:16:44,520
PBIR decomposes a report into folders and JSON, pages, visuals, bookmarks, filters, themes,

376
00:16:44,520 --> 00:16:47,760
interactions, layout metadata that decomposition is useful.

377
00:16:47,760 --> 00:16:48,960
It makes changes trackable.

378
00:16:48,960 --> 00:16:50,280
It makes automation possible.

379
00:16:50,280 --> 00:16:53,200
It also makes what changed harder to understand.

380
00:16:53,200 --> 00:16:57,680
Because in PBIR, the same visual outcome can be produced by settings in multiple layers.

381
00:16:57,680 --> 00:17:02,040
The report theme, the page background, the visual JSON, conditional formatting rules,

382
00:17:02,040 --> 00:17:05,360
and sometimes even model metadata driving formatting defaults.

383
00:17:05,360 --> 00:17:07,720
You can change a color and touch three different files.

384
00:17:07,720 --> 00:17:11,760
You can change a slicer interaction and touch a single line that looks irrelevant.

385
00:17:11,760 --> 00:17:16,320
You can fix a chart by altering a hidden filter container you didn't know existed.

386
00:17:16,320 --> 00:17:20,760
So when people say it's in Git, what they usually mean is we can see that something changed.

387
00:17:20,760 --> 00:17:22,960
They still cannot see whether the meaning changed.

388
00:17:22,960 --> 00:17:25,080
A code diff tells you that JSON keys moved.

389
00:17:25,080 --> 00:17:29,640
It doesn't tell you that a report's analytical intent shifted from show trends with optional

390
00:17:29,640 --> 00:17:35,040
segmentation to show only the pre-filtered slice that makes the KPI look stable.

391
00:17:35,040 --> 00:17:36,520
Distinction matters.

392
00:17:36,520 --> 00:17:41,120
Because report semantics are not cosmetic, filters, interactions, bookmarks, and page navigation

393
00:17:41,120 --> 00:17:42,400
in code business logic.

394
00:17:42,400 --> 00:17:45,000
The report decides what a user can see by default.

395
00:17:45,000 --> 00:17:48,000
It decides which comparisons are easy and which are hidden.

396
00:17:48,000 --> 00:17:51,080
It decides whether a slicer affects one chart or all of them.

397
00:17:51,080 --> 00:17:54,680
Those are semantic decisions just expressed in UI metadata instead of DAX.

398
00:17:54,680 --> 00:17:56,640
Now add agents.

399
00:17:56,640 --> 00:18:01,200
An agent can open PBIR, search for a pattern, and replicate it across pages.

400
00:18:01,200 --> 00:18:02,760
It can standardize formatting.

401
00:18:02,760 --> 00:18:03,760
It can align visuals.

402
00:18:03,760 --> 00:18:04,960
It can rename titles.

403
00:18:04,960 --> 00:18:06,480
It can bulk edit filter paints.

404
00:18:06,480 --> 00:18:10,440
And if you give it tool access, it can do all of that at scale across hundreds of reports

405
00:18:10,440 --> 00:18:14,640
without ever seeing the business conversation that justified the original design.

406
00:18:14,640 --> 00:18:17,080
So you end up with a specific failure mode.

407
00:18:17,080 --> 00:18:19,920
Agents replicate layout without replicating analytic intent.

408
00:18:19,920 --> 00:18:21,240
The report looks consistent.

409
00:18:21,240 --> 00:18:22,960
The story it tells is no longer consistent.

410
00:18:22,960 --> 00:18:25,640
This is where reporters code becomes a governance mirage.

411
00:18:25,640 --> 00:18:27,600
The pipeline can validate syntax.

412
00:18:27,600 --> 00:18:28,800
It can validate schema.

413
00:18:28,800 --> 00:18:31,960
It can validate that the JSON is well formed and the artifacts deploy.

414
00:18:31,960 --> 00:18:35,280
It cannot validate that the report still answers the right questions.

415
00:18:35,280 --> 00:18:38,080
And PBIR makes the blast radius bigger than teams expect.

416
00:18:38,080 --> 00:18:41,040
The agent doesn't need to touch the model to change outcomes.

417
00:18:41,040 --> 00:18:44,320
It just needs to move a filter from page scope to visual scope.

418
00:18:44,320 --> 00:18:45,520
Or disable an interaction.

419
00:18:45,520 --> 00:18:47,640
Or apply a bookmark as the default view.

420
00:18:47,640 --> 00:18:52,400
Or leave behind a hidden slicer that pre-filters the page while looking like a neutral report.

421
00:18:52,400 --> 00:18:53,680
Everything still renders.

422
00:18:53,680 --> 00:18:55,280
The numbers still look plausible.

423
00:18:55,280 --> 00:18:56,680
Your audience trusted anyway.

424
00:18:56,680 --> 00:19:00,240
Now tie this back to the research reality you already have in the community.

425
00:19:00,240 --> 00:19:04,600
PBIR is new, still evolving, and there aren't millions of mature examples for agents to learn

426
00:19:04,600 --> 00:19:05,600
from.

427
00:19:05,600 --> 00:19:09,080
In that agentic report development discussion, the point wasn't that PBIR is bad.

428
00:19:09,080 --> 00:19:11,200
It was that the format is complex.

429
00:19:11,200 --> 00:19:12,400
Interactions are non-obvious.

430
00:19:12,400 --> 00:19:16,040
And without scaffolding agents don't reliably know where a decision is encoded.

431
00:19:16,040 --> 00:19:17,040
So they guess.

432
00:19:17,040 --> 00:19:18,200
And when they guess they change state.

433
00:19:18,200 --> 00:19:22,000
Even when you do provide scaffolding, instructions files, examples, templates.

434
00:19:22,000 --> 00:19:24,600
The agent is still operating on a representation problem.

435
00:19:24,600 --> 00:19:25,600
It sees JSON.

436
00:19:25,600 --> 00:19:27,400
It doesn't see the stakeholder conversation.

437
00:19:27,400 --> 00:19:30,640
It doesn't see why the CFO demanded a specific exception view.

438
00:19:30,640 --> 00:19:35,480
It doesn't see the historical baggage behind a temporary bookmark that became permanent.

439
00:19:35,480 --> 00:19:36,480
It sees patterns.

440
00:19:36,480 --> 00:19:38,680
Therefore it produces pattern-shaped edits.

441
00:19:38,680 --> 00:19:42,160
This is why the phrase "small change" stops meaning anything in PBIR.

442
00:19:42,160 --> 00:19:45,880
A one-line JSON change can rewrite the interactive behavior of an entire page.

443
00:19:45,880 --> 00:19:49,600
A bulk rename can break user trust because familiar fields disappear.

444
00:19:49,600 --> 00:19:54,480
A copied visual can carry hidden filters into a new context and quietly bias the result.

445
00:19:54,480 --> 00:19:57,960
So yes, PBIR and PBIP enable disciplined workflows.

446
00:19:57,960 --> 00:19:59,920
But only if you treat them as what they are.

447
00:19:59,920 --> 00:20:04,800
A massive, multi-file state surface that requires semantic review, not just diff review.

448
00:20:04,800 --> 00:20:09,120
And once you attach agents to that surface, especially through tool protocols and APIs,

449
00:20:09,120 --> 00:20:11,560
you've turned suggestions into state changes.

450
00:20:11,560 --> 00:20:16,400
Next that control plane gets even more dangerous when agents operate through MCP and automation

451
00:20:16,400 --> 00:20:17,400
tooling.

452
00:20:17,400 --> 00:20:19,080
Because now the platform isn't just editable.

453
00:20:19,080 --> 00:20:20,080
It's programmable.

454
00:20:20,080 --> 00:20:21,320
MCP and tooling.

455
00:20:21,320 --> 00:20:23,600
The control plane got faster, not safer.

456
00:20:23,600 --> 00:20:28,080
MCP is where the story stops being theoretical because once an agent has tools, it's no longer

457
00:20:28,080 --> 00:20:29,840
helping you write DAX.

458
00:20:29,840 --> 00:20:34,680
It's operating the control plane and the control plane is where the platform's state changes.

459
00:20:34,680 --> 00:20:39,800
Measures, metadata, translations, relationships, formatting descriptions, even bulk refactors

460
00:20:39,800 --> 00:20:42,360
that touch hundreds of objects in one run.

461
00:20:42,360 --> 00:20:43,480
That distinction matters.

462
00:20:43,480 --> 00:20:48,280
In a traditional workflow, an LLM suggests, "you copy, you paste, you run it."

463
00:20:48,280 --> 00:20:50,560
That friction is an implicit safety gate.

464
00:20:50,560 --> 00:20:53,560
Knowing but real, with MCP, that friction disappears.

465
00:20:53,560 --> 00:20:58,160
The agent can read the model, decide what to do next, and execute the change directly

466
00:20:58,160 --> 00:20:59,640
through a tool call.

467
00:20:59,640 --> 00:21:00,800
Faster, yes.

468
00:21:00,800 --> 00:21:01,960
Safer, no.

469
00:21:01,960 --> 00:21:03,680
MCP servers are accelerators.

470
00:21:03,680 --> 00:21:06,400
They're designed to make repetitive operations cheap.

471
00:21:06,400 --> 00:21:12,120
bulk rename measures, add descriptions, standardize formatting strings, generate SVG measures,

472
00:21:12,120 --> 00:21:17,720
refactor code into UDFs, apply translations, tweak relationships, create tables, run traces,

473
00:21:17,720 --> 00:21:19,400
run validation queries.

474
00:21:19,400 --> 00:21:22,960
The exact things teams hate doing manually, so the agent starts doing them.

475
00:21:22,960 --> 00:21:27,080
The uncomfortable truth is that most organizations do not have mature semantics for what done

476
00:21:27,080 --> 00:21:28,720
means in those tasks.

477
00:21:28,720 --> 00:21:32,360
They have technical acceptance criteria, does it deploy, does it refresh, does the report

478
00:21:32,360 --> 00:21:34,200
render, did the error go away?

479
00:21:34,200 --> 00:21:35,960
Agents optimize for that.

480
00:21:35,960 --> 00:21:41,000
If you give an agent the ability to run tool calls, it's success metric becomes no errors,

481
00:21:41,000 --> 00:21:47,000
not correct definition, not approved meaning, not aligned to finance policy, no errors.

482
00:21:47,000 --> 00:21:51,080
And you can watch this failure mode in real demos, an agent tries to create something, hits

483
00:21:51,080 --> 00:21:54,480
an error, tries again, hits a different error, tries a different approach, and eventually

484
00:21:54,480 --> 00:21:55,480
gets a green check.

485
00:21:55,480 --> 00:21:57,520
That looks like problem solving, it is.

486
00:21:57,520 --> 00:22:01,240
But it's problem solving against validation constraints, not business truth constraints.

487
00:22:01,240 --> 00:22:03,760
That means you get a new category of drift.

488
00:22:03,760 --> 00:22:05,360
iterative state mutation.

489
00:22:05,360 --> 00:22:09,240
The agent doesn't make a careful change, it makes a sequence of changes until the system

490
00:22:09,240 --> 00:22:10,400
stops complaining.

491
00:22:10,400 --> 00:22:14,440
And each attempt can leave residue, a partially created object, a renamed artifact, a new

492
00:22:14,440 --> 00:22:19,600
measure that compiles but isn't used, a relationship that fixed one visual and broke another.

493
00:22:19,600 --> 00:22:23,320
Even when it rolls back, you're trusting that rollback actually restored the previous semantic

494
00:22:23,320 --> 00:22:24,320
state.

495
00:22:24,320 --> 00:22:28,400
Trusting the tool is not governance, and there's another more subtle failure mode, when agents

496
00:22:28,400 --> 00:22:32,400
lie about what they changed, not maliciously, mechanically.

497
00:22:32,400 --> 00:22:36,080
They summarize what they intended to do, what they think they did, or what would have been

498
00:22:36,080 --> 00:22:37,400
reasonable to do.

499
00:22:37,400 --> 00:22:41,320
If you don't validate against the actual model state, you'll believe the narrative.

500
00:22:41,320 --> 00:22:46,120
You can see this in MCP workflows where the agent says it created functions or updated measures

501
00:22:46,120 --> 00:22:51,400
or applied a refactor, and then you open the model and find errors, missing objects or changes

502
00:22:51,400 --> 00:22:52,840
that didn't actually occur.

503
00:22:52,840 --> 00:22:56,960
The tool calls might have failed, the model might have rejected part of the update, or the

504
00:22:56,960 --> 00:23:00,400
agent might have switched strategies mid-run and lost track of the final state.

505
00:23:00,400 --> 00:23:03,800
The system is deterministic, the agent's narration is not.

506
00:23:03,800 --> 00:23:07,600
So architecturally, you have to treat MCP as a privilege boundary.

507
00:23:07,600 --> 00:23:09,440
It is not co-pilot with plugins.

508
00:23:09,440 --> 00:23:12,320
It is a right-capable automation surface for your semantic layer.

509
00:23:12,320 --> 00:23:16,800
The same way fabric rest APIs and service principles give you power at scale.

510
00:23:16,800 --> 00:23:20,240
MCP gives you power at scale with natural language as the interface.

511
00:23:20,240 --> 00:23:21,240
That makes it more accessible.

512
00:23:21,240 --> 00:23:26,320
It also makes it easier to misuse, because now just make it work becomes a bulk operation.

513
00:23:26,320 --> 00:23:29,800
And when bulk operations touch semantics, drift stops being local.

514
00:23:29,800 --> 00:23:30,800
It becomes systemic.

515
00:23:30,800 --> 00:23:33,240
This is also where identity starts to matter.

516
00:23:33,240 --> 00:23:37,200
If you run MCP through your own user context, the agent has whatever you have.

517
00:23:37,200 --> 00:23:40,840
If you run it through a service principle, it has whatever you granted that identity.

518
00:23:40,840 --> 00:23:43,000
Either way, tool access turns into authority.

519
00:23:43,000 --> 00:23:46,760
The agent becomes an actor in your governance model, whether you admit it or not.

520
00:23:46,760 --> 00:23:49,960
And once an agent is an actor, the real question is no longer "did it work?"

521
00:23:49,960 --> 00:23:53,440
The real question is, who authorized the semantic change that it just made?

522
00:23:53,440 --> 00:23:57,880
Because that's the thing you cannot reconstruct later if you didn't capture it up front.

523
00:23:57,880 --> 00:24:00,840
Next, this becomes an audit problem, not a modeling problem.

524
00:24:00,840 --> 00:24:03,080
After the fact, you can't prove intent.

525
00:24:03,080 --> 00:24:04,400
Auditability collapse.

526
00:24:04,400 --> 00:24:06,440
You can't prove intent after the fact.

527
00:24:06,440 --> 00:24:10,320
Auditability is where most agenteic BI conversations die in the real world.

528
00:24:10,320 --> 00:24:11,720
Not because auditors hate AI.

529
00:24:11,720 --> 00:24:13,480
Auditors don't care what tool you use.

530
00:24:13,480 --> 00:24:16,680
They care that you can answer four questions without improvising.

531
00:24:16,680 --> 00:24:17,520
Who changed it?

532
00:24:17,520 --> 00:24:18,600
When did they change it?

533
00:24:18,600 --> 00:24:19,720
Why did they change it?

534
00:24:19,720 --> 00:24:23,160
And what policy or business rule authorize the change?

535
00:24:23,160 --> 00:24:26,400
In Power BI and Fabric, that applies to everything that affects meaning,

536
00:24:26,400 --> 00:24:31,120
measures, relationships, calculation groups, power query steps, role definitions,

537
00:24:31,120 --> 00:24:34,400
certified data sets, report filters, default bookmarks.

538
00:24:34,400 --> 00:24:37,320
If any of those changed, you need to show intent.

539
00:24:37,320 --> 00:24:38,920
Now, here's the architectural problem.

540
00:24:38,920 --> 00:24:42,720
An agent can generate change history, but it cannot generate intent history

541
00:24:42,720 --> 00:24:45,680
unless you force intent into the workflow before the change happens.

542
00:24:45,680 --> 00:24:47,280
Because intent is not a log line.

543
00:24:47,280 --> 00:24:49,120
Intent is a decision.

544
00:24:49,120 --> 00:24:52,640
Most teams think we have logs, means we have governance.

545
00:24:52,640 --> 00:24:55,960
They point to activity logs, git commits, fabric item history,

546
00:24:55,960 --> 00:24:59,160
MCP tool call traces, copilot chat transcripts.

547
00:24:59,160 --> 00:25:01,120
None of those answer the auditor's question,

548
00:25:01,120 --> 00:25:04,600
because none of those artifacts explain the business rule that was approved.

549
00:25:04,600 --> 00:25:09,040
A tool call log tells you that the agent edited measure X at 14.03.

550
00:25:09,040 --> 00:25:11,680
It does not tell you that finance approved changing net revenue

551
00:25:11,680 --> 00:25:14,120
to exclude internal transfers starting in Q3

552
00:25:14,120 --> 00:25:17,680
and that the change align to policy fin revolution for that distinction matters.

553
00:25:17,680 --> 00:25:19,600
Tool telemetry is not governance evidence.

554
00:25:19,600 --> 00:25:20,360
It is exhaust.

555
00:25:20,360 --> 00:25:23,640
And when agents are involved, the exhaust gets noisier while the evidence gets thinner.

556
00:25:23,640 --> 00:25:25,760
You can end up with hundreds of microchanges,

557
00:25:25,760 --> 00:25:27,440
many of them iterative attempts,

558
00:25:27,440 --> 00:25:30,880
and the only stable thing you can say is that the model validates.

559
00:25:30,880 --> 00:25:31,880
That is not a control.

560
00:25:31,880 --> 00:25:33,400
That is an absence of alarms.

561
00:25:33,400 --> 00:25:35,440
The collapse happens in three steps.

562
00:25:35,440 --> 00:25:40,000
First, the agent makes a semantic change as part of fixing something else.

563
00:25:40,000 --> 00:25:41,760
A measure gets refactored for performance.

564
00:25:41,760 --> 00:25:43,120
Therefore, a filter gets moved.

565
00:25:43,120 --> 00:25:44,880
Therefore, a relationship gets adjusted.

566
00:25:44,880 --> 00:25:47,080
Therefore, a downstream KPI shifts.

567
00:25:47,080 --> 00:25:48,560
It's one task in the agent's mind.

568
00:25:48,560 --> 00:25:51,320
It's for separate control failures in an audit review.

569
00:25:51,320 --> 00:25:55,080
Second, the human reviewer cannot reliably interpret the change surface.

570
00:25:55,080 --> 00:25:56,720
PBI are diffs are structural noise.

571
00:25:56,720 --> 00:25:58,560
Model diffs show metadata churn.

572
00:25:58,560 --> 00:26:03,520
A PR can include dozens of JSON and model edits with no way to see the semantic delta

573
00:26:03,520 --> 00:26:05,680
unless you translate it into business language.

574
00:26:05,680 --> 00:26:07,440
Most teams don't do that translation.

575
00:26:07,440 --> 00:26:10,240
They merge because it looks reasonable.

576
00:26:10,240 --> 00:26:13,960
Third, when the question comes later, why did this KPI change last quarter?

577
00:26:13,960 --> 00:26:16,760
You have no artifact that ties the delta to an approved intent.

578
00:26:16,760 --> 00:26:17,640
You have a commit.

579
00:26:17,640 --> 00:26:18,560
You have a deployment.

580
00:26:18,560 --> 00:26:22,120
You have a chat session where someone asked for optimized revenue measure.

581
00:26:22,120 --> 00:26:22,880
That is not intent.

582
00:26:22,880 --> 00:26:24,360
That is a request for velocity.

583
00:26:24,360 --> 00:26:27,920
This is where the missing artifact becomes obvious, the semantic decision record.

584
00:26:27,920 --> 00:26:29,400
Call it an SDR if you want.

585
00:26:29,400 --> 00:26:30,400
Call it a decision log.

586
00:26:30,400 --> 00:26:31,400
Call it a data contract.

587
00:26:31,400 --> 00:26:32,400
The name doesn't matter.

588
00:26:32,400 --> 00:26:33,880
The function does.

589
00:26:33,880 --> 00:26:38,800
A semantic decision record is a small explicit statement that says this is the business concept.

590
00:26:38,800 --> 00:26:40,440
This is the approved definition.

591
00:26:40,440 --> 00:26:41,440
These are the constraints.

592
00:26:41,440 --> 00:26:43,920
This is the owner and this is the effective date.

593
00:26:43,920 --> 00:26:46,800
Then it links to the exact change set that implemented it.

594
00:26:46,800 --> 00:26:50,400
Without that, you cannot prove that the system changed for a legitimate reason.

595
00:26:50,400 --> 00:26:51,920
You can only prove that it changed.

596
00:26:51,920 --> 00:26:56,840
It changed is not acceptable in regulated environments in financial reporting, in healthcare

597
00:26:56,840 --> 00:27:01,880
metrics, in operational KPIs that drive compensation, pricing, staffing or risk.

598
00:27:01,880 --> 00:27:06,040
Even if you're not regulated, you still get the same failure in executive trust.

599
00:27:06,040 --> 00:27:09,320
Once leaders see two different numbers for the same question, they stop trusting the

600
00:27:09,320 --> 00:27:11,840
platform and start building shadow spreadsheets.

601
00:27:11,840 --> 00:27:14,560
Drift becomes compliance debt.

602
00:27:14,560 --> 00:27:18,800
Not because a regulator shows up tomorrow, but because every unknown definition becomes

603
00:27:18,800 --> 00:27:23,080
a future incident, every agent fixed measure becomes a future reconciliation.

604
00:27:23,080 --> 00:27:26,640
Every undocumented change becomes a future argument where nobody can win because nobody

605
00:27:26,640 --> 00:27:28,120
can prove what was intended.

606
00:27:28,120 --> 00:27:29,680
So the reframing is simple.

607
00:27:29,680 --> 00:27:32,280
If you can't reconstruct intent, you don't have governance.

608
00:27:32,280 --> 00:27:33,680
You have telemetry.

609
00:27:33,680 --> 00:27:37,880
And telemetry is what you look at after the system already did the thing you needed to prevent.

610
00:27:37,880 --> 00:27:40,800
Next, the same lack of intent shows up in a different form.

611
00:27:40,800 --> 00:27:45,200
Security and access assumptions erode because agents expand pathways faster than your permission

612
00:27:45,200 --> 00:27:47,240
model can keep up.

613
00:27:47,240 --> 00:27:48,720
Permission drift.

614
00:27:48,720 --> 00:27:51,680
Agents expand access paths without you noticing.

615
00:27:51,680 --> 00:27:53,760
Permission drift is the quiet twin of semantic drift.

616
00:27:53,760 --> 00:27:55,840
You can argue about definitions in a meeting.

617
00:27:55,840 --> 00:27:58,960
You can't argue with an access path you didn't know existed.

618
00:27:58,960 --> 00:28:00,120
Agents need reach.

619
00:28:00,120 --> 00:28:05,640
To help, they require access to semantic models, lake houses, warehouses, workspaces, gateways,

620
00:28:05,640 --> 00:28:06,640
and APIs.

621
00:28:06,640 --> 00:28:10,440
And the moment an organization treats the agent like a productivity feature instead of an

622
00:28:10,440 --> 00:28:13,160
identity, it starts handing out permissions.

623
00:28:13,160 --> 00:28:15,600
The same way it hands out exception clauses.

624
00:28:15,600 --> 00:28:18,240
Temporarily, broadly, and with no retirement plan.

625
00:28:18,240 --> 00:28:20,800
The first mistake is overscoping by convenience.

626
00:28:20,800 --> 00:28:24,960
Someone wants the agent to bulk update measures across multiple models, therefore they

627
00:28:24,960 --> 00:28:27,160
grant contributor on the workspace.

628
00:28:27,160 --> 00:28:32,080
Someone wants it to read data to validate outputs, therefore they grant access to the lakehouse.

629
00:28:32,080 --> 00:28:36,400
Someone wants it to deploy PBR changes, therefore they grant right access to the repo and the

630
00:28:36,400 --> 00:28:37,840
pipeline identity.

631
00:28:37,840 --> 00:28:38,840
It works.

632
00:28:38,840 --> 00:28:40,440
The task completes.

633
00:28:40,440 --> 00:28:42,200
Everyone moves on.

634
00:28:42,200 --> 00:28:46,200
That is how an agent becomes a permanent admin-shaped hole in your control plane, because

635
00:28:46,200 --> 00:28:48,320
in fabric privileges compose.

636
00:28:48,320 --> 00:28:52,480
This rolls, item permissions, data permissions, and external tool permissions all stack into

637
00:28:52,480 --> 00:28:54,160
an effective capability set.

638
00:28:54,160 --> 00:28:58,760
If you hand an agent broad workspace permissions, you didn't just let it edit a measure.

639
00:28:58,760 --> 00:29:04,120
You let it create new items, modify settings, share artifacts, and potentially expose data

640
00:29:04,120 --> 00:29:06,920
through downstream connections you never reviewed.

641
00:29:06,920 --> 00:29:12,120
And when agents operate via tools, MCP servers, rest APIs, CLI tooling, the permission boundary

642
00:29:12,120 --> 00:29:14,440
is no longer the power BI desktop UI.

643
00:29:14,440 --> 00:29:16,720
It is whatever that identity can do in the platform.

644
00:29:16,720 --> 00:29:21,360
So the access model has to be treated like code, small grants, explicit intent, and expiration.

645
00:29:21,360 --> 00:29:23,680
Next is service principles and managed identities.

646
00:29:23,680 --> 00:29:26,560
They are often presented as safer because they aren't human.

647
00:29:26,560 --> 00:29:30,320
In reality, they are safer only if you manage them like production infrastructure.

648
00:29:30,320 --> 00:29:31,320
Most teams don't.

649
00:29:31,320 --> 00:29:35,600
They create an app registration, granted broad API permissions, added to workspace admin

650
00:29:35,600 --> 00:29:37,000
and call it automation.

651
00:29:37,000 --> 00:29:41,280
Now the agent has a durable identity that never gets tired, never forgets, and never stops

652
00:29:41,280 --> 00:29:42,280
at 5pm.

653
00:29:42,280 --> 00:29:43,600
That's not a security win.

654
00:29:43,600 --> 00:29:47,240
That's a velocity multiplier for mistakes, and it gets worse when the agent starts needing

655
00:29:47,240 --> 00:29:48,920
cross domain access.

656
00:29:48,920 --> 00:29:51,600
One model pulls from HR data, another pulls from finance.

657
00:29:51,600 --> 00:29:53,360
The third pulls from operations.

658
00:29:53,360 --> 00:29:57,720
The agent becomes the glue because it can see everything and help everywhere.

659
00:29:57,720 --> 00:30:00,840
That's exactly how data boundaries collapse in large tenants.

660
00:30:00,840 --> 00:30:03,000
Not through malice, but through convenience.

661
00:30:03,000 --> 00:30:06,400
There's also a new risk surface, context leakage.

662
00:30:06,400 --> 00:30:07,400
Agents don't just act.

663
00:30:07,400 --> 00:30:08,400
They ingest.

664
00:30:08,400 --> 00:30:11,440
They pull metadata, sample data, measure expressions, error messages, even snippets

665
00:30:11,440 --> 00:30:14,240
of query output to reason about what to do next.

666
00:30:14,240 --> 00:30:18,480
If that context goes into prompts, chat logs, or third party tool telemetry, you now have

667
00:30:18,480 --> 00:30:22,240
sensitive information replicated outside the governed data plane.

668
00:30:22,240 --> 00:30:25,520
Even if the raw data never left the tenant, the meaning might have.

669
00:30:25,520 --> 00:30:31,160
Column names, custom identifiers in error text, fragments of PII in debug output, relationships

670
00:30:31,160 --> 00:30:32,640
that reveal business structure.

671
00:30:32,640 --> 00:30:33,640
That is still leakage.

672
00:30:33,640 --> 00:30:35,560
It's just more subtle than a file export.

673
00:30:35,560 --> 00:30:39,040
And you can't rely on the agent respects permissions as a security story.

674
00:30:39,040 --> 00:30:40,520
Of course it respects permissions.

675
00:30:40,520 --> 00:30:42,280
It operates as the identity you gave it.

676
00:30:42,280 --> 00:30:46,520
So if you gave it permission to bypass RLS assumptions, by querying a model as an owner

677
00:30:46,520 --> 00:30:51,160
by using an API endpoint that returns more than a report viewer would see, by operating

678
00:30:51,160 --> 00:30:55,360
in a workspace where the semantic model is shared too broadly, then your security model

679
00:30:55,360 --> 00:30:56,680
didn't get bypassed.

680
00:30:56,680 --> 00:30:58,280
It got redesigned by omission.

681
00:30:58,280 --> 00:30:59,840
This is the architectural law.

682
00:30:59,840 --> 00:31:02,840
Every new integration pathway is a new access path.

683
00:31:02,840 --> 00:31:04,880
Agents create pathways, tools create pathways.

684
00:31:04,880 --> 00:31:07,080
Just for this one automation creates pathways.

685
00:31:07,080 --> 00:31:10,640
At the time those pathways accumulate and the tenant stops being a set of controlled

686
00:31:10,640 --> 00:31:14,800
workspaces and becomes an authorization graph nobody can accurately reason about.

687
00:31:14,800 --> 00:31:16,480
That is permission drift.

688
00:31:16,480 --> 00:31:18,720
And it lands you in the same place as semantic drift.

689
00:31:18,720 --> 00:31:22,240
You can't prove what should have been allowed because you never encoded intent into the

690
00:31:22,240 --> 00:31:23,240
system.

691
00:31:23,240 --> 00:31:24,680
So the fix is not ban agents.

692
00:31:24,680 --> 00:31:29,600
The fix is to enforce design gates so that autonomy can't expand faster than governance.

693
00:31:29,600 --> 00:31:34,080
Next that means a governance model that treats agent output as untrusted until it passes

694
00:31:34,080 --> 00:31:35,880
four explicit gates.

695
00:31:35,880 --> 00:31:36,880
The governance model.

696
00:31:36,880 --> 00:31:40,200
Four gates that stop drift without killing velocity.

697
00:31:40,200 --> 00:31:42,520
So what actually stops drift isn't better prompting.

698
00:31:42,520 --> 00:31:44,080
It isn't more careful agents.

699
00:31:44,080 --> 00:31:47,880
It isn't asking the same model three times and picking the answer that feels right.

700
00:31:47,880 --> 00:31:51,680
Drift stops when the platform refuses to accept semantic change without intent.

701
00:31:51,680 --> 00:31:53,160
That means you need gates.

702
00:31:53,160 --> 00:31:54,680
Not guidelines, not training.

703
00:31:54,680 --> 00:31:55,680
Gates.

704
00:31:55,680 --> 00:31:59,480
Mechanisms that constrain, what can happen, where it can happen and what proof must exist

705
00:31:59,480 --> 00:32:01,560
before it becomes shared truth.

706
00:32:01,560 --> 00:32:05,720
Here's the governance model that works with agents instead of pretending agents will behave.

707
00:32:05,720 --> 00:32:06,720
Four gates.

708
00:32:06,720 --> 00:32:07,720
It adds friction.

709
00:32:07,720 --> 00:32:08,720
That's the point.

710
00:32:08,720 --> 00:32:11,320
The friction is targeted at the only place that matters.

711
00:32:11,320 --> 00:32:12,320
Semantic authority.

712
00:32:12,320 --> 00:32:13,320
Gate one is intent mapping.

713
00:32:13,320 --> 00:32:18,320
Before an agent generates anything, you force the human to state what correct means.

714
00:32:18,320 --> 00:32:19,320
Not in pros.

715
00:32:19,320 --> 00:32:20,320
In constraints.

716
00:32:20,320 --> 00:32:23,600
Scope allowed operations, definitions and exclusions.

717
00:32:23,600 --> 00:32:27,040
If you cannot write the intent down, you are not ready to automate it because the agent

718
00:32:27,040 --> 00:32:30,480
will otherwise invent the missing constraints and it will invent them differently every

719
00:32:30,480 --> 00:32:31,480
time.

720
00:32:31,480 --> 00:32:36,440
Intent mapping turns, build me a KPI into build this KPI under these rules using these

721
00:32:36,440 --> 00:32:40,560
tables aligned to this calendar with these exclusions and with this owner.

722
00:32:40,560 --> 00:32:43,400
It creates a deterministic contract the agent has to follow.

723
00:32:43,400 --> 00:32:45,080
Gate two is change containment.

724
00:32:45,080 --> 00:32:47,720
Agents don't get to operate on production state ever.

725
00:32:47,720 --> 00:32:50,320
You don't let an agent just update the model.

726
00:32:50,320 --> 00:32:54,480
You let it work in a sandbox, a branch, a separate workspace, a cloned semantic model,

727
00:32:54,480 --> 00:32:56,080
a PBIP project copy.

728
00:32:56,080 --> 00:32:57,400
You bound its blast radius.

729
00:32:57,400 --> 00:33:00,160
You make rollbacks boring, you make failure cheap.

730
00:33:00,160 --> 00:33:04,240
Containment is what prevents iterative tool calling from becoming iterative corruption.

731
00:33:04,240 --> 00:33:08,120
Because agents don't make one change, they make sequences, so you isolate the sequence from

732
00:33:08,120 --> 00:33:09,440
anything people trust.

733
00:33:09,440 --> 00:33:11,280
Gate three is review and verification.

734
00:33:11,280 --> 00:33:14,000
This is where most organizations lie to themselves.

735
00:33:14,000 --> 00:33:16,360
They think code review equals semantic review.

736
00:33:16,360 --> 00:33:17,360
It doesn't.

737
00:33:17,360 --> 00:33:19,640
Verification means you test meaning, not syntax.

738
00:33:19,640 --> 00:33:21,760
You validate outputs against known scenarios.

739
00:33:21,760 --> 00:33:24,680
You test time intelligence against your fiscal boundaries.

740
00:33:24,680 --> 00:33:26,560
You test filters that historically break.

741
00:33:26,560 --> 00:33:28,040
You compare before and after.

742
00:33:28,040 --> 00:33:32,000
You detect new relationships, cross filter changes, many to many additions, calculation

743
00:33:32,000 --> 00:33:33,000
group changes.

744
00:33:33,000 --> 00:33:34,840
And you do it in a way that produces evidence.

745
00:33:34,840 --> 00:33:37,080
That evidence is what makes governance real.

746
00:33:37,080 --> 00:33:40,640
Without it, you're just approving looks fine changes at scale.

747
00:33:40,640 --> 00:33:42,560
Gate four is release and attestation.

748
00:33:42,560 --> 00:33:46,640
Nothing becomes certified, promoted or shared until an accountable owner signs off that

749
00:33:46,640 --> 00:33:51,040
the semantic contract is still true, and that sign off needs to be attached to the change

750
00:33:51,040 --> 00:33:52,040
set.

751
00:33:52,040 --> 00:33:56,680
Not a meeting note, not a chat thread, a durable artifact, who approved it, what definition

752
00:33:56,680 --> 00:34:00,880
changed, why it changed, what policy governs it, and what version is now authoritative.

753
00:34:00,880 --> 00:34:03,480
Attestation is what makes auditability survive.

754
00:34:03,480 --> 00:34:04,840
Now notice what this model does.

755
00:34:04,840 --> 00:34:06,560
It doesn't treat the agent as a developer.

756
00:34:06,560 --> 00:34:10,880
It treats the agent as an untrusted automation engine operating inside a control pipeline,

757
00:34:10,880 --> 00:34:15,520
the same way you treat infrastructure as code, fast execution, strict gates and a clear

758
00:34:15,520 --> 00:34:17,000
chain of approval.

759
00:34:17,000 --> 00:34:21,040
And this maps clearly onto fabric's life cycle if you stop trying to make fabric self-service

760
00:34:21,040 --> 00:34:23,040
mean uncontrolled.

761
00:34:23,040 --> 00:34:27,680
Intent mapping aligns to your design stage, the semantic decision record, the data product

762
00:34:27,680 --> 00:34:30,840
definition, the contract for what this model is allowed to mean.

763
00:34:30,840 --> 00:34:35,400
Containment aligns to development environments, separate workspaces, deployment pipelines,

764
00:34:35,400 --> 00:34:41,080
PBIP projects in repos isolated identities, the agent works where mistakes are survivable.

765
00:34:41,080 --> 00:34:45,400
Verification aligns to CI, automated checks where possible and mandatory semantic review

766
00:34:45,400 --> 00:34:49,440
where automation cannot prove meaning, because it refreshed is not a test.

767
00:34:49,440 --> 00:34:54,040
Release an attestation aligns to promotion, the point where you mark a data set as endorsed,

768
00:34:54,040 --> 00:34:57,320
certified or production ready, and you can defend that claim later.

769
00:34:57,320 --> 00:34:59,440
This is also how you keep velocity.

770
00:34:59,440 --> 00:35:02,440
Units remain valuable where they should be valuable.

771
00:35:02,440 --> 00:35:06,880
Generating scaffolding, doing repetitive edits, documenting measures, applying formatting

772
00:35:06,880 --> 00:35:11,720
rules, translating metadata, creating drafts of logic that humans can verify, they

773
00:35:11,720 --> 00:35:16,400
stop being valuable where they are most dangerous, silently redefining business meaning through

774
00:35:16,400 --> 00:35:17,760
unchecked autonomy.

775
00:35:17,760 --> 00:35:21,480
So the four gates aren't a bureaucracy, they're an architecture, they make autonomy safe

776
00:35:21,480 --> 00:35:27,160
by making intent explicit, blast radius, small verification mandatory and releases traceable.

777
00:35:27,160 --> 00:35:30,320
And if you're wondering where this fails most often, it's gate one.

778
00:35:30,320 --> 00:35:34,120
Most teams can build pipelines, most teams can do reviews, most teams can deploy, what they

779
00:35:34,120 --> 00:35:37,920
cannot do is write down semantic intent in a form that can be enforced.

780
00:35:37,920 --> 00:35:39,240
That's where drift begins.

781
00:35:39,240 --> 00:35:40,920
So that's where the fix begins.

782
00:35:40,920 --> 00:35:45,320
Gate one, intent mapping as a deterministic contract, gate one is intent mapping and it's

783
00:35:45,320 --> 00:35:49,960
where organizations either become capable of safe autonomy or they get drift as a service,

784
00:35:49,960 --> 00:35:52,360
because intent is the only thing agents don't have.

785
00:35:52,360 --> 00:35:56,640
They have patterns, they have tools, they have the ability to mutate state until validation

786
00:35:56,640 --> 00:36:00,760
passes, but they don't have the business decision that makes one definition acceptable and

787
00:36:00,760 --> 00:36:03,040
another definition a governance incident.

788
00:36:03,040 --> 00:36:05,480
So intent mapping isn't right better prompts.

789
00:36:05,480 --> 00:36:09,720
It is a deterministic contract that the agent is not allowed to improvise around, start

790
00:36:09,720 --> 00:36:14,880
with the simplest rule, define the allowed operations, not what you hope the agent will do,

791
00:36:14,880 --> 00:36:16,240
what it is permitted to do.

792
00:36:16,240 --> 00:36:20,920
For example, create measures only in no relationships, no power query rewrites, no calculation

793
00:36:20,920 --> 00:36:26,000
group edits, no table additions, no renames outside a defined namespace, no touching certified

794
00:36:26,000 --> 00:36:27,000
assets.

795
00:36:27,000 --> 00:36:30,960
That scope has to be brutally narrow because a narrow scope creates predictable outcomes.

796
00:36:30,960 --> 00:36:35,360
If you can't state the scope in one sentence, you're not delegating a task, you're delegating

797
00:36:35,360 --> 00:36:36,680
ambiguity.

798
00:36:36,680 --> 00:36:38,360
Next define the semantic constraints.

799
00:36:38,360 --> 00:36:41,440
This is the part most team skip because it feels like bureaucracy.

800
00:36:41,440 --> 00:36:46,360
It is not, it is the only way to stop the agent from inventing your business rules.

801
00:36:46,360 --> 00:36:49,960
Semantic constraints are things like which tables are authoritative, which columns are

802
00:36:49,960 --> 00:36:55,000
approved inputs and which existing measures must be reused instead of recreated.

803
00:36:55,000 --> 00:36:59,520
It is also where you encode calendar law, which date table is the only valid date table,

804
00:36:59,520 --> 00:37:03,960
which fiscal year definition applies and which date columns are allowed for time intelligence.

805
00:37:03,960 --> 00:37:08,480
In other words, you're telling the agent what time is, what customer is, what sale is,

806
00:37:08,480 --> 00:37:10,600
and what exclude means in your organization.

807
00:37:10,600 --> 00:37:14,720
Because if you don't, it will pick defaults and defaults are never your policy.

808
00:37:14,720 --> 00:37:17,800
Then enforce naming conventions as rules, not style.

809
00:37:17,800 --> 00:37:21,080
Most naming guidance is optional, therefore it erodes.

810
00:37:21,080 --> 00:37:25,840
It is accelerated erosion by generating names faster than people can normalize them.

811
00:37:25,840 --> 00:37:30,320
So the contract must include a naming grammar, prefixes, suffixes, display folders, and

812
00:37:30,320 --> 00:37:31,640
prohibited synonyms.

813
00:37:31,640 --> 00:37:36,560
Not because it makes the model pretty, because naming is how users discover and reuse definitions.

814
00:37:36,560 --> 00:37:38,560
If the names drift, the semantics fork.

815
00:37:38,560 --> 00:37:42,400
A deterministic contract also includes reuse boundaries.

816
00:37:42,400 --> 00:37:47,000
If a measure exists in the certified model, reference it, do not create a variant.

817
00:37:47,000 --> 00:37:52,200
But single rule collapses an entire drift vector because it forces convergence instead of duplication.

818
00:37:52,200 --> 00:37:54,920
Now add the ask three questions first pattern.

819
00:37:54,920 --> 00:37:58,800
This matters because agents will start work immediately unless you force clarification.

820
00:37:58,800 --> 00:38:02,000
The three questions are not philosophical, they're mechanical.

821
00:38:02,000 --> 00:38:07,200
First, what is the business definition in plain language, including inclusions and exclusions?

822
00:38:07,200 --> 00:38:11,720
Second, what is the grain in the time logic, daily monthly fiscal calendar, posted date,

823
00:38:11,720 --> 00:38:13,800
invoice date, last refresh date?

824
00:38:13,800 --> 00:38:16,400
Third, what is the expected validation example?

825
00:38:16,400 --> 00:38:20,800
A known number, a known slice, a scenario where the result can be checked.

826
00:38:20,800 --> 00:38:24,740
Those three questions turn the task into something testable, which means gate three can later

827
00:38:24,740 --> 00:38:25,740
verify it.

828
00:38:25,740 --> 00:38:29,960
Without them, you will approve looks right outputs and drift will survive review.

829
00:38:29,960 --> 00:38:32,880
And yes, the community has already stumbled into this truth.

830
00:38:32,880 --> 00:38:37,040
In the agentic report development discussions around PBR and context engineering, the most

831
00:38:37,040 --> 00:38:39,320
valuable outcome wasn't faster visuals.

832
00:38:39,320 --> 00:38:43,560
It was being forced to document what you wanted before the tool could do anything useful.

833
00:38:43,560 --> 00:38:48,440
It is not a productivity tax that is the missing discipline most BI teams never institutionalized.

834
00:38:48,440 --> 00:38:50,600
Now make the output of gate one tangible.

835
00:38:50,600 --> 00:38:55,440
The result is an artifact a short intent spec, not a five page requirements document.

836
00:38:55,440 --> 00:39:00,240
A compact contract that includes scope, constraints, required reuse, naming rules and validation

837
00:39:00,240 --> 00:39:01,240
cases.

838
00:39:01,240 --> 00:39:05,320
One page, two pages maximum stored with the work item linked to the change set reusable

839
00:39:05,320 --> 00:39:06,400
as future context.

840
00:39:06,400 --> 00:39:10,480
This is where you stop treating models as artifacts and start treating them as products because

841
00:39:10,480 --> 00:39:11,760
a product has contracts.

842
00:39:11,760 --> 00:39:15,400
A product has owners, a product has defined semantics that don't change because someone

843
00:39:15,400 --> 00:39:17,720
asked an agent to make it faster.

844
00:39:17,720 --> 00:39:20,560
Finally, intent mapping has to be enforceable.

845
00:39:20,560 --> 00:39:23,720
If it lives only in a word document, it will be ignored at scale.

846
00:39:23,720 --> 00:39:28,200
It has to live where the agent runs instruction files, policy as code checks, tool allow lists

847
00:39:28,200 --> 00:39:31,640
and pre-flight validations that block prohibited operations.

848
00:39:31,640 --> 00:39:35,080
The moment the agent can violate the contract, you are back to probabilistic semantics.

849
00:39:35,080 --> 00:39:36,960
So gate one is not documentation.

850
00:39:36,960 --> 00:39:38,480
It is authorization for meaning.

851
00:39:38,480 --> 00:39:42,880
And once meaning is authorized, the next problem is blast radius because even correct changes

852
00:39:42,880 --> 00:39:46,240
become dangerous when they are applied directly to shared state.

853
00:39:46,240 --> 00:39:48,160
That's why gate two is containment.

854
00:39:48,160 --> 00:39:50,040
Gate two, containment.

855
00:39:50,040 --> 00:39:53,040
Sandboxes, branches and scoped identities.

856
00:39:53,040 --> 00:39:57,560
Gate two is containment because even when your intent is perfect, execution still isn't.

857
00:39:57,560 --> 00:39:59,320
Agents don't apply one surgical change.

858
00:39:59,320 --> 00:40:00,320
They iterate.

859
00:40:00,320 --> 00:40:03,520
They try something, hit a validation error, try a different approach and keep going until

860
00:40:03,520 --> 00:40:05,240
the platform stops complaining.

861
00:40:05,240 --> 00:40:06,440
That's not reckless.

862
00:40:06,440 --> 00:40:08,440
That's literally how tool using agents work.

863
00:40:08,440 --> 00:40:10,080
Action, feedback, action, feedback.

864
00:40:10,080 --> 00:40:12,640
So you don't contain the agent because it's evil.

865
00:40:12,640 --> 00:40:15,280
You contain it because iteration creates blast radius.

866
00:40:15,280 --> 00:40:16,280
The rule is simple.

867
00:40:16,280 --> 00:40:18,040
Agent output is never production state.

868
00:40:18,040 --> 00:40:19,040
It is a proposal.

869
00:40:19,040 --> 00:40:24,120
That means you treat every agent run like a branch, not a commit, not a hot fix, a branch.

870
00:40:24,120 --> 00:40:26,520
Something you can throw away without a post-mortem.

871
00:40:26,520 --> 00:40:30,240
In fabric terms, containment starts with where the agent is allowed to operate.

872
00:40:30,240 --> 00:40:34,400
You don't point an agent at the certified semantic model and say optimize measures.

873
00:40:34,400 --> 00:40:39,920
You point at a copy, a development workspace, a sandbox semantic model or a PBIP project

874
00:40:39,920 --> 00:40:41,360
checked out in a repo.

875
00:40:41,360 --> 00:40:44,600
You isolate the run from the assets people already trust.

876
00:40:44,600 --> 00:40:47,200
Because when an agent is wrong, it will be wrong quickly.

877
00:40:47,200 --> 00:40:48,880
And it will be wrong everywhere it can reach.

878
00:40:48,880 --> 00:40:52,280
This is where environment design becomes governance, not tooling trivia.

879
00:40:52,280 --> 00:40:56,000
A separate workspace for agent runs isn't process overhead.

880
00:40:56,000 --> 00:41:00,160
It's the difference between a contained failure and a tenent wide semantic incident.

881
00:41:00,160 --> 00:41:02,800
Then you wire that containment into your life cycle.

882
00:41:02,800 --> 00:41:06,280
If you already use deployment pipelines, treat the agent workspace as the same class of

883
00:41:06,280 --> 00:41:07,280
environment as dev.

884
00:41:07,280 --> 00:41:09,680
Let the agent mutate dev artifacts.

885
00:41:09,680 --> 00:41:12,640
Promote through the pipeline only after gate three verification.

886
00:41:12,640 --> 00:41:14,840
Never let the agent bypass the promotion boundary.

887
00:41:14,840 --> 00:41:18,560
If you are using PBIP and PBIR, containment becomes even more literal.

888
00:41:18,560 --> 00:41:20,280
The PBIP project is the boundary.

889
00:41:20,280 --> 00:41:21,560
The repo is the boundary.

890
00:41:21,560 --> 00:41:25,080
The agent can edit files in a branch, generate a PR and stop.

891
00:41:25,080 --> 00:41:26,320
That's the correct shape.

892
00:41:26,320 --> 00:41:28,840
Changes are visible, reviewable and reversible.

893
00:41:28,840 --> 00:41:33,320
But you don't allow direct edits in production workspaces because it's faster.

894
00:41:33,320 --> 00:41:36,840
That's how you turn an agent into a release engineer with no accountability.

895
00:41:36,840 --> 00:41:38,960
Containment also means scoped identities.

896
00:41:38,960 --> 00:41:41,000
This is where most teams sabotage themselves.

897
00:41:41,000 --> 00:41:44,880
They run agents under a human user with broad permissions because it's easy.

898
00:41:44,880 --> 00:41:49,160
Or they create a service principle with tenent wide rights because automation needs access.

899
00:41:49,160 --> 00:41:50,160
That is not automation.

900
00:41:50,160 --> 00:41:51,600
That is unmanaged authority.

901
00:41:51,600 --> 00:41:54,240
The agent's identity must be minimum viable permission.

902
00:41:54,240 --> 00:41:55,640
Not minimum viable friction.

903
00:41:55,640 --> 00:41:58,480
If the task is "create measure descriptions".

904
00:41:58,480 --> 00:42:00,600
The agent doesn't need to edit relationships.

905
00:42:00,600 --> 00:42:04,440
If the task is bulk rename measures, the agent doesn't need access to the lake house.

906
00:42:04,440 --> 00:42:10,080
If the task is "generator report page" in PBIR, the agent doesn't need workspace admin.

907
00:42:10,080 --> 00:42:13,920
So you scoped the identity to the exact surface area you want the agent to touch.

908
00:42:13,920 --> 00:42:15,440
And you scope it in time too.

909
00:42:15,440 --> 00:42:17,200
Standing access is lazy design.

910
00:42:17,200 --> 00:42:18,840
Time bound access is containment.

911
00:42:18,840 --> 00:42:24,480
The clean model is, the agent gets a scoped identity for a defined window on a defined workspace

912
00:42:24,480 --> 00:42:27,040
with a defined allow list of tool operations.

913
00:42:27,040 --> 00:42:31,160
When the run ends the permissions and if you can't enforce that technically, you enforce it

914
00:42:31,160 --> 00:42:34,080
operationally with separate identities and separate environments.

915
00:42:34,080 --> 00:42:35,560
Now add deterministic backups.

916
00:42:35,560 --> 00:42:38,400
This is the part nobody wants to do until they need it.

917
00:42:38,400 --> 00:42:40,240
Before an agent run, you take a restore point.

918
00:42:40,240 --> 00:42:41,600
For PBIP, that's trivial.

919
00:42:41,600 --> 00:42:43,000
Branch tag commit.

920
00:42:43,000 --> 00:42:45,200
For service artifacts, you need an equivalent.

921
00:42:45,200 --> 00:42:46,840
Export the model metadata.

922
00:42:46,840 --> 00:42:50,640
Checkpoint the workspace or at least ensure the previous promoted version is recoverable

923
00:42:50,640 --> 00:42:51,880
in the pipeline.

924
00:42:51,880 --> 00:42:55,400
Because rollback is not a nice to have with agents, it is a design requirement.

925
00:42:55,400 --> 00:42:58,200
You are not protecting yourself from a single bad change.

926
00:42:58,200 --> 00:43:02,040
You are protecting yourself from a sequence of changes that passed validation but corrupted

927
00:43:02,040 --> 00:43:03,040
meaning.

928
00:43:03,040 --> 00:43:06,720
Containment also reduces the temptation to let the agent fix it live.

929
00:43:06,720 --> 00:43:11,320
Once teams see a demo where MCP can update measures in real time, they start treating production

930
00:43:11,320 --> 00:43:12,840
as an interactive sandbox.

931
00:43:12,840 --> 00:43:13,840
That's entertaining.

932
00:43:13,840 --> 00:43:16,840
It's also how you lose the only thing governance is supposed to preserve.

933
00:43:16,840 --> 00:43:18,080
A stable truth layer.

934
00:43:18,080 --> 00:43:21,520
So gate 2 is how you keep experimentation without sacrificing trust.

935
00:43:21,520 --> 00:43:22,520
Agents can run fast.

936
00:43:22,520 --> 00:43:24,200
They can try 10 variations.

937
00:43:24,200 --> 00:43:25,520
They can refactor aggressively.

938
00:43:25,520 --> 00:43:27,600
They can generate drafts and alternatives.

939
00:43:27,600 --> 00:43:28,840
In containment that's fine.

940
00:43:28,840 --> 00:43:30,680
That's value, outside containment.

941
00:43:30,680 --> 00:43:32,280
That's drift with better UX.

942
00:43:32,280 --> 00:43:36,400
And once you've contained the blast radius, you still haven't solved the real problem because

943
00:43:36,400 --> 00:43:39,400
a contained wrong change is still a wrong change.

944
00:43:39,400 --> 00:43:41,400
Containment prevents disasters.

945
00:43:41,400 --> 00:43:43,360
Verification prevents subtle corruption.

946
00:43:43,360 --> 00:43:44,520
That's gate 3.

947
00:43:44,520 --> 00:43:47,320
Gate 3 verification works is not a test.

948
00:43:47,320 --> 00:43:50,600
Gate 3 is verification because it worked is not a test.

949
00:43:50,600 --> 00:43:51,600
It's a symptom.

950
00:43:51,600 --> 00:43:52,960
The platform didn't throw an exception.

951
00:43:52,960 --> 00:43:56,760
That's all power BI and fabric will happily return a number for almost anything.

952
00:43:56,760 --> 00:44:00,160
Dax will happily evaluate under the current filter context.

953
00:44:00,160 --> 00:44:03,200
Relationships will happily propagate filters down the best available path.

954
00:44:03,200 --> 00:44:07,760
PBI will happily render a report that encodes a biased default view.

955
00:44:07,760 --> 00:44:11,880
None of that proves the output matches the business definition you think you have.

956
00:44:11,880 --> 00:44:15,080
Verification is where you force the model to prove meaning, not just compile.

957
00:44:15,080 --> 00:44:17,240
The first shift is semantic unit testing.

958
00:44:17,240 --> 00:44:18,240
Not performance testing.

959
00:44:18,240 --> 00:44:20,160
Not does the visual load.

960
00:44:20,160 --> 00:44:21,680
Active tests are small.

961
00:44:21,680 --> 00:44:24,560
Explicit scenarios where the expected behavior is known and stable.

962
00:44:24,560 --> 00:44:29,240
For example, net revenue for customer X in fiscal month Y must equal the ledger extract

963
00:44:29,240 --> 00:44:34,960
total for that cohort within a defined tolerance and with explicit inclusion and exclusion rules.

964
00:44:34,960 --> 00:44:39,880
Or active customers must not count internal accounts, must not count test tenants and must

965
00:44:39,880 --> 00:44:41,800
use the posting date, not the order date.

966
00:44:41,800 --> 00:44:44,520
The point is not that every KPI needs a thousand tests.

967
00:44:44,520 --> 00:44:48,960
The point is that every KPI needs at least one scenario that proves its contract.

968
00:44:48,960 --> 00:44:50,640
Not that you are approving vibes.

969
00:44:50,640 --> 00:44:55,120
Now, teams hear tests and think they need a full engineering platform before they can start.

970
00:44:55,120 --> 00:44:56,120
They don't.

971
00:44:56,120 --> 00:44:57,280
They need a habit.

972
00:44:57,280 --> 00:45:02,480
A small set of canonical slices and expected outputs that get checked before promotion.

973
00:45:02,480 --> 00:45:06,960
In fabric terms, this can be a set of DAX queries that return known values or a comparison

974
00:45:06,960 --> 00:45:09,840
query that validates deltas against the baseline.

975
00:45:09,840 --> 00:45:10,840
It can be lightweight.

976
00:45:10,840 --> 00:45:12,200
It just cannot be optional.

977
00:45:12,200 --> 00:45:14,720
Next is separating performance from correctness.

978
00:45:14,720 --> 00:45:20,040
Regents love optimizing because optimization produces measurable feedback, fewer storage engine

979
00:45:20,040 --> 00:45:26,600
scans, shorter query durations, fewer formula engine hits, it looks scientific, it looks disciplined.

980
00:45:26,600 --> 00:45:29,800
But performance improvements are worthless if the definition drifted.

981
00:45:29,800 --> 00:45:32,680
So the rule is prove correctness first, then optimize.

982
00:45:32,680 --> 00:45:37,400
If an agent proposes a refactor, gate three must validate that the refactor produces identical

983
00:45:37,400 --> 00:45:42,440
results across the test scenarios, not roughly similar, identical where it matters, and

984
00:45:42,440 --> 00:45:44,880
within policy defined tolerance where it doesn't.

985
00:45:44,880 --> 00:45:46,880
If you can't prove equivalence, you didn't optimize.

986
00:45:46,880 --> 00:45:47,880
You changed the KPI.

987
00:45:47,880 --> 00:45:51,080
That distinction matters when executives build decisions on that KPI.

988
00:45:51,080 --> 00:45:52,480
Then there's diff interpretation.

989
00:45:52,480 --> 00:45:56,440
PBIR and model metadata diffs are too low level for most reviewers.

990
00:45:56,440 --> 00:45:58,960
You can see change, but you can't see intent.

991
00:45:58,960 --> 00:46:01,000
So gate three requires translation.

992
00:46:01,000 --> 00:46:05,120
Turn the raw diffs into a human language change summary that describes semantic impact.

993
00:46:05,120 --> 00:46:07,440
Not updated visual JSON.

994
00:46:07,440 --> 00:46:08,680
That's meaningless.

995
00:46:08,680 --> 00:46:14,720
It needs to say moved filter status equals posted from visual scope to page scope or changed

996
00:46:14,720 --> 00:46:19,920
measure net revenue to exclude returns table or added bidirectional filtering on customer

997
00:46:19,920 --> 00:46:21,280
sales relationship.

998
00:46:21,280 --> 00:46:22,400
Those are semantic changes.

999
00:46:22,400 --> 00:46:23,760
Those are reviewable.

1000
00:46:23,760 --> 00:46:26,840
If you can't translate a diff into meaning, you can't approve it.

1001
00:46:26,840 --> 00:46:29,280
Gate three also needs explicit reject criteria.

1002
00:46:29,280 --> 00:46:32,320
This is where you stop pretending every change is negotiable.

1003
00:46:32,320 --> 00:46:36,240
Some changes are structural drift generators and you reject them by default unless an

1004
00:46:36,240 --> 00:46:39,200
owner explicitly authorizes them in gate one.

1005
00:46:39,200 --> 00:46:41,320
Reject criteria should be boring and absolute.

1006
00:46:41,320 --> 00:46:44,360
New relationships, changes in cross filter direction.

1007
00:46:44,360 --> 00:46:45,960
New many too many relationships.

1008
00:46:45,960 --> 00:46:48,320
New inactive to active relationship flips.

1009
00:46:48,320 --> 00:46:53,040
New date tables, calculation group edits and time intelligence that references non-approved

1010
00:46:53,040 --> 00:46:54,120
date columns.

1011
00:46:54,120 --> 00:46:58,520
If the agent introduces any of that, the change fails verification unless the intent contract

1012
00:46:58,520 --> 00:47:00,080
explicitly allowed it.

1013
00:47:00,080 --> 00:47:04,160
Because those changes rewrite the model's behavior in ways most teams cannot reason about

1014
00:47:04,160 --> 00:47:05,160
under pressure.

1015
00:47:05,160 --> 00:47:06,760
And then there's the human in the loop rule.

1016
00:47:06,760 --> 00:47:08,240
This isn't best practice yet.

1017
00:47:08,240 --> 00:47:09,640
It is design law.

1018
00:47:09,640 --> 00:47:14,160
Semantic changes require a human owner to accept accountability, not to rubber stamp a PR

1019
00:47:14,160 --> 00:47:18,040
but to accept that a business definition changed and that the organization will live with

1020
00:47:18,040 --> 00:47:19,040
the consequences.

1021
00:47:19,040 --> 00:47:23,160
The agent can generate, the agent can refactor, the agent can even propose tests.

1022
00:47:23,160 --> 00:47:24,800
But the human approves meaning.

1023
00:47:24,800 --> 00:47:27,200
That is the point where governance becomes real.

1024
00:47:27,200 --> 00:47:30,440
Someone with domain authority signs the contract, not just the code.

1025
00:47:30,440 --> 00:47:34,400
Finally, verification has to be fast enough that teams don't bypass it.

1026
00:47:34,400 --> 00:47:36,840
That's why you don't build a massive test harness first.

1027
00:47:36,840 --> 00:47:41,840
You build a thin one that catches the common failure modes, time logic errors, filter propagation

1028
00:47:41,840 --> 00:47:45,160
changes, duplicate definitions and hidden defaults in reports.

1029
00:47:45,160 --> 00:47:49,320
If you catch those, you catch most drift early and if you don't catch them, you will catch

1030
00:47:49,320 --> 00:47:53,680
them later in production in front of stakeholders with no ability to prove what changed.

1031
00:47:53,680 --> 00:47:58,400
Gate three prevents subtle corruption, but it still doesn't make change safe by itself.

1032
00:47:58,400 --> 00:48:02,160
Because even correct verified change can become ungoverned truth if you promote it without

1033
00:48:02,160 --> 00:48:03,160
provenance.

1034
00:48:03,160 --> 00:48:05,280
That's why gate four exists.

1035
00:48:05,280 --> 00:48:09,800
Release and attestation, where traceability becomes part of the product.

1036
00:48:09,800 --> 00:48:11,000
Gate four.

1037
00:48:11,000 --> 00:48:12,640
Release and attestation.

1038
00:48:12,640 --> 00:48:14,160
Provenance becomes the product.

1039
00:48:14,160 --> 00:48:18,720
Gate four is where most organizations get uncomfortable because it forces a confession.

1040
00:48:18,720 --> 00:48:21,520
The semantic model isn't just a technical artifact.

1041
00:48:21,520 --> 00:48:24,400
It's a decision surface and decisions require owners.

1042
00:48:24,400 --> 00:48:28,520
Release and attestation is the point where you stop treating deployed as true.

1043
00:48:28,520 --> 00:48:32,280
You only promote what you can defend later under questioning without hand waving.

1044
00:48:32,280 --> 00:48:34,720
That means provenance becomes part of what you ship.

1045
00:48:34,720 --> 00:48:37,960
In practice, gate four starts with required metadata.

1046
00:48:37,960 --> 00:48:43,280
Every promoted semantic change needs a minimal provenance bundle, who approved it, what changed,

1047
00:48:43,280 --> 00:48:48,160
why it changed, what policy or definition it aligns to, and what version is now authoritative.

1048
00:48:48,160 --> 00:48:52,680
If you can't answer those five things from a durable artifact, the release doesn't happen.

1049
00:48:52,680 --> 00:48:54,960
This is where teams try to substitute process theatre.

1050
00:48:54,960 --> 00:48:56,600
They'll say, "It's in Git."

1051
00:48:56,600 --> 00:48:59,120
Or, "The agent log shows the tool calls."

1052
00:48:59,120 --> 00:49:00,800
Or, "We have the chat transcript."

1053
00:49:00,800 --> 00:49:02,080
None of that is attestation.

1054
00:49:02,080 --> 00:49:04,080
Those are implementation traces.

1055
00:49:04,080 --> 00:49:06,600
Attestation is an explicit statement of responsibility.

1056
00:49:06,600 --> 00:49:09,360
So the release pipeline needs a hard rule.

1057
00:49:09,360 --> 00:49:13,360
Semantic changes require sign-off by a named owner with domain authority.

1058
00:49:13,360 --> 00:49:15,280
Not the developer who merged the PR.

1059
00:49:15,280 --> 00:49:18,480
Not the platform admin, the person who owns the meaning of the KPI.

1060
00:49:18,480 --> 00:49:19,800
That's not bureaucracy.

1061
00:49:19,800 --> 00:49:24,040
That's the only mechanism that prevents the agent did it from becoming your organization's

1062
00:49:24,040 --> 00:49:25,520
default excuse.

1063
00:49:25,520 --> 00:49:27,480
Next align attestation to endorsement.

1064
00:49:27,480 --> 00:49:30,960
Fabric already has the concept of promoted and certified data sets.

1065
00:49:30,960 --> 00:49:34,480
Let those badges as governance boundaries, not decoration.

1066
00:49:34,480 --> 00:49:37,320
Agents should not operate against certified data sets directly.

1067
00:49:37,320 --> 00:49:40,800
They should operate on drafts and candidates, and certification should be the final act

1068
00:49:40,800 --> 00:49:41,800
of gate four.

1069
00:49:41,800 --> 00:49:43,680
That creates a clean operating model.

1070
00:49:43,680 --> 00:49:46,560
Uncertified assets can be experimental and fast.

1071
00:49:46,560 --> 00:49:49,160
Certified assets are contract bound and slow to change.

1072
00:49:49,160 --> 00:49:53,880
If your tenant doesn't enforce that separation, you've built a semantic commons with no law.

1073
00:49:53,880 --> 00:49:57,360
Per view and lineage also matter here, but not as a checkbox.

1074
00:49:57,360 --> 00:50:02,120
Which is useful only if it connects to intent.

1075
00:50:02,120 --> 00:50:07,040
What you actually need is this definition was approved for this purpose and you needed

1076
00:50:07,040 --> 00:50:08,200
discoverable.

1077
00:50:08,200 --> 00:50:12,320
So gate four requires that promoted artifacts surfaced in the governance inventory with ownership

1078
00:50:12,320 --> 00:50:17,600
metadata, sensitivity labels were appropriate, and a traceable link to the semantic decision

1079
00:50:17,600 --> 00:50:18,600
record.

1080
00:50:18,600 --> 00:50:21,880
If you can't find the owner in two clicks, you don't have a govern data product.

1081
00:50:21,880 --> 00:50:23,720
You have a shared file with better marketing.

1082
00:50:23,720 --> 00:50:28,640
Now the other half of gate four is operational reality, roll forward and roll back.

1083
00:50:28,640 --> 00:50:31,560
Everyone loves to talk about roll back like it's a comfort blanket.

1084
00:50:31,560 --> 00:50:34,960
But with semantic drift, roll back is harder than it sounds.

1085
00:50:34,960 --> 00:50:37,440
Reports get rebuilt around the new truth.

1086
00:50:37,440 --> 00:50:38,440
Stakeholders adapt.

1087
00:50:38,440 --> 00:50:42,360
A broken KPI becomes the new baseline in someone's forecast.

1088
00:50:42,360 --> 00:50:43,520
So you need both policies.

1089
00:50:43,520 --> 00:50:46,920
When you roll back and when you roll forward with a fix, that means release needs two things

1090
00:50:46,920 --> 00:50:47,920
every time.

1091
00:50:47,920 --> 00:50:50,760
A recovery plan and an expiry plan.

1092
00:50:50,760 --> 00:50:55,640
If this change is wrong, how do you restore the previous certified version quickly?

1093
00:50:55,640 --> 00:50:56,640
Expiry plan.

1094
00:50:56,640 --> 00:51:00,800
If this change was temporary, what forces you to remove it later instead of letting it

1095
00:51:00,800 --> 00:51:02,200
become permanent drift?

1096
00:51:02,200 --> 00:51:05,040
This is why exception clauses are entropy generators.

1097
00:51:05,040 --> 00:51:08,520
Gate four is where you either retire them or you formalize them into policy.

1098
00:51:08,520 --> 00:51:10,640
You do not leave them floating in the model.

1099
00:51:10,640 --> 00:51:13,280
And now the KPI nobody tracks but everyone should.

1100
00:51:13,280 --> 00:51:14,280
Drift rate.

1101
00:51:14,280 --> 00:51:17,160
Not how many commits, not how many deployments.

1102
00:51:17,160 --> 00:51:18,680
Drift rate is semantic churn.

1103
00:51:18,680 --> 00:51:22,800
How often the definitions that matter change and how often they change without a corresponding

1104
00:51:22,800 --> 00:51:24,000
business decision record.

1105
00:51:24,000 --> 00:51:28,160
If drift rate goes up, your governance is failing, even if your CICD looks pristine because

1106
00:51:28,160 --> 00:51:32,040
the platform can be perfectly automated and still semantically unstable.

1107
00:51:32,040 --> 00:51:35,520
In fact, automation accelerates instability when you don't control meaning.

1108
00:51:35,520 --> 00:51:38,320
So gate four is the final architectural posture.

1109
00:51:38,320 --> 00:51:39,320
Provenance is the product.

1110
00:51:39,320 --> 00:51:41,040
If you can't prove it, you can't ship it.

1111
00:51:41,040 --> 00:51:45,520
And if you ship it without proof, you've moved from analytics into storytelling with numbers.

1112
00:51:45,520 --> 00:51:46,520
Next.

1113
00:51:46,520 --> 00:51:49,440
A sensible question that always follows once the gates exist.

1114
00:51:49,440 --> 00:51:53,000
When should you use agents at all and when should you keep them away from the semantic

1115
00:51:53,000 --> 00:51:55,640
layer entirely?

1116
00:51:55,640 --> 00:51:57,520
Practical operating model.

1117
00:51:57,520 --> 00:51:59,880
Where agents belong and where they don't.

1118
00:51:59,880 --> 00:52:02,040
So where do agents belong?

1119
00:52:02,040 --> 00:52:05,520
Once the novelty wears off and you're trying to keep a tenant coherent.

1120
00:52:05,520 --> 00:52:09,000
They belong where the work is repetitive, the semantics are stable and the blast radius

1121
00:52:09,000 --> 00:52:10,000
is containable.

1122
00:52:10,000 --> 00:52:12,080
In other words, agents are excellent mechanics.

1123
00:52:12,080 --> 00:52:13,560
They are terrible legislators.

1124
00:52:13,560 --> 00:52:15,760
Safe uses are the boring ones.

1125
00:52:15,760 --> 00:52:19,920
Documentation descriptions, metadata hygiene, translations, foldering, formatting strings,

1126
00:52:19,920 --> 00:52:24,560
measure annotations and building out standardized scaffolding from a template you already trust.

1127
00:52:24,560 --> 00:52:28,800
If you already have a certified measure library, an agent can help you apply it consistently,

1128
00:52:28,800 --> 00:52:33,200
renamed to match taxonomy, move measures into the right display folders, add consistent

1129
00:52:33,200 --> 00:52:38,080
descriptions and generate a change summary that a human can actually review.

1130
00:52:38,080 --> 00:52:41,840
Agents are also useful for bulk refactors that are syntactic, not semantic, replacing

1131
00:52:41,840 --> 00:52:46,800
a table name after a rename, updating formatting rules, applying a known pattern across many

1132
00:52:46,800 --> 00:52:53,080
measures or generating visual layout changes when the intent is explicitly defined and verified.

1133
00:52:53,080 --> 00:52:58,480
This is where PBR, PBIP automation pays off, not designing a story but enforcing an existing

1134
00:52:58,480 --> 00:53:00,080
story across reports.

1135
00:53:00,080 --> 00:53:02,240
Conditional uses exist but only under gates.

1136
00:53:02,240 --> 00:53:03,680
Measure scaffolding is a good example.

1137
00:53:03,680 --> 00:53:08,360
If the organization already owns the definition and the calendar rules, an agent can draft

1138
00:53:08,360 --> 00:53:12,200
the DAX quickly, propose variants and even suggest tests.

1139
00:53:12,200 --> 00:53:15,640
But the agent never decides the definition, it never picks the date column, it never

1140
00:53:15,640 --> 00:53:19,080
invents exclusions, it drafts, you approve.

1141
00:53:19,080 --> 00:53:23,560
If you want a simple rule, agents can automate expression, not authority.

1142
00:53:23,560 --> 00:53:28,080
Now the hard line, where agents don't belong, agents don't belong in relationship creation,

1143
00:53:28,080 --> 00:53:32,640
schema, redesign or the definition of business KPIs without an accountable owner driving

1144
00:53:32,640 --> 00:53:33,640
it.

1145
00:53:33,640 --> 00:53:37,360
Relationship drift is too destructive and too non-obvious and schema redesign is not a

1146
00:53:37,360 --> 00:53:42,640
technical improvement, it is a change in meaning because it changes how filters propagate

1147
00:53:42,640 --> 00:53:45,200
and what belongs to a result.

1148
00:53:45,200 --> 00:53:49,240
Agents also don't belong in anything that expands semantic surface area without a retirement

1149
00:53:49,240 --> 00:53:54,560
plan, new date tables, new many too many bridges, new calculation groups, new temporary measures

1150
00:53:54,560 --> 00:53:57,280
meant to satisfy a single executive question.

1151
00:53:57,280 --> 00:54:01,000
Those are entropy generators and agents produce them faster than humans can clean them up.

1152
00:54:01,000 --> 00:54:04,080
This is also where maturity matters and it is not negotiable.

1153
00:54:04,080 --> 00:54:08,360
If a team can't ship a stable semantic model without agents, it will not ship a stable semantic

1154
00:54:08,360 --> 00:54:11,160
model with agents, it will ship instability faster.

1155
00:54:11,160 --> 00:54:13,720
Process fidelity has to exist before autonomy.

1156
00:54:13,720 --> 00:54:16,720
That's the uncomfortable truth behind every glossy agent demo.

1157
00:54:16,720 --> 00:54:20,560
The demo works because someone already had structure, templates, conventions and a review

1158
00:54:20,560 --> 00:54:21,560
process.

1159
00:54:21,560 --> 00:54:23,720
The agent didn't create that, it consumed it.

1160
00:54:23,720 --> 00:54:25,480
So the operating model is straightforward.

1161
00:54:25,480 --> 00:54:28,880
Use agents to accelerate the parts of the life cycle you already control.

1162
00:54:28,880 --> 00:54:32,720
Use them to reduce human toil in the areas where your intent is already encoded.

1163
00:54:32,720 --> 00:54:37,400
Naming, formatting, documentation, replication, refactoring within known boundaries, do not

1164
00:54:37,400 --> 00:54:41,880
use agents to replace the only scarce resource in the system, semantic ownership and if you're

1165
00:54:41,880 --> 00:54:46,920
thinking, but we need speed, good, speed is not the problem, unbounded speed is.

1166
00:54:46,920 --> 00:54:51,400
The decision rule that holds up in enterprises is simple, automate repetition, not meaning.

1167
00:54:51,400 --> 00:54:55,240
If the work is repeatable and the output can be verified deterministically, an agent

1168
00:54:55,240 --> 00:54:56,440
is a force multiplier.

1169
00:54:56,440 --> 00:55:00,200
If the work requires a business decision, an agent is just a confident guess with tool

1170
00:55:00,200 --> 00:55:01,200
access.

1171
00:55:01,200 --> 00:55:05,440
Yes, agents can belong in your fabric operating model, but only after you decide what the semantic

1172
00:55:05,440 --> 00:55:08,920
layer is, a product with contracts, owners and gates.

1173
00:55:08,920 --> 00:55:11,560
Because if you don't, the platform will do what it always does.

1174
00:55:11,560 --> 00:55:13,800
It will drift toward convenience.

1175
00:55:13,800 --> 00:55:16,600
Control the semantics or the semantics control you.

1176
00:55:16,600 --> 00:55:20,820
Agents can accelerate delivery, but only governance gates preserve semantic truth when power

1177
00:55:20,820 --> 00:55:23,640
BI and fabric are being changed at machine speed.

1178
00:55:23,640 --> 00:55:27,880
If you want the next step, watch the deep dive on implementing the four gates in fabric

1179
00:55:27,880 --> 00:55:34,000
and DevOps, branching sandbox identities, semantic tests and MCP tool allow lists.

1180
00:55:34,000 --> 00:55:37,720
Without turning your tenant into an authorization graph, nobody can explain.

1181
00:55:37,720 --> 00:55:40,680
Subscribe if you want more architecture that survives contact with reality.