Jan. 11, 2026

Fabric Rewrote Data Engineering

This episode explores why Microsoft Fabric and Copilot feel empowering and chaotic at the same time. While Fabric simplifies the experience by unifying storage, compute, and analytics into a single platform, it does not remove the hard parts of data engineering. It removes friction, not responsibility. By making it easier to build pipelines, models, and reports quickly, Fabric accelerates every decision, including the wrong or ambiguous ones. Problems that once took months to surface now appear in days or hours, often as rising costs, performance degradation, or dashboards that quietly disagree rather than obvious system failures.

The episode explains how older data stacks unintentionally enforced governance through friction. Separate tools, environments, and deployment steps forced teams to define ownership, contracts, and boundaries. Fabric collapses those boundaries into shared workspaces and shared capacity, which blurs accountability and expands the blast radius of mistakes. Copilot amplifies this effect by optimizing for fast, plausible completion instead of long-term correctness, cost control, or governance intent. It can generate working solutions that look right and run fine, even when they silently introduce drift or inefficiency.

The core message is that Fabric didn’t break governance; the old stack simply rationed chaos. To succeed, teams must intentionally design contracts, enforce boundaries, and treat cost and correctness as engineered properties rather than afterthoughts.

Microsoft Fabric didn’t make data engineering easier.
It made ambiguity cheaper to ship. This episode explains why teams feel faster and more out of control at the same time after adopting Fabric and Copilot—and why that isn’t a tooling problem. Fabric removed the ceremony that used to slow bad decisions down. Copilot removed the typing, not the consequences. The result is architectural erosion that shows up first as cost spikes, conflicting dashboards, and audit discomfort—not broken pipelines. If your Fabric estate “works” but feels fragile, this episode explains why. What You’ll Learn 1. What Fabric Actually Changed (and What It Didn’t) Fabric didn’t rewrite data engineering because of better UI or nicer tools. It rewrote it by collapsing:

  • Storage
  • Compute
  • Semantics
  • Publishing
  • Identity

into a single SaaS control plane. This removed handoffs that used to force architectural decisions—and replaced them with lateral movement inside workspaces. Fabric removed friction, not responsibility. 2. Why Speed Accelerates Drift, Not Simplicity In older stacks, ambiguity paid a tax:

  • Environment boundaries
  • Tool handoffs
  • Deployment friction
  • Separate billing surfaces

Those boundaries slowed bad decisions down. Fabric removes them. Drift now ships at refresh speed. The result isn’t failure—it’s quiet wrongness:

  • Dashboards refresh on time and disagree
  • Pipelines succeed while semantics fragment
  • Capacity spikes without deployments
  • Audits surface ownership gaps no one noticed forming

3. The New Failure Signal: Cost, Not Outages Fabric estates don’t usually fail loudly.
They fail expensively. Because all workloads draw from a shared capacity meter:

  • Bad query shapes
  • Unbounded filters
  • Copilot-generated SQL
  • Refresh concurrency

surface first as capacity saturation, not broken jobs. Execution plans—not dashboards—become the only honest artifact. 4. Copilot’s Real Impact: Completion Over Consequence Copilot optimizes for:

  • Plausible output
  • Fast completion
  • Syntax correctness

It does not optimize for:

  • Deterministic cost
  • Schema contracts
  • Security intent
  • Long-term correctness

Without enforced boundaries, Copilot doesn’t break governance—it accelerates its absence. Teams with enforcement get faster.
Teams without enforcement get faster at shipping entropy. 5. Why Raw Tables Become a Cost and Security Liability When raw tables are queryable:

  • Cost becomes probabilistic
  • Schema drift becomes accepted behavior
  • Access intent collapses into workspace roles
  • Copilot becomes a blast-radius multiplier

Fabric exposes the uncomfortable truth:
Raw tables are not a consumption API. 6. Case Study: The “Haunted” Capacity Spike A common Fabric incident pattern:

  • No deployments
  • No pipeline failures
  • Dashboards still load
  • Capacity spikes mid-day

Root cause:

  • Non-sargable predicates
  • Missing time bounds
  • SELECT *
  • Copilot-generated SQL under concurrency

Fix:

  • Views and procedures as the only query surface
  • Execution plans as acceptance criteria
  • Cost treated as an engineered property

7. Lakehouse → Warehouse Contract Collapse Lakehouses are permissive by design.
Warehouses are expected to enforce structure—but they can’t enforce contracts that never existed. Without explicit schema enforcement:

  • Drift moves downstream
  • Semantic models become patch bays
  • KPIs fork silently
  • “Power BI is wrong” becomes a recurring sentence

The Warehouse must be the contract zone, not another reflection layer. 8. Why Workspace-Only Security Creates an Ownership Vacuum Workspaces are collaboration boundaries—not data security boundaries. When organizations rely on workspace roles:

  • Nobody owns table-level intent
  • Service principals gain broad access
  • Audit questions stall
  • Copilot accelerates unintended exposure

The fix isn’t labels or training.
It’s engine-level enforcement: schemas, roles, views, and deny-by-default access. 9. The Modern Data Engineer’s Job Didn’t Shrink—it Moved Fabric shrinks visible labor:

  • Pipeline scaffolding
  • Glue code
  • Manual SQL authoring

But it expands responsibility:

  • Contract design
  • Boundary enforcement
  • Cost governance
  • Failure-mode anticipation

The modern data engineer enforces intent, not just movement. 10. The Only Operating Model That Survives Fabric + Copilot This episode outlines a survivable operating model:

  • AI drafts, humans approve
  • Contracts before convenience
  • Execution plans as cost policy
  • Views over raw tables
  • CI/CD gates for schema and logic
  • Assume decay unless enforced

Governance must be mechanical—not social. Core Takeaway Fabric is a speed multiplier. It multiplies:

  • Delivery velocity
  • Ambiguity
  • Governance debt

at the same rate. The platform doesn’t break.
Your assumptions do. Call to Action Ask yourself one question:
When something feels wrong, which artifact do you trust?

  • Execution plan
  • Capacity metrics
  • Violation count
  • Lineage view

Whatever you answered—that’s what your governance model is actually built on. Subscribe for the next episode:
“How to Design Fabric Data Contracts That Survive Copilot.”

Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-fm-modern-work-security-and-productivity-with-microsoft-365--6704921/support.

Transcript

1
00:00:00,000 --> 00:00:02,040
Most organizations believe Fabric Plus Copilot

2
00:00:02,040 --> 00:00:03,760
made data engineering easier.

3
00:00:03,760 --> 00:00:04,560
They are wrong.

4
00:00:04,560 --> 00:00:07,200
Fabric removed the ceremony, not the responsibility.

5
00:00:07,200 --> 00:00:09,720
And Copilot removed the typing, not the consequences.

6
00:00:09,720 --> 00:00:13,160
So yes, pipelines run faster, notebooks feel friendlier,

7
00:00:13,160 --> 00:00:16,240
SQL appears out of thin air, and Power BI lights up

8
00:00:16,240 --> 00:00:18,200
before anyone has written a real contract.

9
00:00:18,200 --> 00:00:19,480
But the thing nobody's mentioning is,

10
00:00:19,480 --> 00:00:21,040
what speed does to ambiguity?

11
00:00:21,040 --> 00:00:21,960
It chips it.

12
00:00:21,960 --> 00:00:23,280
And it chips it at machine speed.

13
00:00:23,280 --> 00:00:25,520
Here's the comfortable belief I keep hearing in incident

14
00:00:25,520 --> 00:00:28,520
reviews and why are we over budget meetings?

15
00:00:28,520 --> 00:00:30,120
We consolidated the stack.

16
00:00:30,120 --> 00:00:31,480
We finally have one platform.

17
00:00:31,480 --> 00:00:32,720
The teams are moving faster.

18
00:00:32,720 --> 00:00:34,080
This should be simpler.

19
00:00:34,080 --> 00:00:35,720
That's the marketing narrative too.

20
00:00:35,720 --> 00:00:40,000
One Lake, one experience, one workspace, one integrated story.

21
00:00:40,000 --> 00:00:42,000
And if you stop the analysis at the UI layer,

22
00:00:42,000 --> 00:00:43,280
that narrative seems true.

23
00:00:43,280 --> 00:00:45,640
You can create a workspace, land files into one Lake,

24
00:00:45,640 --> 00:00:47,760
click a button, generate a semantic model,

25
00:00:47,760 --> 00:00:49,160
and render a report in minutes.

26
00:00:49,160 --> 00:00:50,000
That demo is real.

27
00:00:50,000 --> 00:00:51,880
The problem is what that demo trained people

28
00:00:51,880 --> 00:00:53,880
to believe that the platform removed the need

29
00:00:53,880 --> 00:00:55,160
for deliberate architecture.

30
00:00:55,160 --> 00:00:56,040
It did not.

31
00:00:56,040 --> 00:00:58,040
In architectural terms, Fabric is not a tool.

32
00:00:58,040 --> 00:01:00,480
It's platform physics, a single SAS control plane

33
00:01:00,480 --> 00:01:02,720
that lets every persona touch the same assets

34
00:01:02,720 --> 00:01:06,040
from multiple angles, engineering, warehousing,

35
00:01:06,040 --> 00:01:10,760
BI, real-time AI, inside one capacity envelope.

36
00:01:10,760 --> 00:01:12,040
That's not easier.

37
00:01:12,040 --> 00:01:14,800
That's less friction between intent and impact.

38
00:01:14,800 --> 00:01:17,960
And when your intent is unclear, that reduction in friction

39
00:01:17,960 --> 00:01:19,240
doesn't produce simplicity.

40
00:01:19,240 --> 00:01:20,880
It produces architectural erosion.

41
00:01:20,880 --> 00:01:22,960
You've seen this pattern before just not this fast.

42
00:01:22,960 --> 00:01:25,360
In older stacks, ambiguity had attacks.

43
00:01:25,360 --> 00:01:28,960
It had to cross boundaries, ADF to Synapse, DBT to warehouse,

44
00:01:28,960 --> 00:01:31,600
warehouse to Power BI, Power BI to the app.

45
00:01:31,600 --> 00:01:33,840
Each handle forced someone to make a decision,

46
00:01:33,840 --> 00:01:35,120
or at least document one.

47
00:01:35,120 --> 00:01:37,960
It was annoying, slow, and expensive,

48
00:01:37,960 --> 00:01:39,480
but it did something useful.

49
00:01:39,480 --> 00:01:41,320
It slowed bad decisions down.

50
00:01:41,320 --> 00:01:43,280
It also gave you time to notice ownership gaps

51
00:01:43,280 --> 00:01:45,080
because chain trade was limited by friction.

52
00:01:45,080 --> 00:01:46,680
Fabric removes that padding.

53
00:01:46,680 --> 00:01:50,000
So the failure mode surfaced earlier, and they surface louder.

54
00:01:50,000 --> 00:01:52,560
That's why I'm positioning this episode a very specific way.

55
00:01:52,560 --> 00:01:54,080
These failures happened in Fabric,

56
00:01:54,080 --> 00:01:55,560
but they're not Fabric-specific.

57
00:01:55,560 --> 00:01:58,000
Fabric just removed the padding that used to hide them.

58
00:01:58,000 --> 00:02:00,480
If you've operated snowflake, Databricks, Synapse,

59
00:02:00,480 --> 00:02:03,320
BigQuery, Pick your religion, you've seen the same entropy.

60
00:02:03,320 --> 00:02:05,160
Fabric just makes the drift visible sooner

61
00:02:05,160 --> 00:02:08,400
because the boundaries blur by default, not by exception.

62
00:02:08,400 --> 00:02:10,280
And that distinction matters, especially if you're one

63
00:02:10,280 --> 00:02:13,120
of the people inheriting a Fabric estate right now.

64
00:02:13,120 --> 00:02:17,280
This episode is for senior data engineers, analytics engineers,

65
00:02:17,280 --> 00:02:19,920
data platform owners, architects who just got handed

66
00:02:19,920 --> 00:02:22,520
a bunch of workspaces, and leaders asking the only

67
00:02:22,520 --> 00:02:25,400
honest question in the room, why is this still hard?

68
00:02:25,400 --> 00:02:27,000
Because the sales pitch said it wouldn't be,

69
00:02:27,000 --> 00:02:29,360
it's still hard because data engineering was never hard

70
00:02:29,360 --> 00:02:30,720
due to typing speed.

71
00:02:30,720 --> 00:02:32,960
It was hard because of responsibility, contracts, cost,

72
00:02:32,960 --> 00:02:34,080
and control.

73
00:02:34,080 --> 00:02:37,080
Fabric accelerates delivery, therefore it accelerates drift,

74
00:02:37,080 --> 00:02:39,280
copilot accelerates authoring, therefore it accelerates

75
00:02:39,280 --> 00:02:40,080
wrongness.

76
00:02:40,080 --> 00:02:42,320
And the first signals you'll see are not always failures

77
00:02:42,320 --> 00:02:43,240
in the classic sense.

78
00:02:43,240 --> 00:02:44,600
You won't get a red pipeline.

79
00:02:44,600 --> 00:02:46,120
You'll get a capacity spike.

80
00:02:46,120 --> 00:02:48,000
You'll get two dashboards that disagree.

81
00:02:48,000 --> 00:02:50,320
You'll get an ordered question that makes everyone stare

82
00:02:50,320 --> 00:02:50,960
at the floor.

83
00:02:50,960 --> 00:02:52,040
The platform works.

84
00:02:52,040 --> 00:02:53,720
The system doesn't.

85
00:02:53,720 --> 00:02:55,240
Now here's the part that kills me.

86
00:02:55,240 --> 00:02:57,800
Teams interpret that as Fabric is immature or copilot

87
00:02:57,800 --> 00:03:00,320
is unreliable or we need better training.

88
00:03:00,320 --> 00:03:03,080
And sure, training helps people click the right buttons.

89
00:03:03,080 --> 00:03:04,720
But training doesn't enforce intent.

90
00:03:04,720 --> 00:03:06,040
Labels don't enforce intent.

91
00:03:06,040 --> 00:03:07,600
Documentation doesn't enforce intent.

92
00:03:07,600 --> 00:03:09,520
Workspace RBAC doesn't enforce intent.

93
00:03:09,520 --> 00:03:10,400
Only design does.

94
00:03:10,400 --> 00:03:11,200
Only boundaries do.

95
00:03:11,200 --> 00:03:12,480
Only gates do.

96
00:03:12,480 --> 00:03:14,120
So the promise for this episode is simple.

97
00:03:14,120 --> 00:03:15,040
It's not a tutorial.

98
00:03:15,040 --> 00:03:16,160
It's not a feature tour.

99
00:03:16,160 --> 00:03:18,360
It's an explanation of why teams feel faster and more

100
00:03:18,360 --> 00:03:20,280
out of control at the exact same time.

101
00:03:20,280 --> 00:03:23,320
And what operating rules actually survive this platform,

102
00:03:23,320 --> 00:03:25,200
we're going to walk through the old mental model

103
00:03:25,200 --> 00:03:26,160
that Fabric replaced.

104
00:03:26,160 --> 00:03:29,880
What Fabric actually changed under the hood and why one platform

105
00:03:29,880 --> 00:03:31,600
quietly converts your governance model

106
00:03:31,600 --> 00:03:35,320
from deterministic to probabilistic unless you fight back.

107
00:03:35,320 --> 00:03:37,320
Then we'll talk about copilot, not as a novelty,

108
00:03:37,320 --> 00:03:39,520
but as an acceleration layer that optimizes

109
00:03:39,520 --> 00:03:41,080
for completion over consequence.

110
00:03:41,080 --> 00:03:43,280
And I'm not going to ask you to trust vibes.

111
00:03:43,280 --> 00:03:45,240
The only artifacts that matter in this world

112
00:03:45,240 --> 00:03:48,840
are execution plans, capacity metrics, and violation

113
00:03:48,840 --> 00:03:49,440
counts.

114
00:03:49,440 --> 00:03:51,800
If it doesn't show up in a plan, a cost report,

115
00:03:51,800 --> 00:03:54,560
or a violation count, it's not governance, it's hope.

116
00:03:54,560 --> 00:03:56,520
By the end, you'll have a mental model for Fabric

117
00:03:56,520 --> 00:03:59,360
that's honest and an operating model that assumes decay

118
00:03:59,360 --> 00:04:00,520
unless enforced.

119
00:04:00,520 --> 00:04:02,280
Because that's the only model that survives

120
00:04:02,280 --> 00:04:04,360
a platform designed to make everything easy,

121
00:04:04,360 --> 00:04:05,920
right up until it's expensive.

122
00:04:05,920 --> 00:04:08,080
Now let's rewind to the old world for a second

123
00:04:08,080 --> 00:04:10,240
because you need to remember what friction used to do

124
00:04:10,240 --> 00:04:11,200
for you.

125
00:04:11,200 --> 00:04:13,640
The old mental model pipelines as the product.

126
00:04:13,640 --> 00:04:16,120
Before Fabric, most data teams treated the pipeline

127
00:04:16,120 --> 00:04:18,880
as the product, not the dataset, not the semantic model,

128
00:04:18,880 --> 00:04:21,120
not the KPI definition, the pipeline.

129
00:04:21,120 --> 00:04:23,600
Success meant the job ran, the schedule didn't slip,

130
00:04:23,600 --> 00:04:25,640
and nothing paged you at 2am.

131
00:04:25,640 --> 00:04:28,120
The pipeline graph was the artifact you showed leadership

132
00:04:28,120 --> 00:04:29,040
to prove progress.

133
00:04:29,040 --> 00:04:30,960
Look, the boxes connect, the arrows flow,

134
00:04:30,960 --> 00:04:32,320
the runtime is green.

135
00:04:32,320 --> 00:04:34,280
And honestly, in that era, it made sense

136
00:04:34,280 --> 00:04:36,240
because infrastructure was the bottleneck.

137
00:04:36,240 --> 00:04:39,040
You didn't wake up and casually provision an estate.

138
00:04:39,040 --> 00:04:41,680
You requested clusters, you negotiated gateways,

139
00:04:41,680 --> 00:04:44,760
you waited on networking, you argued over service limits,

140
00:04:44,760 --> 00:04:46,440
you begged for firewall holes,

141
00:04:46,440 --> 00:04:49,000
and because all of that took time, the organization learned

142
00:04:49,000 --> 00:04:52,560
to behavior, treat delivery as a construction project,

143
00:04:52,560 --> 00:04:55,640
engineers as builders, environments as hard boundaries,

144
00:04:55,640 --> 00:04:57,520
releases as events.

145
00:04:57,520 --> 00:04:59,920
SQL was the interface because SQL was the only thing

146
00:04:59,920 --> 00:05:01,320
that survived handoffs.

147
00:05:01,320 --> 00:05:03,840
Data factory to Synapse, Synapse to a warehouse,

148
00:05:03,840 --> 00:05:06,280
warehouse to Power BI, each layer forced you

149
00:05:06,280 --> 00:05:07,800
to speak a shared language.

150
00:05:07,800 --> 00:05:10,480
That language wasn't always pretty, but it was explicit.

151
00:05:10,480 --> 00:05:13,360
If a transformation mattered, it got written down somewhere.

152
00:05:13,360 --> 00:05:17,240
In SQL, in stored procedures, in DBT models, in views.

153
00:05:17,240 --> 00:05:19,720
And if it didn't get written down, it usually didn't ship

154
00:05:19,720 --> 00:05:22,400
because the next tool in the chain demanded a decision.

155
00:05:22,400 --> 00:05:25,600
Toolspro wasn't expected tax, and it came with a hidden benefit.

156
00:05:25,600 --> 00:05:27,920
But when you had ADF over here, a warehouse over there

157
00:05:27,920 --> 00:05:29,800
and Power BI in a different portal,

158
00:05:29,800 --> 00:05:31,760
you couldn't pretend the boundaries weren't real.

159
00:05:31,760 --> 00:05:33,440
Security lived in multiple places,

160
00:05:33,440 --> 00:05:35,080
compute lived in multiple places,

161
00:05:35,080 --> 00:05:36,560
storage lived in multiple places.

162
00:05:36,560 --> 00:05:38,640
That forced explicit ownership conversations,

163
00:05:38,640 --> 00:05:40,080
even if they were miserable.

164
00:05:40,080 --> 00:05:41,040
Who owns the lake?

165
00:05:41,040 --> 00:05:42,640
Who owns the warehouse schema?

166
00:05:42,640 --> 00:05:44,320
Who owns the semantic model?

167
00:05:44,320 --> 00:05:45,440
Who is allowed to publish?

168
00:05:45,440 --> 00:05:46,920
Who approves refresh schedules?

169
00:05:46,920 --> 00:05:50,000
You didn't always like the answers, but the system demanded you ask.

170
00:05:50,000 --> 00:05:52,200
Deployment friction was also accidental governance.

171
00:05:52,200 --> 00:05:54,600
PRs, approvals, promotion between dev and prod,

172
00:05:54,600 --> 00:05:56,560
long run times, limited concurrency,

173
00:05:56,560 --> 00:05:58,680
none of that made engineers happy.

174
00:05:58,680 --> 00:05:59,840
But it created a throttle.

175
00:05:59,840 --> 00:06:01,880
A bad decision had to survive a review cycle.

176
00:06:01,880 --> 00:06:04,600
A schema change had to survive someone noticing it.

177
00:06:04,600 --> 00:06:07,080
A new pipeline had to survive an environment boundary.

178
00:06:07,080 --> 00:06:09,520
You could still make mistakes, but you couldn't make them

179
00:06:09,520 --> 00:06:11,680
every five minutes across every surface.

180
00:06:11,680 --> 00:06:14,920
And that mattered because failure used to look like an outage.

181
00:06:14,920 --> 00:06:16,800
A pipeline failed, a job didn't run,

182
00:06:16,800 --> 00:06:18,320
a dataset didn't refresh.

183
00:06:18,320 --> 00:06:19,640
Something was visibly broken.

184
00:06:19,640 --> 00:06:21,760
What didn't happen as often was quietly wrong.

185
00:06:21,760 --> 00:06:23,840
You didn't usually get a perfect green check mark attached

186
00:06:23,840 --> 00:06:25,040
to a perfectly wrong dashboard

187
00:06:25,040 --> 00:06:27,440
that refreshed on time and lied consistently.

188
00:06:27,440 --> 00:06:29,440
Not because the old world was morally superior

189
00:06:29,440 --> 00:06:30,480
because it was slower.

190
00:06:30,480 --> 00:06:32,640
Slow systems leak less ambiguity per hour.

191
00:06:32,640 --> 00:06:34,560
Here's the key insight to keeping your head

192
00:06:34,560 --> 00:06:36,560
that friction did two things at once.

193
00:06:36,560 --> 00:06:38,280
It slowed bad decisions down.

194
00:06:38,280 --> 00:06:41,080
And it hid ownership gaps by reducing the rate of change.

195
00:06:41,080 --> 00:06:44,120
So when leaders say why did fabric make everything feel out of control?

196
00:06:44,120 --> 00:06:46,520
The answer isn't that fabric broke governance.

197
00:06:46,520 --> 00:06:49,320
The old stack just rationed chaos.

198
00:06:49,320 --> 00:06:52,600
Fabrics real rewrite from pipeline graph to platform physics.

199
00:06:52,600 --> 00:06:55,360
Fabrics real rewrite isn't that it gave you nicer tooling.

200
00:06:55,360 --> 00:06:57,080
It rewired where decisions happen.

201
00:06:57,080 --> 00:07:00,360
In the old model, your pipeline graph set on top of separate systems.

202
00:07:00,360 --> 00:07:02,440
The graph orchestrated movement between places

203
00:07:02,440 --> 00:07:04,360
that were owned differently, built differently,

204
00:07:04,360 --> 00:07:07,000
secured differently, and optimized differently

205
00:07:07,000 --> 00:07:08,360
that forced you to think in boundaries

206
00:07:08,360 --> 00:07:11,000
because the platform forced you to pay for boundaries.

207
00:07:11,000 --> 00:07:14,360
Fabric collapses that whole experience into one SaaS control plane

208
00:07:14,360 --> 00:07:16,600
and the first consequence is that the workspace

209
00:07:16,600 --> 00:07:18,200
becomes the unit of reality.

210
00:07:18,200 --> 00:07:20,840
Not the database, not the subscription, not the cluster,

211
00:07:20,840 --> 00:07:21,800
the workspace.

212
00:07:21,800 --> 00:07:24,600
Microsoft markets this as one experience,

213
00:07:24,600 --> 00:07:25,800
and that's accurate,

214
00:07:25,800 --> 00:07:27,880
but the technical meaning is more interesting.

215
00:07:27,880 --> 00:07:30,200
Fabric gives every workload,

216
00:07:30,200 --> 00:07:32,920
Lakehouse, warehouse, data factory, notebooks,

217
00:07:32,920 --> 00:07:36,520
Power BI, real-time intelligence, data agents,

218
00:07:36,520 --> 00:07:38,520
one shared containment model.

219
00:07:38,520 --> 00:07:41,400
Same surface, same identity plane, same capacity envelope,

220
00:07:41,400 --> 00:07:43,240
same place where people click share.

221
00:07:43,240 --> 00:07:46,120
So the platform stops behaving like a chain of systems.

222
00:07:46,120 --> 00:07:48,520
It behaves like a single authorization graph

223
00:07:48,520 --> 00:07:50,680
attached to a single compute meter.

224
00:07:50,680 --> 00:07:53,080
That distinction matters because once you consolidate

225
00:07:53,080 --> 00:07:55,320
compute storage and publishing into one place

226
00:07:55,320 --> 00:07:57,080
you no longer have handoffs.

227
00:07:57,080 --> 00:07:58,280
You have lateral movement.

228
00:07:58,280 --> 00:08:00,360
Engineers stop handing work off across tools

229
00:08:00,360 --> 00:08:02,360
and start handing it around inside the workspace.

230
00:08:02,360 --> 00:08:04,120
And because the mechanics feel frictionless,

231
00:08:04,120 --> 00:08:06,600
teams interpret that as architectural progress.

232
00:08:06,600 --> 00:08:09,960
But the real story is what got removed, the walls between responsibilities.

233
00:08:09,960 --> 00:08:11,960
Fabric collapses the old stack in four ways

234
00:08:11,960 --> 00:08:14,360
that are convenient in demos and brutal in real estate.

235
00:08:14,360 --> 00:08:17,720
First storage gets pulled under a single narrative, one lake.

236
00:08:17,720 --> 00:08:20,040
The one drive for data analogy is catchy,

237
00:08:20,040 --> 00:08:22,760
and it's not wrong in the sense that you get a unified lake layer

238
00:08:22,760 --> 00:08:23,800
and a unified catalog,

239
00:08:23,800 --> 00:08:26,760
but the actual behavior that matters for engineering

240
00:08:26,760 --> 00:08:27,880
isn't the metaphor.

241
00:08:27,880 --> 00:08:28,680
It's gravity.

242
00:08:28,680 --> 00:08:31,080
One lake becomes the default landing zone for everything,

243
00:08:31,080 --> 00:08:32,200
and the moment that happens,

244
00:08:32,200 --> 00:08:35,160
people stop asking who owns the data because it's in the lake.

245
00:08:35,160 --> 00:08:36,840
Ownership turns into location.

246
00:08:36,840 --> 00:08:37,880
That's not ownership.

247
00:08:37,880 --> 00:08:41,000
Second compute gets abstracted into capacity.

248
00:08:41,000 --> 00:08:43,080
You don't pay per engine the way you used to.

249
00:08:43,080 --> 00:08:45,880
You pay for the shared meter that every engine draws from.

250
00:08:45,880 --> 00:08:48,120
Spark sessions, warehouse queries, model refreshes,

251
00:08:48,120 --> 00:08:49,800
interactive exploration, same pool.

252
00:08:49,800 --> 00:08:51,160
That sounds like simplification

253
00:08:51,160 --> 00:08:53,640
until you realize what it does to accountability.

254
00:08:53,640 --> 00:08:55,640
When the meter is shared, everyone believes

255
00:08:55,640 --> 00:08:57,960
someone else is responsible for the spike.

256
00:08:57,960 --> 00:08:59,160
And because it's a SaaS platform,

257
00:08:59,160 --> 00:09:01,240
the infrastructure isn't yours to tune,

258
00:09:01,240 --> 00:09:03,400
so teams search for comfort in the UI,

259
00:09:03,400 --> 00:09:05,960
instead of certainty in the workload behavior.

260
00:09:05,960 --> 00:09:08,520
Third, security baselines get normalized.

261
00:09:08,520 --> 00:09:10,200
Workspaces come with a simple role model.

262
00:09:10,200 --> 00:09:11,160
Great for adoption.

263
00:09:11,160 --> 00:09:12,440
Dangerous for intent.

264
00:09:12,440 --> 00:09:15,080
Because workspace roles are not a data security model.

265
00:09:15,080 --> 00:09:16,360
They're a collaboration model.

266
00:09:16,360 --> 00:09:17,560
They answer who can work here,

267
00:09:17,560 --> 00:09:18,840
not who can see this table,

268
00:09:18,840 --> 00:09:20,120
not who can query this view,

269
00:09:20,120 --> 00:09:22,360
not who is allowed to infer this sensitive attribute

270
00:09:22,360 --> 00:09:23,240
through a join.

271
00:09:23,240 --> 00:09:25,640
But when everything lives behind the same workspace boundary,

272
00:09:25,640 --> 00:09:28,200
people treat that boundary as if it were a database boundary.

273
00:09:28,200 --> 00:09:31,400
It is not fourth integration parts become internal.

274
00:09:31,400 --> 00:09:34,360
In the old world, a new pipeline meant choosing connectors,

275
00:09:34,360 --> 00:09:36,760
run times, service principles, landing zones,

276
00:09:36,760 --> 00:09:39,080
and a destination pattern that was ceremony.

277
00:09:39,080 --> 00:09:41,640
Fabric collapses that into item creation.

278
00:09:41,640 --> 00:09:44,760
Data flow, pipeline, notebook, shortcut mirror,

279
00:09:44,760 --> 00:09:47,000
warehouse table, semantic model, click, click,

280
00:09:47,000 --> 00:09:47,640
done.

281
00:09:47,640 --> 00:09:49,480
And because it's all inside one surface,

282
00:09:49,480 --> 00:09:51,480
every path looks equally endorsed.

283
00:09:51,480 --> 00:09:52,840
Shortcuts look like ownership.

284
00:09:52,840 --> 00:09:54,280
Copies look like lineage.

285
00:09:54,280 --> 00:09:55,960
Semantic reuse looks like governance.

286
00:09:55,960 --> 00:09:58,360
Meanwhile, the actual physics never changed.

287
00:09:58,360 --> 00:10:01,240
Data still has three realities you can't negotiate with.

288
00:10:01,240 --> 00:10:03,640
Cost, contracts, and control.

289
00:10:03,640 --> 00:10:06,520
Costs still exist, but now it's surfaced as capacity behavior.

290
00:10:06,520 --> 00:10:09,160
Contracts still exist, but now they're optional,

291
00:10:09,160 --> 00:10:10,440
unless you enforce them.

292
00:10:10,440 --> 00:10:13,880
Controls still exists, but now it's easy to confuse access

293
00:10:13,880 --> 00:10:16,040
with intent.

294
00:10:16,040 --> 00:10:18,120
This is the foundational misunderstanding.

295
00:10:18,120 --> 00:10:20,680
Fabric removed the walls, not the physics.

296
00:10:20,680 --> 00:10:23,560
When Microsoft says workloads become experiences,

297
00:10:23,560 --> 00:10:26,280
what that means in practice is that workloads become modes.

298
00:10:26,280 --> 00:10:28,600
A lake house isn't a separate product you provision.

299
00:10:28,600 --> 00:10:30,920
It's an item inside the workspace.

300
00:10:30,920 --> 00:10:33,480
A warehouse isn't a separate environment you protect.

301
00:10:33,480 --> 00:10:36,200
It's an item inside the same permission model.

302
00:10:36,200 --> 00:10:37,640
Power BI isn't downstream.

303
00:10:37,640 --> 00:10:40,040
It's in the same place, pointed at the same assets,

304
00:10:40,040 --> 00:10:42,440
built by the same people, sometimes in the same afternoon.

305
00:10:42,440 --> 00:10:44,840
So the integration story becomes dangerously simple.

306
00:10:44,840 --> 00:10:47,000
If someone can see it, they can use it.

307
00:10:47,000 --> 00:10:48,280
If they can use it, they will.

308
00:10:48,280 --> 00:10:49,960
If they can publish it, they'll publish it.

309
00:10:49,960 --> 00:10:52,120
And once something gets published, it becomes production.

310
00:10:52,120 --> 00:10:54,200
Because executives don't care how it was made.

311
00:10:54,200 --> 00:10:55,400
They care that it exists.

312
00:10:55,400 --> 00:10:58,360
That's why fabric feels like velocity and chaos at the same time.

313
00:10:58,360 --> 00:11:00,520
Because you're no longer fighting tools.

314
00:11:00,520 --> 00:11:03,640
You're fighting entropy in a single shared control plane.

315
00:11:03,640 --> 00:11:06,040
And now we need to talk about what speed does

316
00:11:06,040 --> 00:11:07,800
when determinism isn't enforced.

317
00:11:07,800 --> 00:11:11,080
Because this is where one platform turns into one blast radius.

318
00:11:11,080 --> 00:11:15,160
In a speed without determinism, why faster feels like less safe?

319
00:11:15,160 --> 00:11:18,200
Most teams confuse pipeline speed with decision speed.

320
00:11:18,200 --> 00:11:20,840
Fabric absolutely makes execution faster.

321
00:11:20,840 --> 00:11:21,960
Ingestion is simpler.

322
00:11:21,960 --> 00:11:23,560
Notebooks are one click away.

323
00:11:23,560 --> 00:11:25,880
Direct lake lights up visuals and the platform

324
00:11:25,880 --> 00:11:27,640
removes a lot of the old drag.

325
00:11:27,640 --> 00:11:29,160
But the hidden variable is that

326
00:11:29,160 --> 00:11:32,760
the rate of change goes up and with it, the rate of unreviewed decisions.

327
00:11:32,760 --> 00:11:34,040
You didn't just speed up jobs.

328
00:11:34,040 --> 00:11:35,640
You sped up policy mistakes.

329
00:11:35,640 --> 00:11:37,240
And the moment the change rate goes up,

330
00:11:37,240 --> 00:11:39,320
ambiguity stops being a documentation problem.

331
00:11:39,320 --> 00:11:40,840
It becomes a production behavior.

332
00:11:40,840 --> 00:11:43,800
Because ambiguity in data engineering isn't philosophical.

333
00:11:43,800 --> 00:11:46,360
It's concrete drift, schema drift, naming drift,

334
00:11:46,360 --> 00:11:48,360
semantics drift, access drift.

335
00:11:48,360 --> 00:11:50,120
In the old stack, drift still happened,

336
00:11:50,120 --> 00:11:52,120
but it had to crawl through boundaries.

337
00:11:52,120 --> 00:11:54,680
Now it moves at the same speed as a refresh schedule

338
00:11:54,680 --> 00:11:56,680
and the same speed as a co-pilot suggestion.

339
00:11:56,680 --> 00:11:58,840
So faster starts to feel like less safe.

340
00:11:58,840 --> 00:12:00,520
Not because fabric is unsafe,

341
00:12:00,520 --> 00:12:03,320
but because determinism isn't enforced by default.

342
00:12:03,320 --> 00:12:05,000
You're operating a high-speed platform

343
00:12:05,000 --> 00:12:07,800
with low friction publishing and mostly social contracts.

344
00:12:07,800 --> 00:12:09,880
Social contracts do not survive scale.

345
00:12:09,880 --> 00:12:11,320
Here's the real mechanism.

346
00:12:11,320 --> 00:12:13,480
Default behaviors become architecture.

347
00:12:13,480 --> 00:12:15,720
Not in a poetic way, in an operational way.

348
00:12:15,720 --> 00:12:17,480
If a workspace has no naming conventions,

349
00:12:17,480 --> 00:12:20,120
then whatever the first team did becomes the convention.

350
00:12:20,120 --> 00:12:22,520
If the lake house lands files with inconsistent types,

351
00:12:22,520 --> 00:12:26,120
then the warehouse consumes that inconsistency unless you reject it.

352
00:12:26,120 --> 00:12:28,920
If the semantic model gets built directly off raw tables

353
00:12:28,920 --> 00:12:30,120
because it's convenient,

354
00:12:30,120 --> 00:12:32,040
that convenience becomes a dependency graph.

355
00:12:32,040 --> 00:12:34,440
And once dashboards exist, nobody wants to touch the source

356
00:12:34,440 --> 00:12:36,200
because now you're breaking the business.

357
00:12:36,200 --> 00:12:39,400
So you end up with workspace conventions replacing contracts

358
00:12:39,400 --> 00:12:41,080
and shortcuts replacing ownership

359
00:12:41,080 --> 00:12:43,320
and it refreshes replacing correctness.

360
00:12:43,320 --> 00:12:45,640
That distinction matters because fabric makes it easy

361
00:12:45,640 --> 00:12:47,400
to create the downstream artifacts

362
00:12:47,400 --> 00:12:49,720
before you've locked the upstream invariance.

363
00:12:49,720 --> 00:12:51,240
And when downstream exists first,

364
00:12:51,240 --> 00:12:53,080
the organization starts managing backwards,

365
00:12:53,080 --> 00:12:54,920
patch the semantic layer, patch the DAX,

366
00:12:54,920 --> 00:12:57,320
patch the report, patch the refresh schedule.

367
00:12:57,320 --> 00:12:59,080
The data layer becomes a landfill of,

368
00:12:59,080 --> 00:13:01,400
we'll clean it later, later never comes.

369
00:13:01,400 --> 00:13:03,080
Now add capacity pricing to this

370
00:13:03,080 --> 00:13:05,400
because this is where fabric gets brutally honest.

371
00:13:05,400 --> 00:13:07,080
In the old world, you paid per tool.

372
00:13:07,080 --> 00:13:09,560
You could hide inefficiency inside somebody else's bill,

373
00:13:09,560 --> 00:13:12,360
the warehouse bill, the spark bill, the BI bill.

374
00:13:12,360 --> 00:13:13,880
Fabric makes the chaos shared.

375
00:13:13,880 --> 00:13:15,400
Everything draws from one meter.

376
00:13:15,400 --> 00:13:18,520
So the cost of ambiguity shows up as compute consumption.

377
00:13:18,520 --> 00:13:21,400
Spikes, throttling, degraded interactivity,

378
00:13:21,400 --> 00:13:22,680
refresh contention,

379
00:13:22,680 --> 00:13:26,040
and a monthly bill that suddenly has a personality.

380
00:13:26,040 --> 00:13:28,440
And leadership doesn't care that the sequel was only used

381
00:13:28,440 --> 00:13:29,720
for a small result set.

382
00:13:29,720 --> 00:13:31,400
The platform still paid for the scan.

383
00:13:31,400 --> 00:13:33,160
This is why fabric becomes the first place

384
00:13:33,160 --> 00:13:35,240
many organizations experience cost incidents

385
00:13:35,240 --> 00:13:36,440
as their primary signal.

386
00:13:36,440 --> 00:13:39,960
Not outages, not failures, cost, capacity saturation,

387
00:13:39,960 --> 00:13:41,960
performance degradation, the system functions,

388
00:13:41,960 --> 00:13:44,120
but it does so expensively and unpredictably,

389
00:13:44,120 --> 00:13:45,640
which is just another way of saying

390
00:13:45,640 --> 00:13:47,320
your assumptions are no longer free.

391
00:13:47,320 --> 00:13:49,240
You can't hope your way into stable spend,

392
00:13:49,240 --> 00:13:52,200
you can't train your way into deterministic access boundaries.

393
00:13:52,200 --> 00:13:54,760
You can't label your way into a schema contract.

394
00:13:54,760 --> 00:13:56,840
Fabric doesn't break, your assumptions do.

395
00:13:56,840 --> 00:13:59,960
And once you see that, you realize why the platform feels

396
00:13:59,960 --> 00:14:01,400
like it's tightening around you

397
00:14:01,400 --> 00:14:03,720
because you're no longer amortizing mistakes over time.

398
00:14:03,720 --> 00:14:05,480
You're paying for them immediately at scale

399
00:14:05,480 --> 00:14:07,960
on a shared meter with downstream artifacts

400
00:14:07,960 --> 00:14:10,440
that create political resistance to fixing the source.

401
00:14:10,440 --> 00:14:11,640
So yes, fabric is faster.

402
00:14:11,640 --> 00:14:14,440
But speed without gates is just higher frequency failure.

403
00:14:14,440 --> 00:14:17,240
And now co-pilot becomes the acceleration layer on top of that,

404
00:14:17,240 --> 00:14:19,160
because it doesn't just speed up execution.

405
00:14:19,160 --> 00:14:21,880
It speeds up the creation of plausible and bounded decisions.

406
00:14:22,040 --> 00:14:24,760
Co-pilot's real impact, completion over consequence.

407
00:14:24,760 --> 00:14:27,640
Co-pilot for Microsoft fabric is the part everyone wants to talk about

408
00:14:27,640 --> 00:14:29,480
because it looks like free velocity.

409
00:14:29,480 --> 00:14:31,480
You type a sentence and it writes the pipeline,

410
00:14:31,480 --> 00:14:33,800
it writes the notebook cell, it writes the school,

411
00:14:33,800 --> 00:14:36,120
it writes the KQL, it suggests the transformation,

412
00:14:36,120 --> 00:14:39,080
it will even explain the error message it helped create.

413
00:14:39,080 --> 00:14:41,480
And the uncomfortable truth is that it does exactly

414
00:14:41,480 --> 00:14:42,680
what it's designed to do.

415
00:14:42,680 --> 00:14:44,280
It helps you complete a task,

416
00:14:44,280 --> 00:14:46,360
but co-pilot doesn't live in your incident timeline.

417
00:14:46,360 --> 00:14:47,640
It doesn't live in your budget,

418
00:14:47,640 --> 00:14:49,560
it doesn't live in your audit findings.

419
00:14:49,560 --> 00:14:53,400
So when you ask, what does co-pilot actually change in data engineering?

420
00:14:53,400 --> 00:14:55,400
The answer is not it replaces engineers

421
00:14:55,400 --> 00:14:57,640
and it eliminates complexity.

422
00:14:57,640 --> 00:15:00,040
It changes the failure rate of ambiguity.

423
00:15:00,040 --> 00:15:02,280
Because the optimization target is completion,

424
00:15:02,280 --> 00:15:03,880
plausible output fast,

425
00:15:03,880 --> 00:15:05,000
not deterministic cost,

426
00:15:05,000 --> 00:15:06,280
not governance intent,

427
00:15:06,280 --> 00:15:07,800
not long term correctness.

428
00:15:07,800 --> 00:15:09,240
Co-pilot writes answers.

429
00:15:09,240 --> 00:15:11,320
Data engineering is about consequences.

430
00:15:11,320 --> 00:15:12,680
Here's what most teams miss.

431
00:15:12,680 --> 00:15:14,840
Co-pilot doesn't generate work in a vacuum.

432
00:15:14,840 --> 00:15:16,680
It generates work inside a platform

433
00:15:16,680 --> 00:15:19,480
where the easiest path is also the widest blast radius.

434
00:15:19,480 --> 00:15:21,560
Fabric gives you immediate downstream surfaces,

435
00:15:21,560 --> 00:15:24,280
direct lakes, semantic models, reports, data agents.

436
00:15:24,280 --> 00:15:27,480
So whatever co-pilot produces can become real fast.

437
00:15:27,480 --> 00:15:29,400
And once it becomes real, it becomes defended.

438
00:15:29,400 --> 00:15:32,440
People build slides on it, they send links, they make decisions,

439
00:15:32,440 --> 00:15:33,800
then you, the inheriting architect,

440
00:15:33,800 --> 00:15:35,480
get asked why the numbers changed.

441
00:15:35,480 --> 00:15:37,640
Co-pilot didn't break anything.

442
00:15:37,640 --> 00:15:39,480
It just made it easy to ship an assumption

443
00:15:39,480 --> 00:15:41,160
before you encoded it as a contract.

444
00:15:41,160 --> 00:15:42,920
And you can see the pattern across workloads.

445
00:15:42,920 --> 00:15:45,480
In a warehouse, co-pilot will happily generate a query

446
00:15:45,480 --> 00:15:47,480
that returns the right-looking columns

447
00:15:47,480 --> 00:15:50,120
with the right-looking joins for the right-looking question.

448
00:15:50,120 --> 00:15:52,920
It will also happily do it without a bounded time predicate,

449
00:15:52,920 --> 00:15:54,600
without respecting partitioning intent,

450
00:15:54,600 --> 00:15:57,720
and without any awareness of what your capacity meter will do

451
00:15:57,720 --> 00:16:00,600
when someone runs it at 9 a.m., alongside three refreshes

452
00:16:00,600 --> 00:16:03,160
and 10 other analysts trying to just check something.

453
00:16:03,160 --> 00:16:05,320
In a notebook, it will generate transformations

454
00:16:05,320 --> 00:16:07,000
that look reasonable and run.

455
00:16:07,000 --> 00:16:09,880
It won't stop and ask you if the lake house is schema-on-read

456
00:16:09,880 --> 00:16:11,960
and therefore silently accepting drift.

457
00:16:11,960 --> 00:16:13,720
It won't insist on validation gates.

458
00:16:13,720 --> 00:16:16,680
It won't create quarantine tables unless you tell it to.

459
00:16:16,680 --> 00:16:19,000
And if you don't tell it to, you don't have quality enforcement.

460
00:16:19,000 --> 00:16:21,000
You have optimism with a refresh schedule.

461
00:16:21,000 --> 00:16:22,680
In Data Factory, it will generate pipelines

462
00:16:22,680 --> 00:16:24,520
that connect and move data.

463
00:16:24,520 --> 00:16:26,840
It won't force you to define ownership boundaries

464
00:16:26,840 --> 00:16:28,040
or consumption contracts.

465
00:16:28,040 --> 00:16:31,560
It will create a path and paths at scale become dependency.

466
00:16:31,560 --> 00:16:32,920
And then there are data agents.

467
00:16:32,920 --> 00:16:34,680
This is where the illusion gets dangerous.

468
00:16:34,680 --> 00:16:36,680
Because now the natural language layer is pointed

469
00:16:36,680 --> 00:16:37,800
at your models and tables

470
00:16:37,800 --> 00:16:40,680
and people interpret conversational access as governed access.

471
00:16:40,680 --> 00:16:43,800
But the agent can only operate within what the caller can already see.

472
00:16:43,800 --> 00:16:45,880
That means co-pilot doesn't inherit your intent.

473
00:16:45,880 --> 00:16:47,560
It inherits your permissions.

474
00:16:47,560 --> 00:16:48,920
If your permissions are broad

475
00:16:48,920 --> 00:16:51,960
because workspace roles became the whole security strategy,

476
00:16:51,960 --> 00:16:53,640
co-pilot becomes a really efficient way

477
00:16:53,640 --> 00:16:55,720
to explore things you didn't mean to expose.

478
00:16:55,720 --> 00:16:58,200
This is why it worked becomes, it's correct

479
00:16:58,200 --> 00:17:00,280
in fabric estates that lean on co-pilot.

480
00:17:00,280 --> 00:17:02,040
The result renders, the dashboard refreshes,

481
00:17:02,040 --> 00:17:04,040
the agent answers, nobody gets an error.

482
00:17:04,040 --> 00:17:07,560
And in modern organizations, no error is treated as proof of correctness.

483
00:17:07,560 --> 00:17:10,200
It isn't, co-pilot makes two kinds of teams faster.

484
00:17:10,200 --> 00:17:11,960
The teams with enforcement become faster

485
00:17:11,960 --> 00:17:14,680
at implementing decisions they already made deliberately.

486
00:17:14,680 --> 00:17:16,760
Teams without enforcement become faster

487
00:17:16,760 --> 00:17:19,080
at producing artifacts that look like decisions.

488
00:17:19,080 --> 00:17:20,200
Those are not the same thing.

489
00:17:20,200 --> 00:17:21,880
So the real governance limitation

490
00:17:21,880 --> 00:17:23,400
isn't that co-pilot hallucinates.

491
00:17:23,400 --> 00:17:27,240
The deeper limitation is that co-pilot has no native concept of your invariance

492
00:17:27,240 --> 00:17:29,160
unless you encode them into the system.

493
00:17:29,160 --> 00:17:31,880
Schemas, views, procedures, roles,

494
00:17:31,880 --> 00:17:33,240
CICD gates,

495
00:17:33,240 --> 00:17:36,920
and acceptance criteria like execution plans and violation counts.

496
00:17:36,920 --> 00:17:38,760
If you don't enforce those surfaces,

497
00:17:38,760 --> 00:17:40,840
co-pilot doesn't accelerate engineering.

498
00:17:40,840 --> 00:17:42,200
It accelerates entropy.

499
00:17:42,200 --> 00:17:45,480
And now we can stop theorizing because the failure modes aren't abstract.

500
00:17:45,480 --> 00:17:48,600
They show up as cost incidents, correctness incidents,

501
00:17:48,600 --> 00:17:51,080
and access incidents often with no obvious break.

502
00:17:51,080 --> 00:17:54,680
Case one, setup, the cost drift that looked like a ghost.

503
00:17:54,680 --> 00:17:58,520
The first fabric estate failure that shows up for most organizations

504
00:17:58,520 --> 00:18:00,200
isn't a security breach,

505
00:18:00,200 --> 00:18:02,440
and it isn't a data quality scandal.

506
00:18:02,440 --> 00:18:03,480
It's a bill.

507
00:18:03,480 --> 00:18:06,360
Or more accurately, a capacity graph that looks haunted.

508
00:18:06,360 --> 00:18:08,520
Here's the symptom pattern, you're on a steady state.

509
00:18:08,520 --> 00:18:09,800
The business feels good.

510
00:18:09,800 --> 00:18:11,000
Dashboards refresh.

511
00:18:11,000 --> 00:18:12,200
Nobody deployed anything.

512
00:18:12,200 --> 00:18:13,800
There wasn't a release weekend.

513
00:18:13,800 --> 00:18:15,000
No big new data set.

514
00:18:15,000 --> 00:18:17,000
No we migrated to a new model.

515
00:18:17,000 --> 00:18:19,240
And then the capacity metrics spike hard,

516
00:18:19,240 --> 00:18:21,080
usually right in the middle of the workday,

517
00:18:21,080 --> 00:18:24,440
and usually in bursts that don't line up with your pipeline schedule.

518
00:18:24,440 --> 00:18:26,600
Leadership asks the worst question in the world.

519
00:18:26,600 --> 00:18:27,560
What changed?

520
00:18:27,560 --> 00:18:30,680
And you end up in that familiar meeting where five people say nothing

521
00:18:30,680 --> 00:18:32,120
with total sincerity.

522
00:18:32,120 --> 00:18:34,200
Because from their perspective, nothing changed.

523
00:18:34,200 --> 00:18:35,320
The reports still load.

524
00:18:35,320 --> 00:18:36,520
The refresh is still complete.

525
00:18:36,520 --> 00:18:38,600
The pipeline success rate is still green.

526
00:18:38,600 --> 00:18:41,560
So the organization interprets the spike as billing weirdness

527
00:18:41,560 --> 00:18:43,080
or a fabric capacity glitch,

528
00:18:43,080 --> 00:18:45,320
or Microsoft did something in the service.

529
00:18:45,320 --> 00:18:45,960
Plot twist.

530
00:18:45,960 --> 00:18:47,400
It's almost never a platform ghost.

531
00:18:47,400 --> 00:18:48,920
It's a query-shaped problem.

532
00:18:48,920 --> 00:18:51,160
Fabric just made cost the first incident signal

533
00:18:51,160 --> 00:18:53,400
because cost is the only thing the platform can't hide

534
00:18:53,400 --> 00:18:55,320
when everything shares the same meter.

535
00:18:55,320 --> 00:18:56,840
You can mask correctness for weeks.

536
00:18:56,840 --> 00:18:58,520
You can mask access drift for months.

537
00:18:58,520 --> 00:19:00,200
But you cannot mask capacity pressure

538
00:19:00,200 --> 00:19:01,560
when 10 people run wide,

539
00:19:01,560 --> 00:19:03,400
unbounded queries at the same time.

540
00:19:03,400 --> 00:19:05,080
The semantic model is refreshing.

541
00:19:05,080 --> 00:19:06,680
So in fabric, the first question isn't,

542
00:19:06,680 --> 00:19:07,560
did it fail?

543
00:19:07,560 --> 00:19:09,240
It's what did it do to the meter?

544
00:19:09,240 --> 00:19:10,600
And that's why this case matters.

545
00:19:10,600 --> 00:19:11,960
It forces a new discipline.

546
00:19:11,960 --> 00:19:14,600
You have to treat cost as an engineered property,

547
00:19:14,600 --> 00:19:16,200
not as an after-the-fact reporter.

548
00:19:16,200 --> 00:19:18,680
If cost is undefined, governance is imaginary.

549
00:19:18,680 --> 00:19:21,080
Because a platform that charges you for runtime

550
00:19:21,080 --> 00:19:23,400
doesn't care whether your results hit was small.

551
00:19:23,400 --> 00:19:25,400
It charges you for what you asked the engine to do.

552
00:19:25,400 --> 00:19:27,880
Now, to debug this without lying to yourself,

553
00:19:27,880 --> 00:19:29,720
you need ground-truth artifacts.

554
00:19:29,720 --> 00:19:32,600
Not screenshots, not guesses, not it feels slower.

555
00:19:32,600 --> 00:19:33,880
Ground truth means three things.

556
00:19:33,880 --> 00:19:35,160
One, execution plans.

557
00:19:35,160 --> 00:19:36,120
Not the query text.

558
00:19:36,120 --> 00:19:36,920
The plan.

559
00:19:36,920 --> 00:19:37,880
The joint types.

560
00:19:37,880 --> 00:19:38,520
The scans.

561
00:19:38,520 --> 00:19:39,160
The sorts.

562
00:19:39,160 --> 00:19:39,800
The spills.

563
00:19:39,800 --> 00:19:40,760
The operator costs.

564
00:19:40,760 --> 00:19:42,280
The shape of the work.

565
00:19:42,280 --> 00:19:44,760
Two, scan rows versus returned rows.

566
00:19:44,760 --> 00:19:47,160
That ratio tells you whether you're paying for a seek

567
00:19:47,160 --> 00:19:48,520
or funding a full-world scan

568
00:19:48,520 --> 00:19:50,600
to retrieve 15 rows for a visual.

569
00:19:50,600 --> 00:19:53,880
Three, fabric capacity metrics aligned to time windows.

570
00:19:53,880 --> 00:19:55,800
You correlate the spike with the query window

571
00:19:55,800 --> 00:19:57,000
and the refresh window.

572
00:19:57,000 --> 00:19:58,120
You don't start with blame.

573
00:19:58,120 --> 00:19:59,480
You start with correlation.

574
00:19:59,480 --> 00:20:00,280
And once you do that,

575
00:20:00,280 --> 00:20:02,360
you'll notice the same pattern over and over.

576
00:20:02,360 --> 00:20:04,760
Interactive bursts plus refresh concurrency

577
00:20:04,760 --> 00:20:06,520
plus one or two helpful queries

578
00:20:06,520 --> 00:20:08,120
that look innocent in a chat window

579
00:20:08,120 --> 00:20:10,680
and behave like a denial of wallet attack in the engine.

580
00:20:10,680 --> 00:20:11,960
That's the setup.

581
00:20:11,960 --> 00:20:13,320
Now we can talk root cause

582
00:20:13,320 --> 00:20:15,160
because the plan will tell on you every time.

583
00:20:15,160 --> 00:20:18,040
In case one root cause, AI-circle, that scan the world.

584
00:20:18,040 --> 00:20:20,440
The root cause usually isn't someone ran a query.

585
00:20:20,440 --> 00:20:22,040
People always ran queries.

586
00:20:22,040 --> 00:20:24,680
The root cause is that co-pilot made it socially acceptable

587
00:20:24,680 --> 00:20:27,160
to run warehouse-scale SQL with no discipline

588
00:20:27,160 --> 00:20:29,240
because it looked professional enough to trust

589
00:20:29,240 --> 00:20:31,000
and fast enough to repeat.

590
00:20:31,000 --> 00:20:33,240
So what does the pattern look like in practice?

591
00:20:33,240 --> 00:20:34,600
First, non-sarguable predicates.

592
00:20:34,600 --> 00:20:36,440
Co-pilot loves convenience syntax.

593
00:20:36,440 --> 00:20:39,320
It'll give you things like filtering on a computed expression,

594
00:20:39,320 --> 00:20:41,400
wrapping date columns in functions,

595
00:20:41,400 --> 00:20:43,320
building conditions that read well to humans

596
00:20:43,320 --> 00:20:45,160
and destroy index usage for engines.

597
00:20:45,160 --> 00:20:46,840
The user sees a clean wear clause.

598
00:20:46,840 --> 00:20:49,000
The optimizer sees cool, I can't seek.

599
00:20:49,000 --> 00:20:50,120
And so it scans.

600
00:20:50,120 --> 00:20:52,760
And on a big warehouse table, scan is not a detail.

601
00:20:52,760 --> 00:20:53,480
It's the bill.

602
00:20:53,480 --> 00:20:56,600
Second, missing time filters.

603
00:20:56,600 --> 00:20:58,760
This one is almost comical

604
00:20:58,760 --> 00:21:01,560
because the question people ask is always time bound.

605
00:21:01,560 --> 00:21:03,640
Last month, this quarter,

606
00:21:03,640 --> 00:21:05,240
since the campaign started.

607
00:21:05,240 --> 00:21:07,080
But co-pilot tends to generate queries

608
00:21:07,080 --> 00:21:10,520
that are semantically plausible without being operationally bounded.

609
00:21:10,520 --> 00:21:11,800
It'll happily join fact tables

610
00:21:11,800 --> 00:21:14,120
and return the right columns without enforcing a window.

611
00:21:14,120 --> 00:21:15,800
The result set can still be tiny

612
00:21:15,800 --> 00:21:18,280
because the final visual only needs 10 rows.

613
00:21:18,280 --> 00:21:19,880
But the engine had to read everything

614
00:21:19,880 --> 00:21:21,160
to discover those 10 rows.

615
00:21:21,160 --> 00:21:24,120
Third, select star against large tables.

616
00:21:24,120 --> 00:21:26,840
It happens because co-pilot optimizes for completion.

617
00:21:26,840 --> 00:21:28,520
Select, return something.

618
00:21:28,520 --> 00:21:30,120
The user doesn't have to think.

619
00:21:30,120 --> 00:21:31,640
And once you select,

620
00:21:31,640 --> 00:21:35,320
in a warehouse table, you've made two decisions you didn't intend to make.

621
00:21:35,320 --> 00:21:37,480
You've committed to a wider IO footprint

622
00:21:37,480 --> 00:21:38,920
and you've committed to schema drift

623
00:21:38,920 --> 00:21:42,040
because now every new column is automatically in scope

624
00:21:42,040 --> 00:21:44,040
for every downstream consumer.

625
00:21:44,040 --> 00:21:45,400
You didn't just write a query,

626
00:21:45,400 --> 00:21:47,320
you wrote a contract you never reviewed

627
00:21:47,320 --> 00:21:49,000
and here's where people get fooled.

628
00:21:49,000 --> 00:21:50,520
But the report is simple.

629
00:21:50,520 --> 00:21:52,040
But the result is small.

630
00:21:52,040 --> 00:21:53,480
But it finishes quickly.

631
00:21:53,480 --> 00:21:54,600
None of that matters.

632
00:21:54,600 --> 00:21:56,600
What matters is what the plan did.

633
00:21:56,600 --> 00:21:59,240
In execution plans, the tell is consistent.

634
00:21:59,240 --> 00:22:01,400
Large scans feeding hash joins.

635
00:22:01,400 --> 00:22:02,680
Followed by big sorts.

636
00:22:02,680 --> 00:22:03,800
And then you see spills.

637
00:22:03,800 --> 00:22:05,720
Spills are the platform politely admitting

638
00:22:05,720 --> 00:22:09,000
it ran out of memory and decided to rent more time from your capacity.

639
00:22:09,000 --> 00:22:10,360
That's not a performance issue.

640
00:22:10,360 --> 00:22:12,040
That's a cost policy failure.

641
00:22:12,040 --> 00:22:14,760
You can also see it in scandros versus return rows.

642
00:22:14,760 --> 00:22:16,920
When that ratio is absurd,

643
00:22:16,920 --> 00:22:19,160
millions scanned, dozens returned,

644
00:22:19,160 --> 00:22:21,320
you're not looking at someone exploring.

645
00:22:21,320 --> 00:22:23,800
You're looking at a lack of bounded query surfaces.

646
00:22:23,800 --> 00:22:25,800
Now add fabrics abstraction layer

647
00:22:25,800 --> 00:22:27,480
and you get the second illusion.

648
00:22:27,480 --> 00:22:30,040
Semantic model refreshes still succeed.

649
00:22:30,040 --> 00:22:31,800
Direct lag still renders visuals.

650
00:22:31,800 --> 00:22:33,640
The user's experience stays smooth enough

651
00:22:33,640 --> 00:22:35,480
that nobody treats it as an incident.

652
00:22:35,480 --> 00:22:38,360
The platform eats the cost, the dashboards keep refreshing

653
00:22:38,360 --> 00:22:40,680
and the only thing that screams is the capacity meter.

654
00:22:40,680 --> 00:22:43,240
So the spikes correlate to interactive bursts

655
00:22:43,240 --> 00:22:45,240
and refresh concurrency, not deployments.

656
00:22:45,240 --> 00:22:46,520
That's why it feels like a ghost.

657
00:22:46,520 --> 00:22:47,800
Nothing in get changed.

658
00:22:47,800 --> 00:22:50,360
But user behavior changed and co-pilot made that behavior

659
00:22:50,360 --> 00:22:51,640
easier to produce at scale.

660
00:22:51,640 --> 00:22:53,320
This is also why this problem

661
00:22:53,320 --> 00:22:54,920
shows up first in fabric estates

662
00:22:54,920 --> 00:22:57,240
that democratize access early.

663
00:22:57,240 --> 00:22:58,600
You give broad read permissions

664
00:22:58,600 --> 00:22:59,880
because you want adoption.

665
00:22:59,880 --> 00:23:01,560
You don't lock down query surfaces

666
00:23:01,560 --> 00:23:03,560
because you don't want to slow people down.

667
00:23:03,560 --> 00:23:05,480
You let analysts query raw tables

668
00:23:05,480 --> 00:23:07,160
because it's just read only.

669
00:23:07,160 --> 00:23:08,520
Then co-pilot shows up

670
00:23:08,520 --> 00:23:10,360
and now the casual analyst can generate

671
00:23:10,360 --> 00:23:12,200
warehouse grade SQL in 10 seconds

672
00:23:12,200 --> 00:23:14,840
without understanding sagibility, partition elimination,

673
00:23:14,840 --> 00:23:17,240
joint strategy, or the difference between works

674
00:23:17,240 --> 00:23:19,960
and works efficiently under concurrency.

675
00:23:19,960 --> 00:23:21,880
And before anyone gets comfortable blaming co-pilot,

676
00:23:21,880 --> 00:23:23,800
remember co-pilot isn't the villain.

677
00:23:23,800 --> 00:23:25,320
Co-pilot is the amplifier.

678
00:23:25,320 --> 00:23:27,560
The actual design omission is letting raw tables

679
00:23:27,560 --> 00:23:29,320
become the consumption interface.

680
00:23:29,320 --> 00:23:31,240
If raw tables are queryable by default,

681
00:23:31,240 --> 00:23:32,920
then your cost model is probabilistic.

682
00:23:32,920 --> 00:23:35,880
You're gambling that every consumer will behave

683
00:23:35,880 --> 00:23:38,360
like a senior engineer with an execution plan open.

684
00:23:38,360 --> 00:23:38,920
They won't.

685
00:23:38,920 --> 00:23:40,200
So here's the anchor line that matters

686
00:23:40,200 --> 00:23:41,720
for the rest of the episode.

687
00:23:41,720 --> 00:23:43,400
If it doesn't show up in a plan,

688
00:23:43,400 --> 00:23:45,800
a cost report, or a violation count,

689
00:23:45,800 --> 00:23:47,640
it's not governance, it's hope.

690
00:23:47,640 --> 00:23:49,240
In this case, the plan is the confession.

691
00:23:49,240 --> 00:23:50,680
And it always confesses.

692
00:23:50,680 --> 00:23:51,960
Case one, fix.

693
00:23:51,960 --> 00:23:53,480
Views as query surface.

694
00:23:53,480 --> 00:23:55,400
Plans as acceptance criteria.

695
00:23:55,400 --> 00:23:58,200
So the fix isn't tell people to write better school.

696
00:23:58,200 --> 00:24:00,600
That's education, education decays.

697
00:24:00,600 --> 00:24:02,280
The fix is to change the query surface

698
00:24:02,280 --> 00:24:04,840
so the platform can't be helpfully expensive by default.

699
00:24:04,840 --> 00:24:08,440
In other words, you stop letting raw tables be a public API.

700
00:24:08,440 --> 00:24:11,160
You make the warehouse behave like a system with boundaries,

701
00:24:11,160 --> 00:24:12,840
not a playground with billing.

702
00:24:12,840 --> 00:24:15,560
The first enforcement move is simple and unpopular.

703
00:24:15,560 --> 00:24:17,080
Views and stored procedures

704
00:24:17,080 --> 00:24:19,320
become the only supported consumption interface.

705
00:24:19,320 --> 00:24:21,800
Not preferred, not recommended, the only one.

706
00:24:21,800 --> 00:24:23,640
If analysts, reports, data agents,

707
00:24:23,640 --> 00:24:26,520
and ad hoc explorers can hit raw fact tables directly,

708
00:24:26,520 --> 00:24:29,080
then you've already accepted that cost is a shared gamble.

709
00:24:29,080 --> 00:24:30,360
You didn't implement governance.

710
00:24:30,360 --> 00:24:32,440
You implemented hope with a refresh schedule,

711
00:24:32,440 --> 00:24:33,480
so you lock it down.

712
00:24:33,480 --> 00:24:36,760
You create a serving schema, call it something boring like serving,

713
00:24:36,760 --> 00:24:37,880
or consume.

714
00:24:37,880 --> 00:24:40,440
And that is where every queryable object lives.

715
00:24:40,440 --> 00:24:43,480
Views expose only the columns you intend to support,

716
00:24:43,480 --> 00:24:45,800
and only with filters you intend to pay for,

717
00:24:45,800 --> 00:24:49,320
stored procedures become the path for parameterized access.

718
00:24:49,320 --> 00:24:51,400
Date windows, entity scopes,

719
00:24:51,400 --> 00:24:53,400
and give me the last end days,

720
00:24:53,400 --> 00:24:56,440
patents that don't require every consumer to rediscover

721
00:24:56,440 --> 00:24:58,120
sargability the hard way.

722
00:24:58,120 --> 00:25:00,120
That's also where you enforce naming discipline.

723
00:25:00,120 --> 00:25:03,160
You don't expose fact sales total 23 of V2 final final.

724
00:25:03,160 --> 00:25:06,760
You expose sales, fact sales, through a view that hides the underlying mess

725
00:25:06,760 --> 00:25:08,120
until you refactor it.

726
00:25:08,120 --> 00:25:09,720
The consumer gets stability.

727
00:25:09,720 --> 00:25:14,200
You get freedom to change internals without a political incident every time a column gets renamed.

728
00:25:14,200 --> 00:25:16,760
Second execution plans become acceptance criteria,

729
00:25:16,760 --> 00:25:20,040
not for every query in the estate, don't be theatrical.

730
00:25:20,040 --> 00:25:22,680
For critical parts, semantic model refresh queries,

731
00:25:22,680 --> 00:25:27,560
top interactive report queries, and anything that hits large fact tables or runs under concurrency,

732
00:25:27,560 --> 00:25:29,880
you treat the plan the same way you treat a security review,

733
00:25:29,880 --> 00:25:31,560
it's a gate, not a suggestion.

734
00:25:31,560 --> 00:25:34,440
Here are the non-negotiables you enforce in review.

735
00:25:34,440 --> 00:25:35,560
Bounded predicates.

736
00:25:35,560 --> 00:25:37,880
If the query can read the entire table it will.

737
00:25:37,880 --> 00:25:41,560
So you require time windows, partition elimination, and parameterization.

738
00:25:41,560 --> 00:25:43,720
You ban non-sargable predicates in those parts.

739
00:25:43,720 --> 00:25:45,160
You also ban select star.

740
00:25:45,160 --> 00:25:47,960
If you don't name the columns, you don't understand the contract you're creating,

741
00:25:47,960 --> 00:25:49,800
and you're also guaranteeing drift.

742
00:25:49,800 --> 00:25:52,600
And you explicitly look for the usual engine taxes.

743
00:25:52,600 --> 00:25:56,920
Scans where you expected Seaks, hash joins where you expected a narrow data set,

744
00:25:56,920 --> 00:26:00,200
big sorts, and spills. Spills are treated as a defect.

745
00:26:00,200 --> 00:26:02,360
Not a performance defect, a governance defect,

746
00:26:02,360 --> 00:26:07,080
because a spilled query under concurrency is a cost incident waiting for a calendar invite.

747
00:26:07,080 --> 00:26:10,440
Third, you turn capacity metrics into a feedback loop,

748
00:26:10,440 --> 00:26:12,440
not a post-mortem artifact.

749
00:26:12,440 --> 00:26:14,600
You establish a stable cost baseline.

750
00:26:14,600 --> 00:26:17,800
What normal looks like by time of day, by workload, by refresh window,

751
00:26:17,800 --> 00:26:21,400
then you tag and track deviations to query patterns not to teams.

752
00:26:21,400 --> 00:26:22,520
The goal isn't blame.

753
00:26:22,520 --> 00:26:24,440
The goal is making cost predictable.

754
00:26:24,440 --> 00:26:27,080
When a spike happens, you correlate which query shapes run,

755
00:26:27,080 --> 00:26:31,160
which objects they hit, which concurrency windows existed, which refreshes overlapped.

756
00:26:31,160 --> 00:26:33,960
You build a library of this pattern causes that spike.

757
00:26:33,960 --> 00:26:35,400
Then you enforce that library.

758
00:26:35,400 --> 00:26:37,240
Over time, something weird happens.

759
00:26:37,240 --> 00:26:39,080
The random alerts stop being random.

760
00:26:39,080 --> 00:26:40,920
The platform becomes legible again.

761
00:26:40,920 --> 00:26:44,520
And once the cost surface becomes predictable, everything else gets easier.

762
00:26:44,520 --> 00:26:48,840
Because you can now talk about governance like an engineered property with artifacts.

763
00:26:48,840 --> 00:26:51,160
Not an aspirational poster in a wiki.

764
00:26:51,960 --> 00:26:55,480
That's the actual outcome you want from this fix, not cheaper queries,

765
00:26:55,480 --> 00:26:56,600
deterministic behavior.

766
00:26:56,600 --> 00:27:01,160
Now we can move to the next failure mode, because once cost is controlled,

767
00:27:01,160 --> 00:27:02,680
the next thing that breaks is truth.

768
00:27:02,680 --> 00:27:04,040
Case 2.

769
00:27:04,040 --> 00:27:07,480
Set up, Lake House, Warehouse, Contract Collapse.

770
00:27:07,480 --> 00:27:10,280
Once you stop bleeding money, you notice the next failure mode.

771
00:27:10,280 --> 00:27:11,880
Nobody agrees on what's true.

772
00:27:11,880 --> 00:27:14,360
This one shows up as Power BI is wrong,

773
00:27:14,360 --> 00:27:17,080
which is always a useful sentence because it's never specific.

774
00:27:17,080 --> 00:27:21,000
What they mean is two reports disagree, two teams have two official KPIs

775
00:27:21,000 --> 00:27:25,160
and the executive dashboard mysteriously changes when someone improves a model.

776
00:27:25,160 --> 00:27:26,680
The platform keeps refreshing.

777
00:27:26,680 --> 00:27:28,360
Nothing is read, the argument still happens.

778
00:27:28,360 --> 00:27:31,400
And it happens because the contract never existed.

779
00:27:31,400 --> 00:27:34,040
In fabric estates, this usually starts the same way.

780
00:27:34,040 --> 00:27:36,680
A team lands data into a lake house because it's fast,

781
00:27:36,680 --> 00:27:38,040
because notebooks are right there,

782
00:27:38,040 --> 00:27:40,120
because delta tables feel like progress,

783
00:27:40,120 --> 00:27:42,520
and because direct lake makes Power BI light up

784
00:27:42,520 --> 00:27:45,400
without the traditional import and warehouse ceremony.

785
00:27:45,400 --> 00:27:47,320
So the early win is real data shows up,

786
00:27:47,320 --> 00:27:49,160
visuals render, people feel unblocked,

787
00:27:49,160 --> 00:27:51,400
but the lake house mental model is schema on read.

788
00:27:51,400 --> 00:27:52,600
It's permissive by design.

789
00:27:52,600 --> 00:27:55,080
It will accept drift unless you force it not to.

790
00:27:55,080 --> 00:27:57,640
Files arrive with slightly different column types.

791
00:27:57,640 --> 00:28:01,000
New columns appear because the source system team just added a flag.

792
00:28:01,000 --> 00:28:03,160
Strings show up where integers used to be.

793
00:28:03,160 --> 00:28:05,400
Time stamps, move formats, and everyone shrugs

794
00:28:05,400 --> 00:28:08,280
because the notebook still runs after one more cost.

795
00:28:08,280 --> 00:28:10,520
Then the organization does the next predictable thing.

796
00:28:10,520 --> 00:28:13,000
It mirrors that lake house shape into a warehouse

797
00:28:13,000 --> 00:28:14,680
or builds a warehouse on top of it,

798
00:28:14,680 --> 00:28:17,000
expecting the warehouse to provide structure.

799
00:28:17,000 --> 00:28:20,040
But the warehouse can't enforce structure you never defined.

800
00:28:20,040 --> 00:28:23,160
If the ingestion pipeline treats whatever showed up as acceptable,

801
00:28:23,160 --> 00:28:25,480
the warehouse layer becomes a reflection of ambiguity

802
00:28:25,480 --> 00:28:26,920
not a rejection boundary.

803
00:28:26,920 --> 00:28:28,920
The schema looks cleaner because it's s-quall

804
00:28:28,920 --> 00:28:31,080
therefore leadership assumes it's controlled.

805
00:28:31,080 --> 00:28:33,160
Meanwhile, the real behavior is that the warehouse

806
00:28:33,160 --> 00:28:35,560
is normalizing drift as if it were a feature.

807
00:28:35,560 --> 00:28:38,840
And once that happens, the semantic layer becomes a patch bay.

808
00:28:38,840 --> 00:28:41,000
Analytics engineers start solving data problems

809
00:28:41,000 --> 00:28:42,440
with DAX and Power Query,

810
00:28:42,440 --> 00:28:43,960
because it's the only place they can move

811
00:28:43,960 --> 00:28:45,560
without waiting on an upstream fix.

812
00:28:45,560 --> 00:28:47,000
Measures get more complex.

813
00:28:47,000 --> 00:28:49,560
Calculated columns appear to reconcile type mismatches.

814
00:28:49,560 --> 00:28:50,840
Relationships get weird.

815
00:28:50,840 --> 00:28:52,760
Fix it in the model becomes the operating model

816
00:28:52,760 --> 00:28:54,520
because it's fast and politically safe.

817
00:28:54,520 --> 00:28:56,280
But it's also where truth fragments

818
00:28:56,280 --> 00:28:58,680
you end up with multiple correct KPIs.

819
00:28:58,680 --> 00:29:00,680
Finance has one definition in a report.

820
00:29:00,680 --> 00:29:03,080
Sales has another in a different semantic model.

821
00:29:03,080 --> 00:29:05,640
And the same metric name starts meaning different filters

822
00:29:05,640 --> 00:29:06,520
and different grain.

823
00:29:06,520 --> 00:29:08,440
Everyone's dashboard refreshes on time.

824
00:29:08,440 --> 00:29:10,120
Everyone's wrong in a different way.

825
00:29:10,120 --> 00:29:11,640
Co-pilot accelerates this collapse

826
00:29:11,640 --> 00:29:13,480
because it rewards plausibility.

827
00:29:13,480 --> 00:29:15,240
It will happily generate transformations

828
00:29:15,240 --> 00:29:17,720
and KPI logic that compile and render,

829
00:29:17,720 --> 00:29:19,720
even if they encode a hidden assumption

830
00:29:19,720 --> 00:29:23,400
about grain, keys, deduplication, or later-riving facts.

831
00:29:23,400 --> 00:29:25,480
It doesn't ask you what invariant you're enforcing.

832
00:29:25,480 --> 00:29:27,160
It asks you what output you want.

833
00:29:27,160 --> 00:29:29,400
So the deeper principle for this case is simple.

834
00:29:29,400 --> 00:29:30,920
The earlier you enforce shape,

835
00:29:30,920 --> 00:29:33,880
the fewer semantic patch jobs you fund downstream.

836
00:29:33,880 --> 00:29:36,600
Task flows and medallion visuals don't create contracts.

837
00:29:36,600 --> 00:29:37,880
They create diagrams.

838
00:29:37,880 --> 00:29:39,560
If your bronze to silver to gold layers

839
00:29:39,560 --> 00:29:42,280
aren't backed by schema enforcement and validation gates,

840
00:29:42,280 --> 00:29:43,560
you don't have architecture.

841
00:29:43,560 --> 00:29:45,480
You have a faster way to ship ambiguity.

842
00:29:45,480 --> 00:29:49,320
Case 2Fix enforced schemas, validation gates, quarantine tables.

843
00:29:49,320 --> 00:29:51,480
The fix is not get better at modeling.

844
00:29:51,480 --> 00:29:53,240
The fix is to put an enforcement boundary

845
00:29:53,240 --> 00:29:54,760
where ambiguity enters the estate

846
00:29:54,760 --> 00:29:56,360
and then refused to negotiate with it.

847
00:29:56,360 --> 00:29:59,320
In fabric that boundary is the lake house to warehouse seam.

848
00:29:59,320 --> 00:30:01,560
Treat the lake house as an intake zone.

849
00:30:01,560 --> 00:30:05,080
Fast landing, cheap iteration, messy reality.

850
00:30:05,080 --> 00:30:07,560
Treat the warehouse as the contract zone.

851
00:30:07,560 --> 00:30:11,080
Shaped, typed, keyed, and intentionally consumable.

852
00:30:11,880 --> 00:30:14,040
The warehouse isn't another place to query.

853
00:30:14,040 --> 00:30:15,880
It's the point where the organization

854
00:30:15,880 --> 00:30:17,880
finally commits to what the data is.

855
00:30:17,880 --> 00:30:19,880
So you start with explicit warehouse schemas,

856
00:30:19,880 --> 00:30:21,240
not debu-en-vibes.

857
00:30:21,240 --> 00:30:24,680
Real schemas that encode domain ownership and consumption intent

858
00:30:24,680 --> 00:30:28,520
sales, finance, HR, serving, whatever matches your estate model.

859
00:30:28,520 --> 00:30:30,840
This matters because schema is how you stop location

860
00:30:30,840 --> 00:30:32,680
from masquerading as ownership.

861
00:30:32,680 --> 00:30:35,640
If the table lives in finance, finance owns the contract.

862
00:30:35,640 --> 00:30:37,480
If it lives in serving, the platform team

863
00:30:37,480 --> 00:30:38,920
owns the consumption interface.

864
00:30:38,920 --> 00:30:40,760
The name becomes a control surface.

865
00:30:40,760 --> 00:30:42,360
Then you define shape deliberately.

866
00:30:42,360 --> 00:30:45,320
Columns, types, nullability, keys, and invariance.

867
00:30:45,320 --> 00:30:47,320
And yes, invariance is a real word here.

868
00:30:47,320 --> 00:30:49,720
It means statements the business believes are always true

869
00:30:49,720 --> 00:30:51,560
and therefore the system must enforce.

870
00:30:51,560 --> 00:30:53,000
An order has an order date.

871
00:30:53,000 --> 00:30:54,760
A customer key cannot be empty.

872
00:30:54,760 --> 00:30:56,600
A currency code is from a known set.

873
00:30:56,600 --> 00:30:58,360
A fact table grain is stable.

874
00:30:58,360 --> 00:31:01,160
And later arriving updates have a declared strategy.

875
00:31:01,160 --> 00:31:02,920
Without those, you don't have a contract.

876
00:31:02,920 --> 00:31:04,920
You have a file that happened to pass today.

877
00:31:04,920 --> 00:31:06,440
Now here's the uncomfortable move.

878
00:31:06,440 --> 00:31:09,000
You stop silently coercing data.

879
00:31:09,000 --> 00:31:12,280
Most lake house pipelines fix drift by casting strings

880
00:31:12,280 --> 00:31:14,520
to ins trimming columns, defaulting missing values

881
00:31:14,520 --> 00:31:15,320
and moving on.

882
00:31:15,320 --> 00:31:16,600
That keeps the pipeline green.

883
00:31:16,600 --> 00:31:19,640
It also turns unknown behavior into accepted behavior.

884
00:31:19,640 --> 00:31:21,720
So instead, you add validation gates at ingestion

885
00:31:21,720 --> 00:31:23,080
and transform boundaries.

886
00:31:23,080 --> 00:31:26,520
And you make failure an operational signal, not an embarrassment.

887
00:31:26,520 --> 00:31:28,520
Validation gates are simple in concept.

888
00:31:28,520 --> 00:31:31,960
Before data crosses into the contract zone, it gets checked.

889
00:31:31,960 --> 00:31:35,480
Schema matches, types match, required fields exist.

890
00:31:35,480 --> 00:31:37,600
Keys behave the way you claim they behave.

891
00:31:37,600 --> 00:31:40,240
Duplicates are handled according to a rule you can explain.

892
00:31:40,240 --> 00:31:41,520
If the checks pass, you load.

893
00:31:41,520 --> 00:31:45,040
If they fail, you do not best effort the data into production

894
00:31:45,040 --> 00:31:47,360
and hope the semantic layer can reconcile it.

895
00:31:47,360 --> 00:31:48,400
You quarantine it.

896
00:31:48,400 --> 00:31:50,640
Quarantine tables are the part most teams avoid

897
00:31:50,640 --> 00:31:52,920
because it feels like admitting imperfection.

898
00:31:52,920 --> 00:31:55,120
But quarantine is how you keep the estate honest.

899
00:31:55,120 --> 00:31:57,440
When violations occur, you land the bad routes

900
00:31:57,440 --> 00:31:59,600
in a quarantine schema with metadata.

901
00:31:59,600 --> 00:32:03,600
Source system, load time, violation type, offending values.

902
00:32:03,600 --> 00:32:06,880
And you publish the count as a metric, not hidden in a notebook cell.

903
00:32:06,880 --> 00:32:08,640
A real metric that leadership can see

904
00:32:08,640 --> 00:32:11,280
if they insist on shipping data without contracts.

905
00:32:11,280 --> 00:32:12,960
This changes behavior fast.

906
00:32:12,960 --> 00:32:16,480
Because now schema drift is no longer a silent downstream argument.

907
00:32:16,480 --> 00:32:18,320
It becomes a visible upstream event.

908
00:32:18,320 --> 00:32:20,320
And once it's visible, you can assign responsibility,

909
00:32:20,320 --> 00:32:23,040
either the source system changed, the ingestion logic failed,

910
00:32:23,040 --> 00:32:25,360
or the contract needs an intentional version bump.

911
00:32:25,360 --> 00:32:26,800
Those are the only three truths.

912
00:32:26,800 --> 00:32:28,000
Everything else is denial.

913
00:32:28,000 --> 00:32:29,600
And you'll notice a second order effect

914
00:32:29,600 --> 00:32:31,600
that everyone likes once it happens.

915
00:32:31,600 --> 00:32:33,360
Downstream logic collapses.

916
00:32:33,360 --> 00:32:34,800
In a good way.

917
00:32:34,800 --> 00:32:36,480
When the warehouse enforces shape,

918
00:32:36,480 --> 00:32:38,880
the semantic model stops being a patch bay.

919
00:32:38,880 --> 00:32:41,120
DAX measures get simpler because they don't have

920
00:32:41,120 --> 00:32:42,960
to guard against type chaos.

921
00:32:42,960 --> 00:32:46,160
Relationships stabilize because keys behave consistently.

922
00:32:46,160 --> 00:32:48,240
Two correct KPIs becomes harder to sustain

923
00:32:48,240 --> 00:32:50,960
because the raw truth is no longer malleable per workspace.

924
00:32:50,960 --> 00:32:52,480
You can still build multiple measures,

925
00:32:52,480 --> 00:32:54,160
but you're doing it on top of a stable,

926
00:32:54,160 --> 00:32:56,080
typed contract, not on top of a shifting,

927
00:32:56,080 --> 00:32:57,440
lake of assumptions.

928
00:32:57,440 --> 00:33:00,000
This is also where co-pilot becomes useful again.

929
00:33:00,000 --> 00:33:02,080
With enforced schemas and clear contracts,

930
00:33:02,080 --> 00:33:04,320
co-pilot can generate transformations and measures

931
00:33:04,320 --> 00:33:05,680
that are constrained by reality.

932
00:33:05,680 --> 00:33:07,040
You've reduced the search space.

933
00:33:07,040 --> 00:33:09,200
You've moved from plausible to bounded.

934
00:33:09,200 --> 00:33:10,240
But that's the whole trick.

935
00:33:10,240 --> 00:33:12,720
Don't ask AI to be your governance model.

936
00:33:12,720 --> 00:33:15,680
Make governance the environment AI operates inside.

937
00:33:15,680 --> 00:33:18,400
And when someone asks why you're slowing things down with gates,

938
00:33:18,400 --> 00:33:19,760
the answer is simple.

939
00:33:19,760 --> 00:33:21,120
You're not slowing delivery.

940
00:33:21,120 --> 00:33:23,120
You're stopping unreviewed ambiguity

941
00:33:23,120 --> 00:33:25,120
from becoming production truth.

942
00:33:25,120 --> 00:33:26,800
Case three setup plus fix.

943
00:33:26,800 --> 00:33:29,600
Workspace only security and the ownership vacuum.

944
00:33:29,600 --> 00:33:32,160
Once cost is predictable and truth is enforced,

945
00:33:32,160 --> 00:33:34,480
the next failure mode is the one nobody wants to talk about

946
00:33:34,480 --> 00:33:36,000
because it isn't a performance graph.

947
00:33:36,000 --> 00:33:37,680
It's an audit question.

948
00:33:37,680 --> 00:33:39,680
The symptom pattern is always the same.

949
00:33:39,680 --> 00:33:41,680
You find service principles with broad access

950
00:33:41,680 --> 00:33:43,600
because the pipeline needed it.

951
00:33:43,600 --> 00:33:45,440
You find analysts who can see raw tables

952
00:33:45,440 --> 00:33:47,360
because they were helping validate.

953
00:33:47,360 --> 00:33:49,920
You find multiple workspaces with the same data set

954
00:33:49,920 --> 00:33:52,960
copied three different ways because someone needed autonomy.

955
00:33:52,960 --> 00:33:56,000
And when the security team asks who can see this table,

956
00:33:56,000 --> 00:33:58,320
the answer is a long pause followed by,

957
00:33:58,320 --> 00:33:59,600
we think only AI cuts.

958
00:33:59,600 --> 00:34:01,040
That's not a security posture.

959
00:34:01,040 --> 00:34:01,920
That's a rumor.

960
00:34:01,920 --> 00:34:03,200
This is where fabrics convenience

961
00:34:03,200 --> 00:34:04,560
becomes architectural erosion

962
00:34:04,560 --> 00:34:06,560
because workspaces feel like security boundaries

963
00:34:06,560 --> 00:34:08,000
but they aren't data boundaries.

964
00:34:08,000 --> 00:34:09,360
They are collaboration containers.

965
00:34:09,360 --> 00:34:11,600
A workspace role answers who can contribute here,

966
00:34:11,600 --> 00:34:13,600
not who can access this column,

967
00:34:13,600 --> 00:34:15,280
not who can join these two tables

968
00:34:15,280 --> 00:34:16,560
and infer something sensitive,

969
00:34:16,560 --> 00:34:18,880
not who can run an expensive query surface

970
00:34:18,880 --> 00:34:20,400
that becomes a side channel.

971
00:34:20,400 --> 00:34:23,360
And if you treat workspace roles as your whole strategy,

972
00:34:23,360 --> 00:34:24,800
you create an ownership vacuum

973
00:34:24,800 --> 00:34:26,480
because now nobody owns the access model

974
00:34:26,480 --> 00:34:27,840
at the data engine layer.

975
00:34:27,840 --> 00:34:29,360
Nobody owns the table level intent.

976
00:34:29,360 --> 00:34:31,280
Nobody owns the deny by default posture.

977
00:34:31,280 --> 00:34:33,280
Everyone assumes the workspace boundary is enough,

978
00:34:33,280 --> 00:34:35,360
therefore nobody builds real boundaries inside it.

979
00:34:35,360 --> 00:34:36,480
The platform works.

980
00:34:36,480 --> 00:34:37,360
The system doesn't.

981
00:34:37,360 --> 00:34:41,440
Copilot makes this worse in a very specific way.

982
00:34:41,440 --> 00:34:44,320
It changes how people discover and interact with data.

983
00:34:44,320 --> 00:34:46,640
In the old world, consumers had to know where to look.

984
00:34:46,640 --> 00:34:48,480
A report, a data set,

985
00:34:48,480 --> 00:34:50,480
maybe a documented SQL endpoint.

986
00:34:50,480 --> 00:34:53,440
In fabric, copilot, and agents make exploration conversational

987
00:34:53,440 --> 00:34:56,800
and cross surface, they surface tables, they suggest joins.

988
00:34:56,800 --> 00:34:58,640
They help people just query it.

989
00:34:58,640 --> 00:35:01,520
And if you gave broad read access because you wanted adoption,

990
00:35:01,520 --> 00:35:03,360
copilot becomes the fastest path

991
00:35:03,360 --> 00:35:06,720
to exploring raw assets you never intended as consumption paths.

992
00:35:06,720 --> 00:35:08,480
It doesn't bypass permissions.

993
00:35:08,480 --> 00:35:10,400
It bypasses your assumptions.

994
00:35:10,400 --> 00:35:13,200
So the root cause isn't copilot exposed data.

995
00:35:13,200 --> 00:35:16,560
The root cause is that you never encoded your consumption surfaces

996
00:35:16,560 --> 00:35:18,960
and you never encoded your intent at the engine layer.

997
00:35:18,960 --> 00:35:22,160
You left a vacuum and the platform filled it with default behavior,

998
00:35:22,160 --> 00:35:24,880
broad access, raw tables as APIs,

999
00:35:24,880 --> 00:35:27,120
and service principles that look like owners.

1000
00:35:27,120 --> 00:35:28,480
Now here's the uncomfortable truth.

1001
00:35:28,480 --> 00:35:30,480
Most fabric estates don't get hacked.

1002
00:35:30,480 --> 00:35:31,760
They get drifted.

1003
00:35:31,760 --> 00:35:34,880
Access expands because every exception feels justified at the time.

1004
00:35:34,880 --> 00:35:37,520
Give the SP owner for now, we'll tighten later.

1005
00:35:37,520 --> 00:35:40,000
Add them as member they need to publish.

1006
00:35:40,000 --> 00:35:41,280
Just let them read the layhouse.

1007
00:35:41,280 --> 00:35:42,640
It's not that sensitive.

1008
00:35:42,640 --> 00:35:44,240
Then later becomes never,

1009
00:35:44,240 --> 00:35:45,920
and the estate becomes unauditable.

1010
00:35:45,920 --> 00:35:47,280
Not because anyone was malicious,

1011
00:35:47,280 --> 00:35:49,840
but because no one was responsible for the intent model.

1012
00:35:49,840 --> 00:35:52,000
So the fix is not more training.

1013
00:35:52,000 --> 00:35:52,960
It's not more labels.

1014
00:35:52,960 --> 00:35:54,560
It's not put it in a wiki.

1015
00:35:54,560 --> 00:35:56,000
You need enforceable boundaries

1016
00:35:56,000 --> 00:35:57,520
that survive convenience.

1017
00:35:57,520 --> 00:36:00,080
First, define schema-based security boundaries

1018
00:36:00,080 --> 00:36:01,520
in the warehouse as the contract zone.

1019
00:36:01,520 --> 00:36:03,920
If you use the previous section correctly,

1020
00:36:03,920 --> 00:36:06,720
you already have real schemas that encode ownership.

1021
00:36:06,720 --> 00:36:09,600
Domain schema, serving schemas, quarantine schemas.

1022
00:36:09,600 --> 00:36:11,360
Now you attach security to that structure.

1023
00:36:11,360 --> 00:36:14,000
You create database roles that map to business intent,

1024
00:36:14,000 --> 00:36:17,440
consumer, steward, engineer, automation, whatever matches your org,

1025
00:36:17,440 --> 00:36:19,360
and you start from deny by default.

1026
00:36:19,360 --> 00:36:21,840
No implicit access because someone is in the workspace.

1027
00:36:21,840 --> 00:36:24,320
Second, you restrict raw table access.

1028
00:36:24,320 --> 00:36:26,880
Tables become internal implementation details.

1029
00:36:26,880 --> 00:36:29,840
Views and procedures become the controlled access parts.

1030
00:36:29,840 --> 00:36:31,840
This is where performance and security align.

1031
00:36:31,840 --> 00:36:33,920
You expose what you intend people to query,

1032
00:36:33,920 --> 00:36:36,800
and you can enforce both column level and row level constraints

1033
00:36:36,800 --> 00:36:37,600
where appropriate.

1034
00:36:37,600 --> 00:36:40,160
More importantly, you can enforce that agents

1035
00:36:40,160 --> 00:36:43,120
and ad hoc queries hit the same surfaces your reports hit.

1036
00:36:43,120 --> 00:36:46,080
One contract, one path, one place to secure and tune.

1037
00:36:46,080 --> 00:36:49,280
Third, you treat service principles as identities with blast radius,

1038
00:36:49,280 --> 00:36:50,880
not as pipeline glue.

1039
00:36:50,880 --> 00:36:52,880
Every SP has an explicit role.

1040
00:36:52,880 --> 00:36:55,440
Scope to the minimum schema and objects it needs.

1041
00:36:55,440 --> 00:36:59,040
You stop granting workspace roles as a substitute for data permissions.

1042
00:36:59,040 --> 00:37:02,720
You also stop mixing can deploy artifacts with can read data.

1043
00:37:02,720 --> 00:37:03,920
Those are different rights.

1044
00:37:03,920 --> 00:37:05,520
When you collapse them into member,

1045
00:37:05,520 --> 00:37:08,320
you create an estate where publishing equals reading

1046
00:37:08,320 --> 00:37:11,200
and reading equals exploring and exploring equals inference.

1047
00:37:11,200 --> 00:37:13,920
Finally, you make auditability a first class artifact,

1048
00:37:13,920 --> 00:37:14,960
not a yearly scramble.

1049
00:37:14,960 --> 00:37:17,600
You should be able to answer three questions without a meeting.

1050
00:37:17,600 --> 00:37:20,560
Who can see it, who can change it, and through what surface.

1051
00:37:20,560 --> 00:37:23,520
If you can't answer those questions from roles, grants,

1052
00:37:23,520 --> 00:37:25,920
and controlled interfaces, you don't have governance.

1053
00:37:25,920 --> 00:37:26,960
You have a story.

1054
00:37:26,960 --> 00:37:28,800
And this is the big outcome of this fix.

1055
00:37:28,800 --> 00:37:32,320
You can become audit-defensible without re-architecting the whole estate.

1056
00:37:32,320 --> 00:37:33,760
You don't need a new platform.

1057
00:37:33,760 --> 00:37:35,200
You need a real intent model,

1058
00:37:35,200 --> 00:37:37,120
enforced inside the data engine,

1059
00:37:37,120 --> 00:37:40,000
with workspaces treated as collaboration, not containment.

1060
00:37:40,000 --> 00:37:42,800
Because in fabric, the workspace is where people click,

1061
00:37:42,800 --> 00:37:44,320
but the warehouse is where you enforce.

1062
00:37:44,320 --> 00:37:47,600
New failure modes in the fabric plus copilot era.

1063
00:37:47,600 --> 00:37:50,240
Now, if those three cases felt familiar, good.

1064
00:37:50,240 --> 00:37:51,200
They're not edge cases.

1065
00:37:51,200 --> 00:37:53,200
They're the new baseline failure modes

1066
00:37:53,200 --> 00:37:56,800
in a fabric estate where speed exists before enforcement.

1067
00:37:56,800 --> 00:37:58,800
And what's different in this era is that failure

1068
00:37:58,800 --> 00:38:00,320
doesn't require a visible break.

1069
00:38:00,320 --> 00:38:01,760
The platform keeps running.

1070
00:38:01,760 --> 00:38:03,360
Refreshes keep finishing.

1071
00:38:03,360 --> 00:38:04,320
Users keep clicking.

1072
00:38:04,320 --> 00:38:05,600
Copilot keeps responding.

1073
00:38:05,600 --> 00:38:08,640
The estate degrades anyway because the failure modes are systemic,

1074
00:38:08,640 --> 00:38:11,360
boundary blur, cost drift, and the ownership vacuum.

1075
00:38:11,360 --> 00:38:13,520
And they don't show up as one big explosion.

1076
00:38:13,520 --> 00:38:16,720
They show up as a slow replacement of intent with convenience.

1077
00:38:16,720 --> 00:38:19,120
Failure mode 1 is boundary blur across layers.

1078
00:38:19,120 --> 00:38:20,880
In fabric, the lake house feeds the warehouse,

1079
00:38:20,880 --> 00:38:22,480
the warehouse feeds the semantic model,

1080
00:38:22,480 --> 00:38:25,440
the semantic model feeds reports and reports feed apps.

1081
00:38:25,440 --> 00:38:28,160
But the boundaries between those layers are mostly social

1082
00:38:28,160 --> 00:38:30,000
unless you make them physical.

1083
00:38:30,000 --> 00:38:33,440
So what happens in real estate is that each layer starts fixing

1084
00:38:33,440 --> 00:38:35,440
what it doesn't like about the layer before it.

1085
00:38:35,440 --> 00:38:37,680
The lake house lands messy data.

1086
00:38:37,680 --> 00:38:40,000
The warehouse normalizes it without rejecting it.

1087
00:38:40,000 --> 00:38:43,520
The semantic model compensates with calculated columns and relationships.

1088
00:38:43,520 --> 00:38:45,920
And the report compensates with DAX and filters.

1089
00:38:45,920 --> 00:38:47,680
Every local fix feels rational,

1090
00:38:47,680 --> 00:38:49,920
but every local fix also creates drift

1091
00:38:49,920 --> 00:38:52,480
because the logic now lives in multiple places

1092
00:38:52,480 --> 00:38:56,000
and nobody can describe the system as one coherent contract.

1093
00:38:56,000 --> 00:38:59,360
And because fabric makes it easy to create a new semantic model

1094
00:38:59,360 --> 00:39:01,040
or a new report in minutes,

1095
00:39:01,040 --> 00:39:03,120
teams fork truth instead of repairing it.

1096
00:39:03,120 --> 00:39:04,720
A temporary model becomes the model.

1097
00:39:04,720 --> 00:39:06,640
A quick report becomes operational.

1098
00:39:06,640 --> 00:39:09,520
Apps become distribution channels for inconsistencies

1099
00:39:09,520 --> 00:39:11,600
and then you get the weirdest kind of outage.

1100
00:39:11,600 --> 00:39:14,240
The business still runs, but nobody trusts the numbers.

1101
00:39:14,240 --> 00:39:15,200
That's boundary blur.

1102
00:39:15,920 --> 00:39:20,080
Failure mode 2 is cost-drift driven by AI-shaped interaction patterns.

1103
00:39:20,080 --> 00:39:22,080
This isn't just bad sequel.

1104
00:39:22,080 --> 00:39:25,120
It's the combination of co-pilot making query authoring cheap

1105
00:39:25,120 --> 00:39:27,840
and fabric making query execution shared.

1106
00:39:27,840 --> 00:39:31,280
When a platform charges you for runtime under concurrency,

1107
00:39:31,280 --> 00:39:34,320
every unbounded scan becomes an incident generator.

1108
00:39:34,320 --> 00:39:37,840
But the trick is that cost drift doesn't necessarily correlate to deployments.

1109
00:39:37,840 --> 00:39:40,160
So traditional change management doesn't catch it.

1110
00:39:40,160 --> 00:39:41,680
You can have perfect CICD

1111
00:39:41,680 --> 00:39:43,760
and still get wrecked at 9.05 AM

1112
00:39:43,760 --> 00:39:46,080
because a well-meaning analyst asked co-pilot a question

1113
00:39:46,080 --> 00:39:47,840
that produced a non-sargable predicate,

1114
00:39:47,840 --> 00:39:50,800
hit a large table and ran alongside refresh concurrency.

1115
00:39:50,800 --> 00:39:52,560
So the incident taxonomy shifts.

1116
00:39:52,560 --> 00:39:54,800
Your first signal becomes capacity pressure,

1117
00:39:54,800 --> 00:39:56,320
not pipeline failure.

1118
00:39:56,320 --> 00:39:58,400
Your debugging artifact becomes the plan,

1119
00:39:58,400 --> 00:39:59,280
not the log.

1120
00:39:59,280 --> 00:40:01,840
Your prevention mechanism becomes query surfaces

1121
00:40:01,840 --> 00:40:04,480
and acceptance criteria, not best practices.

1122
00:40:04,480 --> 00:40:05,760
And if you don't accept that shift,

1123
00:40:05,760 --> 00:40:07,600
you keep treating spend as a billing problem

1124
00:40:07,600 --> 00:40:09,520
when it's actually an architecture problem.

1125
00:40:09,520 --> 00:40:11,680
Failure mode 3 is the ownership vacuum

1126
00:40:11,680 --> 00:40:13,440
that looks like everything is fine.

1127
00:40:13,440 --> 00:40:14,640
This is the most corrosive one

1128
00:40:14,640 --> 00:40:17,200
because it hides behind success metrics, pipelines run,

1129
00:40:17,200 --> 00:40:19,200
dashboards refresh, people deliver,

1130
00:40:19,200 --> 00:40:21,040
but nobody owns semantics end to end.

1131
00:40:21,040 --> 00:40:22,640
Nobody owns the contract boundary,

1132
00:40:22,640 --> 00:40:24,400
nobody owns the access intent model.

1133
00:40:24,400 --> 00:40:26,080
And because fabric centralizes everything

1134
00:40:26,080 --> 00:40:27,200
into a workspace experience,

1135
00:40:27,200 --> 00:40:28,640
the organization starts confusing

1136
00:40:28,640 --> 00:40:31,440
someone has access with someone has responsibility.

1137
00:40:31,440 --> 00:40:32,480
Those are not the same.

1138
00:40:32,480 --> 00:40:33,920
An ownership vacuum forms

1139
00:40:33,920 --> 00:40:36,640
when the system can function without an explicit owner.

1140
00:40:36,640 --> 00:40:37,600
Fabric lets you do that.

1141
00:40:37,600 --> 00:40:40,240
It's a platform designed for speed and collaboration,

1142
00:40:40,240 --> 00:40:41,520
so it will happily keep working

1143
00:40:41,520 --> 00:40:43,120
while your governance model erodes.

1144
00:40:43,120 --> 00:40:44,800
Co-pilot accelerates the erosion

1145
00:40:44,800 --> 00:40:46,480
because it makes it easier for more people

1146
00:40:46,480 --> 00:40:49,520
to create more artifacts, pipelines, notebooks,

1147
00:40:49,520 --> 00:40:51,920
transformations, models and answers.

1148
00:40:51,920 --> 00:40:54,000
More artifacts means more implicit contracts.

1149
00:40:54,000 --> 00:40:55,840
More implicit contracts means more drift.

1150
00:40:55,840 --> 00:40:58,400
And drift without an owner is not a technical issue.

1151
00:40:58,400 --> 00:40:59,760
It's entropy management failure.

1152
00:40:59,760 --> 00:41:03,760
This is why each exception becomes an entropy generator.

1153
00:41:03,760 --> 00:41:05,440
You add one temporary shortcut,

1154
00:41:05,440 --> 00:41:07,200
one just for now workspace role.

1155
00:41:07,200 --> 00:41:08,880
One quick semantic model

1156
00:41:08,880 --> 00:41:10,800
built directly on raw tables.

1157
00:41:10,800 --> 00:41:12,880
One fix in DAX because upstream is slow,

1158
00:41:12,880 --> 00:41:15,200
one copilot generated query saved as a data set

1159
00:41:15,200 --> 00:41:16,480
because it worked.

1160
00:41:16,480 --> 00:41:18,640
Each one is defensible in isolation.

1161
00:41:18,640 --> 00:41:21,600
Together, they form a system where intent can't be proven.

1162
00:41:21,600 --> 00:41:24,000
And the real consequences that you end up with incidents

1163
00:41:24,000 --> 00:41:25,600
that aren't outages,

1164
00:41:25,600 --> 00:41:27,600
cost incidents, correctness incidents,

1165
00:41:27,600 --> 00:41:28,640
access incidents.

1166
00:41:28,640 --> 00:41:30,080
They don't always show up as red lights.

1167
00:41:30,080 --> 00:41:31,920
They show up as executive distrust,

1168
00:41:31,920 --> 00:41:33,280
ordered discomfort,

1169
00:41:33,280 --> 00:41:36,480
and a capacity meter that behaves like a random number generator.

1170
00:41:36,480 --> 00:41:39,040
So if you're inheriting a fabric estate and thinking,

1171
00:41:39,040 --> 00:41:41,040
why does this feel harder than it should be,

1172
00:41:41,040 --> 00:41:41,760
here's the answer.

1173
00:41:41,760 --> 00:41:42,560
The platform works.

1174
00:41:42,560 --> 00:41:43,680
The system doesn't.

1175
00:41:43,680 --> 00:41:45,760
And the only way out is to stop treating governance

1176
00:41:45,760 --> 00:41:48,640
as education and start treating it as enforced design.

1177
00:41:48,640 --> 00:41:50,640
Because fabric doesn't slow you down anymore,

1178
00:41:50,640 --> 00:41:52,720
which means you have to choose where friction belongs.

1179
00:41:52,720 --> 00:41:55,520
The role collapse, what shrunk, what didn't.

1180
00:41:55,520 --> 00:41:57,120
Here's the part people keep getting wrong

1181
00:41:57,120 --> 00:41:58,560
and it's why the hiring conversations

1182
00:41:58,560 --> 00:42:00,560
in fabric estates are weird right now.

1183
00:42:00,560 --> 00:42:03,360
They see copilot generating notebooks and SQL

1184
00:42:03,360 --> 00:42:05,680
and they conclude the data engineer role is shrinking.

1185
00:42:05,680 --> 00:42:06,720
It is.

1186
00:42:06,720 --> 00:42:09,600
But only the parts that were never the job in the first place.

1187
00:42:09,600 --> 00:42:11,840
What shrunk is the visible labor?

1188
00:42:11,840 --> 00:42:13,840
Handwriting the pipeline scaffolding,

1189
00:42:13,840 --> 00:42:15,200
stitching connectors together,

1190
00:42:15,200 --> 00:42:16,080
doing repetitive,

1191
00:42:16,080 --> 00:42:19,280
SQL transforms that look impressive in a commit history,

1192
00:42:19,280 --> 00:42:22,160
and maintaining glue code that exists purely

1193
00:42:22,160 --> 00:42:24,320
because tools used to be separate.

1194
00:42:24,320 --> 00:42:25,760
Fabric collapsed a lot of that work

1195
00:42:25,760 --> 00:42:28,480
into item creation and shared experiences,

1196
00:42:28,480 --> 00:42:30,880
and copilot collapses even more into auto-complete

1197
00:42:30,880 --> 00:42:32,080
with extra steps.

1198
00:42:32,080 --> 00:42:34,480
So yes, the estate can produce artifacts faster,

1199
00:42:34,480 --> 00:42:37,040
but the trap is that teams confuse artifact velocity

1200
00:42:37,040 --> 00:42:38,080
with system integrity.

1201
00:42:38,080 --> 00:42:41,040
In the old world, you had to earn every artifact

1202
00:42:41,040 --> 00:42:42,960
by suffering through provisioned infrastructure,

1203
00:42:42,960 --> 00:42:44,480
separate portals, long-run times,

1204
00:42:44,480 --> 00:42:46,240
and annoying deployment friction.

1205
00:42:46,240 --> 00:42:48,480
That friction acted like attacks on casual changes,

1206
00:42:48,480 --> 00:42:49,760
but slowed bad decisions.

1207
00:42:49,760 --> 00:42:50,960
It also slowed good ones.

1208
00:42:50,960 --> 00:42:54,080
But the net effect was that fewer people touch the system,

1209
00:42:54,080 --> 00:42:55,280
fewer times per day,

1210
00:42:55,280 --> 00:42:58,480
with fewer opportunities to accidentally publish a new truth.

1211
00:42:58,480 --> 00:42:59,760
Fabric removes that tax.

1212
00:42:59,760 --> 00:43:03,120
So if your organization's identity for a data engineer

1213
00:43:03,120 --> 00:43:04,720
was the person who makes pipelines,

1214
00:43:04,720 --> 00:43:06,880
of course it feels like the role is disappearing.

1215
00:43:06,880 --> 00:43:09,040
Pipelines are easier, notebooks are easier,

1216
00:43:09,040 --> 00:43:11,040
even the semantic model path is faster.

1217
00:43:11,040 --> 00:43:12,800
The visible assembly work shrinks,

1218
00:43:12,800 --> 00:43:14,160
and now the uncomfortable part.

1219
00:43:14,160 --> 00:43:15,920
The responsibilities that didn't shrink

1220
00:43:15,920 --> 00:43:18,720
are the ones that actually determine whether the estate survives,

1221
00:43:18,720 --> 00:43:22,320
data contracts, schema enforcement, cost predictability,

1222
00:43:22,320 --> 00:43:25,040
security boundaries, and ownership clarity.

1223
00:43:25,040 --> 00:43:26,320
Those did not get automated.

1224
00:43:26,320 --> 00:43:27,840
They got exposed.

1225
00:43:27,840 --> 00:43:29,920
Because once the tool friction goes away,

1226
00:43:29,920 --> 00:43:32,240
the system will ship whatever you allow it to ship.

1227
00:43:32,240 --> 00:43:33,760
If you didn't encode invariants,

1228
00:43:33,760 --> 00:43:35,520
you don't get flexibility.

1229
00:43:35,520 --> 00:43:37,680
That you get drift at a higher refresh frequency.

1230
00:43:38,400 --> 00:43:40,000
If you didn't encode cost intent,

1231
00:43:40,000 --> 00:43:41,520
you don't get self-service,

1232
00:43:41,520 --> 00:43:43,120
you get shared meter contention,

1233
00:43:43,120 --> 00:43:45,280
and a budget that behaves like weather.

1234
00:43:45,280 --> 00:43:46,320
This is the role collapse,

1235
00:43:46,320 --> 00:43:50,000
the platform turned craftsmanship into a smaller percentage of the work,

1236
00:43:50,000 --> 00:43:52,720
and turned governance into the defining bottleneck.

1237
00:43:52,720 --> 00:43:55,440
When tooling gets easier, discipline becomes the bottleneck,

1238
00:43:55,440 --> 00:43:57,840
not because discipline is morally good,

1239
00:43:57,840 --> 00:43:59,360
but because discipline is the only thing

1240
00:43:59,360 --> 00:44:01,280
that produces deterministic outcomes

1241
00:44:01,280 --> 00:44:03,680
in a high-speed low-friction platform.

1242
00:44:03,680 --> 00:44:07,200
This is also why so many teams feel personally attacked by fabric.

1243
00:44:07,200 --> 00:44:09,120
The old world let you be a hero with effort.

1244
00:44:09,120 --> 00:44:10,960
You could brute force a solution with time

1245
00:44:10,960 --> 00:44:12,480
and the friction of the system

1246
00:44:12,480 --> 00:44:14,400
made that effort feel like engineering.

1247
00:44:14,400 --> 00:44:15,920
Fabric removes the heroism layer,

1248
00:44:15,920 --> 00:44:18,560
it rewards design, it punishes improvisation.

1249
00:44:18,560 --> 00:44:19,920
And if you came up in an environment

1250
00:44:19,920 --> 00:44:22,160
where making it work was the primary skill,

1251
00:44:22,160 --> 00:44:24,160
fabric makes you faster at making it work,

1252
00:44:24,160 --> 00:44:26,960
and therefore faster at creating the next incident class.

1253
00:44:26,960 --> 00:44:28,160
So the job moved up the stack

1254
00:44:28,160 --> 00:44:29,440
and the modern data engineer

1255
00:44:29,440 --> 00:44:31,360
isn't primarily a tool operator anymore.

1256
00:44:31,360 --> 00:44:32,960
There are boundary enforcers.

1257
00:44:32,960 --> 00:44:36,000
Contracts, schemers, roles, plans, and gates.

1258
00:44:36,000 --> 00:44:37,760
They decide which paths are allowed,

1259
00:44:37,760 --> 00:44:39,600
which behaviors are acceptable,

1260
00:44:39,600 --> 00:44:41,280
and which quick fixes are rejected

1261
00:44:41,280 --> 00:44:42,880
because they generate long-term drift.

1262
00:44:42,880 --> 00:44:46,000
Analytics engineers feel the same squeeze from the other side.

1263
00:44:46,000 --> 00:44:47,440
If logic lives in five tools,

1264
00:44:47,440 --> 00:44:48,960
Lakehouse transforms, warehouse views,

1265
00:44:48,960 --> 00:44:51,200
semantic model measures, report-level filters,

1266
00:44:51,200 --> 00:44:52,400
and app-level business rules,

1267
00:44:52,400 --> 00:44:54,320
then the platform didn't give you agility.

1268
00:44:54,320 --> 00:44:56,720
It gave you five places for truth to fragment.

1269
00:44:56,720 --> 00:44:59,040
Platform owners also get a new kind of accountability.

1270
00:44:59,040 --> 00:45:00,160
You can't hide behind,

1271
00:45:00,160 --> 00:45:03,200
we size the cluster wrong, or we need more nodes.

1272
00:45:03,200 --> 00:45:05,120
In fabric, if plans are unstable,

1273
00:45:05,120 --> 00:45:06,160
cost is undefined,

1274
00:45:06,160 --> 00:45:08,560
and you will be the person asked why the meter spikes.

1275
00:45:08,560 --> 00:45:09,600
Not because it's your fault,

1276
00:45:09,600 --> 00:45:12,080
but because the platform centralised the blast radius

1277
00:45:12,080 --> 00:45:13,920
into one capacity envelope.

1278
00:45:13,920 --> 00:45:15,760
An architect's inheriting fabric estates

1279
00:45:15,760 --> 00:45:17,520
have the worst version of this.

1280
00:45:17,520 --> 00:45:20,080
They inherit a pile of artefacts that all work

1281
00:45:20,080 --> 00:45:22,400
with no explicit boundaries, no clear owners,

1282
00:45:22,400 --> 00:45:23,680
and a history of exceptions

1283
00:45:23,680 --> 00:45:25,600
that became the actual operating model.

1284
00:45:25,600 --> 00:45:27,040
They're asked to make it reliable

1285
00:45:27,040 --> 00:45:28,560
without slowing the business down,

1286
00:45:28,560 --> 00:45:29,760
which is corporate code for,

1287
00:45:29,760 --> 00:45:31,760
please enforce discipline without creating conflict.

1288
00:45:31,760 --> 00:45:32,560
But that's the shift.

1289
00:45:32,560 --> 00:45:33,680
The role didn't get smaller.

1290
00:45:33,680 --> 00:45:35,760
The role got less visible and more consequential.

1291
00:45:35,760 --> 00:45:37,680
Fabric didn't eliminate data engineering.

1292
00:45:37,680 --> 00:45:39,920
It eliminated the parts that used to distract people

1293
00:45:39,920 --> 00:45:40,960
from the real job,

1294
00:45:40,960 --> 00:45:42,560
enforcing intent at scale.

1295
00:45:42,560 --> 00:45:44,560
What the modern data engineer is now?

1296
00:45:44,560 --> 00:45:47,120
Most organisations still describe the modern data engineer

1297
00:45:47,120 --> 00:45:49,120
as a faster version of the old one.

1298
00:45:49,120 --> 00:45:50,720
Same job, new tooling.

1299
00:45:50,720 --> 00:45:51,920
That belief is comfortable.

1300
00:45:51,920 --> 00:45:52,880
And it's also wrong.

1301
00:45:52,880 --> 00:45:54,960
The modern data engineer in a fabric estate

1302
00:45:54,960 --> 00:45:56,240
is not a pipeline writer,

1303
00:45:56,240 --> 00:45:57,360
not a tool operator,

1304
00:45:57,360 --> 00:45:58,880
and not a SQL typist.

1305
00:45:58,880 --> 00:46:00,000
Those activities still exist,

1306
00:46:00,000 --> 00:46:01,920
but they are not the centre of gravity anymore.

1307
00:46:02,640 --> 00:46:04,480
Fabric and co-pilot shoved the work

1308
00:46:04,480 --> 00:46:07,200
up the stack into a place most teams avoided.

1309
00:46:07,200 --> 00:46:08,480
Explosive intent.

1310
00:46:08,480 --> 00:46:10,480
And this is the part that makes people uncomfortable,

1311
00:46:10,480 --> 00:46:12,880
because intent can't be improvised.

1312
00:46:12,880 --> 00:46:15,040
It has to be designed, enforced, and defended.

1313
00:46:15,040 --> 00:46:16,080
So what is the job now?

1314
00:46:16,080 --> 00:46:17,600
First, contract designer.

1315
00:46:17,600 --> 00:46:18,960
A contract is not a wiki page,

1316
00:46:18,960 --> 00:46:20,240
and it's not a diagram.

1317
00:46:20,240 --> 00:46:21,680
A contract is an enforcement boundary

1318
00:46:21,680 --> 00:46:23,840
that makes ambiguity expensive to ship.

1319
00:46:23,840 --> 00:46:26,000
It defines grain, keys, types,

1320
00:46:26,000 --> 00:46:28,160
nullability, freshness expectations,

1321
00:46:28,160 --> 00:46:29,280
and failure behaviour.

1322
00:46:29,280 --> 00:46:31,360
It encodes what the business thinks is true

1323
00:46:31,360 --> 00:46:33,040
into something the platform can enforce

1324
00:46:33,040 --> 00:46:34,720
without asking permission every time.

1325
00:46:34,720 --> 00:46:37,280
When you skip this, the estate will still move fast.

1326
00:46:37,280 --> 00:46:39,200
It will just move fast toward drift.

1327
00:46:39,200 --> 00:46:40,960
Second, boundary enforcer.

1328
00:46:40,960 --> 00:46:43,280
This is the skill that separates

1329
00:46:43,280 --> 00:46:46,640
we adopted fabric from we operate fabric.

1330
00:46:46,640 --> 00:46:48,560
Boundaries in fabric are not the marketing ones.

1331
00:46:48,560 --> 00:46:50,320
They're not lake house versus warehouse

1332
00:46:50,320 --> 00:46:51,360
as a product menu.

1333
00:46:51,360 --> 00:46:53,120
Their architectural seems.

1334
00:46:53,120 --> 00:46:54,320
Intake versus contract,

1335
00:46:54,320 --> 00:46:55,440
contract versus consumption,

1336
00:46:55,440 --> 00:46:57,120
consumption versus presentation.

1337
00:46:57,120 --> 00:46:59,440
The modern data engineer makes those seams physical

1338
00:46:59,440 --> 00:47:02,240
with views, procedures, schemers, and controlled surfaces.

1339
00:47:02,240 --> 00:47:04,480
And they also make those seams social

1340
00:47:04,480 --> 00:47:06,160
by refusing to accept quick fixes

1341
00:47:06,160 --> 00:47:07,440
that bypass the boundary.

1342
00:47:07,440 --> 00:47:11,040
That refusal matters more than any new feature announcement.

1343
00:47:11,040 --> 00:47:12,160
Third, cost governor.

1344
00:47:12,160 --> 00:47:15,680
If you're uncomfortable calling cost a governance problem,

1345
00:47:15,680 --> 00:47:16,960
fabric will fix that for you.

1346
00:47:16,960 --> 00:47:18,880
In fabric, cost is not finances problem.

1347
00:47:18,880 --> 00:47:21,440
Cost is an execution property of your decisions.

1348
00:47:21,440 --> 00:47:23,120
Query surfaces, concurrency,

1349
00:47:23,120 --> 00:47:24,480
refresh design, and the difference

1350
00:47:24,480 --> 00:47:26,960
between bounded access and raw exploration.

1351
00:47:26,960 --> 00:47:29,200
The modern data engineer treats execution plans

1352
00:47:29,200 --> 00:47:30,640
like policy artifacts.

1353
00:47:30,640 --> 00:47:32,640
They enforce sagability expectations,

1354
00:47:32,640 --> 00:47:34,880
bounded predicates, and predictable query shapes

1355
00:47:34,880 --> 00:47:35,760
for critical parts,

1356
00:47:35,760 --> 00:47:37,520
not because they enjoy gatekeeping

1357
00:47:37,520 --> 00:47:40,640
because deterministic spend requires deterministic behavior.

1358
00:47:40,640 --> 00:47:42,640
Fourth, failure mode anticipator.

1359
00:47:42,640 --> 00:47:46,320
Old school data engineering treated failure as outages.

1360
00:47:46,320 --> 00:47:49,280
A job failed, a pipeline stopped, a connector broke.

1361
00:47:49,280 --> 00:47:52,720
In fabric, the more dangerous failures are the quiet ones.

1362
00:47:52,720 --> 00:47:54,880
It refreshed, but it drifted.

1363
00:47:54,880 --> 00:47:56,960
It answered, but it exposed, it worked,

1364
00:47:56,960 --> 00:47:58,560
but it burned the capacity meter.

1365
00:47:58,960 --> 00:48:01,360
The modern data engineer thinks in incident classes,

1366
00:48:01,360 --> 00:48:03,200
cost incidents, correctness incidents,

1367
00:48:03,200 --> 00:48:05,600
access incidents, and they design for detection.

1368
00:48:05,600 --> 00:48:07,200
Plans, scans versus returns,

1369
00:48:07,200 --> 00:48:08,800
violations, quarantines, lineage,

1370
00:48:08,800 --> 00:48:11,120
and auditability aren't optional extras.

1371
00:48:11,120 --> 00:48:13,360
They are the sensors that tell you the estate is decaying

1372
00:48:13,360 --> 00:48:15,040
before the business notices.

1373
00:48:15,040 --> 00:48:16,480
That distinction matters

1374
00:48:16,480 --> 00:48:18,640
because you can't manage what you can't observe

1375
00:48:18,640 --> 00:48:21,120
and you can't govern what you can't prove.

1376
00:48:21,120 --> 00:48:22,800
Now tie this back to the T-CeCle mindset

1377
00:48:22,800 --> 00:48:25,120
because this is where people underestimate what's happening.

1378
00:48:25,120 --> 00:48:26,800
Most teams treat T-CeCle,

1379
00:48:26,800 --> 00:48:30,160
schemas, rolls, and plans as implementation details.

1380
00:48:30,160 --> 00:48:32,560
In a fabric estate, those are control surfaces.

1381
00:48:32,560 --> 00:48:33,840
A schema is not a folder.

1382
00:48:33,840 --> 00:48:34,960
It's an ownership boundary.

1383
00:48:34,960 --> 00:48:36,240
A role is not a convenience.

1384
00:48:36,240 --> 00:48:37,520
It's an intent declaration.

1385
00:48:37,520 --> 00:48:38,720
A view is not a shortcut.

1386
00:48:38,720 --> 00:48:39,760
It's a contract wrapper.

1387
00:48:39,760 --> 00:48:42,720
An execution plan is not a performance troubleshooting tool.

1388
00:48:42,720 --> 00:48:44,640
It's the cost policy reality check.

1389
00:48:44,640 --> 00:48:46,320
And once you see those as control surfaces,

1390
00:48:46,320 --> 00:48:48,000
you stop asking co-pilot to be smarter.

1391
00:48:48,000 --> 00:48:49,680
You start making the system stricter.

1392
00:48:49,680 --> 00:48:51,280
That is the only sustainable move.

1393
00:48:51,280 --> 00:48:54,160
Now, because this episode is for multiple audiences,

1394
00:48:54,160 --> 00:48:56,000
the roll shift needs to be explicit

1395
00:48:56,000 --> 00:48:59,360
because each group tends to outsource the hard part to someone else.

1396
00:48:59,360 --> 00:49:02,640
Data engineer, if you don't enforce schema and contracts,

1397
00:49:02,640 --> 00:49:04,240
you're not engineering, you're relaying,

1398
00:49:04,240 --> 00:49:07,600
you're moving ambiguity from source to report at higher speed.

1399
00:49:07,600 --> 00:49:10,320
Analytics engineer, if logic lives in five tools,

1400
00:49:10,320 --> 00:49:12,320
you own the drift, not emotionally.

1401
00:49:12,320 --> 00:49:15,040
Operationally, you can't demand a single source of truth

1402
00:49:15,040 --> 00:49:17,920
while implementing five competing sources of logic.

1403
00:49:17,920 --> 00:49:20,880
Platform owner, if plans are unstable, cost is undefined.

1404
00:49:20,880 --> 00:49:23,120
You don't get to argue, it's just usage

1405
00:49:23,120 --> 00:49:24,800
when the query surface allows the scam.

1406
00:49:25,520 --> 00:49:27,760
Architect, if boundaries aren't explicit,

1407
00:49:27,760 --> 00:49:29,360
integration will decide them for you.

1408
00:49:29,360 --> 00:49:32,240
And integration always picks the path of least resistance,

1409
00:49:32,240 --> 00:49:33,520
not the path of least risk.

1410
00:49:33,520 --> 00:49:37,680
Executive, speed without control always builds you later.

1411
00:49:37,680 --> 00:49:40,720
Sometimes as money, sometimes as trust, sometimes as audit pain,

1412
00:49:40,720 --> 00:49:41,760
always as rework.

1413
00:49:41,760 --> 00:49:44,160
This is the modern identity.

1414
00:49:44,160 --> 00:49:46,240
Enforce intent at scale,

1415
00:49:46,240 --> 00:49:49,200
in a platform that will otherwise enforce entropy for you.

1416
00:49:49,200 --> 00:49:52,640
Operating model for fabric plus co-pilot,

1417
00:49:52,640 --> 00:49:54,400
enforcement, not education.

1418
00:49:54,400 --> 00:49:56,800
So here's the operating model and it is not inspiring.

1419
00:49:56,800 --> 00:49:58,080
It is enforceable.

1420
00:49:58,080 --> 00:50:01,120
First rule, AI drafts, humans approve.

1421
00:50:01,120 --> 00:50:03,600
Generation is typing, reviews engineering.

1422
00:50:03,600 --> 00:50:06,240
You don't merge co-pilot output because it compiled.

1423
00:50:06,240 --> 00:50:08,400
You merge it because it passed acceptance criteria

1424
00:50:08,400 --> 00:50:09,680
you can defend later.

1425
00:50:09,680 --> 00:50:12,720
And that means you define acceptance criteria that aren't vibes.

1426
00:50:12,720 --> 00:50:15,200
Schema checks, constraint checks, execution plans,

1427
00:50:15,200 --> 00:50:16,560
and security intent.

1428
00:50:16,560 --> 00:50:18,320
If it can't be validated mechanically,

1429
00:50:18,320 --> 00:50:20,160
it will be argued socially.

1430
00:50:20,160 --> 00:50:22,640
Social governance collapses under scheduled pressure.

1431
00:50:22,640 --> 00:50:24,080
Mechanical governance survives.

1432
00:50:24,720 --> 00:50:26,960
Second rule, contracts before convenience.

1433
00:50:26,960 --> 00:50:28,640
The lake house is not your truth layer.

1434
00:50:28,640 --> 00:50:29,760
It's your intake layer.

1435
00:50:29,760 --> 00:50:31,680
The warehouse is where you commit to shape.

1436
00:50:31,680 --> 00:50:34,720
That distinction matters because schema on read is a drift engine.

1437
00:50:34,720 --> 00:50:37,440
If you let bronze drift into silver and call it agile,

1438
00:50:37,440 --> 00:50:38,560
you're not shipping fast.

1439
00:50:38,560 --> 00:50:41,040
You're spreading ambiguity across more downstream surfaces

1440
00:50:41,040 --> 00:50:42,640
at a higher refresh rate.

1441
00:50:42,640 --> 00:50:44,240
So you declare the primary boundary,

1442
00:50:44,240 --> 00:50:45,280
lake house to warehouse.

1443
00:50:45,280 --> 00:50:48,240
That boundary gets gates, not documentation, gates.

1444
00:50:48,240 --> 00:50:51,040
Third rule, execution plans are cost policy,

1445
00:50:51,040 --> 00:50:52,320
not an optimization hobby.

1446
00:50:52,320 --> 00:50:55,280
If you want deterministic spend on a shared capacity,

1447
00:50:55,280 --> 00:50:58,000
you treat plans stability like you treat change control.

1448
00:50:58,000 --> 00:51:00,000
Critical parts must have bounded predicates.

1449
00:51:00,000 --> 00:51:01,440
They must have selective filters.

1450
00:51:01,440 --> 00:51:03,360
They must avoid non-saggable predicates.

1451
00:51:03,360 --> 00:51:05,200
They must avoid select star.

1452
00:51:05,200 --> 00:51:07,840
And they must not spill under expected concurrency.

1453
00:51:07,840 --> 00:51:09,840
If they do, you don't monitor it.

1454
00:51:09,840 --> 00:51:11,360
You redesign the surface.

1455
00:51:11,360 --> 00:51:13,680
Fourth rule, views and procedures over raw tables.

1456
00:51:13,680 --> 00:51:16,080
Always, this is the consumption API.

1457
00:51:16,080 --> 00:51:19,040
It protects cost, correctness, and security in one move.

1458
00:51:19,040 --> 00:51:22,080
Views give you stable contracts even when internal change.

1459
00:51:22,080 --> 00:51:23,840
Procedures give you parametrization,

1460
00:51:23,840 --> 00:51:26,400
so consumers don't invent their own query shapes.

1461
00:51:26,400 --> 00:51:28,640
And because everything points at the same surface,

1462
00:51:28,640 --> 00:51:31,200
you have one place to tune, one place to secure,

1463
00:51:31,200 --> 00:51:32,720
and one place to audit.

1464
00:51:32,720 --> 00:51:34,720
If you let raw tables be querable by default,

1465
00:51:34,720 --> 00:51:37,440
you have accepted uncontrolled query shapes.

1466
00:51:37,440 --> 00:51:39,680
You've also accepted schema drift as a breaking change

1467
00:51:39,680 --> 00:51:40,960
you'll discover downstream.

1468
00:51:40,960 --> 00:51:41,920
That's not self-service.

1469
00:51:41,920 --> 00:51:43,840
That's unmanaged blast radius.

1470
00:51:43,840 --> 00:51:47,520
Fifth rule, CIR, RCD gates for schema and logic changes.

1471
00:51:47,520 --> 00:51:49,920
Fabric is fast enough that drift accumulates faster

1472
00:51:49,920 --> 00:51:52,800
than your team's ability to remember why things were designed

1473
00:51:52,800 --> 00:51:53,600
the way they were.

1474
00:51:53,600 --> 00:51:55,120
That's what entropy looks like.

1475
00:51:55,120 --> 00:51:57,360
Yesterday's exception becomes today's dependency.

1476
00:51:57,360 --> 00:52:01,200
So you put friction, backward, belongs, in promotion,

1477
00:52:01,200 --> 00:52:02,720
not in exploration.

1478
00:52:02,720 --> 00:52:03,920
Exploration can be fast.

1479
00:52:03,920 --> 00:52:05,920
Production has gates.

1480
00:52:05,920 --> 00:52:07,200
Those gates should be boring.

1481
00:52:07,200 --> 00:52:09,920
Schemer diffs require review, contract changes require

1482
00:52:09,920 --> 00:52:12,800
versioning, security changes require explicit intent,

1483
00:52:12,800 --> 00:52:15,520
and critical query parts require plan review.

1484
00:52:15,520 --> 00:52:17,120
Not because you love process,

1485
00:52:17,120 --> 00:52:19,520
because without gates, your estate becomes a pile of

1486
00:52:19,520 --> 00:52:22,400
artifacts that work until the day leadership asks

1487
00:52:22,400 --> 00:52:24,960
why a number changed and nobody can prove anything.

1488
00:52:24,960 --> 00:52:28,080
Sixth rule, every layer assumes decay unless enforced.

1489
00:52:28,080 --> 00:52:30,320
Workspaces drift, permissions drift,

1490
00:52:30,320 --> 00:52:32,720
semantic models drift, definitions drift,

1491
00:52:32,720 --> 00:52:34,560
people leave, context disappears,

1492
00:52:34,560 --> 00:52:36,800
copilot produces plausible artifacts,

1493
00:52:36,800 --> 00:52:38,880
overtime policies drift away from intent.

1494
00:52:38,880 --> 00:52:40,240
That is not a fabric problem.

1495
00:52:40,240 --> 00:52:42,960
That is the default state of any platform at scale.

1496
00:52:42,960 --> 00:52:44,400
So you design for proof.

1497
00:52:44,400 --> 00:52:46,000
Violation counts, quarantine counts,

1498
00:52:46,000 --> 00:52:48,560
plan baselines, lineage visibility, and audit trails.

1499
00:52:48,560 --> 00:52:49,840
You don't need a perfect system.

1500
00:52:49,840 --> 00:52:52,800
You need a system that tells you when it is becoming less true.

1501
00:52:52,800 --> 00:52:55,440
And the strongest line to keep in your head when someone tries to

1502
00:52:55,440 --> 00:52:58,720
negotiate governance into guidance is this.

1503
00:52:58,720 --> 00:53:00,480
Fabric moves data fast.

1504
00:53:00,480 --> 00:53:02,800
Governance decides whether it stays true.

1505
00:53:02,800 --> 00:53:05,440
So the final takeaway is simple, and it's not flattering.

1506
00:53:05,440 --> 00:53:07,440
Fabric didn't simplify data engineering.

1507
00:53:07,440 --> 00:53:08,960
It simplified the mechanics.

1508
00:53:08,960 --> 00:53:11,200
It removed the ceremony that used to slow you down,

1509
00:53:11,200 --> 00:53:13,840
and it removed the padding that used to hide design omissions.

1510
00:53:13,840 --> 00:53:16,400
That's why these failures showed up in fabric first.

1511
00:53:16,400 --> 00:53:17,680
But they aren't fabric specific.

1512
00:53:17,680 --> 00:53:19,840
They are the inevitable failure modes of any platform

1513
00:53:19,840 --> 00:53:22,400
that collapses in gestion, transformation, storage,

1514
00:53:22,400 --> 00:53:26,400
semantics, and consumption into a single fast surface.

1515
00:53:26,400 --> 00:53:28,080
When you can move data at machine speed,

1516
00:53:28,080 --> 00:53:30,320
you can also ship ambiguity at machine speed.

1517
00:53:30,320 --> 00:53:33,280
And once ambiguity ships, it becomes somebody's dashboard,

1518
00:53:33,280 --> 00:53:34,560
somebody's executive summary,

1519
00:53:34,560 --> 00:53:36,000
somebody's single source of truth

1520
00:53:36,000 --> 00:53:38,160
that exists in exactly one workspace

1521
00:53:38,160 --> 00:53:41,040
because it was the fastest place to fix the problem.

1522
00:53:41,040 --> 00:53:42,720
If you're a senior data engineer,

1523
00:53:42,720 --> 00:53:45,040
the job now is to stop building artifacts

1524
00:53:45,040 --> 00:53:46,720
and start enforcing invariants.

1525
00:53:46,720 --> 00:53:48,080
If you're an analytics engineer,

1526
00:53:48,080 --> 00:53:50,960
the job is to stop distributing logic across five tools

1527
00:53:50,960 --> 00:53:52,960
and then acting surprised when truth fragments.

1528
00:53:52,960 --> 00:53:56,400
If you own the platform, the job is to stop treating capacity spikes

1529
00:53:56,400 --> 00:53:59,200
like building weirdness and start treating query surfaces

1530
00:53:59,200 --> 00:54:01,040
and execution plans as policy.

1531
00:54:01,040 --> 00:54:02,720
If you're the inheriting architect,

1532
00:54:02,720 --> 00:54:04,880
the job is to make boundaries explicit

1533
00:54:04,880 --> 00:54:06,880
before integration decides them for you.

1534
00:54:06,880 --> 00:54:09,520
And if you're a leader, the job is to stop buying speed

1535
00:54:09,520 --> 00:54:11,760
and then acting shocked when control erodes.

1536
00:54:11,760 --> 00:54:13,440
Now, if you want a concrete next step,

1537
00:54:13,440 --> 00:54:16,240
here's the real prompt I want you to answer for your own estate.

1538
00:54:16,240 --> 00:54:19,120
What artifact do you trust most when something feels wrong?

1539
00:54:19,120 --> 00:54:21,200
Execution plan, capacity spike trace,

1540
00:54:21,200 --> 00:54:23,360
quarantine count lineage view, audit log,

1541
00:54:23,360 --> 00:54:24,320
because whatever you pick,

1542
00:54:24,320 --> 00:54:26,480
that's what your governance model is actually built on,

1543
00:54:26,480 --> 00:54:27,760
whether you admit it or not.

1544
00:54:27,760 --> 00:54:31,440
Next episode is how to design fabric data contracts

1545
00:54:31,440 --> 00:54:32,960
that survive co-pilot.

1546
00:54:32,960 --> 00:54:33,960
Not a diagram.

1547
00:54:33,960 --> 00:54:35,280
A contract that blocks drift,

1548
00:54:35,280 --> 00:54:36,480
survives workforce churn

1549
00:54:36,480 --> 00:54:38,400
and still lets teams move fast

1550
00:54:38,400 --> 00:54:41,840
without turning your capacity into a random number generator.

1551
00:54:41,840 --> 00:54:43,640
Fabric is a speed multiplier

1552
00:54:43,640 --> 00:54:45,440
and it multiplies your governance debt

1553
00:54:45,440 --> 00:54:48,160
with the same enthusiasm it multiplies your delivery.

1554
00:54:48,160 --> 00:54:50,800
Drop a comment with the single artifact you trust most.

1555
00:54:50,800 --> 00:54:53,200
Execution plan, capacity metrics,

1556
00:54:53,200 --> 00:54:55,040
violation count or lineage,

1557
00:54:55,040 --> 00:54:57,280
and tell me which failure mode hit you hardest,

1558
00:54:57,280 --> 00:54:59,760
cost, contracts or security.

1559
00:54:59,760 --> 00:55:01,760
Subscribe for the data contract episode.