The Anatomy of an Auditable ESG Stack

This episode explains why the EU’s VAT in the Digital Age (ViDA) initiative is not a compliance upgrade, but a fundamental shift in how VAT operates—from delayed, periodic reporting to continuous, transaction-level control. Traditional VAT models relied on time gaps between transactions and reporting to absorb errors, corrections, and ambiguity. ViDA removes that buffer by requiring structured e-invoices and near real-time digital reporting, forcing VAT correctness at the moment each transaction occurs.

The discussion reframes ViDA as a control plane imposed on enterprise systems. Instead of inspecting paperwork after the fact, tax authorities now evaluate the behavior of the systems that generate invoices, including tax determination logic, master data quality, integration reliability, and exception handling. Organizations that attempt to treat ViDA as a bolt-on e-invoicing project or a middleware connector risk building brittle solutions that fail under validation, rejection handling, and audit scrutiny.

Using Dynamics 365 Finance and the Power Platform as reference architecture, the episode outlines why VAT must become deterministic rather than probabilistic. Exceptions, once tolerated and resolved later, now generate immediate operational pressure. Master data becomes a regulated input, not an operational convenience. Invoicing, reporting, corrections, refunds, and platform-based deemed supplier rules all converge into a single transaction pipeline that must be provable end-to-end.

VAT in the Digital Age (ViDA) fundamentally changes how VAT works in the EU. This episode explains why ViDA is not an incremental compliance initiative, but a control plane shift that forces enterprises to produce correct, auditable VAT outcomes at transaction speed. We break down what this means for ERP systems, e-invoicing, digital reporting, platform business models, and Microsoft-centric architectures using Dynamics 365 Finance and Power Platform.

What Changed With VAT in the Digital Age (ViDA)

VAT moves from periodic reporting to near real-time transaction controls
Structured e-invoices replace loosely governed documents
Reporting happens per transaction, not as aggregated summaries
Authorities inspect system behavior, not just submitted files

ViDA compresses the gap between transaction and inspection, removing the traditional error buffer enterprises relied on.

Why ViDA Is Not a Compliance Project

Compliance assumes work stays in tax and finance with IT support later
ViDA requires VAT correctness by design, not cleanup after posting
ERP, billing, commerce, and platforms become part of the VAT control surface
Treating ViDA as a bolt-on e-invoicing connector leads to brittle failures

ViDA enforces architectural determinism. Systems must produce compliant transactions automatically or generate visible exceptions immediately.

The Control Plane Shift Explained

Transaction systems become the data plane
ViDA introduces a control plane that enforces semantics, timing, and reconciliation
Every invoice becomes a regulated data packet with deadlines and acknowledgements
Probabilistic VAT models collapse under real-time validation

What used to be “usually right” becomes operationally unacceptable.

Why E-Invoicing Alone Fails

E-invoicing is not a file format change
Invoice correctness depends on tax determination, master data, sequencing, and lifecycle
Validation failures expose weaknesses across ERP, integrations, and data governance
Middleware cannot manufacture missing or incorrect business truth

E-invoicing reveals problems; it does not fix them.

Dynamics 365 Finance Under ViDA

Becomes the source of regulated events, not just accounting truth
Tax determination must be final at posting time
Sloppy posting followed by clean reporting no longer works
Invoice numbering, credit notes, and corrections become regulated behavior

If finance posting relies on later review, the real system of record becomes human inboxes—not the ERP.

The Role of Power Platform

Power Automate becomes the exception and remediation engine
Exceptions must follow governed lifecycles: triage, fix, resubmit, evidence capture
Power BI becomes the daily compliance health monitor, not a management dashboard
Visibility focuses on drift, not just volume

Automation is mandatory because humans cannot resolve high-volume exceptions at transaction speed.

The Three ViDA Pillars Behave as One System

Digital Reporting and E-Invoicing
Transaction-level reporting with structured semantics and deadlines
Platform Deemed Supplier Rules
Platforms become VAT counterparties when sellers lack VAT obligations
Single VAT Registration (OSS Expansion)
Fewer registrations, but stricter evidence and centralized truth

All pillars share the same invariant: calculate correctly, report correctly, reconcile correctly, and prove it later.

Platforms as Tax Counterparties

Platforms must decide VAT liability at transaction time
Seller status, VAT IDs, and location evidence become runtime inputs
Refunds, chargebacks, and cancellations become regulated tax events
Pricing, contracts, payouts, and reversals must align with tax decisions

Under ViDA, platforms cannot hide behind intermediated models when liability shifts to them.

Exceptions Are Not Edge Cases

Exceptions are structural under transaction-level controls
Missing VAT IDs, invalid data, timing failures, and mismatches are inevitable
The real question is whether exception handling is engineered or accidental
Ungoverned exceptions create permanent operational debt

A deterministic exception lifecycle is a prerequisite for scale.

The Timeline Reality

ViDA is phased, with milestones through 2028, 2030, and beyond
Member states can move earlier than headline dates
Systems must survive years of mixed regimes and partial adoption
The real work is building the pipeline, not hitting a single date

Treat the timeline as an engineering runway, not a compliance calendar.

Key Takeaways

ViDA rewards architecture and punishes patchwork
VAT becomes a transaction-speed systems problem
Master data is a tax control surface
Explainability, traceability, and evidence matter more than tooling
Continuous engineering beats last-minute compliance programs

Final Thought

Under VAT in the Digital Age, delay is replaced by observability.
If your systems cannot explain every transaction as it happens, the authority will—faster than you expect.

Transcript

1
00:00:00,000 --> 00:00:02,880
Most organizations treat ESG reporting like a narrative.

2
00:00:02,880 --> 00:00:04,800
Auditors treat it like evidence.

3
00:00:04,800 --> 00:00:06,680
An evidence has rules, origin, integrity,

4
00:00:06,680 --> 00:00:08,800
repeatability, and access control.

5
00:00:08,800 --> 00:00:12,360
If your number exists, because someone edited a spreadsheet,

6
00:00:12,360 --> 00:00:13,800
your stack isn't a stack.

7
00:00:13,800 --> 00:00:14,720
It's a story.

8
00:00:14,720 --> 00:00:17,120
In this episode, this is what gets built.

9
00:00:17,120 --> 00:00:21,560
A minimal, auditable OESG architecture on Microsoft Cloud

10
00:00:21,560 --> 00:00:24,280
that you can replicate identity, immutability,

11
00:00:24,280 --> 00:00:26,400
governed calculations, lineage,

12
00:00:26,400 --> 00:00:29,280
and a reporting layer that doesn't rewrite history.

13
00:00:29,280 --> 00:00:31,520
And there's one reason dashboards are the fastest path

14
00:00:31,520 --> 00:00:32,480
to audit failure.

15
00:00:32,480 --> 00:00:33,360
Coming up.

16
00:00:33,360 --> 00:00:35,040
The foundational misunderstanding.

17
00:00:35,040 --> 00:00:36,320
ESG isn't a report.

18
00:00:36,320 --> 00:00:37,840
It's a system of record.

19
00:00:37,840 --> 00:00:39,880
The core misconception is comforting.

20
00:00:39,880 --> 00:00:42,760
ESG is a document, a disclosure, a set of charts,

21
00:00:42,760 --> 00:00:45,040
a few paragraphs that say we're improving.

22
00:00:45,040 --> 00:00:47,560
That framing works right up until someone asks for proof.

23
00:00:47,560 --> 00:00:49,800
In architectural terms, ESG is not a report.

24
00:00:49,800 --> 00:00:51,040
It is a system of record.

25
00:00:51,040 --> 00:00:54,160
That distinction matters because a report is an output artifact.

26
00:00:54,160 --> 00:00:56,200
It can be produced by almost any workflow,

27
00:00:56,200 --> 00:00:58,840
including workflows that should never survive contact

28
00:00:58,840 --> 00:00:59,760
with assurance.

29
00:00:59,760 --> 00:01:01,600
A system of record is different.

30
00:01:01,600 --> 00:01:04,640
It is a controlled environment where inputs, transformations

31
00:01:04,640 --> 00:01:07,680
and outputs are all tracked, replayable, and attributable.

32
00:01:07,680 --> 00:01:10,880
And OESG, operational ESG, is simply

33
00:01:10,880 --> 00:01:14,080
the adult version of ESG, measurable, decision-ready,

34
00:01:14,080 --> 00:01:15,000
and auditable.

35
00:01:15,000 --> 00:01:18,200
If ESG is going to be used in corporate disclosures, regulatory

36
00:01:18,200 --> 00:01:20,120
submissions or investor communications,

37
00:01:20,120 --> 00:01:22,880
the underlying system has to behave like financial reporting

38
00:01:22,880 --> 00:01:26,160
systems, not in theme, in mechanics.

39
00:01:26,160 --> 00:01:28,800
So the anatomy starts with a control system model.

40
00:01:28,800 --> 00:01:31,840
Inputs, transformations, outputs, and attestations.

41
00:01:31,840 --> 00:01:33,520
Inputs are the operational facts.

42
00:01:33,520 --> 00:01:36,000
Energy consumption, fuel use, travel, procurement,

43
00:01:36,000 --> 00:01:39,000
line items, workforce counts, water usage.

44
00:01:39,000 --> 00:01:40,840
Transformations are the govern processes

45
00:01:40,840 --> 00:01:43,920
that normalize units, map to organizational structure,

46
00:01:43,920 --> 00:01:46,240
apply emission factors, and compute KPIs.

47
00:01:46,240 --> 00:01:49,440
Outputs are the period-specific KPI tables and disclosures.

48
00:01:49,440 --> 00:01:52,600
Attestations are the approvals, sign-offs, and audit artifacts

49
00:01:52,600 --> 00:01:54,840
that prove the outputs were produced under control.

50
00:01:54,840 --> 00:01:57,360
Most organizations skip straight to outputs.

51
00:01:57,360 --> 00:01:59,480
They build dashboards, they produce slides,

52
00:01:59,480 --> 00:02:02,560
they call it reporting, but they never build the chain of custody.

53
00:02:02,560 --> 00:02:05,280
Chain of custody is the real product, not the pretty chart,

54
00:02:05,280 --> 00:02:07,080
because assurance doesn't audit your chart.

55
00:02:07,080 --> 00:02:10,080
It audits whether the number behind the chart is defensible,

56
00:02:10,080 --> 00:02:12,160
where it came from, who touched it, how it changed,

57
00:02:12,160 --> 00:02:14,240
which logic produced it, and whether anyone could have

58
00:02:14,240 --> 00:02:16,120
quietly altered it after close.

59
00:02:16,120 --> 00:02:19,360
This is where deterministic versus probabilistic ESG shows up.

60
00:02:19,360 --> 00:02:21,760
Deterministic ESG is boring and boring is good.

61
00:02:21,760 --> 00:02:24,280
Given the same raw inputs, the same factor versions,

62
00:02:24,280 --> 00:02:26,560
and the same calculation logic, the system produces

63
00:02:26,560 --> 00:02:28,080
the same outputs every time.

64
00:02:28,080 --> 00:02:31,120
Re-run last year and two years, and you get the same result.

65
00:02:31,120 --> 00:02:33,840
That's what auditors expect, even if they don't say the word,

66
00:02:33,840 --> 00:02:35,040
deterministic.

67
00:02:35,040 --> 00:02:37,640
Probabilistic ESG is what you get when human edits

68
00:02:37,640 --> 00:02:39,960
are allowed to masquerade as process.

69
00:02:39,960 --> 00:02:42,360
Numbers drift because someone fixed a file,

70
00:02:42,360 --> 00:02:45,840
optimized a model, or updated a mapping.

71
00:02:45,840 --> 00:02:47,760
The system still produces an output,

72
00:02:47,760 --> 00:02:49,680
but it can't reproduce its own past.

73
00:02:49,680 --> 00:02:50,720
It can't explain itself.

74
00:02:50,720 --> 00:02:52,080
It can't prove integrity.

75
00:02:52,080 --> 00:02:54,440
And once you can't reproduce, you can't assure.

76
00:02:54,440 --> 00:02:55,840
Here's the thing most people miss.

77
00:02:55,840 --> 00:02:57,360
Auditors don't need perfection.

78
00:02:57,360 --> 00:02:58,640
They need controllability.

79
00:02:58,640 --> 00:03:00,840
They need you to show that changes are visible,

80
00:03:00,840 --> 00:03:02,520
bounded, approved, and attributable.

81
00:03:02,520 --> 00:03:05,120
When you can't do that, every number becomes a debate.

82
00:03:05,120 --> 00:03:06,360
And debates are expensive.

83
00:03:06,360 --> 00:03:09,600
So let's talk about the audit questions that break weak stacks.

84
00:03:09,600 --> 00:03:12,320
Not because auditors are evil, because this is their job.

85
00:03:12,320 --> 00:03:13,920
Who changed it? When did they change it?

86
00:03:13,920 --> 00:03:15,080
Why did they change it?

87
00:03:15,080 --> 00:03:16,360
What approval existed?

88
00:03:16,360 --> 00:03:18,240
What version of the factors did you use?

89
00:03:18,240 --> 00:03:20,120
What version of the calculation logic did you use?

90
00:03:20,120 --> 00:03:22,280
What inputs were in scope for the period close?

91
00:03:22,280 --> 00:03:24,560
Who had access to alter or data curated data

92
00:03:24,560 --> 00:03:25,480
and reported outputs?

93
00:03:25,480 --> 00:03:28,240
Can you show lineage from the KPI back to the source record

94
00:03:28,240 --> 00:03:29,840
without reconstructing a PowerPoint?

95
00:03:29,840 --> 00:03:31,640
If your answer to any of these is we think,

96
00:03:31,640 --> 00:03:33,120
you don't have OESG.

97
00:03:33,120 --> 00:03:34,240
You have a narrative.

98
00:03:34,240 --> 00:03:36,080
Now here's where it gets uncomfortable.

99
00:03:36,080 --> 00:03:38,720
Most ESG programs treat the sustainability team

100
00:03:38,720 --> 00:03:40,040
like the owner of truth.

101
00:03:40,040 --> 00:03:41,720
But systems don't care about job titles,

102
00:03:41,720 --> 00:03:44,240
systems care about permissions and pathways.

103
00:03:44,240 --> 00:03:47,160
If a single person can both submit data and adjust

104
00:03:47,160 --> 00:03:48,920
the calculation and publish the dashboard,

105
00:03:48,920 --> 00:03:50,000
you don't have governance.

106
00:03:50,000 --> 00:03:52,160
You have conditional chaos.

107
00:03:52,160 --> 00:03:53,680
And it accumulates.

108
00:03:53,680 --> 00:03:56,000
Every exception becomes an entropy generator.

109
00:03:56,000 --> 00:03:57,840
One more undocumented pathway for numbers

110
00:03:57,840 --> 00:03:59,920
to change without leaving a clean trail.

111
00:03:59,920 --> 00:04:01,880
This clicked for a lot of teams when

112
00:04:01,880 --> 00:04:03,960
assurance started asking for evidence packs,

113
00:04:03,960 --> 00:04:06,960
not just the final KPI, but the supporting documents,

114
00:04:06,960 --> 00:04:09,720
the factor library provenance, the ingestion logs,

115
00:04:09,720 --> 00:04:12,200
the validation results, and the approvals.

116
00:04:12,200 --> 00:04:14,360
Suddenly the ESG report wasn't the deliverable.

117
00:04:14,360 --> 00:04:17,000
The deliverable was the ability to prove the report.

118
00:04:17,000 --> 00:04:18,760
So the architecture has a simple objective

119
00:04:18,760 --> 00:04:20,720
and forced chain of custody at scale.

120
00:04:20,720 --> 00:04:23,080
That means every ESG number has to be traceable

121
00:04:23,080 --> 00:04:24,720
through four properties.

122
00:04:24,720 --> 00:04:25,560
Origin?

123
00:04:25,560 --> 00:04:27,560
The system can identify the source record

124
00:04:27,560 --> 00:04:28,800
and the source system.

125
00:04:28,800 --> 00:04:29,640
Transformation?

126
00:04:29,640 --> 00:04:31,880
The system can show which pipeline and which logic

127
00:04:31,880 --> 00:04:33,560
produce the derived record.

128
00:04:33,560 --> 00:04:34,400
Integrity?

129
00:04:34,400 --> 00:04:37,560
The system can show the data wasn't overwritten post-close.

130
00:04:37,560 --> 00:04:40,600
And changes are recorded as new versions or adjustments,

131
00:04:40,600 --> 00:04:42,280
not silent edits.

132
00:04:42,280 --> 00:04:45,280
Access, the system can show who could touch what and when.

133
00:04:45,280 --> 00:04:47,440
Once you accept ESG as a system of record,

134
00:04:47,440 --> 00:04:48,920
everything else becomes obvious.

135
00:04:48,920 --> 00:04:51,600
Dashboards become presentation, not computation.

136
00:04:51,600 --> 00:04:53,520
Spreadsheets become controlled submissions,

137
00:04:53,520 --> 00:04:55,040
not a source of truth.

138
00:04:55,040 --> 00:04:57,840
One of fixes become formal adjustments with approvals.

139
00:04:57,840 --> 00:05:00,000
And every component you choose in Microsoft Cloud

140
00:05:00,000 --> 00:05:02,760
starts mapping to a property the auditor will eventually

141
00:05:02,760 --> 00:05:03,280
demand.

142
00:05:03,280 --> 00:05:05,560
Now before we go further, you need a working definition

143
00:05:05,560 --> 00:05:09,320
of auditable in system terms, because it's not a checkbox

144
00:05:09,320 --> 00:05:11,440
and it's definitely not a screenshot.

145
00:05:11,440 --> 00:05:12,400
That comes next.

146
00:05:12,400 --> 00:05:15,040
The audit grade requirements, immutability, reproducibility,

147
00:05:15,040 --> 00:05:16,800
lineage, separation of duties.

148
00:05:16,800 --> 00:05:19,400
So what does auditable mean when it stops being a vibe

149
00:05:19,400 --> 00:05:21,400
and starts being a system property?

150
00:05:21,400 --> 00:05:22,960
It collapses into four requirements,

151
00:05:22,960 --> 00:05:25,600
not because Microsoft says so, because auditors behave

152
00:05:25,600 --> 00:05:28,280
predictably and systems either withstand that pressure

153
00:05:28,280 --> 00:05:29,360
or they don't.

154
00:05:29,360 --> 00:05:32,120
Immutability, reproducibility, lineage,

155
00:05:32,120 --> 00:05:33,840
and separation of duties.

156
00:05:33,840 --> 00:05:35,960
First, immutability.

157
00:05:35,960 --> 00:05:38,840
Immutability is not, we promise we won't change it.

158
00:05:38,840 --> 00:05:41,640
Immutability is, the platform will not let you change it.

159
00:05:41,640 --> 00:05:42,960
That's the entire point.

160
00:05:42,960 --> 00:05:45,640
On Microsoft Cloud, that shows up as right ones,

161
00:05:45,640 --> 00:05:47,800
read many behavior on Azure Blob storage

162
00:05:47,800 --> 00:05:51,240
or ADLS Gen 2 through immutable storage policies.

163
00:05:51,240 --> 00:05:54,080
After period close, your raw evidence and your period outputs

164
00:05:54,080 --> 00:05:56,960
have to stop being mutable objects and become records.

165
00:05:56,960 --> 00:05:59,200
That distinction matters because most ESG programs

166
00:05:59,200 --> 00:06:01,240
close a period socially, not technically.

167
00:06:01,240 --> 00:06:03,320
People agree it's closed, but the storage layer

168
00:06:03,320 --> 00:06:04,520
still allows overrides.

169
00:06:04,520 --> 00:06:06,800
So someone fixes a typo, reruns a pipeline,

170
00:06:06,800 --> 00:06:09,040
uploads a corrected file, and now the evidence

171
00:06:09,040 --> 00:06:10,840
for the closed period silently changes,

172
00:06:10,840 --> 00:06:12,560
your process still feels controlled,

173
00:06:12,560 --> 00:06:14,120
but the system behavior is not.

174
00:06:14,120 --> 00:06:15,560
Auditors don't audit feelings.

175
00:06:15,560 --> 00:06:17,560
They audit whether changes were possible.

176
00:06:17,560 --> 00:06:20,080
Time-based retention is the operational version

177
00:06:20,080 --> 00:06:21,440
of immutability.

178
00:06:21,440 --> 00:06:23,320
You lock data for defined interval,

179
00:06:23,320 --> 00:06:25,320
so it can't be modified or deleted.

180
00:06:25,320 --> 00:06:27,400
Legal hold is the litigation version.

181
00:06:27,400 --> 00:06:30,120
It stays locked until someone with authority clears it.

182
00:06:30,120 --> 00:06:31,760
The consequence is the same either way.

183
00:06:31,760 --> 00:06:33,760
Overrides become illegal.

184
00:06:33,760 --> 00:06:37,080
Which means your pipeline design has to evolve from replace

185
00:06:37,080 --> 00:06:38,640
to publish a new version.

186
00:06:38,640 --> 00:06:40,640
Second, reproducibility.

187
00:06:40,640 --> 00:06:44,280
Reproducibility is the ability to rerun FYI and FYI+2

188
00:06:44,280 --> 00:06:45,480
and get the same result.

189
00:06:45,480 --> 00:06:47,560
Not similar, not close.

190
00:06:47,560 --> 00:06:48,640
The same.

191
00:06:48,640 --> 00:06:51,200
That means three things must be frozen per period.

192
00:06:51,200 --> 00:06:53,560
Inputs, factors, and logic.

193
00:06:53,560 --> 00:06:55,040
Most people only freeze inputs,

194
00:06:55,040 --> 00:06:56,680
and even that is usually wishful thinking.

195
00:06:56,680 --> 00:06:58,920
The system needs to freeze the factor library versions

196
00:06:58,920 --> 00:07:01,360
used for that period and freeze the calculation artifacts

197
00:07:01,360 --> 00:07:02,200
that reference them.

198
00:07:02,200 --> 00:07:04,880
If you rerun with latest factors or latest code,

199
00:07:04,880 --> 00:07:06,320
you're not reproducing history.

200
00:07:06,320 --> 00:07:07,040
You're rewriting it.

201
00:07:07,040 --> 00:07:09,880
Reproducibility is why dashboard math is an audit trap.

202
00:07:09,880 --> 00:07:12,320
You can't prove what logic produced last year's number

203
00:07:12,320 --> 00:07:15,080
if the logic lives in a constantly edited semantic model.

204
00:07:15,080 --> 00:07:17,280
Even if the code is technically visible,

205
00:07:17,280 --> 00:07:19,200
it's not governed like a calculation engine.

206
00:07:19,200 --> 00:07:21,120
There's no concept of a period bound release

207
00:07:21,120 --> 00:07:23,120
an approved version and a locked output.

208
00:07:23,120 --> 00:07:25,040
Auditors don't need your DAX to be clever.

209
00:07:25,040 --> 00:07:26,560
They need it to stop moving.

210
00:07:26,560 --> 00:07:28,040
Third, lineage.

211
00:07:28,040 --> 00:07:31,280
Lineage is the answer to a single question

212
00:07:31,280 --> 00:07:32,880
that destroys weak stacks.

213
00:07:32,880 --> 00:07:34,440
Where did this number come from?

214
00:07:34,440 --> 00:07:36,840
Not philosophically, mechanically.

215
00:07:36,840 --> 00:07:39,680
Lineage is origin to transformation, to consumption,

216
00:07:39,680 --> 00:07:42,640
source system record, to ingested file or table

217
00:07:42,640 --> 00:07:45,080
through transformations, into curated models,

218
00:07:45,080 --> 00:07:48,680
into reported outputs, into the data set that Power BI reads.

219
00:07:48,680 --> 00:07:51,200
If you can't trace it quickly, you will trace it slowly.

220
00:07:51,200 --> 00:07:53,800
And slowly means meetings, screenshots,

221
00:07:53,800 --> 00:07:55,400
and spreadsheet archaeology.

222
00:07:55,400 --> 00:07:58,040
That is not assurance that is theater.

223
00:07:58,040 --> 00:08:01,520
Microsoft purview exists because human memory does not scale.

224
00:08:01,520 --> 00:08:03,280
It's the metadata system that turns,

225
00:08:03,280 --> 00:08:06,200
we think this is how it flows into, here is the graph.

226
00:08:06,200 --> 00:08:08,360
It also becomes your change management weapon.

227
00:08:08,360 --> 00:08:10,520
Before you change a pipeline or a calculation,

228
00:08:10,520 --> 00:08:12,080
you can see downstream impact.

229
00:08:12,080 --> 00:08:14,440
Without lineage, every change is a blind deployment

230
00:08:14,440 --> 00:08:16,280
into your own reporting boundary.

231
00:08:16,280 --> 00:08:18,040
And yes, product capabilities evolve.

232
00:08:18,040 --> 00:08:18,880
That's normal.

233
00:08:18,880 --> 00:08:20,040
Your requirement does not evolve.

234
00:08:20,040 --> 00:08:22,200
The requirement is explainability under pressure.

235
00:08:22,200 --> 00:08:24,360
Fourth, separation of duties.

236
00:08:24,360 --> 00:08:27,280
This one is where most ESG programs quietly fail

237
00:08:27,280 --> 00:08:28,640
because it's inconvenient.

238
00:08:28,640 --> 00:08:29,960
But the logic is simple.

239
00:08:29,960 --> 00:08:32,880
The person who submits data cannot be the person who approves it,

240
00:08:32,880 --> 00:08:34,400
and the person who changes logic

241
00:08:34,400 --> 00:08:37,120
cannot be the person who publishes the reported outputs.

242
00:08:37,120 --> 00:08:39,080
You need role separation across data entry,

243
00:08:39,080 --> 00:08:41,440
validation, calculation, approval, and reporting.

244
00:08:41,440 --> 00:08:42,320
In Microsoft terms,

245
00:08:42,320 --> 00:08:43,960
EntraID is not a diagram.

246
00:08:43,960 --> 00:08:45,600
It is the enforcement mechanism,

247
00:08:45,600 --> 00:08:48,040
group membership, role assignments, access reviews,

248
00:08:48,040 --> 00:08:49,640
audit logs, these are evidence.

249
00:08:49,640 --> 00:08:51,120
And you don't get evidence by saying

250
00:08:51,120 --> 00:08:53,320
only the sustainability team has access.

251
00:08:53,320 --> 00:08:55,520
You get evidence by proving which identities

252
00:08:55,520 --> 00:08:57,240
had which permissions during the period

253
00:08:57,240 --> 00:09:00,320
and showing that privileged access was bounded and reviewable.

254
00:09:00,320 --> 00:09:02,520
Most organizations end up with a hero admin

255
00:09:02,520 --> 00:09:03,560
because it's faster.

256
00:09:03,560 --> 00:09:04,560
It is not governance.

257
00:09:04,560 --> 00:09:06,640
It is a single point of audit failure.

258
00:09:06,640 --> 00:09:08,800
So those four properties define your architecture.

259
00:09:08,800 --> 00:09:10,640
Immutability prevents silent edits,

260
00:09:10,640 --> 00:09:14,360
reproducibility prevents drift, lineage prevents archaeology.

261
00:09:14,360 --> 00:09:16,720
Separation of duties prevents conflict of interest

262
00:09:16,720 --> 00:09:17,640
and invisible power.

263
00:09:17,640 --> 00:09:18,480
And here's the payoff.

264
00:09:18,480 --> 00:09:21,120
Once these exist, your ESG stack stops

265
00:09:21,120 --> 00:09:23,880
being a collection of tools and becomes a control plane.

266
00:09:23,880 --> 00:09:25,280
Now the uncomfortable part.

267
00:09:25,280 --> 00:09:29,000
These requirements map directly to specific Microsoft services.

268
00:09:29,000 --> 00:09:30,360
Some are non-negotiable.

269
00:09:30,360 --> 00:09:32,680
The rest are optional until scale and regulation

270
00:09:32,680 --> 00:09:34,240
make the mandatory.

271
00:09:34,240 --> 00:09:37,120
Microsoft stack map, non-negotiable versus optional.

272
00:09:37,120 --> 00:09:39,160
Now we map those four audit grade requirements

273
00:09:39,160 --> 00:09:41,360
to Microsoft services, not as a shopping list,

274
00:09:41,360 --> 00:09:42,680
as a chain of enforcement.

275
00:09:42,680 --> 00:09:44,640
Because the system doesn't become auditable

276
00:09:44,640 --> 00:09:45,520
when you buy tools.

277
00:09:45,520 --> 00:09:47,760
It becomes auditable when every requirement

278
00:09:47,760 --> 00:09:50,360
has an implementation that removes human discretion.

279
00:09:50,360 --> 00:09:52,160
Start with the non-negotiables.

280
00:09:52,160 --> 00:09:54,760
These are the components, auditors implicitly expect

281
00:09:54,760 --> 00:09:56,840
even if they never say Microsoft out loud.

282
00:09:56,840 --> 00:09:58,120
First identity and access.

283
00:09:58,120 --> 00:10:01,040
Microsoft, Entra ID, Entra is not single sign on.

284
00:10:01,040 --> 00:10:03,360
Architecturally, it's the distributed decision engine

285
00:10:03,360 --> 00:10:05,640
that decides who can submit, who can transform,

286
00:10:05,640 --> 00:10:07,440
who can approve and who can publish.

287
00:10:07,440 --> 00:10:09,640
And it produces logs, logs are evidence.

288
00:10:09,640 --> 00:10:12,000
If role separation is one of your requirements,

289
00:10:12,000 --> 00:10:15,120
Entra is where it either happens or it doesn't.

290
00:10:15,120 --> 00:10:17,240
Second, storage with immutability.

291
00:10:17,240 --> 00:10:19,000
Azure Data Lake Storage Gen 2,

292
00:10:19,000 --> 00:10:20,840
with immutable storage policies for the zones

293
00:10:20,840 --> 00:10:23,920
that become evidence, raw and period closed reported outputs,

294
00:10:23,920 --> 00:10:26,800
plus any evidence vault you keep for supporting documents.

295
00:10:26,800 --> 00:10:29,160
This is the part everyone tries to negotiate away

296
00:10:29,160 --> 00:10:31,320
because it forces pipeline discipline.

297
00:10:31,320 --> 00:10:33,680
But worm isn't a feature, it's a behavior change.

298
00:10:33,680 --> 00:10:37,400
Once you enable immutability, overrides are no longer a quick fix.

299
00:10:37,400 --> 00:10:39,160
They are an audit event you can't perform.

300
00:10:39,160 --> 00:10:40,720
That constraint is the entire point.

301
00:10:40,720 --> 00:10:42,360
Third, a governed calculation zone.

302
00:10:42,360 --> 00:10:44,680
Fabric Lake House or Azure Synapse Analytics.

303
00:10:44,680 --> 00:10:47,120
Pick one and treat it like an accounting engine.

304
00:10:47,120 --> 00:10:50,600
Version artifacts, control deployments and period bound releases.

305
00:10:50,600 --> 00:10:51,800
Your calculations need to live

306
00:10:51,800 --> 00:10:53,400
where they can be tested, reviewed

307
00:10:53,400 --> 00:10:55,760
and rerun against frozen inputs and frozen factors.

308
00:10:55,760 --> 00:10:58,360
If your KPI logic lives in power BI measures,

309
00:10:58,360 --> 00:10:59,840
you didn't build a calculation zone.

310
00:10:59,840 --> 00:11:02,200
You built a dashboard that quietly rewrites history.

311
00:11:02,200 --> 00:11:03,640
Fourth, governance,

312
00:11:03,640 --> 00:11:05,640
and lineage, Microsoft purview.

313
00:11:05,640 --> 00:11:08,520
Purview is the difference between,

314
00:11:08,520 --> 00:11:12,080
we can probably explain this and here is the lineage graph.

315
00:11:12,080 --> 00:11:14,760
Here are the owners, here are the transformations.

316
00:11:14,760 --> 00:11:17,920
Under assurance pressure, that difference becomes the whole game.

317
00:11:17,920 --> 00:11:20,840
Purview is also how you scale governance beyond tribal knowledge.

318
00:11:20,840 --> 00:11:22,720
People leave, your metadata can't.

319
00:11:22,720 --> 00:11:24,640
Fifth, reporting as a thin layer.

320
00:11:24,640 --> 00:11:27,520
Power BI, power BI is allowed, power BI is useful.

321
00:11:27,520 --> 00:11:30,480
Power BI is also where most teams destroy auditability

322
00:11:30,480 --> 00:11:33,160
by turning the semantic model into the calculation engine.

323
00:11:33,160 --> 00:11:34,360
So the rule is brutal.

324
00:11:34,360 --> 00:11:37,200
Power BI consumes reported period close tables.

325
00:11:37,200 --> 00:11:39,240
Measures are presentation and aggregation,

326
00:11:39,240 --> 00:11:40,560
not emissions accounting.

327
00:11:40,560 --> 00:11:42,800
You want auditors to argue about your visuals,

328
00:11:42,800 --> 00:11:43,600
not your logic.

329
00:11:43,600 --> 00:11:45,280
So that's the non-negotiable baseline.

330
00:11:45,280 --> 00:11:47,840
Entra, ADLS Gen 2 with immutability,

331
00:11:47,840 --> 00:11:51,080
fabric or synapse for calculations, purview for lineage

332
00:11:51,080 --> 00:11:54,440
and power BI as the last mile presentation layer.

333
00:11:54,440 --> 00:11:57,640
Now the optional components, optional does not mean irrelevant.

334
00:11:57,640 --> 00:12:01,600
It means not required until scale, regulation and complexity corner you.

335
00:12:02,680 --> 00:12:06,000
Microsoft Sustainability Manager sits in that category.

336
00:12:06,000 --> 00:12:08,680
It's optional when you already have mature emissions logic,

337
00:12:08,680 --> 00:12:11,000
a controlled factor library, and the willpower

338
00:12:11,000 --> 00:12:13,640
to build transparent pipelines and models yourself.

339
00:12:13,640 --> 00:12:16,720
It becomes valuable when you need faster onboarding to frameworks,

340
00:12:16,720 --> 00:12:18,400
faster scope three workflows,

341
00:12:18,400 --> 00:12:21,200
or you simply don't have internal emissions domain depth.

342
00:12:21,200 --> 00:12:23,320
The platform has audit trail capabilities

343
00:12:23,320 --> 00:12:25,040
and data trail reporting features,

344
00:12:25,040 --> 00:12:26,800
but it doesn't absolve you from architecture.

345
00:12:26,800 --> 00:12:29,160
If you treat it as a black box that spits out numbers,

346
00:12:29,160 --> 00:12:30,840
you're just outsourcing your audit risk

347
00:12:30,840 --> 00:12:32,400
to a product configuration.

348
00:12:32,400 --> 00:12:34,480
As your data factory is also optional,

349
00:12:34,480 --> 00:12:37,120
but only if your ingestion needs stay simple.

350
00:12:37,120 --> 00:12:40,000
If fabric native ingestion covers your source systems fine.

351
00:12:40,000 --> 00:12:42,120
But when you have real ERP integration,

352
00:12:42,120 --> 00:12:45,160
IoT telemetry coordination, multi-step API dependencies

353
00:12:45,160 --> 00:12:46,760
and cross-system timing constraints,

354
00:12:46,760 --> 00:12:49,120
data factory becomes the orchestration layer

355
00:12:49,120 --> 00:12:51,240
that keeps ingestion deterministic.

356
00:12:51,240 --> 00:12:53,320
Just remember the immutability constraint.

357
00:12:53,320 --> 00:12:56,320
Data factory pipelines that override files collide

358
00:12:56,320 --> 00:12:59,120
with worm policies and fail, that failure isn't a bug.

359
00:12:59,120 --> 00:13:00,840
It's your architecture revealing itself.

360
00:13:00,840 --> 00:13:03,680
Azure Machine Learning is optional in the purest sense.

361
00:13:03,680 --> 00:13:05,960
Use it for forecasting, anomaly detection

362
00:13:05,960 --> 00:13:07,200
and scenario modeling.

363
00:13:07,200 --> 00:13:08,520
Never for baseline numbers.

364
00:13:08,520 --> 00:13:11,000
Model outputs are estimates and estimates

365
00:13:11,000 --> 00:13:12,640
need labeling, provenance and governance

366
00:13:12,640 --> 00:13:13,680
like any other input.

367
00:13:13,680 --> 00:13:16,600
Otherwise, your AI insights become untraceable logic changes

368
00:13:16,600 --> 00:13:17,920
with a brand name.

369
00:13:17,920 --> 00:13:19,120
Here's the short warning.

370
00:13:19,120 --> 00:13:21,720
Every optional tool becomes mandatory.

371
00:13:21,720 --> 00:13:24,400
The moment you use it to create or modify numbers

372
00:13:24,400 --> 00:13:25,960
inside your reporting boundary,

373
00:13:25,960 --> 00:13:27,840
the system doesn't care that it was a pilot.

374
00:13:27,840 --> 00:13:30,240
If it touched the number, it's part of the evidence chain.

375
00:13:30,240 --> 00:13:31,640
So you now have the map.

376
00:13:31,640 --> 00:13:33,880
Non-negotiables enforce the four requirements.

377
00:13:33,880 --> 00:13:37,520
Optional tools add capability, but also add pathways for entropy.

378
00:13:37,520 --> 00:13:38,920
Next, you started the edge

379
00:13:38,920 --> 00:13:40,880
because the first place truth gets corrupted

380
00:13:40,880 --> 00:13:43,400
is always the first place data enters your system.

381
00:13:43,400 --> 00:13:46,800
Operational data sources, where OSG actually comes from.

382
00:13:46,800 --> 00:13:49,000
OSG doesn't come from your sustainability team

383
00:13:49,000 --> 00:13:51,800
that team usually collects it, begs for it, cleans it up

384
00:13:51,800 --> 00:13:52,840
and tries to defend it.

385
00:13:52,840 --> 00:13:55,720
But the data originates somewhere else.

386
00:13:55,720 --> 00:13:58,240
Operational systems that were never designed

387
00:13:58,240 --> 00:13:59,920
to be audited for carbon math.

388
00:13:59,920 --> 00:14:01,400
That's the first architectural truth.

389
00:14:01,400 --> 00:14:03,880
If you don't treat the source systems as part of the reporting

390
00:14:03,880 --> 00:14:06,440
boundary, you'll spend your life explaining downstream numbers

391
00:14:06,440 --> 00:14:08,480
while upstream inputs keep changing.

392
00:14:08,480 --> 00:14:11,480
So let's name the real sources and the real damage they can do.

393
00:14:11,480 --> 00:14:14,680
Start with ERP, SAP, Dynamics, whatever you've standardized

394
00:14:14,680 --> 00:14:18,080
on ERP is where activity data becomes audit-friendly

395
00:14:18,080 --> 00:14:19,800
because it already has controls.

396
00:14:19,800 --> 00:14:22,240
Transactions, approvals, posting periods,

397
00:14:22,240 --> 00:14:24,360
master data, organizational structure.

398
00:14:24,360 --> 00:14:27,480
But the trap is that teams try to use the ERP outputs

399
00:14:27,480 --> 00:14:30,040
as already reported sustainability data.

400
00:14:30,040 --> 00:14:30,560
They shouldn't.

401
00:14:30,560 --> 00:14:33,600
You want the activity data, fuel purchases, freight costs,

402
00:14:33,600 --> 00:14:36,400
inventory movement, utility invoices, travel expenses,

403
00:14:36,400 --> 00:14:37,800
procurement line items.

404
00:14:37,800 --> 00:14:40,160
The thing most people miss is that ERP is better

405
00:14:40,160 --> 00:14:42,480
as an evidence source than as a calculation engine.

406
00:14:42,480 --> 00:14:44,200
It's good at capturing business events

407
00:14:44,200 --> 00:14:45,680
with identity and timestamps.

408
00:14:45,680 --> 00:14:48,240
It is not good at emissions factors, allocation logic

409
00:14:48,240 --> 00:14:50,600
or multi-scope reconciliation unless you deliberately

410
00:14:50,600 --> 00:14:51,280
build it that way.

411
00:14:51,280 --> 00:14:54,440
So ERP is a source of facts, not a source of finished ESG

412
00:14:54,440 --> 00:14:56,960
truth, next energy meters and IoT telemetry.

413
00:14:56,960 --> 00:14:59,680
This is where teams get excited about granularity

414
00:14:59,680 --> 00:15:01,760
and then quietly drown.

415
00:15:01,760 --> 00:15:05,000
Telemetry is high volume, high frequency and low forgiveness.

416
00:15:05,000 --> 00:15:07,600
You can collect a million readings and still fail assurance

417
00:15:07,600 --> 00:15:10,080
because you can't explain context, which facility,

418
00:15:10,080 --> 00:15:12,240
which meter, what unit, which time zone,

419
00:15:12,240 --> 00:15:15,080
what calibration assumptions, what mapping from meter

420
00:15:15,080 --> 00:15:16,840
to asset to business unit.

421
00:15:16,840 --> 00:15:18,560
Telemetry without context is not data.

422
00:15:18,560 --> 00:15:20,360
It's noise with audit liability.

423
00:15:20,360 --> 00:15:22,920
And because IoT pipelines often involve gateways

424
00:15:22,920 --> 00:15:25,360
edge buffering, retries and late arriving events

425
00:15:25,360 --> 00:15:26,840
you need to design for time.

426
00:15:26,840 --> 00:15:28,760
Event time versus ingestion time

427
00:15:28,760 --> 00:15:30,960
and what happens when the real reading shows up

428
00:15:30,960 --> 00:15:32,360
after period close.

429
00:15:32,360 --> 00:15:34,920
If you don't decide that early, your close process becomes

430
00:15:34,920 --> 00:15:36,960
a permanent argument with your own sensors.

431
00:15:36,960 --> 00:15:39,600
Now HR systems.

432
00:15:39,600 --> 00:15:42,160
Workforce metrics sound simple until you try

433
00:15:42,160 --> 00:15:43,800
to define them consistently.

434
00:15:43,800 --> 00:15:46,960
Headcount, turnover, diversity, health and safety incidents,

435
00:15:46,960 --> 00:15:49,000
training hours, these are all HR managed

436
00:15:49,000 --> 00:15:50,160
and they are sensitive.

437
00:15:50,160 --> 00:15:53,320
That creates two constraints, access control and aggregation.

438
00:15:53,320 --> 00:15:54,840
You don't want raw employee records

439
00:15:54,840 --> 00:15:56,600
wandering into analytics workspaces

440
00:15:56,600 --> 00:15:58,080
because someone wanted a dashboard.

441
00:15:58,080 --> 00:16:01,320
For OESG, HR systems should feed controlled aggregates

442
00:16:01,320 --> 00:16:03,280
with documented definitions and a stable

443
00:16:03,280 --> 00:16:05,360
organizational hierarchy mapping.

444
00:16:05,360 --> 00:16:07,600
Otherwise you get denominator drift.

445
00:16:07,600 --> 00:16:09,200
The metric stays the same name,

446
00:16:09,200 --> 00:16:11,160
but the population changes silently.

447
00:16:11,160 --> 00:16:13,000
Auditors don't need to see personal data.

448
00:16:13,000 --> 00:16:15,720
They need to see that the metric definition didn't mutate.

449
00:16:15,720 --> 00:16:18,040
Then procurement and suppliers, which is where scope three

450
00:16:18,040 --> 00:16:20,680
stops being theory and becomes operational humiliation.

451
00:16:20,680 --> 00:16:23,680
Supplier data comes through surveys, portals, partner feeds,

452
00:16:23,680 --> 00:16:25,840
invoices and sometimes email attachments

453
00:16:25,840 --> 00:16:28,480
that should never be admitted into an evidence chain.

454
00:16:28,480 --> 00:16:29,840
The variability is the point.

455
00:16:29,840 --> 00:16:32,040
Suppliers don't share the same systems,

456
00:16:32,040 --> 00:16:34,160
the same data quality or the same incentives.

457
00:16:34,160 --> 00:16:36,400
So you need to capture two things from day one.

458
00:16:36,400 --> 00:16:37,880
Coverage and confidence.

459
00:16:37,880 --> 00:16:39,760
What percentage of spend or categories

460
00:16:39,760 --> 00:16:42,760
have supplier provided data and what percentage is estimated?

461
00:16:42,760 --> 00:16:44,160
Those flags aren't nice to have.

462
00:16:44,160 --> 00:16:46,120
They're the only honest way to survive questions

463
00:16:46,120 --> 00:16:47,120
about completeness.

464
00:16:47,120 --> 00:16:49,200
And if you don't store supplier submissions

465
00:16:49,200 --> 00:16:51,920
as evidence artifacts with identity, timestamps

466
00:16:51,920 --> 00:16:54,360
and versioning, you will not be able to prove what was known

467
00:16:54,360 --> 00:16:55,800
at the time of reporting.

468
00:16:55,800 --> 00:16:59,480
Now the radioactive source, spreadsheets and CSV files,

469
00:16:59,480 --> 00:17:00,320
they're allowed.

470
00:17:00,320 --> 00:17:03,400
They're also the birthplace of Final V7 CSV,

471
00:17:03,400 --> 00:17:06,600
which is the universal symbol of uncontrolled modification.

472
00:17:06,600 --> 00:17:07,920
Spreadsheets are not evil.

473
00:17:07,920 --> 00:17:09,160
They're just not control systems.

474
00:17:09,160 --> 00:17:11,520
They don't preserve chain of custody by default

475
00:17:11,520 --> 00:17:13,400
and they make it trivial to change history

476
00:17:13,400 --> 00:17:14,800
without leaving a durable trail.

477
00:17:14,800 --> 00:17:17,800
So in an audit grade stack, spreadsheets are treated

478
00:17:17,800 --> 00:17:19,160
as controlled submissions.

479
00:17:19,160 --> 00:17:21,360
Metadata captured, schema validated,

480
00:17:21,360 --> 00:17:24,240
approvals recorded, and then the content gets ingested

481
00:17:24,240 --> 00:17:26,640
into the raw zone as append only evidence.

482
00:17:26,640 --> 00:17:29,160
The spreadsheet itself becomes supporting documentation

483
00:17:29,160 --> 00:17:31,720
in the evidence vault, not the system of record.

484
00:17:31,720 --> 00:17:32,680
Here's the checkpoint.

485
00:17:32,680 --> 00:17:35,880
Every source system has its own native controls, gaps

486
00:17:35,880 --> 00:17:37,320
and failure modes.

487
00:17:37,320 --> 00:17:40,480
ERP brings structure, but temps reported outputs.

488
00:17:40,480 --> 00:17:42,880
IoT brings volume, but lacks business context.

489
00:17:42,880 --> 00:17:45,720
HR brings sensitivity and definition drift.

490
00:17:45,720 --> 00:17:48,400
Suppliers bring variability and partial coverage.

491
00:17:48,400 --> 00:17:50,120
Spreadsheets brings speed and entropy.

492
00:17:50,120 --> 00:17:53,360
Once you accept that, the next step becomes obvious.

493
00:17:53,360 --> 00:17:55,120
Ingestion is where truth gets corrupted

494
00:17:55,120 --> 00:17:56,960
because ingestion is where humans still believe

495
00:17:56,960 --> 00:17:58,320
overwriting is a feature.

496
00:17:58,320 --> 00:18:00,040
Ingestion patterns control pipelines

497
00:18:00,040 --> 00:18:02,120
versus human driven upload rituals.

498
00:18:02,120 --> 00:18:04,840
Ingestion is where OESG dies in real life

499
00:18:04,840 --> 00:18:07,160
because ingestion is where teams still confuse

500
00:18:07,160 --> 00:18:09,480
getting data in with getting evidence in.

501
00:18:09,480 --> 00:18:10,640
Those are not the same.

502
00:18:10,640 --> 00:18:12,120
The design rule is simple.

503
00:18:12,120 --> 00:18:14,120
Ingestion must be append first.

504
00:18:14,120 --> 00:18:15,800
Overrides are ordered poison.

505
00:18:15,800 --> 00:18:18,560
The moment your process allows replace the file,

506
00:18:18,560 --> 00:18:20,600
you've created an invisible edit pathway

507
00:18:20,600 --> 00:18:22,120
inside your reporting boundary.

508
00:18:22,120 --> 00:18:24,240
And auditors don't need to prove you used it.

509
00:18:24,240 --> 00:18:25,640
They only need to prove you could.

510
00:18:25,640 --> 00:18:28,520
Append first means every load becomes a new object

511
00:18:28,520 --> 00:18:31,400
or a new version with a load identifier that never repeats.

512
00:18:31,400 --> 00:18:33,800
If you want to correct something, you don't edit history.

513
00:18:33,800 --> 00:18:36,400
You publish an adjustment with rationale and approval

514
00:18:36,400 --> 00:18:38,760
and you keep the original as evidence.

515
00:18:38,760 --> 00:18:40,160
Now here's where most people mess up.

516
00:18:40,160 --> 00:18:42,280
They treat ingestion as a user interface problem.

517
00:18:42,280 --> 00:18:43,520
They build an upload folder.

518
00:18:43,520 --> 00:18:45,520
They write drop files here instructions.

519
00:18:45,520 --> 00:18:46,880
They call it a pipeline.

520
00:18:46,880 --> 00:18:49,320
Then the first late file arrives and someone

521
00:18:49,320 --> 00:18:50,960
overrides the last one.

522
00:18:50,960 --> 00:18:53,880
Because the business wanted the dashboard to be right.

523
00:18:53,880 --> 00:18:55,200
That's not ingestion.

524
00:18:55,200 --> 00:18:56,480
That's ritual.

525
00:18:56,480 --> 00:18:59,800
A controlled ingestion pattern has three non-negotiable behaviors.

526
00:18:59,800 --> 00:19:02,560
Orchestrated movement, validation gates, and telemetry

527
00:19:02,560 --> 00:19:05,040
that can be handed to an auditor without translation.

528
00:19:05,040 --> 00:19:07,680
Let's talk tooling because Microsoft gives you multiple ways

529
00:19:07,680 --> 00:19:10,480
to ingest and none of them magically make you auditable.

530
00:19:10,480 --> 00:19:12,480
Fabric native ingestion is convenient

531
00:19:12,480 --> 00:19:14,400
when your sources are straightforward.

532
00:19:14,400 --> 00:19:17,640
Files, tables, common connectors, predictable schedules,

533
00:19:17,640 --> 00:19:19,840
and you can keep the orchestration simple.

534
00:19:19,840 --> 00:19:21,400
The benefit is proximity.

535
00:19:21,400 --> 00:19:23,240
You're already in the Lake House world.

536
00:19:23,240 --> 00:19:26,160
And you can land data close to where it will be processed.

537
00:19:26,160 --> 00:19:28,080
The failure mode is also proximity.

538
00:19:28,080 --> 00:19:30,920
Teams let convenience become a substitute for control

539
00:19:30,920 --> 00:19:33,800
and they stop capturing the metadata that proves what happened.

540
00:19:33,800 --> 00:19:36,400
Azure Data Factory exists for the unglamorous reality,

541
00:19:36,400 --> 00:19:39,120
complex ERP integration, IoT coordination,

542
00:19:39,120 --> 00:19:42,200
multi-step API polls, dependencies, retries, and sequencing

543
00:19:42,200 --> 00:19:44,320
that can't be trusted to just run.

544
00:19:44,320 --> 00:19:47,800
It also has the operational surface area for governance.

545
00:19:47,800 --> 00:19:49,720
Parameterized pipelines, run history,

546
00:19:49,720 --> 00:19:51,320
integration runtime behavior,

547
00:19:51,320 --> 00:19:53,640
and consistent patterns across many sources.

548
00:19:53,640 --> 00:19:55,240
But the constraint is brutal.

549
00:19:55,240 --> 00:19:58,400
Immutability will punish sloppy ADF designs.

550
00:19:58,400 --> 00:20:01,440
When ADF tries to override a file in an immutable container,

551
00:20:01,440 --> 00:20:02,440
it fails.

552
00:20:02,440 --> 00:20:03,440
That's not Microsoft being difficult.

553
00:20:03,440 --> 00:20:05,480
That's your system proving that it was designed

554
00:20:05,480 --> 00:20:07,000
to rewrite evidence.

555
00:20:07,000 --> 00:20:09,960
So the pattern becomes right to a mutable staging area,

556
00:20:09,960 --> 00:20:13,120
validate, then publish into the immutable raw archive.

557
00:20:13,120 --> 00:20:17,240
Mutable staging immutable archive, two zones, two behaviors.

558
00:20:17,240 --> 00:20:19,520
Now validation gates.

559
00:20:19,520 --> 00:20:22,480
This is the part that separates ingestion from data dumping.

560
00:20:22,480 --> 00:20:24,240
Every load needs to be validated before it

561
00:20:24,240 --> 00:20:26,720
becomes evidence inside your system of record.

562
00:20:26,720 --> 00:20:29,360
And validation isn't just did the pipeline run.

563
00:20:29,360 --> 00:20:31,680
It's, did the data meet minimum standards

564
00:20:31,680 --> 00:20:33,080
to be considered in scope?

565
00:20:33,080 --> 00:20:36,400
The practical gates are boring and therefore effective.

566
00:20:36,400 --> 00:20:40,440
Schema checks, column names, types, required fields,

567
00:20:40,440 --> 00:20:42,080
and allowed null behavior.

568
00:20:42,080 --> 00:20:45,440
Unit normalization, kWH versus MWH,

569
00:20:45,440 --> 00:20:47,960
liters versus cubic meters, distance units,

570
00:20:47,960 --> 00:20:52,720
currency, time zones, required dimensions, site, region, period,

571
00:20:52,720 --> 00:20:55,680
scope category, source system identifier.

572
00:20:55,680 --> 00:20:57,840
And the system must record the outcome.

573
00:20:57,840 --> 00:21:01,320
Pass, fail, quarantine, or accepted with exceptions.

574
00:21:01,320 --> 00:21:03,800
This is where you stop pretending spreadsheets are harmless.

575
00:21:03,800 --> 00:21:06,200
A controlled CSV submission is allowed only

576
00:21:06,200 --> 00:21:09,240
if it passes the same validation gates as an API feed.

577
00:21:09,240 --> 00:21:11,000
Otherwise, you've created a privileged path

578
00:21:11,000 --> 00:21:13,200
for human supplied nonsense to enter the raw zone.

579
00:21:13,200 --> 00:21:15,640
Now, log everything, not for observability dashboards,

580
00:21:15,640 --> 00:21:16,720
for chain of custody.

581
00:21:16,720 --> 00:21:18,800
Every load should produce a durable record

582
00:21:18,800 --> 00:21:22,480
that includes a load ID, source system, extract window,

583
00:21:22,480 --> 00:21:24,640
ingestion timestamp, submitter identity

584
00:21:24,640 --> 00:21:28,200
where applicable, file name or object path validation results,

585
00:21:28,200 --> 00:21:30,440
and the pipeline version that perform the load.

586
00:21:30,440 --> 00:21:32,200
This is where entry shows up again.

587
00:21:32,200 --> 00:21:34,240
Submitter identity is not a name in an email.

588
00:21:34,240 --> 00:21:37,200
It's an authenticated identity tied to the submission event.

589
00:21:37,200 --> 00:21:39,800
If you can't attribute a submission to a real identity,

590
00:21:39,800 --> 00:21:41,840
you can't prove separation of duties.

591
00:21:41,840 --> 00:21:44,000
And if you can't prove separation of duties,

592
00:21:44,000 --> 00:21:47,080
you will eventually be asked why you believe your own data.

593
00:21:47,080 --> 00:21:49,640
There's also a subtle requirement most teams miss.

594
00:21:49,640 --> 00:21:51,520
ingestion needs to be replayable.

595
00:21:51,520 --> 00:21:54,120
With someone asks, what did we know on March 31st?

596
00:21:54,120 --> 00:21:56,400
You can't respond with the current state of the lake.

597
00:21:56,400 --> 00:21:58,960
You need to be able to point to the exact load artifacts

598
00:21:58,960 --> 00:22:00,360
that were in scope at close.

599
00:22:00,360 --> 00:22:01,640
So ingestion isn't a pipe.

600
00:22:01,640 --> 00:22:02,640
It's a ledger.

601
00:22:02,640 --> 00:22:04,920
And once you build it that way, the downstream system

602
00:22:04,920 --> 00:22:05,800
gets easier.

603
00:22:05,800 --> 00:22:08,400
Curated models get cleaner inputs, calculations

604
00:22:08,400 --> 00:22:11,520
become stable and period close becomes an actual event,

605
00:22:11,520 --> 00:22:13,200
not a calendar reminder.

606
00:22:13,200 --> 00:22:16,080
Next, the data has to be stored like it will be subpoenaed

607
00:22:16,080 --> 00:22:17,440
because it can be.

608
00:22:17,440 --> 00:22:21,200
Storage anatomy, raw, curated, bioreported,

609
00:22:21,200 --> 00:22:22,640
plus an evidence vault.

610
00:22:22,640 --> 00:22:25,040
Storage is where good intentions go to die,

611
00:22:25,040 --> 00:22:27,640
because most teams store ESG data the same way

612
00:22:27,640 --> 00:22:29,280
they store project files.

613
00:22:29,280 --> 00:22:31,920
Whatever folder exists, whatever naming convention

614
00:22:31,920 --> 00:22:35,680
someone remembers, and whatever overrides still work.

615
00:22:35,680 --> 00:22:37,160
That's not storage architecture.

616
00:22:37,160 --> 00:22:39,240
That's entropy management without the management.

617
00:22:39,240 --> 00:22:42,040
An auditable ESG stack needs storage anatomy.

618
00:22:42,040 --> 00:22:43,800
Distinct layers with distinct rules

619
00:22:43,800 --> 00:22:46,360
because different data states have different liabilities.

620
00:22:46,360 --> 00:22:48,280
You're not organizing data for convenience.

621
00:22:48,280 --> 00:22:50,840
You're organizing it so the system can prove what happened.

622
00:22:50,840 --> 00:22:53,760
So the baseline pattern is three zones, raw, curated,

623
00:22:53,760 --> 00:22:54,760
and reported.

624
00:22:54,760 --> 00:22:58,080
And then a fourth thing most stacks forget an evidence vault.

625
00:22:58,080 --> 00:23:00,640
Raw is the closest to source landing zone.

626
00:23:00,640 --> 00:23:01,600
Append only.

627
00:23:01,600 --> 00:23:02,640
Minimal transformation.

628
00:23:02,640 --> 00:23:04,280
The goal is not usability.

629
00:23:04,280 --> 00:23:05,720
The goal is preservation.

630
00:23:05,720 --> 00:23:07,360
Raw data answers one question.

631
00:23:07,360 --> 00:23:09,680
What did we receive from where and when?

632
00:23:09,680 --> 00:23:12,320
That means raw objects need stable identifiers

633
00:23:12,320 --> 00:23:14,480
and immutable behavior after close.

634
00:23:14,480 --> 00:23:17,960
If you normalize units in raw, you've already destroyed provenance

635
00:23:17,960 --> 00:23:19,840
unless you also store the original.

636
00:23:19,840 --> 00:23:22,240
So raw keeps the original representation.

637
00:23:22,240 --> 00:23:24,600
The meter reading payload, the invoice extract,

638
00:23:24,600 --> 00:23:27,320
the supplier submission file, the export from ERP.

639
00:23:27,320 --> 00:23:29,040
You can add metadata alongside it.

640
00:23:29,040 --> 00:23:30,920
You do not fix it in place.

641
00:23:30,920 --> 00:23:32,920
Curated is where the data becomes usable.

642
00:23:32,920 --> 00:23:36,000
This is where you standardize, conform, and model.

643
00:23:36,000 --> 00:23:38,080
Curated data answers a different question.

644
00:23:38,080 --> 00:23:40,040
What does this mean in our organization?

645
00:23:40,040 --> 00:23:42,480
This is where you map source specific fields

646
00:23:42,480 --> 00:23:43,640
into a common schema.

647
00:23:43,640 --> 00:23:45,760
Standardize units apply reference data

648
00:23:45,760 --> 00:23:48,800
like organizational hierarchies and attach quality flags.

649
00:23:48,800 --> 00:23:51,000
The curated zone is where you deal with the reality

650
00:23:51,000 --> 00:23:54,280
that one system calls it planned, another calls it site,

651
00:23:54,280 --> 00:23:56,120
and a third calls it location.

652
00:23:56,120 --> 00:23:58,160
And none of them agree on identifiers.

653
00:23:58,160 --> 00:24:01,360
You resolve that here explicitly in versioned transformations

654
00:24:01,360 --> 00:24:02,160
you can explain.

655
00:24:02,160 --> 00:24:05,200
Curated is also where you keep truth with scars.

656
00:24:05,200 --> 00:24:06,920
You don't hide data quality issues.

657
00:24:06,920 --> 00:24:08,040
You mark them.

658
00:24:08,040 --> 00:24:12,000
Later, arriving data, missing dimensions, suspect values,

659
00:24:12,000 --> 00:24:13,840
all of that becomes flags and exceptions

660
00:24:13,840 --> 00:24:16,360
because clean data with no record of cleaning

661
00:24:16,360 --> 00:24:18,200
is just manipulated data.

662
00:24:18,200 --> 00:24:20,480
Reported is the period-closed output zone.

663
00:24:20,480 --> 00:24:22,320
This is where KPIs become records.

664
00:24:22,320 --> 00:24:24,600
Reported answers, the only question assurance really

665
00:24:24,600 --> 00:24:27,120
cares about what did you report for this period

666
00:24:27,120 --> 00:24:29,480
under which logic using which inputs and factors.

667
00:24:29,480 --> 00:24:30,960
Reported data must be stable.

668
00:24:30,960 --> 00:24:33,920
Once the period closes, reported outputs do not change.

669
00:24:33,920 --> 00:24:35,160
If something needs correction,

670
00:24:35,160 --> 00:24:37,200
you don't override reported tables.

671
00:24:37,200 --> 00:24:39,320
You publish an adjustment entry with references,

672
00:24:39,320 --> 00:24:42,120
what changed, why, and which approval allowed it.

673
00:24:42,120 --> 00:24:43,600
That's how financial systems work.

674
00:24:43,600 --> 00:24:45,360
And ESG doesn't get a special exemption

675
00:24:45,360 --> 00:24:46,880
just because it feels newer.

676
00:24:46,880 --> 00:24:48,280
Now here's the thing most people miss.

677
00:24:48,280 --> 00:24:50,600
These three zones are not only about data shape,

678
00:24:50,600 --> 00:24:52,200
they're about access boundaries.

679
00:24:52,200 --> 00:24:54,840
Raw is restricted because it contains direct extracts

680
00:24:54,840 --> 00:24:56,520
and sometimes sensitive fields.

681
00:24:56,520 --> 00:24:57,960
Curated is restricted differently

682
00:24:57,960 --> 00:25:00,040
because it represents standardized enterprise data

683
00:25:00,040 --> 00:25:01,600
that can be widely misused.

684
00:25:01,600 --> 00:25:04,120
Reported is restricted because it's the official record.

685
00:25:04,120 --> 00:25:06,040
Different audiences, different permissions,

686
00:25:06,040 --> 00:25:07,760
same-entra enforcement model,

687
00:25:07,760 --> 00:25:09,200
and then there's the evidence vault.

688
00:25:09,200 --> 00:25:12,280
The evidence vault is not a folder called supporting docs.

689
00:25:12,280 --> 00:25:14,040
It's a controlled repository for everything

690
00:25:14,040 --> 00:25:15,160
that proves the numbers.

691
00:25:15,160 --> 00:25:16,960
Supplyers, submissions, invoices,

692
00:25:16,960 --> 00:25:19,480
meter calibration records, calculation approvals,

693
00:25:19,480 --> 00:25:21,760
factor library provenance, mapping decisions,

694
00:25:21,760 --> 00:25:23,840
and period-close attestations.

695
00:25:23,840 --> 00:25:26,800
This vault matters because ESG is not purely quantitative.

696
00:25:26,800 --> 00:25:28,480
Even when the KPIs are number,

697
00:25:28,480 --> 00:25:30,880
the justification often involves documents.

698
00:25:30,880 --> 00:25:32,840
The vault is where you store those artifacts

699
00:25:32,840 --> 00:25:35,960
with the same chain of custody expectations as raw data,

700
00:25:35,960 --> 00:25:39,960
who submitted it, when, which KPI or period it supports,

701
00:25:39,960 --> 00:25:42,040
and whether it was locked after close.

702
00:25:42,040 --> 00:25:44,040
If the supporting evidence lives in teams chats

703
00:25:44,040 --> 00:25:46,920
and someone's mailbox, it doesn't exist in audit terms.

704
00:25:46,920 --> 00:25:48,400
It exists as a future argument.

705
00:25:48,400 --> 00:25:51,080
Now naming and versioning, because mystery tables

706
00:25:51,080 --> 00:25:53,960
are an architectural failure, not a documentation failure,

707
00:25:53,960 --> 00:25:56,880
every object needs a predictable name that encodes,

708
00:25:56,880 --> 00:25:59,720
domain, source, period, and version.

709
00:25:59,720 --> 00:26:01,560
Not because auditors love naming conventions,

710
00:26:01,560 --> 00:26:04,120
but because humans do, you want an engineer to look at a path

711
00:26:04,120 --> 00:26:06,160
and understand whether it's raw or reported,

712
00:26:06,160 --> 00:26:07,760
whether it's preliminary or closed,

713
00:26:07,760 --> 00:26:09,600
and which period it belongs to,

714
00:26:09,600 --> 00:26:11,200
versioning needs to be explicit.

715
00:26:11,200 --> 00:26:12,480
Latest is not a version.

716
00:26:12,480 --> 00:26:13,920
Final is not a version.

717
00:26:13,920 --> 00:26:15,560
Final final two is a confession.

718
00:26:15,560 --> 00:26:17,960
So the storage anatomy creates a set of invariants.

719
00:26:17,960 --> 00:26:19,880
Raw preserves, curated standardizes,

720
00:26:19,880 --> 00:26:22,160
reported freezes, and the evidence vault proves.

721
00:26:22,160 --> 00:26:25,280
Once you have that, immutability stops being a storage checkbox

722
00:26:25,280 --> 00:26:27,520
and becomes a design constraint that your pipelines

723
00:26:27,520 --> 00:26:28,600
can actually survive.

724
00:26:28,600 --> 00:26:29,680
That's next.

725
00:26:29,680 --> 00:26:31,400
Immutability, worm.

726
00:26:31,400 --> 00:26:34,280
How to lock evidence without breaking your pipelines?

727
00:26:34,280 --> 00:26:36,080
Immutability is where good architecture

728
00:26:36,080 --> 00:26:38,400
stop being aspirational and start being inconvenient,

729
00:26:38,400 --> 00:26:39,360
which is why it works.

730
00:26:39,360 --> 00:26:41,200
In Azure terms, this is right once,

731
00:26:41,200 --> 00:26:43,120
read many immutable storage policies

732
00:26:43,120 --> 00:26:45,520
on blob storage or ADLS Gen2

733
00:26:45,520 --> 00:26:48,440
that prevent modification or deletion for a defined period.

734
00:26:48,440 --> 00:26:51,040
Time-based retention locks data for a set interval.

735
00:26:51,040 --> 00:26:54,760
Legal hold locks it until someone explicitly clears it.

736
00:26:54,760 --> 00:26:55,760
Different intent?

737
00:26:55,760 --> 00:26:56,600
Same effect.

738
00:26:56,600 --> 00:26:59,040
You can create and read, but you can't rewrite the past.

739
00:26:59,040 --> 00:27:00,040
That's the point.

740
00:27:00,040 --> 00:27:02,280
The mistake teams make is treating immutability

741
00:27:02,280 --> 00:27:05,000
as a storage toggle you enable later.

742
00:27:05,000 --> 00:27:07,240
But immutability isn't a feature you add.

743
00:27:07,240 --> 00:27:09,840
It's a constraint that changes pipeline behavior,

744
00:27:09,840 --> 00:27:13,280
deployment patterns, and how humans negotiate fixes.

745
00:27:13,280 --> 00:27:15,720
So let's be explicit about what changes operationally

746
00:27:15,720 --> 00:27:17,920
the moment you lock a container.

747
00:27:17,920 --> 00:27:19,920
Overrides become illegal.

748
00:27:19,920 --> 00:27:23,160
Re-run the job becomes, publish a new version.

749
00:27:23,160 --> 00:27:25,600
Re-run the job becomes, post an adjustment.

750
00:27:25,600 --> 00:27:27,720
And any pipeline that assumes it can land

751
00:27:27,720 --> 00:27:30,000
to the same path twice will fail loudly.

752
00:27:30,000 --> 00:27:31,880
As your storage will enforce the policy

753
00:27:31,880 --> 00:27:34,880
and your orchestration will surface it as an error.

754
00:27:34,880 --> 00:27:36,360
In data factory, you'll see failures

755
00:27:36,360 --> 00:27:37,800
like path immutable due to policy

756
00:27:37,800 --> 00:27:39,680
when a copy activity attempts to override

757
00:27:39,680 --> 00:27:41,160
or modify a protected path.

758
00:27:41,160 --> 00:27:42,880
That error is not a platform defect.

759
00:27:42,880 --> 00:27:45,640
It is the system preventing evidence tempering accidental

760
00:27:45,640 --> 00:27:46,480
or otherwise.

761
00:27:46,480 --> 00:27:48,160
This is the foundational misunderstanding

762
00:27:48,160 --> 00:27:50,600
people think immutability is about security.

763
00:27:50,600 --> 00:27:51,440
It's not.

764
00:27:51,440 --> 00:27:52,280
It's about time.

765
00:27:52,280 --> 00:27:54,520
Making period close real in the storage layer

766
00:27:54,520 --> 00:27:56,240
not just in a calendar invite.

767
00:27:56,240 --> 00:27:57,760
Now, there are two workable patterns

768
00:27:57,760 --> 00:28:00,040
that don't destroy your operations.

769
00:28:00,040 --> 00:28:02,320
The first pattern is immutable by design.

770
00:28:02,320 --> 00:28:06,120
Every ingestion writes to a unique, never-reused object name,

771
00:28:06,120 --> 00:28:07,880
and you never need to override anything.

772
00:28:07,880 --> 00:28:10,400
That means a path that includes a load identifier

773
00:28:10,400 --> 00:28:13,440
plus a deterministic partitioning scheme, source, system,

774
00:28:13,440 --> 00:28:16,320
date, and maybe hour if you're dealing with telemetry.

775
00:28:16,320 --> 00:28:18,520
Each run produces a new object set.

776
00:28:18,520 --> 00:28:20,840
If the data arrives late, it lands as a new object set

777
00:28:20,840 --> 00:28:22,440
with a later load ID.

778
00:28:22,440 --> 00:28:24,560
You can still compute the same reported outputs

779
00:28:24,560 --> 00:28:27,800
because your close process selects which load IDs are in scope.

780
00:28:27,800 --> 00:28:30,360
The second pattern is the one most organizations actually

781
00:28:30,360 --> 00:28:30,680
need.

782
00:28:30,680 --> 00:28:34,080
Mutable staging, validated, publish, immutable archive.

783
00:28:34,080 --> 00:28:35,120
Here's how it works.

784
00:28:35,120 --> 00:28:38,160
You ingest into a staging area that is intentionally mutable.

785
00:28:38,160 --> 00:28:40,440
You can rerun pipelines there, fix mapping bugs,

786
00:28:40,440 --> 00:28:42,320
and iterate without fighting worm.

787
00:28:42,320 --> 00:28:43,560
Then you run validation gates.

788
00:28:43,560 --> 00:28:45,480
And only after validation succeeds

789
00:28:45,480 --> 00:28:48,480
do you publish to the raw evidence zone, which is immutable.

790
00:28:48,480 --> 00:28:50,280
Publish is not copy and delete.

791
00:28:50,280 --> 00:28:52,160
Publish is a one-way promotion.

792
00:28:52,160 --> 00:28:55,080
New immutable objects written with a stable naming convention

793
00:28:55,080 --> 00:28:57,880
plus a metadata record that binds them to a load ID,

794
00:28:57,880 --> 00:29:01,360
pipeline version, submitter identity, and validation results.

795
00:29:01,360 --> 00:29:02,880
If you're using Azure Data Factory,

796
00:29:02,880 --> 00:29:05,720
this pattern becomes mandatory in some transformation

797
00:29:05,720 --> 00:29:07,680
scenarios because certain activities rely

798
00:29:07,680 --> 00:29:10,360
on temporary files during processing.

799
00:29:10,360 --> 00:29:12,480
Immutable policies prevent those temporary rights

800
00:29:12,480 --> 00:29:13,480
and cleanup operations.

801
00:29:13,480 --> 00:29:15,600
So you write to non-immutable storage first,

802
00:29:15,600 --> 00:29:18,080
then use a copy activity to move the finalized outputs

803
00:29:18,080 --> 00:29:19,600
into the immutable container.

804
00:29:19,600 --> 00:29:23,200
Again, inconvenient, predictable, correct.

805
00:29:23,200 --> 00:29:26,760
Now the subtle part, immutability doesn't just apply to raw,

806
00:29:26,760 --> 00:29:28,960
it applies to anything you will later claim as evidence.

807
00:29:28,960 --> 00:29:31,240
That includes factor libraries for a closed period,

808
00:29:31,240 --> 00:29:33,680
the period closed reported KPI outputs,

809
00:29:33,680 --> 00:29:36,280
and the evidence vault documents that support disclosures.

810
00:29:36,280 --> 00:29:38,080
If those objects remain mutable,

811
00:29:38,080 --> 00:29:39,880
you can't prove historical integrity.

812
00:29:39,880 --> 00:29:41,080
You can only promise it.

813
00:29:41,080 --> 00:29:43,000
Auditors don't accept promises as controls.

814
00:29:43,000 --> 00:29:46,080
So you need a closed process that includes a storage lock step.

815
00:29:46,080 --> 00:29:48,200
At period close, you freeze the selection of inputs,

816
00:29:48,200 --> 00:29:49,880
which load IDs are included.

817
00:29:49,880 --> 00:29:51,200
You freeze factor versions,

818
00:29:51,200 --> 00:29:53,360
you freeze the calculation logic reference,

819
00:29:53,360 --> 00:29:55,160
then you publish the reported outputs

820
00:29:55,160 --> 00:29:58,280
and apply immutability to the reported zone for that period.

821
00:29:58,280 --> 00:30:00,320
You're not locking the entire lake forever.

822
00:30:00,320 --> 00:30:02,000
You're locking the slices that represent

823
00:30:02,000 --> 00:30:03,480
what we knew and reported.

824
00:30:03,480 --> 00:30:05,200
That distinction matters because you still need

825
00:30:05,200 --> 00:30:06,640
to operate next month.

826
00:30:06,640 --> 00:30:09,120
One more uncomfortable truth immutability forces you

827
00:30:09,120 --> 00:30:10,280
to design for corrections.

828
00:30:10,280 --> 00:30:11,960
Corrections can't be overrides,

829
00:30:11,960 --> 00:30:13,880
so they become adjustment entries.

830
00:30:13,880 --> 00:30:16,200
Additive records that reference the original,

831
00:30:16,200 --> 00:30:18,480
carry a rationale and require approval.

832
00:30:18,480 --> 00:30:19,640
If you do this well,

833
00:30:19,640 --> 00:30:22,480
you end up with something auditors understand immediately.

834
00:30:22,480 --> 00:30:24,400
The original evidence remains intact

835
00:30:24,400 --> 00:30:26,880
and the adjustment trail is visible and attributable.

836
00:30:26,880 --> 00:30:27,880
If you do this poorly,

837
00:30:27,880 --> 00:30:30,360
people will attempt to bypass the system.

838
00:30:30,360 --> 00:30:31,680
They'll hunt for a mutable folder.

839
00:30:31,680 --> 00:30:33,120
They'll ask for exceptions.

840
00:30:33,120 --> 00:30:34,440
They'll demand admin rights.

841
00:30:34,440 --> 00:30:35,680
That's not a people problem.

842
00:30:35,680 --> 00:30:37,440
That's you failing to design the only thing

843
00:30:37,440 --> 00:30:38,800
that survives entropy,

844
00:30:38,800 --> 00:30:41,080
an architecture that makes the right behavior easier

845
00:30:41,080 --> 00:30:42,280
than the wrong one.

846
00:30:42,280 --> 00:30:43,920
Next up is the hard part.

847
00:30:43,920 --> 00:30:45,600
Calculations that don't drift,

848
00:30:45,600 --> 00:30:48,720
even when everyone keeps improving the logic.

849
00:30:48,720 --> 00:30:50,360
The governed calculation zone,

850
00:30:50,360 --> 00:30:51,960
fabric lake house or synapse,

851
00:30:51,960 --> 00:30:53,280
not dashboard math.

852
00:30:53,280 --> 00:30:55,600
This is where most ESG stacks quietly rot,

853
00:30:55,600 --> 00:30:56,680
the calculation layer,

854
00:30:56,680 --> 00:30:58,000
not because people can't do math

855
00:30:58,000 --> 00:30:59,400
because they put math in places

856
00:30:59,400 --> 00:31:01,800
that can't be governed like an accounting system.

857
00:31:01,800 --> 00:31:03,520
Power BI is a presentation tool.

858
00:31:03,520 --> 00:31:06,120
It is not an audit grade calculation engine.

859
00:31:06,120 --> 00:31:07,360
The moment your emissions logic

860
00:31:07,360 --> 00:31:09,320
lives primarily in DAX measures,

861
00:31:09,320 --> 00:31:11,080
you've made your numbers dependent on a file

862
00:31:11,080 --> 00:31:13,040
that changes whenever someone wants a new visual.

863
00:31:13,040 --> 00:31:14,120
That's not control.

864
00:31:14,120 --> 00:31:16,280
That's drift with a user interface.

865
00:31:16,280 --> 00:31:17,640
So the rule is blunt.

866
00:31:17,640 --> 00:31:19,640
Calculations live in a governed zone,

867
00:31:19,640 --> 00:31:22,640
fabric lake house or azure synapse analytics.

868
00:31:22,640 --> 00:31:24,040
Pick one.

869
00:31:24,040 --> 00:31:26,440
Then treat it like a finance system.

870
00:31:26,440 --> 00:31:29,680
Version logic, controlled releases, testability

871
00:31:29,680 --> 00:31:31,320
and reproducibility per period.

872
00:31:31,320 --> 00:31:34,080
Why this matters shows up the first time a stakeholder asks,

873
00:31:34,080 --> 00:31:36,120
why did last year's number change?

874
00:31:36,120 --> 00:31:38,920
And you discover the answer is someone edited a measure.

875
00:31:38,920 --> 00:31:40,680
That answer will not survive assurance.

876
00:31:40,680 --> 00:31:42,120
In a governed calculation zone,

877
00:31:42,120 --> 00:31:44,880
the primary artifacts are explicit and inspecable.

878
00:31:44,880 --> 00:31:47,320
SQL views, stored procedures, notebooks

879
00:31:47,320 --> 00:31:49,400
and tables that represent outputs.

880
00:31:49,400 --> 00:31:51,960
You choose the artifact type based on what you can

881
00:31:51,960 --> 00:31:54,280
govern consistently, not on what your favorite

882
00:31:54,280 --> 00:31:55,680
engineer likes this week.

883
00:31:55,680 --> 00:31:58,080
SQL views can be clean for transparency.

884
00:31:58,080 --> 00:32:01,320
The logic is readable, diffable and can be reviewed.

885
00:32:01,320 --> 00:32:03,840
Stored procedures can enforce parameterization

886
00:32:03,840 --> 00:32:06,120
and encapsulate controlled transformations,

887
00:32:06,120 --> 00:32:08,080
but they can also become opaque

888
00:32:08,080 --> 00:32:11,640
if people start hiding business logic inside procedural code.

889
00:32:11,640 --> 00:32:14,240
Notebooks are powerful for complex transformations

890
00:32:14,240 --> 00:32:17,800
and factor application, but they demand discipline, source

891
00:32:17,800 --> 00:32:21,280
control, approved releases and consistent execution

892
00:32:21,280 --> 00:32:22,360
environments.

893
00:32:22,360 --> 00:32:24,880
Choose one dominant pattern for KPI computation

894
00:32:24,880 --> 00:32:26,920
and enforce it, because mixed paradigms

895
00:32:26,920 --> 00:32:28,720
are how you lose reproducibility.

896
00:32:28,720 --> 00:32:30,920
And this is the checkpoint most people ignore.

897
00:32:30,920 --> 00:32:33,240
Unit consistency and dimensionality.

898
00:32:33,240 --> 00:32:36,120
Emissions calculations are not just multiplication.

899
00:32:36,120 --> 00:32:38,080
They are multiplication under constraints.

900
00:32:38,080 --> 00:32:40,880
Site, region, period, source system, scope category,

901
00:32:40,880 --> 00:32:43,080
activity type, unit and factor version.

902
00:32:43,080 --> 00:32:45,480
If any of those dimensions are missing or ambiguous,

903
00:32:45,480 --> 00:32:47,440
you will produce numbers that look plausible

904
00:32:47,440 --> 00:32:48,840
and fail under interrogation.

905
00:32:48,840 --> 00:32:51,320
So the govern zone has a job beyond computing.

906
00:32:51,320 --> 00:32:53,080
It enforces dimensional completeness.

907
00:32:53,080 --> 00:32:56,120
Every record must be joinable to organizational structure.

908
00:32:56,120 --> 00:32:59,400
Every activity record must carry a unit that can be normalized.

909
00:32:59,400 --> 00:33:02,200
Every computed record must carry the factor version key used,

910
00:33:02,200 --> 00:33:04,320
not DEFRA, not EPA.

911
00:33:04,320 --> 00:33:07,920
A version key that binds the output to a specific factor

912
00:33:07,920 --> 00:33:09,280
library snapshot.

913
00:33:09,280 --> 00:33:12,000
Now here's where period close mechanics stop being a meeting

914
00:33:12,000 --> 00:33:13,720
and become an implementation.

915
00:33:13,720 --> 00:33:16,960
For a close to be auditable, three things must freeze together.

916
00:33:16,960 --> 00:33:18,960
Inputs, factors and logic reference.

917
00:33:18,960 --> 00:33:21,640
Freeze inputs means the system records, which load IDs

918
00:33:21,640 --> 00:33:24,120
or partitions are in scope for the period.

919
00:33:24,120 --> 00:33:26,080
You don't just have much data.

920
00:33:26,080 --> 00:33:28,920
You have these ingestion runs validated, approved,

921
00:33:28,920 --> 00:33:29,880
included.

922
00:33:29,880 --> 00:33:32,200
Later rivals don't overwrite anything.

923
00:33:32,200 --> 00:33:33,880
They become late arrivals.

924
00:33:33,880 --> 00:33:36,520
Explicitly excluded or treated as adjustments.

925
00:33:36,520 --> 00:33:38,160
Freeze factors means the factor library

926
00:33:38,160 --> 00:33:40,720
used for that period is published with a version key

927
00:33:40,720 --> 00:33:42,120
and then locked as evidence.

928
00:33:42,120 --> 00:33:44,880
If your calculation queries join to latest,

929
00:33:44,880 --> 00:33:45,880
you've already failed.

930
00:33:45,880 --> 00:33:47,360
You're not calculating a period.

931
00:33:47,360 --> 00:33:49,600
You're calculating today's opinion of the past.

932
00:33:49,600 --> 00:33:52,080
Freeze logic reference means the exact calculation

933
00:33:52,080 --> 00:33:54,880
artifacts used are versioned and identifiable.

934
00:33:54,880 --> 00:33:56,880
A git commit, a release notebook package,

935
00:33:56,880 --> 00:34:00,120
a view definition version, something durable.

936
00:34:00,120 --> 00:34:01,760
The current notebook is not a version.

937
00:34:01,760 --> 00:34:03,040
It's a moving target.

938
00:34:03,040 --> 00:34:05,440
Once those three are frozen, the reported outputs

939
00:34:05,440 --> 00:34:07,040
can be generated deterministically

940
00:34:07,040 --> 00:34:08,760
and published into the reported zone.

941
00:34:08,760 --> 00:34:10,920
And Power BI consumes those outputs.

942
00:34:10,920 --> 00:34:11,720
That's the boundary.

943
00:34:11,720 --> 00:34:15,160
Power BI doesn't get to help by recomputing core emissions

944
00:34:15,160 --> 00:34:15,680
logic.

945
00:34:15,680 --> 00:34:18,160
Now, a common objection is, but we need flexibility.

946
00:34:18,160 --> 00:34:19,440
No, you need control change.

947
00:34:19,440 --> 00:34:20,320
You can change the model.

948
00:34:20,320 --> 00:34:21,400
You can improve mapping.

949
00:34:21,400 --> 00:34:22,520
You can add new factors.

950
00:34:22,520 --> 00:34:24,480
You can refine scope three categories.

951
00:34:24,480 --> 00:34:27,360
But every change becomes a new version with a new effective date

952
00:34:27,360 --> 00:34:29,640
and a clear statement of what periods it impacts.

953
00:34:29,640 --> 00:34:30,720
That's not bureaucracy.

954
00:34:30,720 --> 00:34:32,920
That's how you stop rewriting history by accident.

955
00:34:32,920 --> 00:34:35,320
If you remember nothing else, the governed calculation zone

956
00:34:35,320 --> 00:34:37,400
is where ESG becomes deterministic.

957
00:34:37,400 --> 00:34:39,920
Dashboards are where ESG becomes arguable.

958
00:34:39,920 --> 00:34:42,360
And once you enforce deterministic computation,

959
00:34:42,360 --> 00:34:44,320
the next dependency becomes obvious.

960
00:34:44,320 --> 00:34:47,040
Emissions logic, lives and dies on factor management.

961
00:34:47,040 --> 00:34:49,960
Emissions factors, versioning or your rewriting history.

962
00:34:49,960 --> 00:34:52,080
Emissions factors are where most ESG stacks

963
00:34:52,080 --> 00:34:53,560
commit their quietest fraud.

964
00:34:53,560 --> 00:34:56,200
They treat reference data like a convenience file,

965
00:34:56,200 --> 00:34:58,400
a spreadsheet attachment, something you update

966
00:34:58,400 --> 00:34:59,880
when the new one comes out.

967
00:34:59,880 --> 00:35:02,280
That behavior rewrites history.

968
00:35:02,280 --> 00:35:04,440
Because an emission factor is not a number.

969
00:35:04,440 --> 00:35:07,080
It's a controlled assumption that converts activity

970
00:35:07,080 --> 00:35:08,120
into emissions.

971
00:35:08,120 --> 00:35:10,800
Change the assumption and you change the outcome.

972
00:35:10,800 --> 00:35:13,440
Which means if you can't prove which factor set applied

973
00:35:13,440 --> 00:35:16,160
to FYI, your FYI numbers aren't evidence.

974
00:35:16,160 --> 00:35:18,200
There are current interpretation of the past.

975
00:35:18,200 --> 00:35:20,360
Auditors don't assure interpretations.

976
00:35:20,360 --> 00:35:22,200
They assure records.

977
00:35:22,200 --> 00:35:23,480
So the rule is simple.

978
00:35:23,480 --> 00:35:25,640
Emissions factors are controlled reference data

979
00:35:25,640 --> 00:35:28,160
with versioning, provenance and effective dates.

980
00:35:28,160 --> 00:35:30,920
And once a period closes, the specific factor set

981
00:35:30,920 --> 00:35:33,520
used for that period becomes immutable evidence.

982
00:35:33,520 --> 00:35:35,400
This is the part everyone tries to shortcut with,

983
00:35:35,400 --> 00:35:37,400
we use DEFRA or we use EPA.

984
00:35:37,400 --> 00:35:38,160
That's not evidence.

985
00:35:38,160 --> 00:35:39,200
That's a brand label.

986
00:35:39,200 --> 00:35:41,400
What matters is the specific library version,

987
00:35:41,400 --> 00:35:43,440
the effective date range, the geography mapping

988
00:35:43,440 --> 00:35:45,240
and the category mapping you applied.

989
00:35:45,240 --> 00:35:46,600
And here's the thing most people miss.

990
00:35:46,600 --> 00:35:48,560
Factor management isn't one table.

991
00:35:48,560 --> 00:35:49,600
It's a small system.

992
00:35:49,600 --> 00:35:51,760
You need at least four concepts in your model.

993
00:35:51,760 --> 00:35:53,480
One, a factor library entity.

994
00:35:53,480 --> 00:35:56,360
This represents a published set you can refer to as a unit.

995
00:35:56,360 --> 00:35:58,840
It has a name, a publisher, a published date,

996
00:35:58,840 --> 00:36:01,600
and a status like draft approved and archived.

997
00:36:01,600 --> 00:36:05,160
Two, the factor records themselves, the actual conversion values

998
00:36:05,160 --> 00:36:07,360
with units, gas type, where applicable

999
00:36:07,360 --> 00:36:10,200
and any classification fields you rely on in joins.

1000
00:36:10,200 --> 00:36:14,760
Three, applicability metadata, geography, sector,

1001
00:36:14,760 --> 00:36:17,960
activity type mapping, and effective date range.

1002
00:36:17,960 --> 00:36:20,120
If a factor is only valid for a country

1003
00:36:20,120 --> 00:36:22,040
or only valid from a certain date,

1004
00:36:22,040 --> 00:36:24,200
the model needs to carry that explicitly.

1005
00:36:24,200 --> 00:36:26,880
Four, provenance artifacts, where it came from

1006
00:36:26,880 --> 00:36:28,400
and how it entered your system.

1007
00:36:28,400 --> 00:36:30,400
That can be a link to an evidence document

1008
00:36:30,400 --> 00:36:33,880
in your evidence vault or at minimum, a stored reference

1009
00:36:33,880 --> 00:36:35,640
that can be produced during assurance.

1010
00:36:35,640 --> 00:36:37,280
Now the failure mode is predictable.

1011
00:36:37,280 --> 00:36:40,160
Teams store factors in a table called emission factors

1012
00:36:40,160 --> 00:36:42,800
with a column called factor value and no version key.

1013
00:36:42,800 --> 00:36:45,080
Then calculations join to it using a natural key

1014
00:36:45,080 --> 00:36:48,840
like activity type and country, and they default to latest.

1015
00:36:48,840 --> 00:36:50,720
And it works until the factor table updates,

1016
00:36:50,720 --> 00:36:53,120
then rerunning last year produces different results.

1017
00:36:53,120 --> 00:36:54,600
And the team calls it an update.

1018
00:36:54,600 --> 00:36:55,680
It is not an update.

1019
00:36:55,680 --> 00:36:57,480
It is a restatement without governance.

1020
00:36:57,480 --> 00:36:59,680
So the enforcement pattern is also predictable.

1021
00:36:59,680 --> 00:37:01,440
Factor to period binding.

1022
00:37:01,440 --> 00:37:04,440
Every computer emission record must carry a factor version key,

1023
00:37:04,440 --> 00:37:06,480
not a textual label, a key that ties back

1024
00:37:06,480 --> 00:37:08,880
to a specific published library snapshot.

1025
00:37:08,880 --> 00:37:10,960
And your calculation logic must require it.

1026
00:37:10,960 --> 00:37:13,040
If the pipeline can run without specifying

1027
00:37:13,040 --> 00:37:14,800
the factor version, you've built a machine

1028
00:37:14,800 --> 00:37:16,440
that can rewrite its own past.

1029
00:37:16,440 --> 00:37:17,960
This is where systems beat policy.

1030
00:37:17,960 --> 00:37:19,960
Don't tell people, don't use latest.

1031
00:37:19,960 --> 00:37:22,880
Make latest unusable in period close processing.

1032
00:37:22,880 --> 00:37:24,520
Use it only in exploratory analysis

1033
00:37:24,520 --> 00:37:27,640
where you explicitly label the output as non-reportable.

1034
00:37:27,640 --> 00:37:29,200
Then you build the publish workflow.

1035
00:37:29,200 --> 00:37:32,440
Factors do not appear in production tables as ad hoc edits.

1036
00:37:32,440 --> 00:37:34,000
They move through a life cycle.

1037
00:37:34,000 --> 00:37:36,240
Draft factors exist in a working area.

1038
00:37:36,240 --> 00:37:38,360
Someone reviews them, someone approves them,

1039
00:37:38,360 --> 00:37:40,040
and then you publish a new library version

1040
00:37:40,040 --> 00:37:41,440
after published you lock it.

1041
00:37:41,440 --> 00:37:43,240
That's where immutability enters again.

1042
00:37:43,240 --> 00:37:46,000
The published factor library for a period becomes evidence.

1043
00:37:46,000 --> 00:37:47,640
So it becomes worm protected.

1044
00:37:47,640 --> 00:37:49,400
And yes, you can still correct factor data.

1045
00:37:49,400 --> 00:37:51,440
You just can't pretend it was always that way.

1046
00:37:51,440 --> 00:37:52,920
Corrections become a new version

1047
00:37:52,920 --> 00:37:56,600
with a clear statement of impact, which future periods use it,

1048
00:37:56,600 --> 00:37:59,480
and whether prior periods require an adjustment entry.

1049
00:37:59,480 --> 00:38:01,600
Now, how does this land in a Microsoft stack

1050
00:38:01,600 --> 00:38:03,880
without turning into another governance slide deck?

1051
00:38:03,880 --> 00:38:06,400
In fabric or synops, you implement factor libraries

1052
00:38:06,400 --> 00:38:09,480
as tables with explicit version keys and effective dating.

1053
00:38:09,480 --> 00:38:11,680
In your calculation views or notebooks,

1054
00:38:11,680 --> 00:38:13,680
you join activity data to factors

1055
00:38:13,680 --> 00:38:17,080
using activity classification, geography, and period date.

1056
00:38:17,080 --> 00:38:18,600
But you don't let the joint float.

1057
00:38:18,600 --> 00:38:22,640
You require an input parameter for factor library version ID

1058
00:38:22,640 --> 00:38:24,400
when producing reported outputs.

1059
00:38:24,400 --> 00:38:26,680
Or you bind the version through a period

1060
00:38:26,680 --> 00:38:29,200
close configuration table that is itself locked

1061
00:38:29,200 --> 00:38:30,280
after close.

1062
00:38:30,280 --> 00:38:32,720
Either way, the output row carries the version ID.

1063
00:38:32,720 --> 00:38:35,680
And you treat the factor library publish as a formal release.

1064
00:38:35,680 --> 00:38:37,200
It's an artifact with approvals.

1065
00:38:37,200 --> 00:38:39,920
It's registered in purview, and it can be traced.

1066
00:38:39,920 --> 00:38:42,200
That's how you answer the audit question in one sentence.

1067
00:38:42,200 --> 00:38:45,600
FI1 used factor library version X published on Y,

1068
00:38:45,600 --> 00:38:47,520
approved by Z, and locked on close.

1069
00:38:47,520 --> 00:38:50,120
Without that, you'll end up in the classic assurance failure.

1070
00:38:50,120 --> 00:38:52,360
Someone asks why FI1 changed?

1071
00:38:52,360 --> 00:38:55,520
And you respond with, because the factors were updated.

1072
00:38:55,520 --> 00:38:58,000
That response admits you don't have reproducibility.

1073
00:38:58,000 --> 00:38:59,480
And if you don't have reproducibility,

1074
00:38:59,480 --> 00:39:01,080
you don't have audit grade ESG.

1075
00:39:01,080 --> 00:39:02,600
Once factor versioning is real,

1076
00:39:02,600 --> 00:39:04,440
KPI modeling stops being guesswork

1077
00:39:04,440 --> 00:39:06,240
and starts being constrained engineering.

1078
00:39:06,240 --> 00:39:07,080
That's next.

1079
00:39:07,080 --> 00:39:09,880
KPI modeling, scope 1, 3, energy water,

1080
00:39:09,880 --> 00:39:12,120
workforce metrics, supplier coverage.

1081
00:39:12,120 --> 00:39:14,240
Once factors are versioned, KPI modeling stops

1082
00:39:14,240 --> 00:39:15,720
being a creative writing exercise

1083
00:39:15,720 --> 00:39:17,760
and becomes what it always should have been.

1084
00:39:17,760 --> 00:39:19,680
Constraints encoded as data.

1085
00:39:19,680 --> 00:39:21,960
Most ESG teams model KPI's like labels.

1086
00:39:21,960 --> 00:39:26,280
scope 1, scope 2, scope 3, water, diversity, supplier coverage.

1087
00:39:26,280 --> 00:39:28,720
Then they build a dashboard and assume the definitions

1088
00:39:28,720 --> 00:39:31,040
will stay stable because everyone agreed.

1089
00:39:31,040 --> 00:39:31,720
They won't.

1090
00:39:31,720 --> 00:39:33,680
So KPI modeling has one job.

1091
00:39:33,680 --> 00:39:36,280
Make the definition enforceable and make drift visible

1092
00:39:36,280 --> 00:39:37,680
when someone tries to change it.

1093
00:39:37,680 --> 00:39:39,280
Start with scope 1, 2, and 3.

1094
00:39:39,280 --> 00:39:41,200
These aren't tags you slap on at the end.

1095
00:39:41,200 --> 00:39:42,320
There are structural constraints

1096
00:39:42,320 --> 00:39:45,520
that determine what data qualifies, which factors are valid,

1097
00:39:45,520 --> 00:39:46,880
and what boundaries apply.

1098
00:39:46,880 --> 00:39:50,520
scope 1 is direct emissions from owned or controlled sources.

1099
00:39:50,520 --> 00:39:52,800
In system terms, scope 1 activity records

1100
00:39:52,800 --> 00:39:54,520
must bind to assets you control.

1101
00:39:54,520 --> 00:39:57,480
Boilers, generators, company vehicles, refrigerants.

1102
00:39:57,480 --> 00:39:59,960
That means your data model needs an asset dimension

1103
00:39:59,960 --> 00:40:03,000
or at least an owned control attribute you can prove,

1104
00:40:03,000 --> 00:40:04,040
not infer later.

1105
00:40:04,040 --> 00:40:06,760
If you can't tie the activity record to the controlled

1106
00:40:06,760 --> 00:40:08,880
asset set that existed during the period,

1107
00:40:08,880 --> 00:40:10,200
you're back to narrative.

1108
00:40:10,200 --> 00:40:14,080
scope 2 is purchased electricity, heat, steam cooling.

1109
00:40:14,080 --> 00:40:17,000
In modeling terms, scope 2 requires a clean separation

1110
00:40:17,000 --> 00:40:18,760
between consumption and factor application

1111
00:40:18,760 --> 00:40:22,000
because electricity data can arrive as meter readings,

1112
00:40:22,000 --> 00:40:23,960
invoices, or allocations.

1113
00:40:23,960 --> 00:40:26,480
Your model must preserve the original consumption units

1114
00:40:26,480 --> 00:40:28,000
and the conversion path.

1115
00:40:28,000 --> 00:40:31,440
And the output must carry which factor version applied,

1116
00:40:31,440 --> 00:40:33,720
plus the geography and supplier mapping

1117
00:40:33,720 --> 00:40:34,600
that justified it.

1118
00:40:34,600 --> 00:40:37,240
Otherwise, you'll end up with global average factors

1119
00:40:37,240 --> 00:40:39,800
quietly covering gaps and then spend months pretending

1120
00:40:39,800 --> 00:40:40,960
it was intentional.

1121
00:40:40,960 --> 00:40:43,960
scope 3 is where the system either becomes honest or collapses.

1122
00:40:43,960 --> 00:40:45,760
scope 3 is a value chain problem, which

1123
00:40:45,760 --> 00:40:48,400
means the model needs to handle mixed evidence quality.

1124
00:40:48,400 --> 00:40:51,080
Supplyer provided data, spend-based estimates,

1125
00:40:51,080 --> 00:40:53,640
activity-based proxies, and hybrid methods.

1126
00:40:53,640 --> 00:40:55,880
The common failure is forcing all of that into one column

1127
00:40:55,880 --> 00:40:58,480
called emissions and calling it complete.

1128
00:40:58,480 --> 00:41:02,800
So the rule is every scope 3 KPI must carry two flags,

1129
00:41:02,800 --> 00:41:06,600
measured versus estimated, and coverage scope, measured means

1130
00:41:06,600 --> 00:41:09,200
supplier provided or directly sourced activity

1131
00:41:09,200 --> 00:41:10,760
with traceable factors.

1132
00:41:10,760 --> 00:41:14,160
Estimated means proxy logic with estimation factors

1133
00:41:14,160 --> 00:41:16,960
treated as controlled inputs just like emission factors.

1134
00:41:16,960 --> 00:41:19,920
Coverage scope means what part of the category this KPI

1135
00:41:19,920 --> 00:41:21,080
represents.

1136
00:41:21,080 --> 00:41:23,640
Percent of spend covered, percent of suppliers covered,

1137
00:41:23,640 --> 00:41:25,080
percent of sites covered.

1138
00:41:25,080 --> 00:41:27,480
Without those, your scope 3 number is just a confidence

1139
00:41:27,480 --> 00:41:28,040
trick.

1140
00:41:28,040 --> 00:41:30,320
Now energy and water, because these KPI's

1141
00:41:30,320 --> 00:41:32,640
attract the most casual denominator abuse.

1142
00:41:32,640 --> 00:41:35,680
Consumption is easy, intensities where you get audited.

1143
00:41:35,680 --> 00:41:38,480
Energy intensity metrics require a denominator.

1144
00:41:38,480 --> 00:41:41,600
Revenue, production volume, floor area, headcount,

1145
00:41:41,600 --> 00:41:42,640
output units.

1146
00:41:42,640 --> 00:41:44,640
Denominators drift because someone changes

1147
00:41:44,640 --> 00:41:46,480
the definition of revenue or switches

1148
00:41:46,480 --> 00:41:48,600
the production metric mid-year or updates

1149
00:41:48,600 --> 00:41:50,440
organizational structure mappings.

1150
00:41:50,440 --> 00:41:53,640
So your KPI model needs to treat denominators as govern data,

1151
00:41:53,640 --> 00:41:55,200
not as a measure in a dashboard.

1152
00:41:55,200 --> 00:41:58,040
That means store denominators as tables with source, period,

1153
00:41:58,040 --> 00:42:00,160
or unit, and definition version.

1154
00:42:00,160 --> 00:42:02,600
Then compute intensity in the governed calculation zone

1155
00:42:02,600 --> 00:42:04,320
and publish it like any other KPI.

1156
00:42:04,320 --> 00:42:06,280
If someone wants a new denominator, fine,

1157
00:42:06,280 --> 00:42:09,000
they get a new KPI variant with a new definition,

1158
00:42:09,000 --> 00:42:11,000
not a silent rewrite of the old one.

1159
00:42:11,000 --> 00:42:14,400
Water works the same way, but with more traps, local units,

1160
00:42:14,400 --> 00:42:17,680
local reporting boundaries, and data that often arrives late.

1161
00:42:17,680 --> 00:42:20,960
So the model needs quality flags estimated missing context,

1162
00:42:20,960 --> 00:42:23,080
suspects, bikes, later arriving.

1163
00:42:23,080 --> 00:42:24,120
Don't hide those.

1164
00:42:24,120 --> 00:42:25,960
Put them in the data set so the dashboard

1165
00:42:25,960 --> 00:42:28,520
can surface confidence, not just totals.

1166
00:42:28,520 --> 00:42:30,760
Workforce metrics are the quiet governance test.

1167
00:42:30,760 --> 00:42:33,840
Headcount, turn over, training hours, safety incident rates,

1168
00:42:33,840 --> 00:42:35,560
these are definition landmines.

1169
00:42:35,560 --> 00:42:38,080
The calculation often depends on what counts as an employee,

1170
00:42:38,080 --> 00:42:40,800
what counts as a contractor, which geographies are in scope,

1171
00:42:40,800 --> 00:42:43,800
and how organizational units map to legal entities.

1172
00:42:43,800 --> 00:42:46,240
If the HR team changes the underlying definition,

1173
00:42:46,240 --> 00:42:48,280
your KPI changes without a code change.

1174
00:42:48,280 --> 00:42:50,280
So KPI modeling for workforce metrics

1175
00:42:50,280 --> 00:42:52,280
must include definition binding.

1176
00:42:52,280 --> 00:42:55,560
A version definition record that states the inclusion rules,

1177
00:42:55,560 --> 00:42:57,880
the denominator, and the aggregation level.

1178
00:42:57,880 --> 00:43:00,480
Then the outputs carry that definition version key, again,

1179
00:43:00,480 --> 00:43:01,960
not a label, a key.

1180
00:43:01,960 --> 00:43:05,680
And supplier coverage needs to be modeled explicitly

1181
00:43:05,680 --> 00:43:08,120
because it's the only way to prevent vanity percentages.

1182
00:43:08,120 --> 00:43:09,520
Covered must have a definition,

1183
00:43:09,520 --> 00:43:11,120
covered by survey response,

1184
00:43:11,120 --> 00:43:14,000
covered by verified activity, covered by modeled estimates,

1185
00:43:14,000 --> 00:43:15,440
each is a different confidence level.

1186
00:43:15,440 --> 00:43:17,920
So store coverage as its own KPI family

1187
00:43:17,920 --> 00:43:20,240
with numerator and denominator definitions

1188
00:43:20,240 --> 00:43:22,320
and treat it like a first class metric.

1189
00:43:22,320 --> 00:43:24,160
Otherwise, you'll end up with a green dashboard

1190
00:43:24,160 --> 00:43:25,960
that can't explain its own scope.

1191
00:43:25,960 --> 00:43:28,920
Once your KPI model encodes scope, definitions,

1192
00:43:28,920 --> 00:43:31,320
estimation flags, and denominator governance,

1193
00:43:31,320 --> 00:43:32,880
the architecture can produce numbers

1194
00:43:32,880 --> 00:43:34,520
that survive interrogation.

1195
00:43:34,520 --> 00:43:36,280
And now we can talk about the failure modes

1196
00:43:36,280 --> 00:43:37,680
because they're not random.

1197
00:43:37,680 --> 00:43:40,080
They're designed in failure mode one,

1198
00:43:40,080 --> 00:43:42,040
manual CSV overrides.

1199
00:43:42,040 --> 00:43:43,360
If there's a single failure mode

1200
00:43:43,360 --> 00:43:46,160
that shows up in almost every ESG program, it's this.

1201
00:43:46,160 --> 00:43:48,040
Someone fixes the number with a file.

1202
00:43:48,040 --> 00:43:49,840
It always sounds reasonable in the moment,

1203
00:43:49,840 --> 00:43:51,360
the meter export was wrong.

1204
00:43:51,360 --> 00:43:52,920
The facility center correction late,

1205
00:43:52,920 --> 00:43:54,800
the supplier portal didn't respond.

1206
00:43:54,800 --> 00:43:56,200
The CFO wants the dashboard

1207
00:43:56,200 --> 00:43:58,200
to match what finance believes is true.

1208
00:43:58,200 --> 00:44:00,200
So a spreadsheet appears, then a CSV,

1209
00:44:00,200 --> 00:44:01,600
then a folder called uploads,

1210
00:44:01,600 --> 00:44:03,760
then a file name that admits the whole control model

1211
00:44:03,760 --> 00:44:06,320
is imaginary, final V7.

1212
00:44:06,320 --> 00:44:08,520
C is a sieve. Here's what goes wrong mechanically.

1213
00:44:08,520 --> 00:44:11,320
A manual override has no inherent chain of custody.

1214
00:44:11,320 --> 00:44:13,160
It doesn't preserve who changed the value

1215
00:44:13,160 --> 00:44:15,600
what the previous value was, what justification existed,

1216
00:44:15,600 --> 00:44:16,800
which approval covered it,

1217
00:44:16,800 --> 00:44:18,880
and whether the period was already closed.

1218
00:44:18,880 --> 00:44:20,480
A CSV is a blob of claims.

1219
00:44:20,480 --> 00:44:23,400
Unless the system forces metadata capture and approval,

1220
00:44:23,400 --> 00:44:25,840
the file becomes a silent rewrite of evidence.

1221
00:44:25,840 --> 00:44:28,680
And once that pathway exists, it gets used for everything.

1222
00:44:28,680 --> 00:44:31,040
First, it's just this one facility.

1223
00:44:31,040 --> 00:44:32,440
Then it's just this one month.

1224
00:44:32,440 --> 00:44:34,240
Then it's just this supplier.

1225
00:44:34,240 --> 00:44:36,080
Then it becomes the default operating model

1226
00:44:36,080 --> 00:44:38,760
because it's faster than fixing ingestion, modeling,

1227
00:44:38,760 --> 00:44:39,800
or validation.

1228
00:44:39,800 --> 00:44:42,120
Entropy loves convenience.

1229
00:44:42,120 --> 00:44:43,680
Auditors hate it for one reason.

1230
00:44:43,680 --> 00:44:46,160
It creates an uncontrolled modification pathway

1231
00:44:46,160 --> 00:44:47,600
inside the reporting boundary.

1232
00:44:47,600 --> 00:44:49,320
That phrase matters because it doesn't matter

1233
00:44:49,320 --> 00:44:51,200
whether the sustainability team is honest.

1234
00:44:51,200 --> 00:44:54,160
It matters whether the system allows undetectable change.

1235
00:44:54,160 --> 00:44:57,160
When a CSV can be uploaded and override prior data,

1236
00:44:57,160 --> 00:44:59,640
you have created a pathway where numbers can change

1237
00:44:59,640 --> 00:45:00,960
without a durable trail.

1238
00:45:00,960 --> 00:45:02,440
That's the definition of weak control.

1239
00:45:02,440 --> 00:45:04,280
The classic symptoms are always the same.

1240
00:45:04,280 --> 00:45:05,480
No reviewer trace.

1241
00:45:05,480 --> 00:45:07,960
One person edits, one person uploads,

1242
00:45:07,960 --> 00:45:10,720
and the approval is a team's message.

1243
00:45:10,720 --> 00:45:11,600
No locked period.

1244
00:45:11,600 --> 00:45:13,280
The organization says the month is closed,

1245
00:45:13,280 --> 00:45:15,200
but the storage and tables are still writable,

1246
00:45:15,200 --> 00:45:17,480
so the month is closed in conversation only.

1247
00:45:17,480 --> 00:45:18,400
No checksum.

1248
00:45:18,400 --> 00:45:19,640
No content fingerprint.

1249
00:45:19,640 --> 00:45:22,000
You can't even prove the file you showed the auditor

1250
00:45:22,000 --> 00:45:23,760
is the file that produced the KPI.

1251
00:45:23,760 --> 00:45:25,680
No binding to calculation logic.

1252
00:45:25,680 --> 00:45:28,680
The override becomes the truth by brute force.

1253
00:45:28,680 --> 00:45:31,120
Not because it passed through governed computation

1254
00:45:31,120 --> 00:45:33,240
with known factors and known code.

1255
00:45:33,240 --> 00:45:35,680
Now, the countermeasure is not band spreadsheets.

1256
00:45:35,680 --> 00:45:37,360
That's how you force shadow processes.

1257
00:45:37,360 --> 00:45:39,240
The countermeasure is controlled submissions.

1258
00:45:39,240 --> 00:45:40,960
If the business needs a manual pathway,

1259
00:45:40,960 --> 00:45:42,960
you give them one that behaves like a ledger.

1260
00:45:42,960 --> 00:45:45,560
Authenticated submitter identity required schema,

1261
00:45:45,560 --> 00:45:47,720
validation gates, and an approval workflow

1262
00:45:47,720 --> 00:45:50,080
that results in an immutable publish.

1263
00:45:50,080 --> 00:45:53,440
The CSV becomes a source artifact, not a rewrite tool,

1264
00:45:53,440 --> 00:45:55,000
so the pattern looks like this.

1265
00:45:55,000 --> 00:45:57,400
A submission is ingested into a staging area

1266
00:45:57,400 --> 00:45:59,680
and tagged with metadata, who submitted it,

1267
00:45:59,680 --> 00:46:04,120
when, for which site, for which period, and under which submission type.

1268
00:46:04,120 --> 00:46:07,880
Then the system runs validation, schema checks, unit checks,

1269
00:46:07,880 --> 00:46:10,680
required dimensions, and basic sanity thresholds.

1270
00:46:10,680 --> 00:46:13,200
If it fails, it gets rejected or quarantined,

1271
00:46:13,200 --> 00:46:16,480
not fixed later, quarantined with a recorded reason.

1272
00:46:16,480 --> 00:46:19,000
If it passes, it does not override anything.

1273
00:46:19,000 --> 00:46:22,040
It gets published as a new versioned object into the raw zone,

1274
00:46:22,040 --> 00:46:24,440
append first, always.

1275
00:46:24,440 --> 00:46:27,120
Then the important part, adjustments, not edits.

1276
00:46:27,120 --> 00:46:29,760
If the period is still open, you can allow the new submission

1277
00:46:29,760 --> 00:46:32,160
to become the latest accepted load for that period,

1278
00:46:32,160 --> 00:46:34,720
but you still keep the older load as evidence.

1279
00:46:34,720 --> 00:46:38,480
If the period is closed, the submission cannot mutate the closed outputs.

1280
00:46:38,480 --> 00:46:40,240
It can only create an adjustment entry

1281
00:46:40,240 --> 00:46:43,600
that is explicitly labeled as post-close, includes a rationale,

1282
00:46:43,600 --> 00:46:46,640
and requires approval from a different role than the submitter.

1283
00:46:46,640 --> 00:46:49,920
Separation of duties stops being an ideal and becomes enforced behavior,

1284
00:46:49,920 --> 00:46:52,240
and yes, you keep the supporting evidence.

1285
00:46:52,240 --> 00:46:54,480
The CSV file itself goes into the evidence fault

1286
00:46:54,480 --> 00:46:57,280
with its metadata and, ideally, a stored fingerprint

1287
00:46:57,280 --> 00:46:58,760
so you can prove integrity.

1288
00:46:58,760 --> 00:47:00,320
The approval record goes into the vault,

1289
00:47:00,320 --> 00:47:02,120
the validation report goes into the vault.

1290
00:47:02,120 --> 00:47:03,760
This is how you build an evidence pack

1291
00:47:03,760 --> 00:47:05,320
without rebuilding your memory.

1292
00:47:05,320 --> 00:47:07,240
Now, here's the uncomfortable truth.

1293
00:47:07,240 --> 00:47:10,840
The business will still demand just change the number, though.

1294
00:47:10,840 --> 00:47:12,080
They always do.

1295
00:47:12,080 --> 00:47:14,480
Your job is to make the only available change pathway

1296
00:47:14,480 --> 00:47:17,600
one that leaves scars, a new version of recorded justification

1297
00:47:17,600 --> 00:47:19,280
and a visible approval trail.

1298
00:47:19,280 --> 00:47:22,160
When people complain that it's slower, that's the control working.

1299
00:47:22,160 --> 00:47:24,280
And once you solve Final V7, CSV,

1300
00:47:24,280 --> 00:47:26,600
you'll notice the next failure mode is subtler.

1301
00:47:26,600 --> 00:47:29,360
The dashboard itself starts rewriting history.

1302
00:47:29,360 --> 00:47:30,600
Failure mode 2.

1303
00:47:30,600 --> 00:47:32,360
Calculation drift in Power BI.

1304
00:47:32,360 --> 00:47:35,160
Calculation drift is the audit failure that feels like progress.

1305
00:47:35,160 --> 00:47:38,600
Someone opens the Power BI model, sees a measure that looks inefficient,

1306
00:47:38,600 --> 00:47:40,480
re-rights it, the visuals load faster,

1307
00:47:40,480 --> 00:47:42,080
and everyone calls it an improvement.

1308
00:47:42,080 --> 00:47:45,240
But the system didn't just get faster, it got less accountable.

1309
00:47:45,240 --> 00:47:47,920
Because Power BI is designed for interactive analysis,

1310
00:47:47,920 --> 00:47:49,440
not period bound computation,

1311
00:47:49,440 --> 00:47:54,400
but it's superpowers flexibility and flexibility is the enemy of reproducibility

1312
00:47:54,400 --> 00:47:56,040
when you're inside a reporting boundary.

1313
00:47:56,040 --> 00:47:57,880
Here's what goes wrong.

1314
00:47:57,880 --> 00:48:00,880
The ESG team builds core logic in DAX.

1315
00:48:00,880 --> 00:48:04,560
Emission conversions, allocations, scope categorization,

1316
00:48:04,560 --> 00:48:07,880
intensity denominators, it starts small.

1317
00:48:07,880 --> 00:48:09,800
Then new requirements arrive.

1318
00:48:09,800 --> 00:48:13,840
New sites, a new supplier category, a new framework question,

1319
00:48:13,840 --> 00:48:16,600
a minor change to how renewable energy is treated.

1320
00:48:16,600 --> 00:48:17,760
So they update the model.

1321
00:48:17,760 --> 00:48:19,760
And historic values silently recompute.

1322
00:48:19,760 --> 00:48:22,560
That's the difference between a calculation engine and a dashboard.

1323
00:48:22,560 --> 00:48:25,520
A calculation engine produces outputs that become records.

1324
00:48:25,520 --> 00:48:27,720
A dashboard recomputes every time it refreshes.

1325
00:48:27,720 --> 00:48:30,320
When you change the logic, you don't just change future numbers.

1326
00:48:30,320 --> 00:48:31,800
You change last year's numbers.

1327
00:48:31,800 --> 00:48:34,360
This clicked for a lot of architects the first time someone asked

1328
00:48:34,360 --> 00:48:36,280
for a prior year reconciliation,

1329
00:48:36,280 --> 00:48:38,280
and the answer was the model changed.

1330
00:48:38,280 --> 00:48:39,920
Not because anyone acted maliciously,

1331
00:48:39,920 --> 00:48:42,040
because the platform makes logic edit trivial

1332
00:48:42,040 --> 00:48:43,960
and makes version binding optional.

1333
00:48:43,960 --> 00:48:45,840
Optional controls aren't controls.

1334
00:48:45,840 --> 00:48:48,600
Now the deeper problem is that Power BI doesn't naturally behave

1335
00:48:48,600 --> 00:48:50,560
like a governed release artifact.

1336
00:48:50,560 --> 00:48:52,000
Yes, you can manage workspaces.

1337
00:48:52,000 --> 00:48:53,920
Yes, you can use deployment pipelines.

1338
00:48:53,920 --> 00:48:55,760
Yes, you can restrict who can publish.

1339
00:48:55,760 --> 00:48:58,400
But the model itself remains a moving object

1340
00:48:58,400 --> 00:49:02,120
unless you design a release process that treats it like code

1341
00:49:02,120 --> 00:49:03,760
that impacts financial statements.

1342
00:49:03,760 --> 00:49:04,920
Most organizations don't.

1343
00:49:04,920 --> 00:49:06,160
They treat it like a report.

1344
00:49:06,160 --> 00:49:07,840
So you end up with the classic drift pattern,

1345
00:49:07,840 --> 00:49:10,640
a developer optimizes a measure for performance.

1346
00:49:10,640 --> 00:49:12,680
The measure changes, the outputs change.

1347
00:49:12,680 --> 00:49:15,240
Nobody notices until a regulator, an auditor,

1348
00:49:15,240 --> 00:49:17,160
or finance compares this year's report

1349
00:49:17,160 --> 00:49:19,120
to a saved PDF from last year.

1350
00:49:19,120 --> 00:49:20,800
And now you're in restatement territory

1351
00:49:20,800 --> 00:49:22,920
except you don't have restatement mechanics.

1352
00:49:22,920 --> 00:49:24,720
You have a dashboard that rewrote history

1353
00:49:24,720 --> 00:49:26,160
without leaving an obvious scar.

1354
00:49:26,160 --> 00:49:29,560
That's why auditors increasingly flag logic heavy DAX models.

1355
00:49:29,560 --> 00:49:30,840
Not because DAX is wrong,

1356
00:49:30,840 --> 00:49:32,560
because DAX is too easy to change

1357
00:49:32,560 --> 00:49:33,880
without the control ceremony

1358
00:49:33,880 --> 00:49:36,320
that should accompany changes to reported numbers.

1359
00:49:36,320 --> 00:49:38,680
The architecture rule that stops this is brutal.

1360
00:49:38,680 --> 00:49:42,280
Power BI is a thin semantic layer over reported tables only.

1361
00:49:42,280 --> 00:49:44,920
That means the calculations on produces the KPI tables.

1362
00:49:44,920 --> 00:49:46,960
Those KPI tables are period closed outputs.

1363
00:49:46,960 --> 00:49:50,040
Power BI reads them, power BI can aggregate and format them.

1364
00:49:50,040 --> 00:49:51,160
It can create visuals.

1365
00:49:51,160 --> 00:49:52,680
It can create drill paths.

1366
00:49:52,680 --> 00:49:54,400
It can even create convenience measures

1367
00:49:54,400 --> 00:49:57,400
that don't affect the underlying accounting of emissions.

1368
00:49:57,400 --> 00:49:59,800
But it cannot be the place where emissions accounting lives.

1369
00:49:59,800 --> 00:50:02,400
If you remember nothing else, DAX measures should be formatting,

1370
00:50:02,400 --> 00:50:03,360
not accounting.

1371
00:50:03,360 --> 00:50:06,440
Now, the system level countermeasure is equally blunt.

1372
00:50:06,440 --> 00:50:09,920
KPI outputs are tables, not measures.

1373
00:50:09,920 --> 00:50:11,400
Instead of total scope to emissions

1374
00:50:11,400 --> 00:50:13,920
being a measure that depends on five other measures,

1375
00:50:13,920 --> 00:50:16,080
it becomes a column in a reported KPI table

1376
00:50:16,080 --> 00:50:17,840
produced by fabric or synapse

1377
00:50:17,840 --> 00:50:22,640
with keys for period or unit scope category, method and factor version.

1378
00:50:22,640 --> 00:50:24,200
Power BI then displays it.

1379
00:50:24,200 --> 00:50:26,280
When someone asks, why did it change?

1380
00:50:26,280 --> 00:50:28,640
You have an answer rooted in artifacts,

1381
00:50:28,640 --> 00:50:32,720
input load IDs, factor versions and calculation release version.

1382
00:50:32,720 --> 00:50:35,120
Not because someone updated the report,

1383
00:50:35,120 --> 00:50:37,640
a practical control pattern looks like this.

1384
00:50:37,640 --> 00:50:39,960
You publish a KPI data set that is certified

1385
00:50:39,960 --> 00:50:42,440
and only refreshes from the reported zone.

1386
00:50:42,440 --> 00:50:45,320
You separate two dashboard classes, assurance dashboards

1387
00:50:45,320 --> 00:50:47,440
that only read period closed outputs

1388
00:50:47,440 --> 00:50:51,440
and management dashboards that can read operational or provisional data.

1389
00:50:51,440 --> 00:50:53,160
That distinction matters because management

1390
00:50:53,160 --> 00:50:57,160
wants speed and iteration, assurance wants stability and traceability,

1391
00:50:57,160 --> 00:50:59,600
mixing them guarantees you'll optimize for convenience

1392
00:50:59,600 --> 00:51:01,280
and later pretend it was governance.

1393
00:51:01,280 --> 00:51:02,640
And you still keep snapshots.

1394
00:51:02,640 --> 00:51:06,200
At close, you export and store the period close report outputs

1395
00:51:06,200 --> 00:51:08,320
as artifacts in your evidence vault.

1396
00:51:08,320 --> 00:51:10,240
PDF for human readable records

1397
00:51:10,240 --> 00:51:14,040
and CSV or data set extracts for machine traceability.

1398
00:51:14,040 --> 00:51:16,320
Those snapshots don't replace the reported tables.

1399
00:51:16,320 --> 00:51:17,320
They complement them.

1400
00:51:17,320 --> 00:51:19,120
They prove what was presented at the time.

1401
00:51:19,120 --> 00:51:20,400
Now here's the cynical truth.

1402
00:51:20,400 --> 00:51:23,720
People will try to sneak logic back into Power BI because it's faster.

1403
00:51:23,720 --> 00:51:25,560
They'll say, it's just a small adjustment.

1404
00:51:25,560 --> 00:51:27,320
They'll say, we can do it as a measure.

1405
00:51:27,320 --> 00:51:29,360
They'll say, it's only for this visual.

1406
00:51:29,360 --> 00:51:32,840
And then six months later, you discover the visual became the source of truth

1407
00:51:32,840 --> 00:51:34,680
because it was the thing executives looked at.

1408
00:51:34,680 --> 00:51:35,760
That's how drift wins.

1409
00:51:35,760 --> 00:51:38,840
So, enforce the boundary calculations in the governed zone.

1410
00:51:38,840 --> 00:51:41,960
Outputs in reported tables, Power BI as presentation.

1411
00:51:41,960 --> 00:51:44,000
And once you move logic out of the dashboard,

1412
00:51:44,000 --> 00:51:45,440
you'll hit the next failure mode.

1413
00:51:45,440 --> 00:51:47,040
Even with SQL-based logic,

1414
00:51:47,040 --> 00:51:48,440
you still can't reproduce history

1415
00:51:48,440 --> 00:51:49,880
if your emission factors float.

1416
00:51:49,880 --> 00:51:52,680
Failure mode three, missing factor versioning.

1417
00:51:52,680 --> 00:51:54,600
Missing factor versioning is the failure mode

1418
00:51:54,600 --> 00:51:57,000
that makes every other control look decorative.

1419
00:51:57,000 --> 00:51:58,560
You can have immutable storage.

1420
00:51:58,560 --> 00:51:59,800
You can have lineage.

1421
00:51:59,800 --> 00:52:01,720
You can even have a governed calculation zone.

1422
00:52:01,720 --> 00:52:04,560
But if your emission factors behave like current truth,

1423
00:52:04,560 --> 00:52:07,120
you've built a system that recalculates the past

1424
00:52:07,120 --> 00:52:08,080
with today's assumptions.

1425
00:52:08,080 --> 00:52:08,920
That's not reporting.

1426
00:52:08,920 --> 00:52:10,640
That's revisionism with a SQL engine.

1427
00:52:10,640 --> 00:52:12,600
Here's how it happens in real architectures.

1428
00:52:12,600 --> 00:52:15,080
A team centralizes emission factors in a table.

1429
00:52:15,080 --> 00:52:17,880
They add a column for source and maybe year.

1430
00:52:17,880 --> 00:52:21,360
They join activity data to factors by geography and activity type.

1431
00:52:21,360 --> 00:52:24,400
Then, because nobody wants to pass parameters around,

1432
00:52:24,400 --> 00:52:27,800
they add a view called something like VW emission factor latest.

1433
00:52:27,800 --> 00:52:29,760
And every calculation joins to latest,

1434
00:52:29,760 --> 00:52:31,800
it works until the factor set updates.

1435
00:52:31,800 --> 00:52:34,800
Then you rerun FY1 and FY+2 and the numbers change.

1436
00:52:34,800 --> 00:52:36,560
Not because the activity changed.

1437
00:52:36,560 --> 00:52:37,880
Not because the logic changed.

1438
00:52:37,880 --> 00:52:40,320
Because the factor table did what tables do,

1439
00:52:40,320 --> 00:52:42,080
it reflected the current state.

1440
00:52:42,080 --> 00:52:44,480
And your system quietly rewrote history.

1441
00:52:44,480 --> 00:52:47,000
This is why we use DEFRA, isn't evidence.

1442
00:52:47,000 --> 00:52:48,720
It's a marketing label for a library.

1443
00:52:48,720 --> 00:52:49,360
Which version?

1444
00:52:49,360 --> 00:52:50,200
Which published date?

1445
00:52:50,200 --> 00:52:51,160
Which effective dates?

1446
00:52:51,160 --> 00:52:52,160
Which geography mapping?

1447
00:52:52,160 --> 00:52:53,080
Which category mapping?

1448
00:52:53,080 --> 00:52:55,040
If your answer is the one in the table,

1449
00:52:55,040 --> 00:52:57,320
you are admitting you can't reproduce a prior close.

1450
00:52:57,320 --> 00:53:00,400
And reproducibility is one of the audit grade requirements

1451
00:53:00,400 --> 00:53:01,760
you claimed you met.

1452
00:53:01,760 --> 00:53:03,960
Now, what does assurance actually do with this?

1453
00:53:03,960 --> 00:53:05,880
They ask the simplest question on earth.

1454
00:53:05,880 --> 00:53:06,920
Prove the number.

1455
00:53:06,920 --> 00:53:07,880
Not tell a story.

1456
00:53:07,880 --> 00:53:09,000
Prove it.

1457
00:53:09,000 --> 00:53:10,040
They want to see the chain.

1458
00:53:10,040 --> 00:53:12,320
Activity record, factor record, and the logic

1459
00:53:12,320 --> 00:53:13,360
that multiplied them.

1460
00:53:13,360 --> 00:53:15,560
If the factor record is not pinned to the period,

1461
00:53:15,560 --> 00:53:17,680
you can't prove what it was at the time.

1462
00:53:17,680 --> 00:53:19,240
You can only show what it is now.

1463
00:53:19,240 --> 00:53:20,960
That becomes ambiguity and ambiguity

1464
00:53:20,960 --> 00:53:23,960
becomes qualifications, restatements, or a scope limitation

1465
00:53:23,960 --> 00:53:26,600
in an assurance report depending on how bad it is.

1466
00:53:26,600 --> 00:53:29,640
The most common real world trigger is annual factor updates.

1467
00:53:29,640 --> 00:53:31,040
A new factor library comes in.

1468
00:53:31,040 --> 00:53:32,040
Someone imports it.

1469
00:53:32,040 --> 00:53:35,000
They overwrite last year's rows because it's an update.

1470
00:53:35,000 --> 00:53:37,800
Then they rerun calculations to validate the new year.

1471
00:53:37,800 --> 00:53:39,440
And suddenly last year changes.

1472
00:53:39,440 --> 00:53:40,960
The dashboard still looks reasonable,

1473
00:53:40,960 --> 00:53:43,400
so nobody notices until finance compares numbers

1474
00:53:43,400 --> 00:53:46,120
to last year's submission or an auditor requests

1475
00:53:46,120 --> 00:53:48,280
a re-performance for the prior period.

1476
00:53:48,280 --> 00:53:50,360
And then the organization learns a painful lesson.

1477
00:53:50,360 --> 00:53:52,840
Factor drift is indistinguishable from manipulation

1478
00:53:52,840 --> 00:53:54,560
unless you can prove version binding.

1479
00:53:54,560 --> 00:53:57,360
So the architecture countermeasure has to be enforcement,

1480
00:53:57,360 --> 00:53:58,200
not guidance.

1481
00:53:58,200 --> 00:54:00,760
First, you build factor libraries as version sets.

1482
00:54:00,760 --> 00:54:02,520
A library has a unique version key.

1483
00:54:02,520 --> 00:54:04,080
The factor records carry that key.

1484
00:54:04,080 --> 00:54:05,880
And you never update an existing version.

1485
00:54:05,880 --> 00:54:07,640
You publish a new version, always.

1486
00:54:07,640 --> 00:54:10,400
Second, you bind factors to periods explicitly.

1487
00:54:10,400 --> 00:54:13,120
That can be done through a period close configuration table

1488
00:54:13,120 --> 00:54:15,200
that records per reporting period.

1489
00:54:15,200 --> 00:54:18,000
The factor library version IDs used for each domain.

1490
00:54:18,000 --> 00:54:21,400
Electricity, fuel, travel, freight, whatever you model.

1491
00:54:21,400 --> 00:54:24,000
That configuration becomes part of the close package

1492
00:54:24,000 --> 00:54:26,600
and gets locked after close because it's the switchboard

1493
00:54:26,600 --> 00:54:28,360
that defines reproducibility.

1494
00:54:28,360 --> 00:54:31,400
Third, you make the pipeline fail without a factor version key.

1495
00:54:31,400 --> 00:54:34,480
This is the part teams avoid because it feels strict.

1496
00:54:34,480 --> 00:54:35,640
Good, it should.

1497
00:54:35,640 --> 00:54:39,000
In your SQL views, stored procedures or notebooks,

1498
00:54:39,000 --> 00:54:41,480
the joint effectors must include the version key.

1499
00:54:41,480 --> 00:54:43,480
If the caller doesn't supply it, the job fails.

1500
00:54:43,480 --> 00:54:45,560
If the configuration table doesn't have a version

1501
00:54:45,560 --> 00:54:47,160
for that period, the job fails.

1502
00:54:47,160 --> 00:54:48,480
No silent fallbacks.

1503
00:54:48,480 --> 00:54:50,560
No latest, no convenience.

1504
00:54:50,560 --> 00:54:53,640
Because latest is an entropy generator disguised as a default.

1505
00:54:53,640 --> 00:54:56,240
Once you enforce that, you can do an audit-ready rerun.

1506
00:54:56,240 --> 00:54:58,240
You can take FYI one activity data.

1507
00:54:58,240 --> 00:55:01,000
You can select the load IDs that were in scope at close.

1508
00:55:01,000 --> 00:55:03,920
You can select the factor library versions recorded for that close.

1509
00:55:03,920 --> 00:55:07,600
You can run the calculation artifacts tied to the released logic version.

1510
00:55:07,600 --> 00:55:08,800
And you get the same result.

1511
00:55:08,800 --> 00:55:10,520
That's what reproducibility means.

1512
00:55:10,520 --> 00:55:11,520
Not close enough.

1513
00:55:11,520 --> 00:55:13,120
The same.

1514
00:55:13,120 --> 00:55:14,320
Now the subtle trap.

1515
00:55:14,320 --> 00:55:16,440
Effective dates and geography mapping.

1516
00:55:16,440 --> 00:55:19,200
Even with version keys, teams still mess up by storing factors

1517
00:55:19,200 --> 00:55:20,720
without applicability constraints

1518
00:55:20,720 --> 00:55:22,520
then joining based on best match.

1519
00:55:22,520 --> 00:55:24,920
That produces probabilistic factor selection.

1520
00:55:24,920 --> 00:55:26,840
So the factor model needs effective dates,

1521
00:55:26,840 --> 00:55:28,840
geography codes and classification keys

1522
00:55:28,840 --> 00:55:30,600
that make selection deterministic.

1523
00:55:30,600 --> 00:55:33,960
If multiple factors match, the pipeline should fail and force resolution,

1524
00:55:33,960 --> 00:55:36,000
not pick one and pretend it was intentional.

1525
00:55:36,000 --> 00:55:37,560
And the final piece is evidence.

1526
00:55:37,560 --> 00:55:40,600
When you publish a factor library version, store provenance.

1527
00:55:40,600 --> 00:55:42,680
Where it came from and when it was approved.

1528
00:55:42,680 --> 00:55:46,280
In Microsoft Sustainability Manager, factors can live in factor libraries.

1529
00:55:46,280 --> 00:55:49,240
In Fabric or Synapse, you'll model them as tables.

1530
00:55:49,240 --> 00:55:51,880
Either way, the published version used for a close period

1531
00:55:51,880 --> 00:55:53,720
becomes part of the evidence chain.

1532
00:55:53,720 --> 00:55:56,480
So it gets locked and registered in purview for lineage.

1533
00:55:56,480 --> 00:55:59,240
Because the moment you can't prove which factors were used,

1534
00:55:59,240 --> 00:56:01,000
you're no longer defending a number,

1535
00:56:01,000 --> 00:56:02,800
you're defending a belief about a number

1536
00:56:02,800 --> 00:56:05,200
and auditors don't assure beliefs.

1537
00:56:05,200 --> 00:56:08,880
Purview, lineage as your only defense against prove it moments.

1538
00:56:08,880 --> 00:56:10,640
At some point, someone will ask the question

1539
00:56:10,640 --> 00:56:12,360
that ends the fun part of ESG.

1540
00:56:12,360 --> 00:56:13,160
Prove it.

1541
00:56:13,160 --> 00:56:14,760
Not explain it, not summarize it.

1542
00:56:14,760 --> 00:56:16,840
Prove that this KPI came from these sources

1543
00:56:16,840 --> 00:56:19,480
went through these transformations and landed in this report

1544
00:56:19,480 --> 00:56:23,160
without being casually rewritten by whoever had access on a Tuesday.

1545
00:56:23,160 --> 00:56:25,080
This is where most ESG stacks collapse

1546
00:56:25,080 --> 00:56:27,280
because they rely on human memory and slide decks.

1547
00:56:27,280 --> 00:56:30,000
That's not governance, that's folklore.

1548
00:56:30,000 --> 00:56:33,080
Microsoft purview is the mechanism that turns your folklore

1549
00:56:33,080 --> 00:56:34,480
into queryable metadata.

1550
00:56:34,480 --> 00:56:37,200
And that distinction matters because lineage isn't a diagram

1551
00:56:37,200 --> 00:56:37,960
you draw once.

1552
00:56:37,960 --> 00:56:40,760
It's an operational record of how data moved, changed shape,

1553
00:56:40,760 --> 00:56:43,560
and became something the business now claims is true.

1554
00:56:43,560 --> 00:56:45,560
Lineage in plain system terms is origin

1555
00:56:45,560 --> 00:56:47,880
to transformation to consumption.

1556
00:56:47,880 --> 00:56:49,800
Origin is where the data came from.

1557
00:56:49,800 --> 00:56:53,640
ERP extracts, meter feeds, supplier submissions, HR aggregates.

1558
00:56:53,640 --> 00:56:55,280
Transformation is what you did to it.

1559
00:56:55,280 --> 00:56:58,240
Validation standardization mapping factor application KPI

1560
00:56:58,240 --> 00:57:00,560
computation consumption is where it shows up.

1561
00:57:00,560 --> 00:57:04,360
Reported tables, semantic models, power BI reports, exports.

1562
00:57:04,360 --> 00:57:07,800
If any link in that chain is someone knows, you don't have lineage.

1563
00:57:07,800 --> 00:57:09,200
You have a future incident.

1564
00:57:09,200 --> 00:57:11,080
Here's what you actually register in purview

1565
00:57:11,080 --> 00:57:13,520
if you wanted to be useful under audit pressure.

1566
00:57:13,520 --> 00:57:18,200
You register the storage assets, lake houses, ADLS paths,

1567
00:57:18,200 --> 00:57:21,320
containers that correspond to raw curated reported

1568
00:57:21,320 --> 00:57:22,720
and the evidence vault.

1569
00:57:22,720 --> 00:57:24,720
You register the processing assets.

1570
00:57:24,720 --> 00:57:28,640
Pipelines, notebooks, SQL endpoints, whatever actually

1571
00:57:28,640 --> 00:57:31,160
performs transformations and publishes outputs.

1572
00:57:31,160 --> 00:57:33,120
And you register the analytics assets.

1573
00:57:33,120 --> 00:57:34,920
The data sets and reports people use,

1574
00:57:34,920 --> 00:57:37,160
including the certified data sets that represent

1575
00:57:37,160 --> 00:57:38,200
the assurance layer.

1576
00:57:38,200 --> 00:57:41,000
And you assign ownership, real ownership, not the team,

1577
00:57:41,000 --> 00:57:42,520
a named role with accountability.

1578
00:57:42,520 --> 00:57:45,520
The thing most people miss is that auditors don't only ask

1579
00:57:45,520 --> 00:57:46,760
where did the number come from.

1580
00:57:46,760 --> 00:57:49,280
They ask who is responsible for this asset.

1581
00:57:49,280 --> 00:57:52,440
Per view is where you make that answer deterministic instead of social.

1582
00:57:52,440 --> 00:57:54,160
Now, what does this look like in practice

1583
00:57:54,160 --> 00:57:56,040
in the moment you're under scrutiny?

1584
00:57:56,040 --> 00:57:58,280
A stakeholder points its scope to for a region

1585
00:57:58,280 --> 00:58:00,040
and says, this seems high.

1586
00:58:00,040 --> 00:58:02,840
If you have lineage, you can trace from the Power BI visual

1587
00:58:02,840 --> 00:58:04,600
back to the reported KPI table,

1588
00:58:04,600 --> 00:58:06,680
back to the calculation view or notebook,

1589
00:58:06,680 --> 00:58:08,560
back to the curated consumption table,

1590
00:58:08,560 --> 00:58:11,200
back to the raw invoice extract or meter feed.

1591
00:58:11,200 --> 00:58:15,240
And you can identify the load IDs and factor library version key used.

1592
00:58:15,240 --> 00:58:16,600
You can do it in minutes.

1593
00:58:16,600 --> 00:58:19,240
Without lineage, you do PowerPoint archaeology.

1594
00:58:19,240 --> 00:58:20,240
You open old emails.

1595
00:58:20,240 --> 00:58:21,720
You ask someone who left the company.

1596
00:58:21,720 --> 00:58:24,600
You rebuild the path from memory and hope it matches reality.

1597
00:58:24,600 --> 00:58:25,360
It won't.

1598
00:58:25,360 --> 00:58:26,760
Lineage isn't only for auditors.

1599
00:58:26,760 --> 00:58:31,200
It's also the fastest way to find where data quality issues actually entered the system.

1600
00:58:31,200 --> 00:58:34,040
When a KPI looks wrong, teams usually blame the calculation.

1601
00:58:34,040 --> 00:58:37,080
Half the time the calculation is fine and the input mapping is wrong

1602
00:58:37,080 --> 00:58:39,040
or the organizational hierarchy changed.

1603
00:58:39,040 --> 00:58:40,200
Or a unit got misread.

1604
00:58:40,200 --> 00:58:41,640
Lineage gives you a breadcrumb trail,

1605
00:58:41,640 --> 00:58:43,880
so root cause analysis becomes mechanical.

1606
00:58:43,880 --> 00:58:46,520
Find the upstream change point, not the downstream symptom.

1607
00:58:46,520 --> 00:58:48,840
And here's the other use case nobody budgets for.

1608
00:58:48,840 --> 00:58:49,880
Impact analysis.

1609
00:58:49,880 --> 00:58:52,840
Every time you change a pipeline, a mapping or a factor library,

1610
00:58:52,840 --> 00:58:55,000
you are changing a graph of dependencies.

1611
00:58:55,000 --> 00:58:58,600
Without lineage, you don't know what you'll break until something breaks.

1612
00:58:58,600 --> 00:59:02,240
With lineage, you can see downstream consumers before you ship the change.

1613
00:59:02,240 --> 00:59:05,680
That's how you stop small improvements from becoming multi-year restatements.

1614
00:59:05,680 --> 00:59:06,880
Now there's a reality check.

1615
00:59:06,880 --> 00:59:10,160
Per view capabilities evolve, integrations change,

1616
00:59:10,160 --> 00:59:13,560
some sustainability specific solutions in the Microsoft ecosystem,

1617
00:59:13,560 --> 00:59:15,760
show up in preview states and then move.

1618
00:59:15,760 --> 00:59:17,520
That is not a reason to avoid governance.

1619
00:59:17,520 --> 00:59:21,240
It's the reason to avoid hard coding your governance into documentation.

1620
00:59:21,240 --> 00:59:23,360
Your architecture has to tolerate product drift

1621
00:59:23,360 --> 00:59:26,480
by treating lineage as a first class system behavior.

1622
00:59:26,480 --> 00:59:29,680
Register assets consistently, enforce naming conventions,

1623
00:59:29,680 --> 00:59:33,840
keep ownership current and make lineage review part of release management.

1624
00:59:33,840 --> 00:59:34,960
And yes, there's setup.

1625
00:59:34,960 --> 00:59:37,800
You will configure, connectors, you will manage identities,

1626
00:59:37,800 --> 00:59:40,640
you will decide which assets get scanned and how often.

1627
00:59:40,640 --> 00:59:44,240
You will deal with the fact that not everything stitches perfectly on day one.

1628
00:59:44,240 --> 00:59:45,760
But you're not doing this for aesthetics.

1629
00:59:45,760 --> 00:59:48,880
You're doing it because prove it moments don't arrive on your schedule.

1630
00:59:48,880 --> 00:59:50,320
They arrive when the board is watching.

1631
00:59:50,320 --> 00:59:53,080
So per view becomes your only defensible posture.

1632
00:59:53,080 --> 00:59:57,920
A way to demonstrate and to end that your ESG numbers are products of controlled systems,

1633
00:59:57,920 --> 01:00:00,240
not a collection of best effort narratives.

1634
01:00:00,240 --> 01:00:03,320
And once you accept that, the next dependency becomes obvious.

1635
01:00:03,320 --> 01:00:05,040
Governance without identity is theater.

1636
01:00:05,040 --> 01:00:07,720
Identity is what turns metadata into enforcement.

1637
01:00:07,720 --> 01:00:09,560
Entra ID plus role separation.

1638
01:00:09,560 --> 01:00:11,520
Stop letting everyone be everyone.

1639
01:00:11,520 --> 01:00:13,920
Most organizations say they have governance.

1640
01:00:13,920 --> 01:00:16,800
Then you look at their permissions and realize they have hope.

1641
01:00:16,800 --> 01:00:19,800
They treat Microsoft Entra ID like a login system.

1642
01:00:19,800 --> 01:00:21,120
Not what it actually is.

1643
01:00:21,120 --> 01:00:23,440
The control plane for who can touch evidence,

1644
01:00:23,440 --> 01:00:28,280
who can change logic and who can publish numbers that will later be defended in an assurance room.

1645
01:00:28,280 --> 01:00:32,240
That distinction matters because ESG fails when identity becomes optional.

1646
01:00:32,240 --> 01:00:33,920
Role separation is not bureaucracy.

1647
01:00:33,920 --> 01:00:39,040
It is the only reason an auditor believes your system didn't quietly rewrite itself under deadline pressure.

1648
01:00:39,040 --> 01:00:41,920
So the model is simple and it's intentionally boring.

1649
01:00:41,920 --> 01:00:43,600
Submitter, validator,

1650
01:00:43,600 --> 01:00:46,320
calculator, approver, report publisher.

1651
01:00:46,320 --> 01:00:48,000
A submitter can provide data.

1652
01:00:48,000 --> 01:00:49,560
They can't edit raw archives.

1653
01:00:49,560 --> 01:00:50,800
They can't change mappings.

1654
01:00:50,800 --> 01:00:52,480
They can't adjust reported outputs.

1655
01:00:52,480 --> 01:00:56,360
A validator can review ingestion results and data quality exceptions.

1656
01:00:56,360 --> 01:00:58,560
They can quarantine or accept with exception.

1657
01:00:58,560 --> 01:00:59,680
They can't publish factors.

1658
01:00:59,680 --> 01:01:01,560
They can't deploy calculation code.

1659
01:01:01,560 --> 01:01:03,920
A calculator can run the governed compute process.

1660
01:01:03,920 --> 01:01:05,400
They can't alter raw evidence.

1661
01:01:05,400 --> 01:01:07,120
They can't approve their own changes.

1662
01:01:07,120 --> 01:01:09,000
They can't publish the final report.

1663
01:01:09,000 --> 01:01:11,440
An approver can sign off on period close,

1664
01:01:11,440 --> 01:01:14,200
factor library versions and post-close adjustments.

1665
01:01:14,200 --> 01:01:16,320
They don't need broad data engineering access.

1666
01:01:16,320 --> 01:01:17,960
They need explicit rights to approve

1667
01:01:17,960 --> 01:01:20,400
and their approvals need to be recorded as evidence.

1668
01:01:20,400 --> 01:01:24,240
A report publisher can publish certified data sets and reports

1669
01:01:24,240 --> 01:01:26,040
that consume reported outputs.

1670
01:01:26,040 --> 01:01:30,960
They cannot modify the calculation logic or the data inputs that produce those outputs.

1671
01:01:30,960 --> 01:01:33,680
If you collapse those roles into the sustainability team,

1672
01:01:33,680 --> 01:01:38,080
you've built a system where the same identity can create data, change data,

1673
01:01:38,080 --> 01:01:40,080
compute results and approve results.

1674
01:01:40,080 --> 01:01:41,400
That is not control.

1675
01:01:41,400 --> 01:01:43,160
That is conditional chaos.

1676
01:01:43,160 --> 01:01:48,240
Now here's the part everyone tries to dodge.

1677
01:01:48,240 --> 01:01:49,920
Access boundaries by zone.

1678
01:01:49,920 --> 01:01:53,160
Raw, curated and reported are not just storage partitions.

1679
01:01:53,160 --> 01:01:54,560
They are permission boundaries.

1680
01:01:54,560 --> 01:01:57,400
Raw should be readable by the people who need to trace provenance

1681
01:01:57,400 --> 01:01:58,720
and resolve ingestion issues,

1682
01:01:58,720 --> 01:02:04,360
but rightable only by controlled ingestion identities typically service principles executing pipelines.

1683
01:02:04,360 --> 01:02:06,400
Humans don't get right access to raw evidence.

1684
01:02:06,400 --> 01:02:09,760
They get a submission mechanism that produces new immutable artifacts.

1685
01:02:09,760 --> 01:02:11,000
That's a different thing.

1686
01:02:11,000 --> 01:02:14,840
Curated should be writable only by the transformation process identities

1687
01:02:14,840 --> 01:02:16,960
and the engineers responsible for the model.

1688
01:02:16,960 --> 01:02:21,040
Broughter read access is fine, but right access is a scalpel, not a group membership.

1689
01:02:21,040 --> 01:02:22,760
Reported should be locked down hardest.

1690
01:02:22,760 --> 01:02:27,200
Right access only for the closed process and controlled adjustment workflows.

1691
01:02:27,200 --> 01:02:31,520
Read access for reporting, finance, internal audit and whoever consumes the KPIs.

1692
01:02:31,520 --> 01:02:35,760
But nobody should be casually updating reported tables because it's just a fix.

1693
01:02:35,760 --> 01:02:36,880
Fixes are adjustments.

1694
01:02:36,880 --> 01:02:38,240
Adjustments have approvals.

1695
01:02:38,240 --> 01:02:40,440
Approvals have identity separation.

1696
01:02:40,440 --> 01:02:44,920
And yes, all of this is enforced with Entra Groups, service principles, managed identities

1697
01:02:44,920 --> 01:02:49,200
and our back assignments at the storage and compute layers, not in VizioDont in the system.

1698
01:02:49,200 --> 01:02:51,960
Now, evidence of control matters as much as control itself.

1699
01:02:51,960 --> 01:02:53,360
You don't just need permissions.

1700
01:02:53,360 --> 01:02:54,600
You need proof of permissions.

1701
01:02:54,600 --> 01:02:57,760
So you treat Entra assignments, role memberships and privilege changes

1702
01:02:57,760 --> 01:02:59,560
as part of the assurance package.

1703
01:02:59,560 --> 01:03:02,200
When the auditor asks, who could have changed this?

1704
01:03:02,200 --> 01:03:03,640
You don't answer with a meeting.

1705
01:03:03,640 --> 01:03:08,240
You answer with access history, role definitions and audit logs.

1706
01:03:08,240 --> 01:03:12,920
Which brings us to the most common ESG security anti-pattern, the hero admin.

1707
01:03:12,920 --> 01:03:15,920
The hero admin shows up when the pipeline breaks, the close date is near

1708
01:03:15,920 --> 01:03:18,520
and somebody says, just give me contributor for a minute.

1709
01:03:18,520 --> 01:03:20,840
Temporary elevation becomes permanent.

1710
01:03:20,840 --> 01:03:22,320
Exceptions become normal.

1711
01:03:22,320 --> 01:03:25,640
And then months later you discover your separation of duties is a myth

1712
01:03:25,640 --> 01:03:28,280
because everyone has been operating as everyone.

1713
01:03:28,280 --> 01:03:30,480
The countermeasure isn't trust people more.

1714
01:03:30,480 --> 01:03:34,440
It's making elevation visible and costly if you must allow privileged access.

1715
01:03:34,440 --> 01:03:37,400
You make it time-bound, explicitly approved and logged.

1716
01:03:37,400 --> 01:03:40,320
You treat it as an incident artifact, not a convenience.

1717
01:03:40,320 --> 01:03:43,680
Because every exception is an entropy generator that will be reused.

1718
01:03:43,680 --> 01:03:45,000
And here is the awkward truth.

1719
01:03:45,000 --> 01:03:47,640
The sustainability organization will fight you on this.

1720
01:03:47,640 --> 01:03:48,760
They'll say it slows them down.

1721
01:03:48,760 --> 01:03:49,320
They're correct.

1722
01:03:49,320 --> 01:03:51,240
Controls always slow down change.

1723
01:03:51,240 --> 01:03:52,160
That's the trade.

1724
01:03:52,160 --> 01:03:55,680
If you want audit grade reporting, you don't optimize for speed of edits.

1725
01:03:55,680 --> 01:03:57,600
You optimize for survivability under scrutiny.

1726
01:03:57,600 --> 01:04:00,680
So you design the workflow, so the compliant path is the easy path.

1727
01:04:00,680 --> 01:04:03,000
Controls submissions instead of shared folders

1728
01:04:03,000 --> 01:04:06,440
approved factor publishing instead of spreadsheet swaps, period close gates

1729
01:04:06,440 --> 01:04:10,520
that lock storage and reports that only read from reported outputs.

1730
01:04:10,520 --> 01:04:12,320
Entra is what makes all of that enforceable.

1731
01:04:12,320 --> 01:04:16,040
Without it, purview shows you lineage of data that anyone could have altered.

1732
01:04:16,040 --> 01:04:17,040
That's not governance.

1733
01:04:17,040 --> 01:04:19,320
That's cataloging your own uncertainty.

1734
01:04:19,320 --> 01:04:23,480
Next, we talk about where organizations reintroduce entropy for convenience.

1735
01:04:23,480 --> 01:04:24,840
The reporting layer.

1736
01:04:24,840 --> 01:04:28,000
Reporting layer, power BI as presentation, not truth.

1737
01:04:28,000 --> 01:04:31,120
Reporting is where most teams undo everything they just built

1738
01:04:31,120 --> 01:04:33,480
because power BI makes it easy to be helpful.

1739
01:04:33,480 --> 01:04:35,280
Helpful is not a control objective.

1740
01:04:35,280 --> 01:04:38,520
In an auditable ESG stack, power BI is a presentation layer.

1741
01:04:38,520 --> 01:04:41,080
A thin semantic layer over reported outputs.

1742
01:04:41,080 --> 01:04:42,120
It does not own the math.

1743
01:04:42,120 --> 01:04:43,720
It does not fix missing data.

1744
01:04:43,720 --> 01:04:47,760
And it does not quietly restate history because someone wanted a cleaner chart.

1745
01:04:47,760 --> 01:04:49,840
The system behavior you want is simple.

1746
01:04:49,840 --> 01:04:53,520
Fabric or Synapse produces period closed KPI tables in the reported zone.

1747
01:04:53,520 --> 01:04:56,520
Power BI reads those tables through certified data sets.

1748
01:04:56,520 --> 01:04:58,480
The report is a window, not a calculator.

1749
01:04:58,480 --> 01:05:00,920
That distinction matters because the report is the thing

1750
01:05:00,920 --> 01:05:06,360
executives screenshot, regulators request an auditor's reconcil against the evidence package.

1751
01:05:06,360 --> 01:05:09,720
If the report can change without a corresponding change in the reported tables,

1752
01:05:09,720 --> 01:05:11,360
you've created a second truth.

1753
01:05:11,360 --> 01:05:13,640
And you will spend the next year arguing with yourself.

1754
01:05:13,640 --> 01:05:16,000
So you build two classes of dashboards on purpose.

1755
01:05:16,000 --> 01:05:18,280
The first class is regulatory and assurance reporting.

1756
01:05:18,280 --> 01:05:20,160
It reads only from the reported zone.

1757
01:05:20,160 --> 01:05:21,840
It refreshes on controlled schedules.

1758
01:05:21,840 --> 01:05:23,400
It uses certified data sets.

1759
01:05:23,400 --> 01:05:27,360
It has locked definitions and a release process that looks boring on purpose.

1760
01:05:27,360 --> 01:05:29,880
The second class is management and operations reporting.

1761
01:05:29,880 --> 01:05:32,680
It can read, curate it and even operational data.

1762
01:05:32,680 --> 01:05:33,600
It can move quickly.

1763
01:05:33,600 --> 01:05:35,720
It can support what's happening right now.

1764
01:05:35,720 --> 01:05:36,720
Questions.

1765
01:05:36,720 --> 01:05:39,440
But it is explicitly labeled as operational, not reportable.

1766
01:05:39,440 --> 01:05:42,640
Different audience, different expectations, different tolerance for drift.

1767
01:05:42,640 --> 01:05:46,920
If you collapse those into one dashboard, you'll optimize for executive convenience and

1768
01:05:46,920 --> 01:05:48,720
accidentally publish it as evidence.

1769
01:05:48,720 --> 01:05:52,440
Now, even in a thin semantic layer, you still need governance because semantics are where

1770
01:05:52,440 --> 01:05:53,440
definitions drift.

1771
01:05:53,440 --> 01:05:59,680
Use certified data sets, not personal workspaces and not an analyst's final pbix.

1772
01:05:59,680 --> 01:06:03,560
Assurances are the signal that the data set is backed by controlled sources as an owner

1773
01:06:03,560 --> 01:06:05,520
and is part of the assurance boundary.

1774
01:06:05,520 --> 01:06:06,520
Promoted is not enough.

1775
01:06:06,520 --> 01:06:09,520
Promoted is a social tag, certified is a control decision.

1776
01:06:09,520 --> 01:06:10,520
Control the refresh path.

1777
01:06:10,520 --> 01:06:15,240
If the data set refreshes from curated tables, someone will eventually change a curated

1778
01:06:15,240 --> 01:06:20,280
transformation and unintentionally shift a number that was treated as stable.

1779
01:06:20,280 --> 01:06:22,760
Assurance data sets refresh from reported tables only.

1780
01:06:22,760 --> 01:06:23,760
That's the rule.

1781
01:06:23,760 --> 01:06:27,240
Then you design the visuals that actually survive audit scrutiny.

1782
01:06:27,240 --> 01:06:29,160
Auditors don't care about your color palette.

1783
01:06:29,160 --> 01:06:30,920
They care about your ability to explain.

1784
01:06:30,920 --> 01:06:35,200
So the mandatory visuals are the ones that surface control relevant context, targets versus

1785
01:06:35,200 --> 01:06:37,880
actuals, yes, but also confidence indicators.

1786
01:06:37,880 --> 01:06:39,760
Measured versus estimated split.

1787
01:06:39,760 --> 01:06:41,360
Coverage metrics alongside totals.

1788
01:06:41,360 --> 01:06:44,040
An explicit period labels tied to closed status.

1789
01:06:44,040 --> 01:06:48,000
A scope three total without a coverage indicator is not a KPI.

1790
01:06:48,000 --> 01:06:49,720
It's a mood.

1791
01:06:49,720 --> 01:06:52,040
You also enforce drill path discipline.

1792
01:06:52,040 --> 01:06:55,960
The drill path needs to be deterministic, grouped to region, to site, to source record

1793
01:06:55,960 --> 01:06:59,360
identifiers.

1794
01:06:59,360 --> 01:07:03,400
When a number gets challenged, the report must let you drill to the reported record grain,

1795
01:07:03,400 --> 01:07:07,280
then provide the keys that let an engineer trace lineage back through purview, period,

1796
01:07:07,280 --> 01:07:10,480
or unit load ID and factor library version key.

1797
01:07:10,480 --> 01:07:16,080
If the drill stops at an aggregated chart, your report is a poster, not an audit artifact.

1798
01:07:16,080 --> 01:07:17,960
Now the part everyone ignores.

1799
01:07:17,960 --> 01:07:18,960
Export strategy.

1800
01:07:18,960 --> 01:07:21,120
At period close, you snapshot the outputs.

1801
01:07:21,120 --> 01:07:24,280
Not because power BI is unreliable, but because people are.

1802
01:07:24,280 --> 01:07:28,560
Those often want exactly what was presented at close, and they wanted reproducible even

1803
01:07:28,560 --> 01:07:31,840
if someone changes a report later for internal reasons.

1804
01:07:31,840 --> 01:07:37,640
So you export close packages, a PDF snapshot for human readable continuity, plus a data extract

1805
01:07:37,640 --> 01:07:41,040
that matches the reported KPI tables for machine comparison.

1806
01:07:41,040 --> 01:07:45,680
Store both in the evidence vault with the close metadata, period, data set version,

1807
01:07:45,680 --> 01:07:47,280
report version, and approval reference.

1808
01:07:47,280 --> 01:07:48,680
This is not redundant.

1809
01:07:48,680 --> 01:07:53,720
It is defense against, we change the report layout, becoming, we can't reproduce what

1810
01:07:53,720 --> 01:07:54,720
we filed.

1811
01:07:54,720 --> 01:07:56,200
A practical warning.

1812
01:07:56,200 --> 01:08:01,000
The easiest way to reintroduce calculation drift is to allow just one measure to creep in.

1813
01:08:01,000 --> 01:08:04,840
Someone will say the reported tables don't include a ratio they want, or the business

1814
01:08:04,840 --> 01:08:09,280
wants a different intensity denominator in a visual, or they want to adjust a mapping

1815
01:08:09,280 --> 01:08:12,880
in the report because it's faster than waiting for the next pipeline run.

1816
01:08:12,880 --> 01:08:17,640
If you allow that, power BI becomes the calculation engine again, slowly, one convenience

1817
01:08:17,640 --> 01:08:18,640
at a time.

1818
01:08:18,640 --> 01:08:22,880
So the rule stays harsh, the only math allowed in power BI is presentation math.

1819
01:08:22,880 --> 01:08:26,920
Formatting simple aggregations that don't change accounting semantics and convenience measures

1820
01:08:26,920 --> 01:08:28,800
that do not become the source of truth.

1821
01:08:28,800 --> 01:08:32,200
Anything that changes the meaning of a KPI belongs in the governed calculation zone gets

1822
01:08:32,200 --> 01:08:35,040
versioned and gets published into the reported tables.

1823
01:08:35,040 --> 01:08:38,200
Because the only thing worse than having no ESG story is having two.

1824
01:08:38,200 --> 01:08:39,200
Optional components.

1825
01:08:39,200 --> 01:08:42,520
Sustainability manager, ADF, Azure ML, where they fit.

1826
01:08:42,520 --> 01:08:43,680
Optional doesn't mean irrelevant.

1827
01:08:43,680 --> 01:08:47,880
It means the component is not part of the minimum control surface required to survive assurance.

1828
01:08:47,880 --> 01:08:51,880
You added when it reduces audit risk or operational friction, not when it makes the demo

1829
01:08:51,880 --> 01:08:52,880
prettier.

1830
01:08:52,880 --> 01:08:55,040
But with Microsoft Sustainability Manager.

1831
01:08:55,040 --> 01:08:59,160
Microsoft Sustainability Manager is useful when the organization needs structured workflows

1832
01:08:59,160 --> 01:09:03,600
and the sustainability focused data model without inventing everything from scratch.

1833
01:09:03,600 --> 01:09:07,120
It positions itself around record, report, and reduce.

1834
01:09:07,120 --> 01:09:08,120
That's not marketing fluff.

1835
01:09:08,120 --> 01:09:09,320
It's a workflow boundary.

1836
01:09:09,320 --> 01:09:14,680
It can unify silo data, run emissions calculations and support reporting modules.

1837
01:09:14,680 --> 01:09:17,320
But the architectural question isn't, is it good?

1838
01:09:17,320 --> 01:09:20,000
The question is does it help you enforce intent?

1839
01:09:20,000 --> 01:09:23,760
Where it helps is governed data collection and auditability inside its domain.

1840
01:09:23,760 --> 01:09:25,520
The platform can track data changes.

1841
01:09:25,520 --> 01:09:28,840
It can enable auditing for sustainability tables in data verse.

1842
01:09:28,840 --> 01:09:33,840
It also has data trail report capabilities described as preview, producing traceability

1843
01:09:33,840 --> 01:09:36,960
across inputs, calculation models, logs, and outputs.

1844
01:09:36,960 --> 01:09:41,680
That can be valuable when your current state is uncontrolled spreadsheets and tribal knowledge.

1845
01:09:41,680 --> 01:09:44,880
Because it gives you a default control story you can actually show.

1846
01:09:44,880 --> 01:09:48,040
Where it doesn't help is when you treat it as the system of record for everything and

1847
01:09:48,040 --> 01:09:51,560
stop caring about reproducibility at the platform boundary.

1848
01:09:51,560 --> 01:09:55,720
If you already have mature emissions logic, strong factor governance, and a deterministic

1849
01:09:55,720 --> 01:09:59,520
lake house model, sustainability manager becomes optional.

1850
01:09:59,520 --> 01:10:03,040
You might still use it for workflow and data collection features, but you don't outsource

1851
01:10:03,040 --> 01:10:05,080
your assurance posture to an app.

1852
01:10:05,080 --> 01:10:08,080
And if you do adopt it, be honest about constraints.

1853
01:10:08,080 --> 01:10:11,120
Auditing configuration is blunt through the standard interface.

1854
01:10:11,120 --> 01:10:15,600
It's all or nothing for sustainability tables unless you use the power platform web API

1855
01:10:15,600 --> 01:10:17,000
for more granular control.

1856
01:10:17,000 --> 01:10:18,000
That's manageable.

1857
01:10:18,000 --> 01:10:21,940
It's also a reminder that audit ready still requires engineering.

1858
01:10:21,940 --> 01:10:23,880
Next as your data factory.

1859
01:10:23,880 --> 01:10:28,160
If you're all in on fabric native ingestion and your source landscape is simple, ADF is

1860
01:10:28,160 --> 01:10:29,160
optional.

1861
01:10:29,160 --> 01:10:30,520
Fabric can ingest.

1862
01:10:30,520 --> 01:10:31,920
Fabric can orchestrate.

1863
01:10:31,920 --> 01:10:33,680
And for many organizations, that's enough.

1864
01:10:33,680 --> 01:10:36,720
But ADF remains valuable when reality shows up.

1865
01:10:36,720 --> 01:10:42,520
Complex ERP extraction, IoT fan in, API rate limits, cross system dependencies, and multi-step

1866
01:10:42,520 --> 01:10:46,280
orchestration that spans networks and security boundaries.

1867
01:10:46,280 --> 01:10:50,240
ADF is the thing you use when you need the pipeline to behave like an integration system,

1868
01:10:50,240 --> 01:10:52,000
not like a notebook with optimism.

1869
01:10:52,000 --> 01:10:54,920
Just remember the immutability constraint you already accepted.

1870
01:10:54,920 --> 01:10:57,880
ADF will fail when you try to override immutable paths.

1871
01:10:57,880 --> 01:11:01,240
You'll see errors like path immutable due to policy because the storage layer is doing

1872
01:11:01,240 --> 01:11:02,240
its job.

1873
01:11:02,240 --> 01:11:06,160
And for certain transformation patterns, ADF data flows can't write directly to immutable

1874
01:11:06,160 --> 01:11:08,680
containers because they rely on temporary files.

1875
01:11:08,680 --> 01:11:10,360
The pattern stays the same.

1876
01:11:10,360 --> 01:11:14,720
Write to immutable staging destination, then copy finalized outputs into the immutable

1877
01:11:14,720 --> 01:11:15,720
evidence zone.

1878
01:11:15,720 --> 01:11:17,280
ADF isn't more enterprise.

1879
01:11:17,280 --> 01:11:18,760
It's more orchestration.

1880
01:11:18,760 --> 01:11:19,760
That's different.

1881
01:11:19,760 --> 01:11:23,680
Now, as your machine learning, it's optional because it doesn't produce baseline numbers.

1882
01:11:23,680 --> 01:11:26,520
If it does, you're building a probabilistic accounting system.

1883
01:11:26,520 --> 01:11:28,800
You can't audit a model's intuition.

1884
01:11:28,800 --> 01:11:31,440
Use Azure ML for three things only.

1885
01:11:31,440 --> 01:11:35,160
Forecasting, anomaly detection, and scenario modeling.

1886
01:11:35,160 --> 01:11:36,160
Forecasting helps planning.

1887
01:11:36,160 --> 01:11:38,280
Anomaly detection helps data quality.

1888
01:11:38,280 --> 01:11:40,120
Scenario modeling helps reduction strategy.

1889
01:11:40,120 --> 01:11:42,280
None of those are the reported KPI baseline.

1890
01:11:42,280 --> 01:11:43,560
They are overlays.

1891
01:11:43,560 --> 01:11:47,720
And they must be labeled as overlays in the data model and in the reports.

1892
01:11:47,720 --> 01:11:53,800
Model outputs should carry model version training data window runtime stamp and clear classification

1893
01:11:53,800 --> 01:11:55,760
as estimated forecast.

1894
01:11:55,760 --> 01:11:59,960
And otherwise, you'll inevitably promote a forecast to a fact because it looks clean on

1895
01:11:59,960 --> 01:12:00,960
a slide.

1896
01:12:00,960 --> 01:12:02,960
So the decision rule is harsh.

1897
01:12:02,960 --> 01:12:08,800
Add optional tooling only when it reduces audit risk, not when it reduces effort.

1898
01:12:08,800 --> 01:12:11,920
Sustainability manager reduces chaos when you need structured collection and built in

1899
01:12:11,920 --> 01:12:13,800
sustainability workflows.

1900
01:12:13,800 --> 01:12:17,880
ADF reduces fragility when integration complexity exceeds what fabric orchestration can

1901
01:12:17,880 --> 01:12:19,080
realistically manage.

1902
01:12:19,080 --> 01:12:22,840
Azure ML adds intelligence, but only if you keep it out of the accounting path.

1903
01:12:22,840 --> 01:12:25,040
Optional components don't replace the fundamentals.

1904
01:12:25,040 --> 01:12:29,720
They either reinforce them or they accelerate your failure in a more expensive way.

1905
01:12:29,720 --> 01:12:32,760
Next a short comparison because every stack can calculate emissions.

1906
01:12:32,760 --> 01:12:35,160
Very few can prove them end to end.

1907
01:12:35,160 --> 01:12:36,840
The short comparison.

1908
01:12:36,840 --> 01:12:39,560
Microsoft versus snowflake Databricks GCP.

1909
01:12:39,560 --> 01:12:43,800
At this point, someone always asks the same question usually with a budget spreadsheet open.

1910
01:12:43,800 --> 01:12:44,800
Why Microsoft?

1911
01:12:44,800 --> 01:12:45,800
Why not snowflake?

1912
01:12:45,800 --> 01:12:46,800
Why not Databricks?

1913
01:12:46,800 --> 01:12:48,160
Why not just do this on GCP?

1914
01:12:48,160 --> 01:12:51,280
And the answer is not that those stacks can't calculate emissions.

1915
01:12:51,280 --> 01:12:53,280
They can.

1916
01:12:53,280 --> 01:12:58,120
Any competent data platform can ingest activity data, join it to factor tables and output

1917
01:12:58,120 --> 01:12:59,840
a number labeled scope two.

1918
01:12:59,840 --> 01:13:00,920
That part is not rare.

1919
01:13:00,920 --> 01:13:02,240
It's table stakes.

1920
01:13:02,240 --> 01:13:05,120
The problem is that assurance doesn't reward computation.

1921
01:13:05,120 --> 01:13:06,560
Assurance rewards proof.

1922
01:13:06,560 --> 01:13:09,160
So the comparison only matters on three axes.

1923
01:13:09,160 --> 01:13:13,000
Identity and access control, lineage and governance and audit evidence as a first class

1924
01:13:13,000 --> 01:13:14,000
output.

1925
01:13:14,000 --> 01:13:15,000
Everything else is noise.

1926
01:13:15,000 --> 01:13:16,560
Start with identity and access.

1927
01:13:16,560 --> 01:13:18,520
Microsoft's advantage is not that entry exists.

1928
01:13:18,520 --> 01:13:19,520
Every cloud has IM.

1929
01:13:19,520 --> 01:13:23,720
The advantage is that the identity plane is already the enterprise default for most organizations

1930
01:13:23,720 --> 01:13:28,600
running Microsoft 365 Azure and Power Platform and it reaches into the services you're

1931
01:13:28,600 --> 01:13:33,080
using for ESG, storage, compute, BI and workflow.

1932
01:13:33,080 --> 01:13:35,480
That matters because roll separation isn't a concept.

1933
01:13:35,480 --> 01:13:38,400
It's a continuous enforcement problem across the entire stack.

1934
01:13:38,400 --> 01:13:42,960
In Microsoft land, you can actually make submitter, validator, calculator, approver, publisher

1935
01:13:42,960 --> 01:13:47,760
map to real groups and real permissions that propagate into the services doing the work.

1936
01:13:47,760 --> 01:13:53,120
In many non-Microsoft stacks, identity becomes an assembly task, not impossible, just assembled.

1937
01:13:53,120 --> 01:13:57,080
And assembled identity is where temporary access becomes permanent, where service accounts

1938
01:13:57,080 --> 01:14:02,000
become shared and where your separation of duties quietly turns into conditional chaos.

1939
01:14:02,000 --> 01:14:03,480
The platform didn't fail you.

1940
01:14:03,480 --> 01:14:04,760
Your architecture did.

1941
01:14:04,760 --> 01:14:06,840
But the platform determines how hard it is to fail.

1942
01:14:06,840 --> 01:14:08,520
Now lineage and governance.

1943
01:14:08,520 --> 01:14:13,640
This is the access where most ESG teams discover the difference between we have data and we can

1944
01:14:13,640 --> 01:14:15,240
explain data.

1945
01:14:15,240 --> 01:14:16,560
Microsoft purview is not magic.

1946
01:14:16,560 --> 01:14:19,520
It's just a governance plane that is designed to be a governance plane.

1947
01:14:19,520 --> 01:14:23,800
You register assets, you scan, you capture lineage, you assign owners, you query metadata,

1948
01:14:23,800 --> 01:14:28,000
you walk into an audit room with something more defensible than a diagram in confluence.

1949
01:14:28,000 --> 01:14:32,520
And because purview integrates into common Microsoft data services, you can build lineage

1950
01:14:32,520 --> 01:14:37,520
that spans ingestion artifacts, lake house or warehouse objects and power BI consumption

1951
01:14:37,520 --> 01:14:40,160
in a way that is operationally achievable.

1952
01:14:40,160 --> 01:14:43,560
In other ecosystems, governance is usually a product stack you bolt on.

1953
01:14:43,560 --> 01:14:46,640
Databricks has unity catalog and lineage capabilities in its ecosystems.

1954
01:14:46,640 --> 01:14:48,640
Snowflake has governance features and partners.

1955
01:14:48,640 --> 01:14:51,360
GCP has data catalog and governance tooling.

1956
01:14:51,360 --> 01:14:52,360
All of that can work.

1957
01:14:52,360 --> 01:14:56,440
But the single pane of glass story becomes single pane of glass after integration work,

1958
01:14:56,440 --> 01:15:00,480
which means it competes with everything else for time, budget and political attention.

1959
01:15:00,480 --> 01:15:04,280
The time governance loses those fights, not because people are lazy, because they get

1960
01:15:04,280 --> 01:15:06,560
measured on delivery, not survivability.

1961
01:15:06,560 --> 01:15:10,480
So Microsoft's practical advantage is not perfection, it's friction.

1962
01:15:10,480 --> 01:15:14,400
Less friction to do governance well means it's more likely to happen and more likely to

1963
01:15:14,400 --> 01:15:16,320
stay current when the system evolves.

1964
01:15:16,320 --> 01:15:17,880
That is what audit is actually experienced.

1965
01:15:17,880 --> 01:15:22,040
Now the third axis audit evidence, this is the point that makes most platform comparisons

1966
01:15:22,040 --> 01:15:23,040
meaningless.

1967
01:15:23,040 --> 01:15:28,840
Audit grade ESG is evidence management, immutable raw inputs, version factors, version logic,

1968
01:15:28,840 --> 01:15:30,440
period close configuration.

1969
01:15:30,440 --> 01:15:35,760
All adjustments, snapshots, access logs, approval trails, reproducible reruns.

1970
01:15:35,760 --> 01:15:36,760
That's the system.

1971
01:15:36,760 --> 01:15:38,760
The report is just an output.

1972
01:15:38,760 --> 01:15:42,160
Microsoft doesn't automatically give you this either, but the architecture aligns with

1973
01:15:42,160 --> 01:15:45,920
it because the components map cleanly to the evidence life cycle.

1974
01:15:45,920 --> 01:15:48,480
Entra gives you the enforcement surface for role separation.

1975
01:15:48,480 --> 01:15:52,520
ADLS Gen2 with immutability gives you the evidence world behavior.

1976
01:15:52,520 --> 01:15:56,360
Fabricosinaps gives you the governed compute surface, where you can implement deterministic

1977
01:15:56,360 --> 01:15:57,960
calculation artifacts.

1978
01:15:57,960 --> 01:16:01,240
All of you gives you lineage as query metadata instead of oral tradition.

1979
01:16:01,240 --> 01:16:06,760
Power BI gives you presentation without forcing you to mix computation and visualization.

1980
01:16:06,760 --> 01:16:10,960
That collection forms an integrated control plane story, not a vendor story, a control

1981
01:16:10,960 --> 01:16:14,600
story, and other stacks you can absolutely build the same control story.

1982
01:16:14,600 --> 01:16:18,480
But you build it, you assemble the identity controls across tools, you assemble lineage

1983
01:16:18,480 --> 01:16:23,120
across transformation engines and BI, you assemble immutability patterns across storage

1984
01:16:23,120 --> 01:16:27,920
and pipeline behavior, you assemble evidence packs as a discipline, not a platform feature.

1985
01:16:27,920 --> 01:16:32,200
And every assembly point becomes a place where policy erodes because policy always erodes

1986
01:16:32,200 --> 01:16:34,680
when intent isn't enforced by design.

1987
01:16:34,680 --> 01:16:36,400
That's the uncomfortable truth.

1988
01:16:36,400 --> 01:16:39,440
Architecture is what remains after your governance committee stops meeting.

1989
01:16:39,440 --> 01:16:44,120
Now, to be fair, there are reasons teams choose Snowflake, Databricks, or GCP for ESG.

1990
01:16:44,120 --> 01:16:46,880
They might already run their entire analytics estate there.

1991
01:16:46,880 --> 01:16:49,400
They might have stronger internal skills on that stack.

1992
01:16:49,400 --> 01:16:53,000
They might have vendor constraints or data gravity that makes Microsoft the wrong place

1993
01:16:53,000 --> 01:16:54,000
to compute.

1994
01:16:54,000 --> 01:16:57,720
None of that is invalid, but if they choose those platforms, they still have to answer

1995
01:16:57,720 --> 01:17:02,320
the same assurance questions and they still have to build the same non-negotiables immutability,

1996
01:17:02,320 --> 01:17:04,960
reproducibility, lineage, separation of duties.

1997
01:17:04,960 --> 01:17:07,160
The stack changes, the physics don't.

1998
01:17:07,160 --> 01:17:10,200
So the short verdict is this, all stacks can calculate emissions.

1999
01:17:10,200 --> 01:17:14,480
Very few stacks can prove them end-to-end without deliberate architecture that prioritizes

2000
01:17:14,480 --> 01:17:16,320
evidence over convenience.

2001
01:17:16,320 --> 01:17:20,120
Microsoft's advantage is that it gives you a coherent set of primitives that align

2002
01:17:20,120 --> 01:17:24,560
with audit survivability, especially in organizations already living inside Entra,

2003
01:17:24,560 --> 01:17:26,920
Microsoft 365, and Azure.

2004
01:17:26,920 --> 01:17:29,000
And this episode was never about vendor fandom.

2005
01:17:29,000 --> 01:17:32,920
It was about building something that survives contact with assurance, which is why the next

2006
01:17:32,920 --> 01:17:35,240
section matters more than the comparison.

2007
01:17:35,240 --> 01:17:40,200
The minimal viable auditable ESG architecture is the part you can actually replicate.

2008
01:17:40,200 --> 01:17:43,800
Minimal viable auditable ESG architecture, the replicable blueprint.

2009
01:17:43,800 --> 01:17:47,600
Here's the part people pretend they want until it forces decisions.

2010
01:17:47,600 --> 01:17:52,320
A minimal viable auditable ESG architecture isn't minimal because it's cheap or quick.

2011
01:17:52,320 --> 01:17:56,660
It's minimal because it contains the smallest set of components and artifacts that can

2012
01:17:56,660 --> 01:18:00,500
survive assurance without turning your team into full-time historians.

2013
01:18:00,500 --> 01:18:04,340
So define the environs clearly, boundaries, components, and produced evidence.

2014
01:18:04,340 --> 01:18:09,220
The boundaries are four zones, raw, curated, reported, and an evidence vault.

2015
01:18:09,220 --> 01:18:13,380
Not because medallion architecture is fashionable, but because it's the cleanest way to separate

2016
01:18:13,380 --> 01:18:16,540
what happened from what you did to it from what you claim.

2017
01:18:16,540 --> 01:18:18,140
The components are five.

2018
01:18:18,140 --> 01:18:24,060
Entra ID, ADLS, GN2 with immutability, fabric or synapse for governed compute, purview

2019
01:18:24,060 --> 01:18:28,020
for lineage, and power BI as a thin presentation layer.

2020
01:18:28,020 --> 01:18:32,060
Everything else is optional, and optional means it's allowed to be absent without collapsing

2021
01:18:32,060 --> 01:18:33,860
audit survivability.

2022
01:18:33,860 --> 01:18:36,220
The artifacts are what make it auditable.

2023
01:18:36,220 --> 01:18:40,900
Load IDs and ingestion logs, immutable raw objects, governed calculation artifacts with

2024
01:18:40,900 --> 01:18:47,260
version control, factor library versions with approval records, period close configuration,

2025
01:18:47,260 --> 01:18:51,020
reported KPI tables, and a close package snapshot.

2026
01:18:51,020 --> 01:18:54,180
If you don't produce those artifacts, you didn't build an auditable system.

2027
01:18:54,180 --> 01:18:55,180
You built a dashboard.

2028
01:18:55,180 --> 01:19:00,380
Now walk through one KPI end-to-end because architecture without a trace is just a diagram.

2029
01:19:00,380 --> 01:19:01,940
Pick scope to emissions.

2030
01:19:01,940 --> 01:19:06,620
It's common enough, and it's where drift and factor ambiguity show up fast.

2031
01:19:06,620 --> 01:19:12,260
The source is activity data, electricity consumption by site and period from invoices or meters.

2032
01:19:12,260 --> 01:19:15,540
Ingestion lands it in the raw zone as append only objects.

2033
01:19:15,540 --> 01:19:19,820
Each load gets a load ID, timestamp, source identifier, and submitter identity.

2034
01:19:19,820 --> 01:19:24,620
If the ingestion came from a human submission, it still lands as a new versioned object.

2035
01:19:24,620 --> 01:19:26,660
No override, ever.

2036
01:19:26,660 --> 01:19:29,500
Validation runs before anything becomes curated.

2037
01:19:29,500 --> 01:19:34,380
Schema checks, unit checks, required dimensions like site, period, and measurement type.

2038
01:19:34,380 --> 01:19:37,500
The validation output is not a log line in the pipeline run.

2039
01:19:37,500 --> 01:19:39,700
It's an artifact you can retrieve later.

2040
01:19:39,700 --> 01:19:43,980
Pass, fail, warnings, and what was corrected or normalized.

2041
01:19:43,980 --> 01:19:46,060
Then curated, you standardize the shape.

2042
01:19:46,060 --> 01:19:47,460
Sites map to org units.

2043
01:19:47,460 --> 01:19:51,220
Normalize, missing dimensions get flagged not silently filled.

2044
01:19:51,220 --> 01:19:54,220
This is also where you enforce controlled vocab.

2045
01:19:54,220 --> 01:19:57,380
Electricity type, supply identifiers, region codes.

2046
01:19:57,380 --> 01:20:02,340
The curated tables carry quality flags forward because clean data that hides uncertainty is

2047
01:20:02,340 --> 01:20:03,500
a liability.

2048
01:20:03,500 --> 01:20:06,780
Then the calculation zone, fabric, lake house, or synapse.

2049
01:20:06,780 --> 01:20:11,340
This is where scope to emissions gets computed using versioned logic, not DAX.

2050
01:20:11,340 --> 01:20:13,820
Not the report, a governed artifact.

2051
01:20:13,820 --> 01:20:16,340
The computation binds to two things explicitly.

2052
01:20:16,340 --> 01:20:19,940
The activity load IDs and the factor library version key.

2053
01:20:19,940 --> 01:20:23,180
If the job doesn't have a factor version key, it fails.

2054
01:20:23,180 --> 01:20:26,780
If multiple factors match due to sloppy mappings, it fails.

2055
01:20:26,780 --> 01:20:28,140
Deterministic selection or no selection.

2056
01:20:28,140 --> 01:20:29,460
Now period close.

2057
01:20:29,460 --> 01:20:31,540
Period close is not a calendar event.

2058
01:20:31,540 --> 01:20:34,860
It's a state change that freezes your ability to rewrite the past.

2059
01:20:34,860 --> 01:20:38,580
You freeze the inputs by selecting the accepted load IDs for the period.

2060
01:20:38,580 --> 01:20:43,220
You freeze the factors by binding the approved factor library version IDs to that period.

2061
01:20:43,220 --> 01:20:47,260
Refreeze the logic by referencing the released calculation artifact version.

2062
01:20:47,260 --> 01:20:51,460
Then you publish the reported outputs, the KPI tables for that period with keys for period

2063
01:20:51,460 --> 01:20:55,740
or unit method measured versus estimated flags and the factor version key used.

2064
01:20:55,740 --> 01:21:00,780
Then you lock as ADLS immutability applies to the raw evidence and to the published close

2065
01:21:00,780 --> 01:21:02,140
artifacts.

2066
01:21:02,140 --> 01:21:06,220
Factor library snapshot, close configuration, and reported outputs as needed for your

2067
01:21:06,220 --> 01:21:07,540
evidence strategy.

2068
01:21:07,540 --> 01:21:10,220
You don't have to make every table immutable forever.

2069
01:21:10,220 --> 01:21:12,380
You do have to make the close package immutable.

2070
01:21:12,380 --> 01:21:13,380
That's the point.

2071
01:21:13,380 --> 01:21:14,380
Then power BI.

2072
01:21:14,380 --> 01:21:16,260
Power BI reads reported tables only.

2073
01:21:16,260 --> 01:21:17,540
The data set is certified.

2074
01:21:17,540 --> 01:21:18,860
The refresh is controlled.

2075
01:21:18,860 --> 01:21:23,500
The visuals include the confidence context, measured versus estimated, coverage indicators,

2076
01:21:23,500 --> 01:21:26,180
and the drill path down to record identifiers.

2077
01:21:26,180 --> 01:21:29,300
When someone challenges the KPI, you don't debate, you drill.

2078
01:21:29,300 --> 01:21:30,980
And per view stitches the whole path.

2079
01:21:30,980 --> 01:21:33,940
Power BI report to data set, data set to report the tables.

2080
01:21:33,940 --> 01:21:36,100
Report the tables to calculation artifacts.

2081
01:21:36,100 --> 01:21:39,900
Calculation artifacts to curated inputs, curated inputs to raw loads, raw loads to sources

2082
01:21:39,900 --> 01:21:41,220
and submission identities.

2083
01:21:41,220 --> 01:21:43,220
That lineage is not for beauty, it's for.

2084
01:21:43,220 --> 01:21:46,980
The day someone asks, prove it, while your calendar is already on fire.

2085
01:21:46,980 --> 01:21:48,460
Finally, sequencing.

2086
01:21:48,460 --> 01:21:51,860
Because this is where most teams implode by trying to boil the ocean.

2087
01:21:51,860 --> 01:21:54,540
Week one, pick one KPI and one data source.

2088
01:21:54,540 --> 01:21:57,860
Build ingestion into raw with load IDs and validation artifacts.

2089
01:21:57,860 --> 01:22:04,180
Week two, build the curated model for that KPI, including quality flags and control dimensions.

2090
01:22:04,180 --> 01:22:08,740
Week three, implement the governed calculation zone with factor version binding and a reported

2091
01:22:08,740 --> 01:22:09,740
output table.

2092
01:22:09,740 --> 01:22:14,620
Week four, register assets in purview and build a thin power BI report that drills to record

2093
01:22:14,620 --> 01:22:16,660
IDs and shows factor version keys.

2094
01:22:16,660 --> 01:22:17,860
That's done.

2095
01:22:17,860 --> 01:22:20,340
Not perfect, done in the only sense that matters.

2096
01:22:20,340 --> 01:22:24,660
The auditor's questions are answerable from systems, not from meetings.

2097
01:22:24,660 --> 01:22:29,660
Auditable ESG in Microsoft isn't about dashboards, it's about immutable data, versioned calculations

2098
01:22:29,660 --> 01:22:32,740
and lineage you can explain to an auditor without PowerPoint.

2099
01:22:32,740 --> 01:22:37,380
If you want the next layer, the ESG data model itself, raw versus curated versus reported

2100
01:22:37,380 --> 01:22:40,660
and how to enforce period close, watch the next episode and subscribe.