Your Microsoft 365 automation environment is probably running on borrowed identity. In this episode of the M365FM Podcast, we expose one of the biggest hidden risks inside modern cloud architecture: enterprise workflows tethered to personal user accounts. It starts innocently enough. An engineer builds a Power Automate flow, connects a Logic App, configures a Power BI refresh, or deploys a SharePoint integration using their own credentials because it is fast and convenient. But the moment that person changes roles, resets a password, triggers Conditional Access, loses MFA access, or leaves the company entirely, the entire automation chain collapses. This is identity rot. Organizations across the world are unknowingly building mission-critical infrastructure on top of human dependencies instead of infrastructure identities. The result is brittle automation, failed workflows, silent outages, security gaps, and operational chaos that often goes unnoticed until production systems fail. As Microsoft moves toward the 2026 identity model, the era of service-principal-less automation is ending. Legacy authentication patterns are being deprecated, old Azure AD Graph integrations are disappearing, and modern workloads are being forced toward identity-first architecture. This episode breaks down why Service Principals, Managed Identities, Federated Credentials, and Zero-Secret authentication are no longer optional modernization projects. They are now foundational requirements for operational survival. If your automation breaks when an employee resigns, your architecture is already unstable.
THE SHADOW ACCOUNT TRAP
Most identity problems begin with convenience. An engineer connects a workflow using their own Microsoft 365 account because the permissions already exist and the deployment is faster. The automation works immediately, the project launches successfully, and nobody realizes they just embedded a hidden human dependency into critical infrastructure. Until the password changes. Until Conditional Access blocks the sign-in. Until MFA expires. Until the employee leaves the company. This episode explores why modern enterprises are trapped in what we call the Shadow Account Model:
- Personal accounts acting as infrastructure identities
- MFA incompatibility with headless automation
- Authentication rot across Power Automate and Logic Apps
- Offboarding failures causing workflow collapse
- Service accounts operating as unsecured ghost users
WHY MICROSOFT IS FORCING THE SHIFT
Microsoft has officially recognized the structural flaw of user-based automation. As we move toward 2026:
- Legacy SharePoint 2013 workflows are being retired
- Azure AD Graph is being deprecated
- Service-principal-less authentication is disappearing
- App-only modern authentication is becoming mandatory
Automation must have its own identity. This episode explains why organizations are no longer fighting technical debt alone. They are now fighting the direction of the platform itself. The old model asked:
“Which person is running this automation?” The new model asks:
“Which workload is authorized to perform this action?” That architectural shift changes everything.
IDENTITY AS INFRASTRUCTURE
Modern identity is no longer a human construct. It is infrastructure. In this episode, we explore how Service Principals function as non-interactive runtime identities that represent workloads instead of employees. We break down:
- The Decoupling Principle in enterprise security
- Why workloads need independent identity boundaries
- The shift from human-centric to resource-centric authorization
- Why identity must become a deployment artifact
- How infrastructure-native authentication improves resilience
MANAGED IDENTITIES AND ZERO-SECRET AUTHENTICATION
The strongest credential is the one nobody ever handles. Managed Identities fundamentally change how enterprise authentication works because Azure manages the entire lifecycle automatically:
- Credential generation
- Rotation
- Storage
- Expiration
- Trust enforcement
- Why Managed Identities eliminate secret sprawl
- How Zero-Secret authentication reduces breach risk
- Why workload-bound identity changes operational security
- How Azure ties identity directly to resource lifecycle
- The security benefits of infrastructure-native trust
FEDERATED CREDENTIALS AND THE END OF STATIC SECRETS
Static secrets are one of the largest liabilities in enterprise automation. This episode explores how Federated Credentials and OpenID Connect (OIDC) are replacing long-lived secrets inside GitHub Actions, CI/CD pipelines, and multi-cloud integrations. You’ll learn:
- Why client secrets become long-term attack surfaces
- How OIDC token exchange works with Entra ID
- Why workload federation eliminates stored credentials
- How temporary trust outperforms permanent passwords
- Why federated identity is the future of automation security
THE PERMISSION CREEP CRISIS
A resilient identity with excessive permissions becomes a high-speed weapon. One of the biggest architectural failures in Microsoft 365 automation is permission creep. Engineers frequently assign massive Graph API scopes like Application.ReadWrite.All or Directory.ReadWrite.All simply to eliminate deployment friction. The result:
Overprivileged Service Principals operating silently across the tenant. This episode explores:
- Why app-only permissions are extremely dangerous
- The hidden blast radius of over-scoped principals
- How attackers target machine identities for persistence
- Why compromised tokens move faster than compromised humans
- How broad Graph permissions enable tenant-wide takeover
Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-fm-modern-work-security-and-productivity-with-microsoft-365--6704921/support.
🚀 Want to be part of m365.fm?
Then stop just listening… and start showing up.
👉 Connect with me on LinkedIn and let’s make something happen:
- 🎙️ Be a podcast guest and share your story
- 🎧 Host your own episode (yes, seriously)
- 💡 Pitch topics the community actually wants to hear
- 🌍 Build your personal brand in the Microsoft 365 space
This isn’t just a podcast — it’s a platform for people who take action.
🔥 Most people wait. The best ones don’t.
👉 Connect with me on LinkedIn and send me a message:
"I want in"
Let’s build something awesome 👊
00:00:00,000 --> 00:00:03,340
You just off-boarded a senior engineer and everything seemed fine until Monday morning,
2
00:00:03,340 --> 00:00:07,820
but that's when you realize every critical automation in your Microsoft 365 environment is dead.
3
00:00:07,820 --> 00:00:11,880
The HR onboarding flow has stalled and the nightly security report was never sent,
4
00:00:11,880 --> 00:00:13,980
which leaves your team scrambling to find out why.
5
00:00:13,980 --> 00:00:18,420
This is the identity rot caused by tethering enterprise workflows to personal credentials,
6
00:00:18,420 --> 00:00:22,320
and it means we have built our most important systems on the shadow of individual users.
7
00:00:22,320 --> 00:00:24,060
It is a model built on quicksand.
8
00:00:24,060 --> 00:00:27,260
Transitioning to service principles is no longer a nice to have cleanup project
9
00:00:27,260 --> 00:00:31,520
because it has become a mandate for survival as we head toward 2026.
10
00:00:31,520 --> 00:00:36,500
Today, we stop treating identity as a human attribute and we start treating it as infrastructure.
11
00:00:36,500 --> 00:00:40,460
If you want an environment that scales without breaking every time someone resigns,
12
00:00:40,460 --> 00:00:43,580
you need to master the service principle, but to fix the system,
13
00:00:43,580 --> 00:00:46,680
we first have to understand why it is currently breaking.
14
00:00:46,680 --> 00:00:50,620
The shadow account trap, we fall into the user account fallacy because it feels convenient
15
00:00:50,620 --> 00:00:56,380
when you are building a power automate flow and need to move files from SharePoint to a legacy database.
16
00:00:56,380 --> 00:00:58,900
The system asks for a connection and you click sign in,
17
00:00:58,900 --> 00:01:03,420
but you use your own account without realizing you just planted a time bomb in your infrastructure.
18
00:01:03,420 --> 00:01:07,140
We default to personal credentials because they already have the licenses and permissions,
19
00:01:07,140 --> 00:01:09,260
but we pay for that convenience in resilience.
20
00:01:09,260 --> 00:01:11,180
This is where authentication rot begins.
21
00:01:11,180 --> 00:01:14,980
Think about how Microsoft 365 manages security today
22
00:01:14,980 --> 00:01:20,260
as the entire system is designed for humans and expects MFA prompts or password resets every 90 days.
23
00:01:20,260 --> 00:01:24,340
It applies conditional access policies based on where you are and what device you are using,
24
00:01:24,340 --> 00:01:27,300
and this creates a massive problem for headless flows.
25
00:01:27,300 --> 00:01:31,780
These automated processes cannot answer an authenticate or prompt or navigate a password expired screen.
26
00:01:31,780 --> 00:01:34,500
When you tether an automated process to a human identity,
27
00:01:34,500 --> 00:01:36,780
you are forcing a machine to act like a person
28
00:01:36,780 --> 00:01:40,140
and the moment that person changes their password, the flow dies.
29
00:01:40,140 --> 00:01:43,780
The moment that person travels to a new country and triggers a risky sign in block,
30
00:01:43,780 --> 00:01:48,340
the flow dies again and the result is a silent strangulation of your headless workloads,
31
00:01:48,340 --> 00:01:50,540
but the real disaster happens during offboarding.
32
00:01:50,540 --> 00:01:55,500
Recent research shows that one in three enterprises experience a critical system failure during staff turnover
33
00:01:55,500 --> 00:01:57,140
because of identity tethering.
34
00:01:57,140 --> 00:02:01,340
When an admin leaves, IT disables their account as part of the standard procedure,
35
00:02:01,340 --> 00:02:06,380
but if that admin was the identity anchor for 20 different power BI refreshes and five logic apps,
36
00:02:06,380 --> 00:02:09,980
those services go dark, the second that account is flipped to disabled.
37
00:02:09,980 --> 00:02:12,540
We try to bypass this by creating service accounts,
38
00:02:12,540 --> 00:02:16,700
but in most organizations, a service account is just a personal account in disguise.
39
00:02:16,700 --> 00:02:23,660
It is usually a licensed user named SVC_automation that has a user name in a password sitting in a spreadsheet or a shared vault.
40
00:02:23,660 --> 00:02:27,180
This creates a massive security hole where there is zero accountability,
41
00:02:27,180 --> 00:02:30,540
because if that password leaks, you have no way of knowing who used it.
42
00:02:30,540 --> 00:02:34,220
You do not have the protection of MFA because MFA breaks the automation,
43
00:02:34,220 --> 00:02:39,900
so you exclude the account from your security policies and create a high privilege account with no modern protection.
44
00:02:39,900 --> 00:02:42,540
It is the weakest link in your entire tenant.
45
00:02:42,540 --> 00:02:45,100
Microsoft has recognized this structural flow
46
00:02:45,100 --> 00:02:51,340
and the reality check officially arrived when they announced that as of March 31st, 2026, things are changing.
47
00:02:51,340 --> 00:02:54,540
Microsoft retired service principle less authentication,
48
00:02:54,540 --> 00:02:58,940
which means the error of just signing in with a user account to run a background process is over.
49
00:02:58,940 --> 00:03:02,780
Legacy engines like SharePoint 2013 workflows are being fully removed,
50
00:03:02,780 --> 00:03:06,620
and new apps are being blocked from using the older 080 graph API.
51
00:03:06,620 --> 00:03:10,140
The message is clear, if your automation depends on a human shadow,
52
00:03:10,140 --> 00:03:12,140
it is no longer supported by the platform.
53
00:03:12,140 --> 00:03:15,580
You are not just fighting technical debt anymore, you are fighting the platform itself.
54
00:03:15,580 --> 00:03:17,740
The old model was about who was running the script,
55
00:03:17,740 --> 00:03:20,700
but the new model is about the script having its own identity.
56
00:03:20,700 --> 00:03:24,140
We have to move away from the idea that an automation is an extension of a person.
57
00:03:24,140 --> 00:03:28,780
It is a standalone entity that needs its own life cycle and its own security boundary.
58
00:03:28,780 --> 00:03:32,220
If user accounts are the trap, then service principles are the escape hatch,
59
00:03:32,220 --> 00:03:35,580
but they only work if you understand how to manage them as infrastructure.
60
00:03:35,580 --> 00:03:39,100
You cannot just create them and forget them because you have to orchestrate them.
61
00:03:39,100 --> 00:03:42,540
You have to shift your mindset from managing users to managing runtimes.
62
00:03:42,540 --> 00:03:47,100
That shift starts with a fundamental change in how we define what an identity actually is.
63
00:03:47,100 --> 00:03:49,980
It is no longer an employee record, it is a deployment artifact,
64
00:03:49,980 --> 00:03:54,700
and once you realize that, the way you build in M365 changes forever.
65
00:03:54,700 --> 00:03:56,540
Identity as infrastructure.
66
00:03:56,540 --> 00:04:00,380
We need to stop thinking of identity as just a set of login credentials.
67
00:04:00,380 --> 00:04:04,700
In the modern Microsoft 365 ecosystem, identity is a structural component
68
00:04:04,700 --> 00:04:08,060
that functions more like a runtime instance of an application than a person.
69
00:04:08,060 --> 00:04:11,820
When you create a service principle, you aren't just making a user that doesn't have a desk,
70
00:04:11,820 --> 00:04:14,940
but rather you are creating a non-interactive security principle.
71
00:04:14,940 --> 00:04:19,180
It lives in the directory, it has no mailbox, it has no password in the traditional sense,
72
00:04:19,180 --> 00:04:21,660
it is a service side representation of your code.
73
00:04:21,660 --> 00:04:24,220
The core of this shift is the decoupling principle.
74
00:04:24,220 --> 00:04:26,380
In the old world, we asked who was running a process,
75
00:04:26,380 --> 00:04:30,060
but in the new world, we asked what resource is authorized to do the work.
76
00:04:30,060 --> 00:04:33,260
This is the move from human-centric to resource-centric security.
77
00:04:33,260 --> 00:04:37,660
You aren't delegating your power to a script, but instead, you are granting that script its own authority.
78
00:04:37,660 --> 00:04:41,020
But here's the problem, not all service principles are built the same.
79
00:04:41,020 --> 00:04:44,540
To build a resilient core, you have to follow the hierarchy of truth.
80
00:04:44,540 --> 00:04:47,900
At the very top of that hierarchy is the managed identity.
81
00:04:47,900 --> 00:04:50,220
If your workload is running inside Azure,
82
00:04:50,220 --> 00:04:53,100
managed identities are the gold standard because they represent
83
00:04:53,100 --> 00:04:56,140
the ultimate realization of identity as infrastructure.
84
00:04:56,140 --> 00:04:59,500
The reason they work so well is that there are no credentials for you to manage.
85
00:04:59,500 --> 00:05:03,500
Azure handles the creation, Azure handles the storage, Azure handles the rotation.
86
00:05:03,500 --> 00:05:06,380
You never see a client secret and you never touch a certificate,
87
00:05:06,380 --> 00:05:09,660
because the identity is literally tied to the life cycle of the resource.
88
00:05:09,660 --> 00:05:11,900
If you delete the virtual machine or the logic app,
89
00:05:11,900 --> 00:05:16,620
the identity vanishes with it, which means you've achieved zero secret authentication.
90
00:05:16,620 --> 00:05:18,780
This eliminates the biggest risk in your environment,
91
00:05:18,780 --> 00:05:21,340
which is human error during credential handling.
92
00:05:21,340 --> 00:05:25,020
But we know that M365 doesn't always live entirely inside an Azure resource.
93
00:05:25,020 --> 00:05:27,020
You have CICD pipelines in GitHub,
94
00:05:27,020 --> 00:05:30,220
you have on-premises scripts, you have multi-cloud integrations.
95
00:05:30,220 --> 00:05:32,700
This is where federated credentials change the game.
96
00:05:32,700 --> 00:05:36,220
We used to rely on long-lived secrets for these external connections.
97
00:05:36,220 --> 00:05:39,980
Which meant you'd generate a client secret and paste it into a GitHub secret
98
00:05:39,980 --> 00:05:41,740
while hoping nobody ever saw it.
99
00:05:41,740 --> 00:05:43,260
That secret was a liability.
100
00:05:43,260 --> 00:05:46,700
It sat there for a year or maybe forever, just waiting to be leaked.
101
00:05:46,700 --> 00:05:51,340
Federated credentials use OpenID Connect or OIDC to perform a token exchange.
102
00:05:51,340 --> 00:05:56,060
Your GitHub action says to enter ID that it has a token from GitHub proving its identity
103
00:05:56,060 --> 00:05:59,100
and then enter ID looks at the trust relationship you've configured.
104
00:05:59,100 --> 00:06:01,580
It validates the issuer, it validates the subject,
105
00:06:01,580 --> 00:06:03,740
and then it hands back a short-lived access token.
106
00:06:03,740 --> 00:06:07,260
No secrets are stored in your pipeline and no secrets are stored in your code
107
00:06:07,260 --> 00:06:10,140
because you've replaced a permanent password with a temporary trust.
108
00:06:10,140 --> 00:06:12,540
This is what it means to treat identity as infrastructure.
109
00:06:12,540 --> 00:06:13,980
It becomes a deployment artifact.
110
00:06:13,980 --> 00:06:16,940
When you define your automation, the identity is part of the template.
111
00:06:16,940 --> 00:06:20,300
So it exists right there in the bicep file or the terraform code.
112
00:06:20,300 --> 00:06:22,220
You aren't setting up an account after the fact
113
00:06:22,220 --> 00:06:24,860
because the identity is provisioned as part of the system's birth.
114
00:06:24,860 --> 00:06:26,940
It's just like a virtual network or storage account.
115
00:06:26,940 --> 00:06:29,340
If you need a process to read from a SharePoint list,
116
00:06:29,340 --> 00:06:31,260
you don't find a person with access,
117
00:06:31,260 --> 00:06:34,700
but instead you define a principle with that specific scope.
118
00:06:34,700 --> 00:06:37,740
You treat that identity with the same rigor you treat your production service,
119
00:06:37,740 --> 00:06:40,060
you monitor its sign-ins, you audit its permissions,
120
00:06:40,060 --> 00:06:41,580
you version its configuration.
121
00:06:41,580 --> 00:06:45,980
This mindset shift is what separates a lab environment from an enterprise environment.
122
00:06:45,980 --> 00:06:47,500
In a lab you use what's easy,
123
00:06:47,500 --> 00:06:49,740
but in an enterprise you use what's durable.
124
00:06:49,740 --> 00:06:51,500
Service principles provide that durability.
125
00:06:51,500 --> 00:06:53,180
They don't get married and change their names.
126
00:06:53,180 --> 00:06:55,260
They don't go on vacation, they don't get fired.
127
00:06:55,260 --> 00:06:57,900
They are the stable anchors of your automation strategy.
128
00:06:57,900 --> 00:06:59,980
But having the right identity is only step one.
129
00:06:59,980 --> 00:07:02,780
The real danger, the thing that actually kills your security,
130
00:07:02,780 --> 00:07:05,500
is how much power you give that identity once it exists.
131
00:07:05,500 --> 00:07:09,900
Because an infrastructure identity with too much power is just a high speed vulnerability.
132
00:07:09,900 --> 00:07:11,420
The permission creep crisis.
133
00:07:11,420 --> 00:07:13,740
We've established that identity is infrastructure,
134
00:07:13,740 --> 00:07:14,940
but here is the problem.
135
00:07:14,940 --> 00:07:16,540
Infrastructure needs boundaries.
136
00:07:16,540 --> 00:07:20,940
Without them, your resilient service principle becomes a silent weapon for an attacker.
137
00:07:20,940 --> 00:07:23,820
The biggest mistake architects make is falling into the application.
138
00:07:23,820 --> 00:07:26,540
Readright.allTrap
139
00:07:26,540 --> 00:07:28,220
It is the path of least resistance.
140
00:07:28,220 --> 00:07:31,340
You're trying to get a script to work and you're tired of seeing 403
141
00:07:31,340 --> 00:07:32,620
forbidden errors in your logs,
142
00:07:32,620 --> 00:07:35,580
so you go into the entraportal to find the API permissions.
143
00:07:35,580 --> 00:07:37,660
You check the box for the broadest scope available,
144
00:07:37,660 --> 00:07:39,580
and just like that you've created a monster.
145
00:07:39,580 --> 00:07:43,900
Research shows that 25% of service principles in the wild are massively over-privileged,
146
00:07:43,900 --> 00:07:47,020
which means they have permission to read every email in the tenant.
147
00:07:47,020 --> 00:07:48,380
They can modify every group.
148
00:07:48,380 --> 00:07:50,220
They can delete entire SharePoint sites.
149
00:07:50,220 --> 00:07:53,500
This happens because we treat these identities like global admins for apps.
150
00:07:53,500 --> 00:07:55,580
We assume that because it's a machine it's safe.
151
00:07:55,580 --> 00:07:57,020
But in reality it's the opposite.
152
00:07:57,020 --> 00:07:59,660
A compromised user account is limited by human speed,
153
00:07:59,660 --> 00:08:01,740
where an attacker has to click and browse,
154
00:08:01,740 --> 00:08:05,740
but a compromised service principle token is limited only by the speed of the API.
155
00:08:05,740 --> 00:08:08,060
If an attacker steals a token with directory,
156
00:08:08,060 --> 00:08:08,700
readright.
157
00:08:08,700 --> 00:08:12,540
On all, they can create 10 new accounts and grant them administrative rights in seconds.
158
00:08:12,540 --> 00:08:15,260
They can automate the exfiltration of your entire document library
159
00:08:15,260 --> 00:08:17,580
before your security team even gets the first alert.
160
00:08:17,580 --> 00:08:19,180
This is the blast radius problem.
161
00:08:19,180 --> 00:08:21,180
When you give a principle broad directory roles,
162
00:08:21,180 --> 00:08:23,580
you are removing the walls between your data silos.
163
00:08:23,580 --> 00:08:26,380
One single leaked secret in a misconfigured key vault
164
00:08:26,380 --> 00:08:29,020
can escalate into a full tenant takeover.
165
00:08:29,020 --> 00:08:31,420
Audit findings from 2025 are terrifying.
166
00:08:31,420 --> 00:08:35,020
They show that 40% of persistence vectors in modern breaches involve
167
00:08:35,020 --> 00:08:36,700
unmonetored service principles.
168
00:08:36,700 --> 00:08:39,020
Attacters aren't just looking for your password anymore.
169
00:08:39,020 --> 00:08:40,780
They are looking for your ghost identities.
170
00:08:40,780 --> 00:08:43,500
They want the principles that have app only exchange access
171
00:08:43,500 --> 00:08:45,420
and they want the ones that can bypass MFA
172
00:08:45,420 --> 00:08:47,580
because they are trusted background processes.
173
00:08:47,580 --> 00:08:49,100
These are the crown jewels.
174
00:08:49,100 --> 00:08:51,980
To stop this, we have to move toward our back for applications.
175
00:08:51,980 --> 00:08:55,580
You need to stop using broad directory-level roles for specific tasks.
176
00:08:55,580 --> 00:08:59,580
If your application only needs to read files from one specific sharepoint site,
177
00:08:59,580 --> 00:09:00,940
don't give it sites.
178
00:09:00,940 --> 00:09:03,580
Read all, but instead use the graph API
179
00:09:03,580 --> 00:09:06,060
to granted permission to that specific site ID.
180
00:09:06,060 --> 00:09:09,260
If your bot needs to send emails from a specific shared mailbox,
181
00:09:09,260 --> 00:09:11,820
use an application access policy in exchange online.
182
00:09:11,820 --> 00:09:14,540
Limit that principle so it can only see that one mailbox.
183
00:09:14,540 --> 00:09:16,300
This is how you scope the blast radius.
184
00:09:16,300 --> 00:09:19,900
You have to move from the mindset of enablement to the mindset of restriction.
185
00:09:19,900 --> 00:09:22,780
Every permission you grant is a potential bridge for an intruder.
186
00:09:22,780 --> 00:09:25,980
If you can't justify why a principle needs a specific scope,
187
00:09:25,980 --> 00:09:26,940
it shouldn't have it.
188
00:09:26,940 --> 00:09:29,740
We also have to stop ignoring these identities in our audits.
189
00:09:29,740 --> 00:09:31,580
Most companies spend all their time reviewing
190
00:09:31,580 --> 00:09:33,340
who is in the global admin group.
191
00:09:33,340 --> 00:09:36,780
But they never look at who has app only permissions to their financial records.
192
00:09:36,780 --> 00:09:38,460
Those app only scopes are more dangerous
193
00:09:38,460 --> 00:09:40,860
because they don't require a user to be present.
194
00:09:40,860 --> 00:09:41,900
They run in the dark.
195
00:09:41,900 --> 00:09:43,660
If you want to secure your infrastructure,
196
00:09:43,660 --> 00:09:45,500
you have to treat service-principal permissions
197
00:09:45,500 --> 00:09:48,220
with the same fear you treat root access on a server
198
00:09:48,220 --> 00:09:50,620
because at the end of the day, that's exactly what they are.
199
00:09:50,620 --> 00:09:53,580
To manage this risk at scale, we have to stop doing it by hand.
200
00:09:53,580 --> 00:09:55,580
We need a system for the life cycle.
201
00:09:55,580 --> 00:09:57,180
Orchestrating the life cycle.
202
00:09:57,180 --> 00:09:59,420
You cannot secure what you don't track.
203
00:09:59,420 --> 00:10:01,740
The real danger of moving to a service-principal model
204
00:10:01,740 --> 00:10:03,500
isn't actually the technology itself,
205
00:10:03,500 --> 00:10:05,500
but rather the sprawl that comes with it.
206
00:10:05,500 --> 00:10:07,820
Because these identities don't have faces,
207
00:10:07,820 --> 00:10:10,780
they don't have managers and they never show up in the company directory,
208
00:10:10,780 --> 00:10:13,820
they tend to linger long after their purpose has vanished.
209
00:10:13,820 --> 00:10:16,140
This is the orphaned identity problem.
210
00:10:16,140 --> 00:10:17,740
In a typical enterprise tenant,
211
00:10:17,740 --> 00:10:21,820
between 20 and 50% of your current service principles are likely ghosts
212
00:10:21,820 --> 00:10:24,780
that represent remnants of a project that ended three years ago,
213
00:10:24,780 --> 00:10:27,420
or perhaps they are the leftovers of a proof of concept
214
00:10:27,420 --> 00:10:30,060
that a consultant built and then abandoned.
215
00:10:30,060 --> 00:10:31,100
But here is the catch.
216
00:10:31,100 --> 00:10:33,180
Those ghosts still have active permissions
217
00:10:33,180 --> 00:10:35,500
and they still have secrets that might be valid,
218
00:10:35,500 --> 00:10:37,500
which means they are essentially open doors
219
00:10:37,500 --> 00:10:38,940
in a house where nobody lives.
220
00:10:38,940 --> 00:10:42,860
To solve this, we have to stop treating identity creation as a one-way street.
221
00:10:42,860 --> 00:10:44,780
You need an automated cleaner policy
222
00:10:44,780 --> 00:10:46,780
that treats your directory like a living garden
223
00:10:46,780 --> 00:10:49,180
because if something isn't growing, it needs to be pruned.
224
00:10:49,180 --> 00:10:52,540
We do this by using the Azure Resource Graph and PowerShell.
225
00:10:52,540 --> 00:10:54,460
You shouldn't be hunting for these in the portal,
226
00:10:54,460 --> 00:10:56,380
but instead you should be running queries
227
00:10:56,380 --> 00:10:59,420
that cross-reference your principles against their actual usage.
228
00:10:59,420 --> 00:11:01,980
If a service principle hasn't signed in for 90 days,
229
00:11:01,980 --> 00:11:03,260
why does it still exist?
230
00:11:03,260 --> 00:11:05,260
You can use the service principle sign-in logs
231
00:11:05,260 --> 00:11:07,020
to find these dormant entities,
232
00:11:07,020 --> 00:11:08,460
but you have to be careful.
233
00:11:08,460 --> 00:11:10,780
Some identities are identity-not-found remnants
234
00:11:10,780 --> 00:11:12,700
that happen when you delete an app registration
235
00:11:12,700 --> 00:11:14,140
while the service principle object
236
00:11:14,140 --> 00:11:16,860
or the role assignment stays stuck in the IAM blade.
237
00:11:16,860 --> 00:11:18,700
These remnants consume your quota.
238
00:11:18,700 --> 00:11:21,980
Azure has a limit of 4,000 role assignments per subscription,
239
00:11:21,980 --> 00:11:23,660
so if you let these orphans pile up,
240
00:11:23,660 --> 00:11:24,940
you will eventually hit a wall
241
00:11:24,940 --> 00:11:27,260
where you can't provision new legitimate infrastructure
242
00:11:27,260 --> 00:11:29,580
because your identity graveyard is full.
243
00:11:29,580 --> 00:11:31,180
You also need to understand the mechanics
244
00:11:31,180 --> 00:11:33,740
of how EntraID purges these objects.
245
00:11:33,740 --> 00:11:36,460
Managed identities have a 30-day soft delete window,
246
00:11:36,460 --> 00:11:38,300
meaning if you delete the owning resource,
247
00:11:38,300 --> 00:11:39,980
the identity enters a recycle bin.
248
00:11:39,980 --> 00:11:41,980
You have 30 days to realize you made a mistake
249
00:11:41,980 --> 00:11:45,340
before that identity and all its associated permissions are gone forever.
250
00:11:45,340 --> 00:11:46,860
But for standard service principles,
251
00:11:46,860 --> 00:11:48,460
the life cycle is often manual
252
00:11:48,460 --> 00:11:51,580
and this is where re-certification workflows become mandatory.
253
00:11:51,580 --> 00:11:53,100
It should not be the one deciding
254
00:11:53,100 --> 00:11:54,620
if an application still needs access
255
00:11:54,620 --> 00:11:55,980
to the finance sharepoint side.
256
00:11:55,980 --> 00:11:58,380
IT doesn't know, but the site owner knows.
257
00:11:58,380 --> 00:12:01,020
You need to set up a system where every six months
258
00:12:01,020 --> 00:12:03,340
the business owner must attest to the continued need
259
00:12:03,340 --> 00:12:05,020
for that application's access.
260
00:12:05,020 --> 00:12:07,820
If they don't click approve, the identity is disabled,
261
00:12:07,820 --> 00:12:09,260
not deleted, disabled.
262
00:12:09,260 --> 00:12:10,620
We call this the scream test.
263
00:12:10,620 --> 00:12:13,900
If you disable an identity and nobody screams within 30 days,
264
00:12:13,900 --> 00:12:15,260
it was safe to remove.
265
00:12:15,260 --> 00:12:16,860
But before you hit that delete key,
266
00:12:16,860 --> 00:12:18,460
you have to map your dependencies,
267
00:12:18,460 --> 00:12:20,460
which is the most critical step in orchestration.
268
00:12:20,460 --> 00:12:22,060
You need to know exactly which flows,
269
00:12:22,060 --> 00:12:23,580
which scripts and which logic apps
270
00:12:23,580 --> 00:12:25,420
are calling that specific client ID.
271
00:12:25,420 --> 00:12:27,420
If you delete a principle that is hard coded
272
00:12:27,420 --> 00:12:29,420
into a legacy production system without a backup,
273
00:12:29,420 --> 00:12:31,820
you've just caused the very outage we are trying to prevent.
274
00:12:31,820 --> 00:12:34,060
Orchestration is about moving identity management
275
00:12:34,060 --> 00:12:35,740
from a manual ticket-based process
276
00:12:35,740 --> 00:12:37,500
to a policy-based life cycle,
277
00:12:37,500 --> 00:12:40,300
ensuring that every identity has an expiration date,
278
00:12:40,300 --> 00:12:43,100
a clear owner, and a documented purpose.
279
00:12:43,100 --> 00:12:45,100
When you treat identity as infrastructure,
280
00:12:45,100 --> 00:12:47,980
you accept the responsibility of maintaining that infrastructure.
281
00:12:47,980 --> 00:12:51,100
You don't just build it, but you manage it from birth to retirement.
282
00:12:51,100 --> 00:12:53,580
And even with clean permissions and a managed life cycle,
283
00:12:53,580 --> 00:12:55,340
a static secret is still a liability,
284
00:12:55,340 --> 00:12:56,940
which is why we have to automate the rotation.
285
00:12:56,940 --> 00:13:00,060
The zero downtime rotation model,
286
00:13:00,060 --> 00:13:01,740
even the most perfectly scoped identity
287
00:13:01,740 --> 00:13:03,580
eventually hits the 90-day expiry wall.
288
00:13:03,580 --> 00:13:06,380
This is the moment where your security policy
289
00:13:06,380 --> 00:13:08,380
becomes your operational enemy.
290
00:13:08,380 --> 00:13:10,300
If you are managing secrets manually,
291
00:13:10,300 --> 00:13:12,860
you are essentially scheduling your own future outages.
292
00:13:12,860 --> 00:13:15,100
You are waiting for a calendar alert that you might miss,
293
00:13:15,100 --> 00:13:18,060
or perhaps a notification email that goes to a service desk inbox
294
00:13:18,060 --> 00:13:19,020
that nobody monitors.
295
00:13:19,020 --> 00:13:21,660
When that secret expires, your automation stops,
296
00:13:21,660 --> 00:13:23,420
and this isn't just a security failure,
297
00:13:23,420 --> 00:13:25,740
but a failure of basic system maintenance.
298
00:13:25,740 --> 00:13:28,380
To solve this, we have to adopt the dual credential pattern,
299
00:13:28,380 --> 00:13:30,700
which is the secret to zero downtime operations.
300
00:13:30,700 --> 00:13:32,460
Most people think of rotation as a single event
301
00:13:32,460 --> 00:13:34,780
where you delete the old secret and paste in the new one,
302
00:13:34,780 --> 00:13:36,620
but that is exactly how you break things.
303
00:13:36,620 --> 00:13:39,420
In a high availability environment, you always want an overlap.
304
00:13:39,420 --> 00:13:42,540
Microsoft EntraID allows you to have multiple valid credentials
305
00:13:42,540 --> 00:13:44,700
for a single service principle simultaneously,
306
00:13:44,700 --> 00:13:46,540
making the workflow simple but powerful.
307
00:13:46,540 --> 00:13:50,380
First, you generate a new secret without touching the old one yet.
308
00:13:50,380 --> 00:13:52,220
You add the new secret to your configuration,
309
00:13:52,220 --> 00:13:54,140
so your application has two ways to get in,
310
00:13:54,140 --> 00:13:55,660
and then you test the connection.
311
00:13:55,660 --> 00:13:58,780
Once you've verified that the new credential is being used successfully,
312
00:13:58,780 --> 00:14:01,260
only then do you go back and prune the old one.
313
00:14:01,260 --> 00:14:04,860
This ad before-delete strategy ensures that there is never a millisecond
314
00:14:04,860 --> 00:14:06,940
where your script is locked out of the tenant,
315
00:14:06,940 --> 00:14:08,860
but doing this by hand is still too risky.
316
00:14:08,860 --> 00:14:10,140
Human fingers are clumsy,
317
00:14:10,140 --> 00:14:11,980
and we often copy and paste the wrong string,
318
00:14:11,980 --> 00:14:13,580
or simply forget to save the change.
319
00:14:13,580 --> 00:14:15,900
This is why you must integrate with Azure Key Vault,
320
00:14:15,900 --> 00:14:18,140
because Key Vault isn't just a digital safe
321
00:14:18,140 --> 00:14:20,140
but a dynamic configuration provider.
322
00:14:20,140 --> 00:14:22,540
Your power automate flows or your Azure functions
323
00:14:22,540 --> 00:14:24,780
should never have a hard-coded password.
324
00:14:24,780 --> 00:14:27,340
Instead, they should have a managed identity
325
00:14:27,340 --> 00:14:31,260
that gives them permission to read a specific secret from the vault at runtime.
326
00:14:31,260 --> 00:14:33,020
When the secret rotates in EntraID,
327
00:14:33,020 --> 00:14:35,740
your automation runbook updates the value in Key Vault,
328
00:14:35,740 --> 00:14:37,180
and the next time your flow runs,
329
00:14:37,180 --> 00:14:39,420
it asks the vault for the latest password
330
00:14:39,420 --> 00:14:41,260
and gets the new one automatically.
331
00:14:41,260 --> 00:14:44,220
Your code doesn't change, your environment variables don't change,
332
00:14:44,220 --> 00:14:47,340
and the plumbing of your identity is handled entirely behind the scenes
333
00:14:47,340 --> 00:14:49,100
by Microsoft Graph APIs.
334
00:14:49,100 --> 00:14:50,540
For high-security environments,
335
00:14:50,540 --> 00:14:52,460
you should move away from secrets entirely
336
00:14:52,460 --> 00:14:54,620
and use certificate-based authentication.
337
00:14:54,620 --> 00:14:56,620
Secrets are just strings that can be written down
338
00:14:56,620 --> 00:14:58,540
or accidentally logged in plain text.
339
00:14:58,540 --> 00:15:00,780
Certificates, however, require a private key
340
00:15:00,780 --> 00:15:02,940
that never needs to leave your secure storage.
341
00:15:02,940 --> 00:15:06,220
By using certificates, you eliminate the risk of credential harvesting
342
00:15:06,220 --> 00:15:07,980
from your logs or your source code.
343
00:15:07,980 --> 00:15:12,060
This automation is what turns identity from a chore into a capability.
344
00:15:12,060 --> 00:15:15,660
You can set your secrets to expire every 30 days instead of every year
345
00:15:15,660 --> 00:15:18,060
and because the rotation is handled by a script,
346
00:15:18,060 --> 00:15:20,220
the frequency doesn't increase your workload.
347
00:15:20,220 --> 00:15:22,380
It only increases your security posture.
348
00:15:22,380 --> 00:15:24,700
You are shrinking the window of opportunity for an attacker
349
00:15:24,700 --> 00:15:27,420
without adding a single minute of manual labor to your week.
350
00:15:27,420 --> 00:15:29,180
This isn't just about protecting your data,
351
00:15:29,180 --> 00:15:32,300
but it's about the bottom line of your entire automation strategy.
352
00:15:32,300 --> 00:15:34,620
If your flows are brittle, they are expensive,
353
00:15:34,620 --> 00:15:37,100
and every time a developer has to stop what they are doing
354
00:15:37,100 --> 00:15:39,900
to fix a broken connection, you are losing money.
355
00:15:39,900 --> 00:15:43,340
By automating the rotation, you are making your infrastructure invisible.
356
00:15:43,340 --> 00:15:44,460
It just works.
357
00:15:44,460 --> 00:15:47,340
And when it just works, you can finally see the true economic impact
358
00:15:47,340 --> 00:15:49,020
of doing identity the right way.
359
00:15:49,020 --> 00:15:51,900
High-speed automation requires high-speed identity management,
360
00:15:51,900 --> 00:15:53,740
and anything less is just a bottleneck.
361
00:15:53,740 --> 00:15:56,780
The economic reality of identity.
362
00:15:56,780 --> 00:15:59,340
Security usually drives the conversation around identity,
363
00:15:59,340 --> 00:16:01,820
but the budget is almost always the ultimate filter.
364
00:16:01,820 --> 00:16:04,380
We need to look at the financial mechanics of this transition,
365
00:16:04,380 --> 00:16:07,180
because identity has its own economic reality.
366
00:16:07,180 --> 00:16:10,460
Moving to service principles is actually a licensing arbitrage play.
367
00:16:10,460 --> 00:16:12,700
Think about the traditional shadow account model.
368
00:16:12,700 --> 00:16:16,460
You buy a $15 a month power automate premium license for an individual
369
00:16:16,460 --> 00:16:17,820
so they can build a few flows.
370
00:16:17,820 --> 00:16:19,100
On paper that looks affordable,
371
00:16:19,100 --> 00:16:21,980
but you are tethering the value of those flows to a human seat
372
00:16:21,980 --> 00:16:23,740
if that person leaves the company,
373
00:16:23,740 --> 00:16:26,300
or if you need to scale that flow to a thousand users,
374
00:16:26,300 --> 00:16:29,340
the individual licensing model collapses under its own weight.
375
00:16:29,340 --> 00:16:32,940
The enterprise model moves this cost to the process license.
376
00:16:32,940 --> 00:16:35,980
It sits at roughly $150 a month per flow.
377
00:16:35,980 --> 00:16:38,700
At first glance, that looks like a 10 times price increase.
378
00:16:38,700 --> 00:16:40,060
But look at the capacity.
379
00:16:40,060 --> 00:16:43,820
A single process license supports two and a half million actions every single day.
380
00:16:43,820 --> 00:16:45,100
It can be stacked,
381
00:16:45,100 --> 00:16:47,180
it can be moved between environments,
382
00:16:47,180 --> 00:16:48,540
and most importantly,
383
00:16:48,540 --> 00:16:51,900
it decouples your operational costs from your employee head count.
384
00:16:51,900 --> 00:16:53,340
You are paying for the work being done,
385
00:16:53,340 --> 00:16:55,340
not the person who happened to configure it.
386
00:16:55,340 --> 00:16:58,300
The real return on investment isn't found in the license fee, though.
387
00:16:58,300 --> 00:17:00,700
It is found in the reduction of operational friction.
388
00:17:00,700 --> 00:17:04,220
Research indicates that organizations adopting a full service principle model
389
00:17:04,220 --> 00:17:07,820
see a 70% drop in identity-related support tickets.
390
00:17:07,820 --> 00:17:11,260
You are no longer paying a tier two admin to manually reset passwords
391
00:17:11,260 --> 00:17:13,500
for a service automation account every quarter.
392
00:17:13,500 --> 00:17:17,820
You aren't paying your most expensive developers to spend four hours on a Monday morning,
393
00:17:17,820 --> 00:17:20,780
fixing a connection error that paralyzed the shipping department.
394
00:17:20,780 --> 00:17:22,780
Then we have the EntraID tiering logic.
395
00:17:22,780 --> 00:17:25,980
Many architects try to stay on the P1 tier to save $3 per user,
396
00:17:25,980 --> 00:17:27,340
but P1 is a manual trap.
397
00:17:27,340 --> 00:17:30,220
It lacks the automation required for high-scale governance.
398
00:17:30,220 --> 00:17:31,420
When you move to P2,
399
00:17:31,420 --> 00:17:34,460
you gain access to privileged identity management for workloads.
400
00:17:34,460 --> 00:17:37,340
You can finally move toward a model of zero standing access.
401
00:17:37,340 --> 00:17:40,620
You can grant a service principle the ability to become a global admin only
402
00:17:40,620 --> 00:17:42,940
for the 10 minutes it takes to run a deployment script.
403
00:17:42,940 --> 00:17:45,260
This is how you optimize your risk-adjusted cost.
404
00:17:45,260 --> 00:17:47,820
You aren't paying to maintain a permanent security hole.
405
00:17:47,820 --> 00:17:50,300
You are paying for a governed, auditable gateway.
406
00:17:50,300 --> 00:17:53,180
Finally, consider the hidden cost of CM noise.
407
00:17:53,180 --> 00:17:56,380
Every orphaned principle attempting to sign in with an expired secret
408
00:17:56,380 --> 00:17:57,420
generates a log.
409
00:17:57,420 --> 00:17:59,660
If your directory is full of ghost identities,
410
00:17:59,660 --> 00:18:02,940
you are literally paying your security provider to ingest and store
411
00:18:02,940 --> 00:18:04,940
thousands of useless failure events.
412
00:18:04,940 --> 00:18:07,900
Pruning your identity graveyard doesn't just improve security.
413
00:18:07,900 --> 00:18:10,620
It directly lowers your data ingestion and storage costs.
414
00:18:10,620 --> 00:18:13,420
It makes your security operations center more efficient
415
00:18:13,420 --> 00:18:15,820
because they aren't wasting time chasing false positives
416
00:18:15,820 --> 00:18:18,540
from a pilot project that was abandoned three years ago.
417
00:18:18,540 --> 00:18:20,860
Your identity model is now a stable foundation.
418
00:18:20,860 --> 00:18:23,420
It is no longer a series of fragile human dependencies.
419
00:18:23,420 --> 00:18:26,300
You have successfully moved from a world of shadow accounts to a world
420
00:18:26,300 --> 00:18:28,380
where identity is treated as infrastructure.
421
00:18:28,380 --> 00:18:31,500
Your immediate homework is to run an inventory script today
422
00:18:31,500 --> 00:18:32,460
to find the rot.
423
00:18:32,460 --> 00:18:35,020
Find every personal account running a production flow.
424
00:18:35,020 --> 00:18:39,500
Identify the ones that are one resignation away from a total system blackout.
425
00:18:39,500 --> 00:18:43,180
If you want to automate this at scale, check out our episode on EntraID governance.
426
00:18:43,180 --> 00:18:45,420
We show you how to set up the recertification cycles
427
00:18:45,420 --> 00:18:48,140
that keep your directory clean without human intervention.
428
00:18:48,140 --> 00:18:51,260
If this changed how you view your infrastructure, leave a review.
429
00:18:51,260 --> 00:18:54,300
It helps more people find this and helps the architect community grow.







