The episode challenges the common misconception that “HR automation” is just a chatbot connected to a PDF repository. Instead, it presents a production-grade architectural approach for scaling HR operations using Microsoft Copilot Studio, Logic Apps, and structured evidence capture to create governed, deterministic HR agents — not just conversational bots.

Apple Podcasts podcast player iconSpotify podcast player iconYoutube Music podcast player iconSpreaker podcast player iconPodchaser podcast player iconAmazon Music podcast player icon

Scaling HR operations effectively is essential in today’s dynamic business landscape. Microsoft Copilot Studio plays a pivotal role in this transformation, enabling HR professionals to leverage advanced automation to scale HR operations seamlessly. By implementing Copilot, you can benefit from task automation, productivity enhancement, and operational efficiency.

Consider the following advantages:

  • Task Automation: AI agents manage client inquiries and monitor supplier performance without human intervention.
  • Productivity Enhancement: Continuous operation means HR processes can run 24/7.
  • Operational Efficiency: Automating routine tasks streamlines operations and reduces manual follow-ups.

Embracing this transformation empowers you to focus on strategic initiatives, ultimately leading to a high-performance HR environment that effectively scales HR operations.

Key Takeaways

  • Automation in HR does not replace jobs; it transforms them, allowing professionals to focus on strategic initiatives.
  • Microsoft Copilot Studio is accessible for all organizations, not just large companies, enabling streamlined HR processes.
  • Data-driven insights from AI can enhance decision-making and improve HR outcomes significantly.
  • Standardizing HR workflows reduces errors and bias, promoting fairness and transparency in processes.
  • Integrating AI with human insight creates a powerful synergy, balancing efficiency with the necessary human touch.
  • Continuous learning and adaptation are essential for HR professionals to stay relevant in a rapidly changing environment.
  • Establishing feedback loops helps refine HR automation workflows based on real user input and performance data.
  • Governance and compliance are crucial for maintaining trust and accountability in automated HR systems.

HR Automation Myths

HR Automation Myths

Common Misunderstandings

Automation Replaces Jobs

Many people believe that automation will replace jobs, especially in HR. This myth often stems from fear of technology. However, automation does not eliminate jobs; it transforms them. Instead of performing repetitive tasks, you can focus on strategic initiatives that require human insight. Automation tools like Microsoft Copilot Studio enhance your role by taking over mundane tasks, allowing you to engage in more meaningful work.

Only for Large Companies

Another common misconception is that automation is only beneficial for large companies. In reality, small and medium-sized enterprises can also reap the rewards of automation. Tools like Copilot Studio are designed to be scalable and accessible. They can streamline processes such as leave requests, HR policy inquiries, and onboarding, regardless of your organization's size.

The Reality of Automation

Enhancing Human Roles

Automation enhances human roles rather than replacing them. By automating repetitive tasks, you free up time to focus on strategic planning and employee engagement. For instance, Copilot Studio can automate leave requests and onboarding processes, allowing you to provide a more personalized experience for employees. This shift leads to improved internal support and faster responses to HR inquiries.

Accessibility for All Organizations

Automation is not just for large enterprises; it is accessible to all organizations. With tools like Copilot Studio, even smaller businesses can implement automation effectively. This technology streamlines daily processes, improves customer interactions, and helps identify qualified candidates. By adopting automation, you can create a more efficient HR environment that benefits everyone involved.

Remember, the effectiveness of automation tools depends on the quality of the data they access. Messy or incomplete data can lead to poor responses. Therefore, ensure your data is clean and well-organized to maximize the benefits of automation.

Deterministic HR Decisions

Data-Driven Insights

Utilizing AI for Analysis

You can transform HR operations by turning passive data into actionable insights using AI. AI tools analyze large volumes of HR data quickly and accurately. They provide you with real-time information that helps you make better decisions. For example, AI assistants can simulate different hiring strategies and predict their outcomes before you implement them. This capability allows you to test ideas and choose the best path forward.

Many organizations have seen impressive results by applying AI to HR analysis. IBM’s AskHR, an enterprise-scale HR agent, automates about 80 tasks. It achieved a 94% containment rate for common questions and reduced support tickets by 75% since 2016. This automation also cut HR operational costs by 40% over four years. Such examples show how AI improves efficiency and reduces workload.

Here is a summary of how data-driven insights improve HR decision-making:

OutcomeDescription
Revenue GrowthCompanies with strong candidate screening saw 3.5 times higher revenue growth.
Competitive AdvantageThese companies were 2.6 times more likely to outperform competitors.
Time-to-Fill ReductionData-driven recruitment cut time-to-fill by 35%.
Quality of HiresQuality of hires increased by 70%.

AI-driven tools also identify trends, predict turnover risks, and offer insights to improve workplace culture. These insights help you act proactively rather than reactively.

Predictive Analytics

Predictive analytics uses historical data and AI models to forecast future HR trends. You can anticipate challenges before they arise. For example, predictive models help you identify employees at risk of leaving. This foresight allows you to take steps to retain valuable talent.

You can also use predictive analytics to spot skill gaps in your workforce. This insight helps you plan training or hiring to meet future needs. For instance, if your business needs more software engineers soon, predictive analytics will alert you early.

Here is a table summarizing key benefits of predictive analytics in HR:

BenefitDescription
Identify Turnover RisksHelps you foresee and reduce employee turnover.
Anticipate Skill GapsEnables planning for necessary workforce skills.
Plan Proactive HiringSupports strategic hiring to meet future demands.

Predictive analytics supports workforce planning by evaluating your business goals and matching them with talent needs. This approach ensures you maintain a high-performance HR environment that adapts to change.

Implementing Deterministic Processes

Standardizing Workflows

You can improve consistency and fairness in HR by standardizing workflows. Standardization means every HR task follows a clear, defined process. This approach reduces errors and bias. It also makes your operations more transparent and trustworthy.

Consider these benefits of standardized workflows:

  • They create a shared framework for decision-making.
  • They ensure all employees receive equal treatment.
  • They reduce legal risks by maintaining compliance.
  • They improve employee trust and satisfaction.

Inconsistent HR practices can cause confusion and unfair treatment. For example, only 40% of candidates feel hiring processes are fair when workflows lack standardization. By contrast, a standardized hiring process defines evaluation criteria clearly, promoting fairness.

Reducing Bias in Decision-Making

AI helps you reduce bias in HR decisions by applying objective criteria consistently. Automation evaluates candidates based on qualifications, not personal opinions. This method minimizes human prejudices and supports fairer outcomes.

Research shows AI can improve the candidate experience by ensuring impartial evaluations. When you use AI-driven tools, you create a more equitable hiring process. This transformation builds trust and diversity within your workforce.

To implement deterministic HR processes effectively, follow these best practices:

  • Break workflows into smaller, reusable modules for easier management.
  • Integrate workflows with existing tools using stable APIs.
  • Define clear roles and permissions to protect workflow integrity.
  • Perform data quality checks to catch errors early.

By combining AI-powered insights with standardized, bias-free workflows, you can transform your HR operations into a high-performance system. This transformation helps you make decisions that are accurate, fair, and aligned with your organization’s goals.

Redefining HR Agents

New Roles for HR Professionals

Strategic Partners

With the integration of AI tools like Microsoft Copilot Studio, you can redefine your role as an HR professional. You transition from administrative tasks to strategic partnerships within your organization. Automation allows you to focus on high-value activities that drive business growth. For example, you can collaborate with leadership on employee engagement strategies and training programs. This shift enhances productivity and enables you to concentrate on culture building and effective communication.

Change Agents

As a change agent, you lead initiatives that foster a positive workplace culture. AI-powered tools streamline HR operations, allowing you to act quickly on employee feedback. You can use insights from Copilot to improve processes like hiring and onboarding. This proactive approach not only enhances employee satisfaction but also strengthens your organization’s overall performance.

Integrating AI with Human Insight

Collaboration Between Humans and AI

Integrating AI with human insight creates a powerful synergy in HR. AI automates repetitive tasks, freeing you to focus on strategic initiatives. For instance, Copilot can handle scheduling and data analysis, allowing you to prioritize personal interactions. This balance between technology and human touch is essential for effective HR management.

To foster collaboration, consider these strategies:

  • Develop user-friendly interfaces for better interaction with AI.
  • Establish feedback loops for continuous improvement between AI systems and human teams.
  • Encourage collaborative decision-making that combines AI insights with your judgment.

Continuous Learning and Adaptation

In a rapidly changing environment, continuous learning is vital. You must adapt to new technologies and processes. AI tools like Copilot provide valuable data insights that help you make informed decisions. By embracing a culture of adaptability, you can enhance your skills and stay relevant in your role.

To support this learning journey, consider:

  • Participating in workshops and training sessions focused on AI integration.
  • Building AI fluency within your HR team to reduce resistance to new technologies.
  • Prioritizing data quality and governance to ensure effective AI utilization.

By redefining your role and integrating AI with human insight, you can transform HR operations into high-performance systems. This transformation empowers you to make strategic decisions that enhance employee outcomes and drive organizational success.

Target Architecture for HR Solutions

Target Architecture for HR Solutions

Essential Components

Technology Stack

To build a scalable HR solution, you need a robust technology stack. This stack should include various components that work together seamlessly. Here are some essential elements:

  • Employee Self-Service Agent: Centralizes self-service tasks, making it easier for employees to manage their HR needs.
  • Integration Capabilities: Utilizes prebuilt and custom connectors for efficient data exchange with various enterprise systems.
  • Core Features: Supports employee and manager scenarios through conversational interfaces, enhancing user experience.
  • Customization and Scalability: Offers extensive options for modifying agent behavior and scaling across regions.
  • Security Measures: Adheres to enterprise-grade security standards, ensuring compliance with regulations.
Technology Stack ComponentKey StrengthsLimitationsBest For
Microsoft Dynamics 365Unified Data, Real-time accuracyComplexity, High licensing costsLarge enterprises
SAP ERPEnterprise Grade, Process AutomationTechnical Barrier, Performance issuesManufacturing and logistics companies
Power Automate Desktop (PAD)Integrates with various systemsN/AOrganizations looking for desktop automation
Custom ConnectorsFlexibility in integrationMaintenance requiredOrganizations needing specific integrations
Cloud Services (Dataverse, SharePoint)Seamless data access, Integration with Microsoft toolsN/AOrganizations using Microsoft ecosystem

Integration with Logic Apps

Integrating Microsoft Copilot with Logic Apps enhances your HR operations. This integration allows you to automate workflows and connect various systems. You can leverage Microsoft ecosystem tools such as Power Automate flows and Azure OpenAI for seamless data flow. Authentication models typically involve delegated access or service principals, ensuring secure connections with third-party systems.

Designing for Scalability

Modular Approaches

Adopting a modular approach in your HR architecture supports future scalability and adaptability. Each module can evolve independently, allowing you to respond quickly to changing business needs. Here are some key benefits of modular design:

Key BenefitDescription
Seamless IntegrationEnsures operational harmony by allowing modules to connect smoothly.
Future-ProofingPrepares each module to evolve with new technologies or changing business needs.
Agile HR PoliciesProvides flexible frameworks for various work models, enhancing adaptability.
Reskilling ProgramsEquips talent to thrive in evolving modular ecosystems.
Ownership CultureEncourages accountability and innovation as teams manage their modules.

Future-Proofing HR Operations

To future-proof your HR operations, you must adopt proactive strategies. Here are some effective approaches:

  • Prepare for crises through proactive risk management and crisis response planning.
  • Promote diversity and inclusion to enhance decision-making and reduce risks.
  • Offer targeted learning and development training to improve skills and compliance.
  • Ensure regulatory compliance with ongoing monitoring and clear procedures.
  • Leverage HR technology to centralize compliance and payroll for global operations.

Future-proofing global operations isn’t a one-time project or a checklist. It’s an ongoing practice.

By focusing on these strategies, you can create a resilient HR architecture that adapts to future challenges while maintaining high performance.

Practical Workflows to Scale HR Operations

Step-by-Step Implementation

Initial Assessment

To effectively implement Microsoft Copilot Studio in your HR workflows, start with an initial assessment. This step helps you understand your current processes and identify areas for improvement. Follow these best practices:

  1. Access Copilot Studio: Log in to Microsoft Copilot Studio, part of the Microsoft Power Platform.
  2. Define Your Agent’s Purpose: Clearly outline what your AI agent will do. This clarity sets the foundation for your automation efforts.
  3. Map Conversation Flows: Structure how interactions will occur. This mapping ensures that the AI-driven web agent can handle inquiries efficiently.
  4. Integrate with Existing Systems: Use Power Automate and APIs to connect with other platforms. This integration enhances the overall efficiency of your HR operations.
  5. Test and Optimize: Regularly evaluate the AI agent's performance and make necessary adjustments. Continuous testing helps you refine workflows and improve outcomes.

In addition to these steps, consider offering clear, context-rich prompts to improve AI responses. Always review AI outputs for accuracy and compliance. Train your team on privacy and responsible AI use to ensure a smooth transition to automation.

Pilot Programs

Launching pilot programs allows you to test the effectiveness of Copilot Studio in real-world scenarios. These programs provide valuable insights into how automation impacts your HR processes. Here are some outcomes you might observe:

OutcomeDescription
Increased retention rateAI's analytical capabilities provide a consistent view of merit-based aspects, aiding in effective reward strategies.
Greater efficiency in training and onboardingGenerative AI tools assist in creating training materials, enhancing the onboarding experience for new hires.
Reduced average hiring timeAutomation of interview and evaluation processes speeds up hiring while lowering costs.
Higher quality and relevance of candidatesCopilot analyzes candidate profiles to identify the best matches for open positions.

By implementing pilot programs, you can gather data to support broader adoption of automation across your HR department.

Scaling Successful Practices

Feedback Loops

Feedback loops play a crucial role in the continuous improvement of your HR automation workflows. They enable you to review and adapt processes based on input from HR teams and stakeholders. Establishing monitoring systems and clear accountability measures creates effective feedback loops. For example, tracking task completion and team sentiment can help you identify areas for improvement.

Communicate how user feedback leads to tangible changes. This transparency ensures that workflows remain flexible and aligned with evolving business objectives. By incorporating real user input and performance data, you can refine your HR automation workflows continuously.

Continuous Improvement

To maintain high performance, focus on continuous improvement. Start by scaling successful practices across multiple departments. Here are some strategies to consider:

  • Focus on employee experience to ensure smooth processes that enhance productivity.
  • Avoid over-automation by starting with clear, repeatable workflows.
  • Build for long-term scalability by choosing adaptable tools and documenting workflows.
  • Enable change management by communicating changes clearly to all stakeholders.
  • Utilize AI to enhance automation and reduce friction in HR operations.

By adopting these strategies, you can create a resilient HR architecture that adapts to future challenges while maintaining high performance.

Importance of Governance and Compliance

In the realm of HR automation, governance and compliance are crucial. You must ensure that automated systems operate transparently and ethically. This oversight fosters trust and accountability within your organization.

Balancing Automation and Human Touch

Ethical Considerations

When implementing AI-driven HR solutions, you must address several ethical considerations. Here are key points to keep in mind:

  • Bias and Discrimination: AI can perpetuate existing biases, leading to unfair outcomes. Conduct data audits and use diverse datasets to mitigate this risk.
  • Transparency and Explainability: Lack of clarity in AI decisions can erode trust. Use explainable AI and communicate clearly with employees about how decisions are made.
  • Privacy and Data Security: Handling sensitive employee data raises privacy concerns. Implement data protection measures and conduct regular audits to safeguard information.
  • Job Security and Automation: AI can displace jobs, causing anxiety among employees. Offer reskilling programs and career transition support to ease these concerns.
  • Dependency on AI: Over-reliance on AI may undermine human judgment. Ensure human oversight remains a priority in your HR processes.

Maintaining Employee Engagement

Balancing automation with a human touch is essential for maintaining employee engagement. Many HR tasks involve sensitive matters that require emotional intelligence and soft skills. For example, an automated response to a bereavement leave request may feel cold and impersonal. Such interactions can negatively impact an employee's view of the organization.

Key PointDescription
Emotional IntelligenceMany HR tasks involve sensitive matters that require emotional intelligence and soft skills, which cannot be automated satisfactorily.
Example of SensitivityAn automated response to a bereavement leave request may be perceived as cold, negatively impacting the employee's view of the organization.
Tasks Requiring Human TouchActivities like culture setting, performance reviews, and exit interviews necessitate human intervention and are not suitable for automation.

Integrating AI with human capabilities allows you to balance efficiency and empathy. As an HR leader, you play a crucial role in making strategic decisions that consider factors beyond data, such as company culture and long-term business strategy.

Monitoring and Evaluation

Key Performance Indicators

To ensure the success of automated HR operations, you must monitor key performance indicators (KPIs). These metrics help you evaluate the effectiveness of your automation efforts. Here are some important KPIs to track:

KPI DescriptionInterpretation
Conversation success rate (HR intents)Measures how effectively HR queries are resolved.
Assisted support rate for HR topicsIndicates reliance on human support for HR issues.
Repeated queries or retries on the same HR topicSuggests potential knowledge gaps or unclear responses.
Eval pass/fail trends for HR scenariosTracks consistency in HR responses.
High usage + high successIndicates effective deflection of HR tickets.
High usage + low successPoints to knowledge gaps or clarity issues.
Repeated retriesHighlights policy ambiguity or lack of personalization.
Eval regressionsSignals risk of inconsistent answers.

Adjusting Strategies Based on Feedback

Listening to employee feedback is essential for successful HR automation. If employees express concerns about new software, investigate the reasons behind their resistance. Adjust your strategies accordingly to address these concerns.

Key Steps in HR Automation AdjustmentsDescription
Engage HR Teams and StakeholdersInvolve HR staff throughout the automation process to ensure their needs are met.
Continuous MonitoringRegularly assess automated workflows based on feedback to improve efficiency.
Align Technology with Employee NeedsEnsure that the technology used meets the expectations of employees to reduce resistance.

By prioritizing governance and compliance, you can create a robust framework for your HR automation efforts. This approach not only enhances accountability but also fosters a culture of trust and engagement within your organization.


Transforming Microsoft Copilot Studio into a high-performance HR agent is crucial for scaling your HR operations. Here are the main takeaways:

  • Focus on creating deterministic systems to avoid unmanaged decision pathways.
  • Establish clear governance to ensure traceability and accountability.
  • Capture intent and context effectively to enhance HR operations.

To begin this transformation, you can take these actionable steps:

  1. Get quick wins: Deploy assistive agents to demonstrate immediate value.
  2. Create a Center of Excellence (COE): Manage cross-team needs and promote agent adoption.
  3. Measure and reward adoption: Track effectiveness and incentivize usage.

By implementing these strategies, you can foster a high-performance HR environment that empowers employees and streamlines processes. The long-term benefits include improved onboarding, personalized learning experiences, and a more connected workforce aligned with your organization’s vision.

FAQ

What is Microsoft Copilot Studio?

Microsoft Copilot Studio is a powerful tool designed to automate HR operations. It helps you scale HR operations by transforming traditional processes into high-performance agents that enhance efficiency and decision-making.

How can I scale HR operations effectively?

You can scale HR operations by implementing automation tools like Copilot Studio. These tools streamline workflows, improve customer experience, and enhance employee self-service agent capabilities, allowing you to focus on strategic initiatives.

What are the benefits of using AI in HR?

Using AI in HR improves decision-making and enhances customer satisfaction. AI-driven agents can analyze data quickly, automate repetitive tasks, and provide valuable insights, leading to better employee engagement and operational efficiency.

How does automation impact employee roles?

Automation transforms employee roles by allowing you to focus on strategic tasks rather than mundane activities. With tools like Copilot Studio, you can become a strategic partner and change agent within your organization.

What is the importance of governance in HR automation?

Governance ensures that automated HR systems operate transparently and ethically. It helps maintain compliance, reduces bias, and fosters trust among employees, ultimately enhancing the overall effectiveness of HR operations.

How can I encourage agent adoption in my organization?

To encourage agent adoption, provide training and support for employees. Highlight the benefits of automation, such as improved efficiency and enhanced customer experience, to motivate staff to embrace new tools.

What role does integration play in HR automation?

Integration connects various systems and tools, enabling seamless data flow. By integrating Microsoft Copilot Studio with existing platforms, you can enhance the functionality of your HR operations and improve overall efficiency.

🚀 Want to be part of m365.fm?

Then stop just listening… and start showing up.

👉 Connect with me on LinkedIn and let’s make something happen:

  • 🎙️ Be a podcast guest and share your story
  • 🎧 Host your own episode (yes, seriously)
  • 💡 Pitch topics the community actually wants to hear
  • 🌍 Build your personal brand in the Microsoft 365 space

This isn’t just a podcast — it’s a platform for people who take action.

🔥 Most people wait. The best ones don’t.

👉 Connect with me on LinkedIn and send me a message:
"I want in"

Let’s build something awesome 👊

1
00:00:00,000 --> 00:00:05,200
Most organizations think HR automation means a chatbot glued to a SharePoint folder full of PDFs.

2
00:00:05,200 --> 00:00:10,000
They are wrong. That setup doesn't automate HR, it just speeds up the production of confident nonsense,

3
00:00:10,000 --> 00:00:14,700
and it does it without evidence, without controls, and without a defensible decision trail.

4
00:00:14,700 --> 00:00:17,100
Meanwhile, the real costs pile up quietly.

5
00:00:17,100 --> 00:00:22,200
Screening bias you can't explain, ticket backlogs that never shrink on boarding that drags for weeks

6
00:00:22,200 --> 00:00:24,100
and audits that turn into archaeology.

7
00:00:24,100 --> 00:00:29,600
This episode is about moving from passive HR data to deterministic HR decisions.

8
00:00:29,600 --> 00:00:33,300
Not magical thinking, governed use cases, a reproducible stack,

9
00:00:33,300 --> 00:00:37,000
and a design that survives contact with compliance, security, and scale.

10
00:00:37,000 --> 00:00:40,800
We're going to build three workflows, screening, triage, onboarding,

11
00:00:40,800 --> 00:00:46,100
using co-pilot studio as the brain and logic apps as muscle with evidence captured by default.

12
00:00:46,100 --> 00:00:50,800
If you're trying to scale HR agents without turning your tenant into a policy crime scene,

13
00:00:50,800 --> 00:00:56,100
subscribe to M365FM. That's the contract here, practical architecture that holds up in production.

14
00:00:56,100 --> 00:00:57,500
This episode is a blueprint.

15
00:00:57,500 --> 00:01:00,900
Control plane thinking, repeatable patterns you can apply in your own environment.

16
00:01:00,900 --> 00:01:06,900
It is not a feature tool, it is not legal advice, and it is definitely not just prompt better dressed up as engineering.

17
00:01:06,900 --> 00:01:10,200
We'll walk three governed use cases end to end.

18
00:01:10,200 --> 00:01:13,500
A candidate screening agent with bias and escalation controls,

19
00:01:13,500 --> 00:01:17,200
an HR ticket triage agent designed for measurable deflection,

20
00:01:17,200 --> 00:01:21,700
and an onboarding orchestrator that survives long running state and retries.

21
00:01:21,700 --> 00:01:27,200
Now we need to define what an HR agent actually is in system terms.

22
00:01:27,200 --> 00:01:31,200
The foundational misunderstanding, HR agents aren't chatbots.

23
00:01:31,200 --> 00:01:34,900
The foundational mistake is treating an HR agent like a conversation experience.

24
00:01:34,900 --> 00:01:39,900
A chatbot is a user interface, it answers questions, it might search a knowledge base,

25
00:01:39,900 --> 00:01:45,900
it might summarize a document, and in most organizations it lives in the same conceptual bucket as self-service.

26
00:01:45,900 --> 00:01:48,200
Helpful, optional, low stakes.

27
00:01:48,200 --> 00:01:52,800
An HR agent is not that, an HR agent is a distributed decision engine with tool access.

28
00:01:52,800 --> 00:01:56,200
That distinction matters because HR work isn't mainly about talking.

29
00:01:56,200 --> 00:01:58,000
It's about decisions and actions.

30
00:01:58,000 --> 00:02:02,800
Screen or escalate, root or resolve, approve or reject, provision or pause.

31
00:02:02,800 --> 00:02:07,600
And the moment an LLM touches those decisions without a controlled action space and an evidence trail,

32
00:02:07,600 --> 00:02:11,200
you have converted your HR operations into conditional chaos.

33
00:02:11,200 --> 00:02:14,200
Most people try to make the model smarter, but that's the wrong lever.

34
00:02:14,200 --> 00:02:17,800
Smarter isn't safer. Smarter just fails in more creative ways.

35
00:02:17,800 --> 00:02:20,900
The lever that matters is determinism, what actions are allowed,

36
00:02:20,900 --> 00:02:25,000
under which identities, with which inputs, with which guardrails and with which logs.

37
00:02:25,000 --> 00:02:30,600
If the system can't prove what it did and why it did it, it didn't do HR work, a generated text.

38
00:02:30,600 --> 00:02:34,500
Here's the uncomfortable truth, HR is where entropy wins by default.

39
00:02:34,500 --> 00:02:37,000
Why? Because HR policy is never a single policy.

40
00:02:37,000 --> 00:02:40,900
It's a patchwork, regional rules, union constraints, job families,

41
00:02:40,900 --> 00:02:44,100
exceptions for executives, accommodations, immigration timelines,

42
00:02:44,100 --> 00:02:47,900
and the one off, we did it this way last time that never got documented.

43
00:02:47,900 --> 00:02:52,900
Every exception is an entropy generator. Every undocumented exception becomes folklore.

44
00:02:52,900 --> 00:02:57,200
Over time, policy drifts away from intent. A passive chatbot amplifies that drift.

45
00:02:57,200 --> 00:03:01,900
It learns the wrong thing from the wrong artifact, and it presents it with a confident tone.

46
00:03:01,900 --> 00:03:06,500
And then humans defer to it because it sounds consistent and consistency feels like authority.

47
00:03:06,500 --> 00:03:11,300
That's how bias creeps in, too. Not from bad prompts, from systems behavior.

48
00:03:11,300 --> 00:03:17,600
A screening flow that uses unstructured criteria, vague scoring and inconsistent overrides will converge on proxies.

49
00:03:17,600 --> 00:03:21,600
It will favor the attributes that correlate with historical outcomes, not job relevance.

50
00:03:21,600 --> 00:03:25,000
If you can't show the rubric, the rationale, and the override reason codes,

51
00:03:25,000 --> 00:03:29,000
you can't defend the process. You also can't improve it because you have no instrumentation.

52
00:03:29,000 --> 00:03:32,500
So in architectural terms, HR agents are not probabilistic assistance.

53
00:03:32,500 --> 00:03:36,600
They are controlled workflow compilers. Copilot studio should capture intent.

54
00:03:36,600 --> 00:03:40,000
What the user is trying to do, what the case context is, what the constraints are,

55
00:03:40,000 --> 00:03:44,100
and what the next safe action is. Then it should call tools that execute deterministically.

56
00:03:44,100 --> 00:03:47,200
The model can reason, but the tools must enforce.

57
00:03:47,200 --> 00:03:49,500
The system must separate, decide from do.

58
00:03:49,500 --> 00:03:52,100
If you collapse those layers, you get the worst outcome.

59
00:03:52,100 --> 00:03:57,200
A probabilistic system taking irreversible actions, and then you retroactively try to explain it.

60
00:03:57,200 --> 00:03:59,600
That's not governance, that storytelling after the fact.

61
00:03:59,600 --> 00:04:02,200
This is why the action space matters more than the prompt.

62
00:04:02,200 --> 00:04:06,000
Action space is the total set of operations the agent can perform.

63
00:04:06,000 --> 00:04:10,500
Read these sources, write these records, send these messages, schedule these meetings,

64
00:04:10,500 --> 00:04:12,600
trigger these workflows, and nothing else.

65
00:04:12,600 --> 00:04:17,200
And every single action in that space needs three things, identity, policy and telemetry.

66
00:04:17,200 --> 00:04:20,400
Identity means the action happens under a real boundary.

67
00:04:20,400 --> 00:04:24,500
A managed identity, a service principle, or on behalf of user context,

68
00:04:24,500 --> 00:04:27,500
something you can constrain with entra and conditional access.

69
00:04:27,500 --> 00:04:30,900
If the agent can act as whoever, then nobody owns the outcome.

70
00:04:30,900 --> 00:04:33,300
Policy means the action has preconditions.

71
00:04:33,300 --> 00:04:38,800
Confidence thresholds require approvals, allowed fields, and explicit deny rules.

72
00:04:38,800 --> 00:04:41,200
Not please be careful. Real gates.

73
00:04:41,200 --> 00:04:45,500
Telemetry means you capture the evidence, what prompt came in, what sources were retrieved,

74
00:04:45,500 --> 00:04:51,500
what tool calls were executed, what data changed, who approved overrides and correlation ideas across the chain.

75
00:04:51,500 --> 00:04:55,300
Without that, your audit posture is hope, and HR doesn't get to run on hope.

76
00:04:55,300 --> 00:04:59,000
Now you might be thinking, "But we just want to reduce tickets and speed up onboarding."

77
00:04:59,000 --> 00:05:00,800
Good, those are exactly the right goals.

78
00:05:00,800 --> 00:05:04,800
But to get there, you have to start building chatbots and start building governed agents,

79
00:05:04,800 --> 00:05:08,100
systems where the conversational layer is the control surface,

80
00:05:08,100 --> 00:05:11,300
and the automation layer is the enforcement mechanism.

81
00:05:11,300 --> 00:05:14,200
Once you internalize that, the architecture becomes obvious.

82
00:05:14,200 --> 00:05:19,000
Copilot Studio orchestrates Logic Apps executes, Dataverse holds state, Monitor holds truth,

83
00:05:19,000 --> 00:05:22,300
and now we can build something that scales without lying to you.

84
00:05:22,300 --> 00:05:24,800
The Target Architecture. Copilot Studio.

85
00:05:24,800 --> 00:05:26,700
As Brain, Logic Apps as Muscle.

86
00:05:26,700 --> 00:05:29,400
Here's the Target Architecture stripped of marketing language.

87
00:05:29,400 --> 00:05:32,400
Copilot Studio is the Brain. Logic App Standard is the Muscle.

88
00:05:32,400 --> 00:05:38,000
The MCP server is the contract that stops the Brain from hallucinating capabilities it shouldn't have.

89
00:05:38,000 --> 00:05:39,700
Dataverse is memory with structure.

90
00:05:39,700 --> 00:05:44,400
Azure Monitor and Log Analytics are the part nobody wants to pay for right until the first incident.

91
00:05:44,400 --> 00:05:45,900
Start with Copilot Studio.

92
00:05:45,900 --> 00:05:47,900
Its job isn't answer questions.

93
00:05:47,900 --> 00:05:50,700
Its job is intent, capture, and orchestration.

94
00:05:50,700 --> 00:05:55,900
It runs the conversation, asks for missing parameters and decides which tool call is allowed next based on context.

95
00:05:55,900 --> 00:05:58,700
Think of it like a control surface for a workflow compiler.

96
00:05:58,700 --> 00:06:00,200
The user describes what they want.

97
00:06:00,200 --> 00:06:03,200
The agent translates that into a constrained plan,

98
00:06:03,200 --> 00:06:06,500
and then it delegates execution to deterministic tools.

99
00:06:06,500 --> 00:06:07,600
That last part matters.

100
00:06:07,600 --> 00:06:10,500
Copilot Studio should not be the place where you do HR.

101
00:06:10,500 --> 00:06:14,800
It should be the place where you decide what to do next and where you enforce the conversational policy,

102
00:06:14,800 --> 00:06:19,900
what information you're allowed to request, what you must disclose, and when you must stop and escalate.

103
00:06:19,900 --> 00:06:23,000
This is also where you group actions into govern tool sets.

104
00:06:23,000 --> 00:06:27,500
Candidate screening actions are not the same risk profile as ticket triage actions.

105
00:06:27,500 --> 00:06:31,900
If you blend them into one giant HR super agent, you've built an entropy engine.

106
00:06:31,900 --> 00:06:33,900
Then Logic Apps Standard.

107
00:06:33,900 --> 00:06:35,800
Logic Apps is where actions happen.

108
00:06:35,800 --> 00:06:43,000
Calling Microsoft Graph, writing to dataverse, sending approval emails, rooting tickets, generating documents, and triggering downstream systems.

109
00:06:43,000 --> 00:06:44,300
And the keyword is standard.

110
00:06:44,300 --> 00:06:48,000
For HR, you want the single tenant posture and enterprise controls,

111
00:06:48,000 --> 00:06:52,100
managed identities, network isolation, and predictable monitoring.

112
00:06:52,100 --> 00:06:54,400
Not because the multi-tenant model is bad,

113
00:06:54,400 --> 00:06:58,100
because HR data makes you pay for shared assumptions.

114
00:06:58,100 --> 00:07:02,600
Logic Apps Standard also forces a discipline most HR automations never have.

115
00:07:02,600 --> 00:07:04,600
Explicit workflow boundaries.

116
00:07:04,600 --> 00:07:06,200
Each workflow becomes a tool.

117
00:07:06,200 --> 00:07:07,700
Each tool has an input schema.

118
00:07:07,700 --> 00:07:09,200
Each tool has a permission model.

119
00:07:09,200 --> 00:07:13,100
That's exactly what you need if you want a system that behaves deterministically under pressure.

120
00:07:13,100 --> 00:07:15,300
Now, drop the MCP server into the middle.

121
00:07:15,300 --> 00:07:19,500
In the demo research, the MCP server is effectively Logic Apps exposing tools.

122
00:07:19,500 --> 00:07:20,700
That's the right mental model.

123
00:07:20,700 --> 00:07:22,500
MCP is your tool interface contract.

124
00:07:22,500 --> 00:07:26,700
It's how you describe in plain terms what a tool does and what payload it expects.

125
00:07:26,700 --> 00:07:30,900
So the model can map natural language into structured calls, but the trick is not the convenience.

126
00:07:30,900 --> 00:07:33,200
The trick is that MCP gives you a choke point.

127
00:07:33,200 --> 00:07:36,800
You decide which tools exist, how they're named, what parameters they accept,

128
00:07:36,800 --> 00:07:38,600
and which identity can invoke them.

129
00:07:38,600 --> 00:07:40,600
You stop letting the model invent actions.

130
00:07:40,600 --> 00:07:43,000
You force it to choose from a menu you own.

131
00:07:43,000 --> 00:07:46,600
Dataverse is next, and it's not optional if you want audit-grade operations.

132
00:07:46,600 --> 00:07:51,300
You need structured state, candidate records, screening rubrics, score artifacts, ticket cases,

133
00:07:51,300 --> 00:07:54,600
onboarding milestones, approval objects, override reason codes.

134
00:07:54,600 --> 00:07:58,600
If the state lives in chat transcripts and email threads, you don't have state.

135
00:07:58,600 --> 00:08:00,500
You have vibes.

136
00:08:00,500 --> 00:08:04,000
Dataverse also gives you something critical for long-running workflows.

137
00:08:04,000 --> 00:08:05,000
Durability.

138
00:08:05,000 --> 00:08:07,600
onboarding is not a single request response.

139
00:08:07,600 --> 00:08:10,300
It's days of waiting on people, systems, and failures.

140
00:08:10,300 --> 00:08:14,900
You need a system of record that survives retries and replays without duplicating actions.

141
00:08:14,900 --> 00:08:17,800
Then observability as your monitor and log analytics.

142
00:08:17,800 --> 00:08:20,900
This is where you capture operational truth across the chain.

143
00:08:20,900 --> 00:08:25,100
Not just the agent responded, you need the conversation session ID,

144
00:08:25,100 --> 00:08:27,700
the MCP tool call, the Logic App Run ID,

145
00:08:27,700 --> 00:08:30,000
the Dataverse transaction, and the outcome.

146
00:08:30,000 --> 00:08:33,000
Correlate them, store them, decide retention deliberately.

147
00:08:33,000 --> 00:08:37,800
Because without correlation IDs, you cannot answer the only question auditors ask that matters.

148
00:08:37,800 --> 00:08:39,400
Show me exactly what happened.

149
00:08:39,400 --> 00:08:41,600
Finally, interest sits across everything as enforcement.

150
00:08:41,600 --> 00:08:44,100
The agent is not a person, it's an identity boundary.

151
00:08:44,100 --> 00:08:47,400
Managed identity for Logic Apps, service principle for tool access,

152
00:08:47,400 --> 00:08:50,700
conditional access where appropriate, least privilege on graph scopes.

153
00:08:50,700 --> 00:08:55,200
You don't secure the agent by writing stern instructions in the system prompt.

154
00:08:55,200 --> 00:08:58,100
You secure it by making unauthorized actions impossible.

155
00:08:58,100 --> 00:08:59,300
That's the architecture.

156
00:08:59,300 --> 00:09:01,500
Copilot Studio Reasons and Roots.

157
00:09:01,500 --> 00:09:03,300
MCP defines capability.

158
00:09:03,300 --> 00:09:05,700
Logic Apps executes with identity and isolation.

159
00:09:05,700 --> 00:09:08,200
Dataverse holds state, monitor proofs behavior,

160
00:09:08,200 --> 00:09:09,700
entra and forces boundaries.

161
00:09:09,700 --> 00:09:13,700
Now we can talk about why Logic App Standard is the default for HR.

162
00:09:13,700 --> 00:09:15,500
Why Logic App Standard for HR?

163
00:09:15,500 --> 00:09:17,800
Isolation, identity and auditability.

164
00:09:17,800 --> 00:09:19,900
Logic App Standard is not a premium option.

165
00:09:19,900 --> 00:09:24,200
For HR, it's the baseline if you intend to keep PII inside controllable boundaries

166
00:09:24,200 --> 00:09:25,600
and still sleep at night.

167
00:09:25,600 --> 00:09:29,100
Most people pick consumption because it's easy, serverless, and it demos well.

168
00:09:29,100 --> 00:09:31,200
But HR workflows don't fail in the demo.

169
00:09:31,200 --> 00:09:35,200
They fail in production, under load, during audits after someone forwards a URL

170
00:09:35,200 --> 00:09:38,300
or when a maker copies a workflow into the wrong environment

171
00:09:38,300 --> 00:09:40,300
and nobody notices for six months.

172
00:09:40,300 --> 00:09:43,400
Standard changes the operating model in three ways that matter.

173
00:09:43,400 --> 00:09:46,900
Isolation, identity and auditability start with isolation.

174
00:09:46,900 --> 00:09:50,300
HR data has a habit of escaping through convenience.

175
00:09:50,300 --> 00:09:53,000
Consumption runs in a shared multi-tenant model.

176
00:09:53,000 --> 00:09:55,100
That doesn't mean it's insecure by definition.

177
00:09:55,100 --> 00:09:57,800
It means you are accepting shared infrastructure assumptions

178
00:09:57,800 --> 00:10:01,200
at the exact moment you're processing resumes, employee records,

179
00:10:01,200 --> 00:10:04,900
and hiring decisions that regulators now classify as high risk.

180
00:10:04,900 --> 00:10:08,300
You are choosing probabilistic comfort over deterministic control.

181
00:10:08,300 --> 00:10:09,700
Standard is single-tenant.

182
00:10:09,700 --> 00:10:11,100
That changes the blast radius.

183
00:10:11,100 --> 00:10:13,100
You get a dedicated runtime you can lock down

184
00:10:13,100 --> 00:10:15,500
and you can move the workflow behind private network boundaries.

185
00:10:15,500 --> 00:10:16,500
And yes, that matters.

186
00:10:16,500 --> 00:10:18,500
Even if your user sit in Teams all day,

187
00:10:18,500 --> 00:10:20,700
your agent doesn't care about your user experience,

188
00:10:20,700 --> 00:10:22,500
it cares about its calibable endpoints.

189
00:10:22,500 --> 00:10:25,900
HR workflows should not be public endpoints with hopes attached.

190
00:10:25,900 --> 00:10:28,100
Private endpoints and vnet integration

191
00:10:28,100 --> 00:10:31,000
let you make the MCP tool surface non-public by design.

192
00:10:31,000 --> 00:10:32,800
You can front it with controlled ingress.

193
00:10:32,800 --> 00:10:35,600
You can restrict who can call it and you can stop pretending

194
00:10:35,600 --> 00:10:39,300
that a secret URL parameter is an authentication strategy.

195
00:10:39,300 --> 00:10:40,800
Consumption gives you fewer options here.

196
00:10:40,800 --> 00:10:43,600
Standard gives you an architecture that can actually be constrained.

197
00:10:43,600 --> 00:10:44,800
Now identity.

198
00:10:44,800 --> 00:10:48,100
HR automation fails when everything runs under one shared connection

199
00:10:48,100 --> 00:10:49,300
that nobody owns.

200
00:10:49,300 --> 00:10:51,600
Standard plays well with managed identities.

201
00:10:51,600 --> 00:10:53,500
That means your logic app can authenticate

202
00:10:53,500 --> 00:10:56,900
to downstream services without embedding secrets in connection strings

203
00:10:56,900 --> 00:11:00,200
and without spreading credentials across environments like confetti.

204
00:11:00,200 --> 00:11:01,900
Managed identity isn't a nice to have.

205
00:11:01,900 --> 00:11:04,800
It's the only honest answer to who performed this action

206
00:11:04,800 --> 00:11:07,500
because in HR the system did it is not an explanation.

207
00:11:07,500 --> 00:11:09,300
It's an evasion.

208
00:11:09,300 --> 00:11:11,900
With standard, you can run each workflow as a tool

209
00:11:11,900 --> 00:11:13,600
with its own identity posture.

210
00:11:13,600 --> 00:11:15,800
You can separate read candidate profile

211
00:11:15,800 --> 00:11:18,600
from write scoring artifact from send offer letter

212
00:11:18,600 --> 00:11:20,100
and assign different permissions.

213
00:11:20,100 --> 00:11:22,400
That's least privilege in practice, not a slide deck.

214
00:11:22,400 --> 00:11:23,800
And once you have identity,

215
00:11:23,800 --> 00:11:26,800
entra becomes an enforcement layer instead of a directory.

216
00:11:26,800 --> 00:11:29,600
Conditional access can constrain who can invoke what

217
00:11:29,600 --> 00:11:31,800
from where and under what risk conditions.

218
00:11:31,800 --> 00:11:35,800
But it only works if your tool calls have real identities to bind to.

219
00:11:35,800 --> 00:11:38,000
Then there's auditability, which is where most

220
00:11:38,000 --> 00:11:40,000
agentech HR projects die quietly.

221
00:11:40,000 --> 00:11:43,800
If you run history captures PII and you don't configure guardrails

222
00:11:43,800 --> 00:11:45,800
you've built a compliance incident logger.

223
00:11:45,800 --> 00:11:48,000
Logic apps will happily store inputs and outputs.

224
00:11:48,000 --> 00:11:49,600
That's useful when you're debugging.

225
00:11:49,600 --> 00:11:53,000
It's catastrophic when those inputs include candidate resumes,

226
00:11:53,000 --> 00:11:55,800
medical accommodation notes or anything that triggers retention

227
00:11:55,800 --> 00:11:56,800
requirements.

228
00:11:56,800 --> 00:11:58,800
Standard doesn't magically fix that,

229
00:11:58,800 --> 00:12:01,800
but it gives you the enterprise posture to handle it properly.

230
00:12:01,800 --> 00:12:04,600
Secure inputs and outputs, controlled storage,

231
00:12:04,600 --> 00:12:07,600
and predictable diagnostic rooting into log analytics.

232
00:12:07,600 --> 00:12:11,200
You choose what evidence you keep, where it lives and how long it survives.

233
00:12:11,200 --> 00:12:14,600
You don't discover your retention policy during an incident review.

234
00:12:14,600 --> 00:12:16,200
And here's the unpleasant part.

235
00:12:16,200 --> 00:12:18,200
Logic apps doesn't give you native DLP.

236
00:12:18,200 --> 00:12:20,200
That's not a bug. It's the design reality.

237
00:12:20,200 --> 00:12:21,600
So you compensate with architecture.

238
00:12:21,600 --> 00:12:23,800
You enforce data minimization by design.

239
00:12:23,800 --> 00:12:25,600
You avoid putting raw PII into prompts.

240
00:12:25,600 --> 00:12:27,600
You pass identifiers, not payloads.

241
00:12:27,600 --> 00:12:31,400
You retrieve the sensitive data inside the tool boundary under identity

242
00:12:31,400 --> 00:12:35,400
and you store structured artifacts in data verse with explicit access controls.

243
00:12:35,400 --> 00:12:37,400
You treat the chat transcript as a control surface,

244
00:12:37,400 --> 00:12:38,600
not as a data lake.

245
00:12:38,600 --> 00:12:42,200
You also put per view boundaries where they actually apply on the data sources

246
00:12:42,200 --> 00:12:46,000
and repositories that co-pilot can retrieve from and on the storage locations

247
00:12:46,000 --> 00:12:47,600
where outputs land.

248
00:12:47,600 --> 00:12:49,200
The agent inherits your data hygiene.

249
00:12:49,200 --> 00:12:51,000
If your SharePoint permissions are a mess,

250
00:12:51,000 --> 00:12:53,400
co-pilot will simply surface your mess faster.

251
00:12:53,400 --> 00:12:56,400
Standard is also where you get predictable operational telemetry.

252
00:12:56,400 --> 00:12:59,000
Consumption will scale, but you're still living inside a model

253
00:12:59,000 --> 00:13:03,400
where runs are scattered and visibility can degrade into we think it happened.

254
00:13:03,400 --> 00:13:07,600
Standard makes it easier to treat this like any other production integration workload.

255
00:13:07,600 --> 00:13:12,800
Baselines, alerts, diagnostics, correlation IDs, and incident response.

256
00:13:12,800 --> 00:13:14,000
So the rule is simple.

257
00:13:14,000 --> 00:13:18,400
If the workflow touches candidate screening, onboarding, provisioning, or employee case data,

258
00:13:18,400 --> 00:13:21,000
standard is the default, not because it's fancy,

259
00:13:21,000 --> 00:13:24,800
because it's the only model that lets you enforce isolation, bind identity,

260
00:13:24,800 --> 00:13:27,400
and produce audit grade evidence without duct tape.

261
00:13:27,400 --> 00:13:30,600
And once you accept that, the governance layer stops being optional.

262
00:13:30,600 --> 00:13:34,400
It becomes the thing that keeps the agent from becoming a shadow admin with good grammar.

263
00:13:34,400 --> 00:13:38,400
Governance model, boundaries, roles, and the action space.

264
00:13:38,400 --> 00:13:39,800
Governance is not a committee.

265
00:13:39,800 --> 00:13:40,600
It's not a PDF.

266
00:13:40,600 --> 00:13:44,400
It's not an annual training module everyone clicks through while eating lunch.

267
00:13:44,400 --> 00:13:47,800
Governance is the set of constraints that make bad behavior impossible at scale.

268
00:13:47,800 --> 00:13:51,600
And in an agentic HR system, the governance unit is not the agent.

269
00:13:51,600 --> 00:13:53,400
It's the action space.

270
00:13:53,400 --> 00:13:57,200
Action space means the exact set of operations the agent can perform,

271
00:13:57,200 --> 00:13:59,800
which tools exist, what parameters they accept,

272
00:13:59,800 --> 00:14:02,400
what systems they can touch, and what they can change.

273
00:14:02,400 --> 00:14:05,800
Not what it can say, what it can do, because in HR words are cheap,

274
00:14:05,800 --> 00:14:08,400
actions create liability.

275
00:14:08,400 --> 00:14:10,600
This is where most deployments decay.

276
00:14:10,600 --> 00:14:15,400
They start with intent, help employees, reduce tickets, speed up hiring,

277
00:14:15,400 --> 00:14:17,400
and they end with capabilities, Brawl.

278
00:14:17,400 --> 00:14:21,000
Somebody adds one more connector, one more action group, one more exception.

279
00:14:21,000 --> 00:14:24,200
Then six months later, you have an HR agent that can read too much,

280
00:14:24,200 --> 00:14:26,200
write too much, and nobody remembers why.

281
00:14:26,200 --> 00:14:29,600
So the governance model starts with boundaries, first boundary environments.

282
00:14:29,600 --> 00:14:33,200
You don't test HR agents in production because you're just trying something.

283
00:14:33,200 --> 00:14:36,800
You build in dev, validate and test, and promote to prod with policy parity.

284
00:14:36,800 --> 00:14:40,000
Otherwise, you're not shipping an agent, you're shipping drift.

285
00:14:40,000 --> 00:14:42,000
Second boundary, tool groups.

286
00:14:42,000 --> 00:14:45,600
Candidate screening tools don't live next to onboarding provisioning tools.

287
00:14:45,600 --> 00:14:48,400
Ticket triage tools don't live next to offer letter generation.

288
00:14:48,400 --> 00:14:50,000
Separation is not bureaucracy.

289
00:14:50,000 --> 00:14:52,000
Separation is blast radius management.

290
00:14:52,000 --> 00:14:54,400
Third boundary, data surfaces.

291
00:14:54,400 --> 00:14:58,400
SharePoint folders full of resumes with random naming and no matter data discipline

292
00:14:58,400 --> 00:14:59,400
are not a data source.

293
00:14:59,400 --> 00:15:00,400
They're an entropy source.

294
00:15:00,400 --> 00:15:04,400
If the agent retrieves from it, it will amplify whatever chaos you stored there.

295
00:15:04,400 --> 00:15:10,000
Governance means you curate sources in force permissions and you decide what grounding actually means for HR.

296
00:15:10,000 --> 00:15:11,000
Now roles.

297
00:15:11,000 --> 00:15:15,600
In most companies, HR automation fails because everyone shares superpowers.

298
00:15:15,600 --> 00:15:18,600
The recruiter can change workflows, the maker can publish agents,

299
00:15:18,600 --> 00:15:21,800
the platform admin can see everything, then everyone can do everything.

300
00:15:21,800 --> 00:15:23,000
And nobody owns anything.

301
00:15:23,000 --> 00:15:26,200
So define roles like the system will punish you if you get them wrong.

302
00:15:26,200 --> 00:15:29,200
Recruiters and hiring teams, they should initiate screening workflows,

303
00:15:29,200 --> 00:15:31,000
request summaries and trigger approvals.

304
00:15:31,000 --> 00:15:33,800
They should not edit the scoring rubric logic in production.

305
00:15:33,800 --> 00:15:35,200
They should not add new tools.

306
00:15:35,200 --> 00:15:36,800
They should not bypass gates.

307
00:15:36,800 --> 00:15:41,000
HR operations, they own case processes, ticket routing, onboarding milestones

308
00:15:41,000 --> 00:15:42,600
and service delivery metrics.

309
00:15:42,600 --> 00:15:45,000
They can tune routing rules and knowledge boundaries.

310
00:15:45,000 --> 00:15:48,000
They do not get to expand the action space without review

311
00:15:48,000 --> 00:15:53,200
because just one more connector is how you end up with payroll data in the wrong place.

312
00:15:53,200 --> 00:15:59,000
Platform admins, they own environments, identity plumbing, connector governance and logging.

313
00:15:59,000 --> 00:16:03,600
They should not be the people deciding hiring criteria or writing screening rubrics

314
00:16:03,600 --> 00:16:08,200
when the same group controls policy and implementation exceptions become invisible.

315
00:16:08,200 --> 00:16:12,400
And invisible exceptions are how bias becomes a systems feature security and compliance.

316
00:16:12,400 --> 00:16:14,000
They don't approve the agent here.

317
00:16:14,000 --> 00:16:16,600
They approve the action space and the evidence plan.

318
00:16:16,600 --> 00:16:17,600
What gets logged?

319
00:16:17,600 --> 00:16:19,800
Where it's retained, how correlation works,

320
00:16:19,800 --> 00:16:23,000
and how fast you can disable pathways when something goes wrong.

321
00:16:23,000 --> 00:16:26,000
Now enforce those roles with entra, not good intentions.

322
00:16:26,000 --> 00:16:29,000
Every tool call should bind to an identity boundary.

323
00:16:29,000 --> 00:16:31,400
For logic apps, that means managed identities.

324
00:16:31,400 --> 00:16:35,800
For agent access, that means explicit permissions and conditional access where it matters.

325
00:16:35,800 --> 00:16:40,600
For co-pilot studio actions, it means no, everyone can use this agent default.

326
00:16:40,600 --> 00:16:45,000
You publish to audiences deliberately and you treat broad access as a privileged event.

327
00:16:45,000 --> 00:16:47,200
And you build least privilege at the tool level.

328
00:16:47,200 --> 00:16:50,000
Read tools and write tools are not the same.

329
00:16:50,000 --> 00:16:53,400
Get candidate profiles is not the same as create candidate record.

330
00:16:53,400 --> 00:16:56,600
Draft offer letter is not the same as send offer letter.

331
00:16:56,600 --> 00:17:01,200
If your action space doesn't separate these, you are relying on the model to self-regulate.

332
00:17:01,200 --> 00:17:05,600
It won't, it can't, it doesn't have incentives, it has probabilities.

333
00:17:05,600 --> 00:17:09,200
Then there's the part nobody wants but everyone needs the kill switch.

334
00:17:09,200 --> 00:17:12,600
You design every agent workflow with the assumption that you will disable it,

335
00:17:12,600 --> 00:17:16,400
not because you expect failure, but because entropy always finds a path.

336
00:17:16,400 --> 00:17:19,800
A kill switch is the ability to shut down tool invocation fast.

337
00:17:19,800 --> 00:17:23,400
Disable the logic app workflow, revoke the managed identity permissions,

338
00:17:23,400 --> 00:17:27,000
unpublish the agent, or block access via conditional access.

339
00:17:27,000 --> 00:17:31,000
Pick at least two layers because relying on one control is how outages become incidents.

340
00:17:31,000 --> 00:17:34,200
And when you hit the kill switch, you don't fix forward in production.

341
00:17:34,200 --> 00:17:37,000
You restore capability only after review.

342
00:17:37,000 --> 00:17:38,000
What happened?

343
00:17:38,000 --> 00:17:39,400
What evidence exists?

344
00:17:39,400 --> 00:17:40,600
What control failed?

345
00:17:40,600 --> 00:17:42,400
And what boundary needs tightening?

346
00:17:42,400 --> 00:17:44,000
This is the core governance truth.

347
00:17:44,000 --> 00:17:48,800
Makers will build, teams will copy, exceptions will accumulate, policies will drift.

348
00:17:48,800 --> 00:17:51,800
Your job is to make drift visible, constrainable and reversible.

349
00:17:51,800 --> 00:17:53,800
That's what action space governance actually is.

350
00:17:53,800 --> 00:17:55,800
Compliance hooks without legal theater.

351
00:17:55,800 --> 00:17:57,600
Oversight, notice evidence.

352
00:17:57,600 --> 00:18:01,200
Most teams here, AI compliance and immediately do one of two things.

353
00:18:01,200 --> 00:18:06,400
They either panic and stop, or they paste a disclaimer into the chat window and call it governance.

354
00:18:06,400 --> 00:18:08,200
Neither works.

355
00:18:08,200 --> 00:18:11,200
Compliance in an agentic HR system is not a statement.

356
00:18:11,200 --> 00:18:12,800
It's an operating capability.

357
00:18:12,800 --> 00:18:17,400
Oversight that can intervene, notice that is consistent, and evidence that survives scrutiny.

358
00:18:17,400 --> 00:18:19,400
Start with oversight.

359
00:18:19,400 --> 00:18:23,000
The law's people reference NYC local law 144 in hiring.

360
00:18:23,000 --> 00:18:26,000
Colorado's AI Act for high-risk systems.

361
00:18:26,000 --> 00:18:27,800
Are not asking you to become a lawyer.

362
00:18:27,800 --> 00:18:31,200
They are forcing you to behave like an operator of a decision system.

363
00:18:31,200 --> 00:18:33,000
Which means you must be able to show.

364
00:18:33,000 --> 00:18:37,800
Who reviewed the outcome when humans took over and how the system prevented harm when it was uncertain?

365
00:18:37,800 --> 00:18:40,000
That's why human in the loop isn't just a safety net.

366
00:18:40,000 --> 00:18:41,200
It's the governance primitive.

367
00:18:41,200 --> 00:18:45,800
If your candidate, screening agent can score a resume and shortlist a person without a gate.

368
00:18:45,800 --> 00:18:50,400
You've effectively built an automated employment decision tool with no supervision story.

369
00:18:50,400 --> 00:18:53,000
And the moment a candidate challenges the outcome,

370
00:18:53,000 --> 00:18:56,000
your but the recruiter had final say argument collapses,

371
00:18:56,000 --> 00:18:59,200
unless you can prove the recruiter actually did something meaningful.

372
00:18:59,200 --> 00:19:00,600
So oversight has to be real.

373
00:19:00,600 --> 00:19:01,400
It has to be measurable.

374
00:19:01,400 --> 00:19:03,000
It has to leave artifacts.

375
00:19:03,000 --> 00:19:05,600
Notice is next and this is where most teams get weird.

376
00:19:05,600 --> 00:19:09,600
They either over-lawyer it and turn the employee experience into a compliance pop-up parade

377
00:19:09,600 --> 00:19:12,200
or they avoid it entirely and hope nobody asks.

378
00:19:12,200 --> 00:19:15,400
But notice is not optional in the direction the world is moving.

379
00:19:15,400 --> 00:19:17,400
The practical approach is simple.

380
00:19:17,400 --> 00:19:21,600
Define when the system is assisting versus influencing.

381
00:19:21,600 --> 00:19:26,400
Assisting means it's drafting, summarizing, retrieving policy or preparing an information package.

382
00:19:26,400 --> 00:19:32,800
Influencing means it's scoring, ranking, recommending, rooting or changing access and records.

383
00:19:32,800 --> 00:19:34,200
Those are different risk profiles.

384
00:19:34,200 --> 00:19:35,400
Treat them differently.

385
00:19:35,400 --> 00:19:39,200
When the system influences, you implement consistent user-facing disclosure.

386
00:19:39,200 --> 00:19:42,800
In screening that might be candidate-facing and regulated jurisdictions.

387
00:19:42,800 --> 00:19:47,400
In internal HR ops, it might be employee-facing when a ticket is classified and auto-resolved.

388
00:19:47,400 --> 00:19:48,600
The point isn't the exact wording.

389
00:19:48,600 --> 00:19:53,000
The point is that you can demonstrate you had a repeatable notice mechanism tied to the workflow,

390
00:19:53,000 --> 00:19:55,200
not the personality of whoever wrote the prompt.

391
00:19:55,200 --> 00:19:56,200
Now evidence.

392
00:19:56,200 --> 00:19:59,000
This is the part everyone promises and nobody actually builds.

393
00:19:59,000 --> 00:20:02,600
An auditor does not care that your agent usually does the right thing.

394
00:20:02,600 --> 00:20:04,000
They ask for a single case.

395
00:20:04,000 --> 00:20:08,400
Show me what happened and to end for this candidate on this state under this policy version

396
00:20:08,400 --> 00:20:10,200
with this approver and this final outcome.

397
00:20:10,200 --> 00:20:12,200
So your evidence model has to be explicit.

398
00:20:12,200 --> 00:20:15,200
At minimum, you need five things captured and correlated.

399
00:20:15,200 --> 00:20:19,400
One, the user request and the system prompt context that drove the decision,

400
00:20:19,400 --> 00:20:22,400
not for performance theater for accountability.

401
00:20:22,400 --> 00:20:24,400
Two, the retrieved sources.

402
00:20:24,400 --> 00:20:26,600
If the agent grounded on a SharePoint document,

403
00:20:26,600 --> 00:20:29,000
store the document ID version and metadata.

404
00:20:29,000 --> 00:20:33,000
If it used people data, store what directory attributes it referenced.

405
00:20:33,000 --> 00:20:36,000
Otherwise, you can't answer the basic challenge based on what?

406
00:20:36,000 --> 00:20:43,200
Three, the tool calls, MCP tool name, input payload, output payload, and the logic app run ID.

407
00:20:43,200 --> 00:20:46,000
The system did not decide it called a tool.

408
00:20:46,000 --> 00:20:48,400
Tools leave traces, capture them.

409
00:20:48,400 --> 00:20:55,200
Four, the state transitions and data verse candidate created score artifact written HITL request issued.

410
00:20:55,200 --> 00:20:59,000
Approval received override reason code recorded short list updated.

411
00:20:59,000 --> 00:21:00,000
These are not chat events.

412
00:21:00,000 --> 00:21:01,200
They are business events.

413
00:21:01,200 --> 00:21:05,400
Five, the human actions who approved who rejected who overrode.

414
00:21:05,400 --> 00:21:08,200
What reason they gave and whether they had the right role to do it.

415
00:21:08,200 --> 00:21:13,800
Then you do all of this with correlation IDs that survive across co pilot studio, MCP, logic apps and data verse.

416
00:21:13,800 --> 00:21:16,200
If you can't stitch it together, you don't have evidence.

417
00:21:16,200 --> 00:21:17,200
You have scattered logs.

418
00:21:17,200 --> 00:21:19,400
A lot of teams worry they're logging too much.

419
00:21:19,400 --> 00:21:21,400
Good. That means they finally understand the risk.

420
00:21:21,400 --> 00:21:22,400
So be deliberate.

421
00:21:22,400 --> 00:21:25,000
Don't dump raw resumes and medical notes into logs.

422
00:21:25,000 --> 00:21:27,200
Store references and structured artifacts.

423
00:21:27,200 --> 00:21:29,200
Use secure inputs and outputs in workflow.

424
00:21:29,200 --> 00:21:31,200
So run history doesn't become a data leak.

425
00:21:31,200 --> 00:21:34,800
Define retention as a policy decision, not an accident of defaults.

426
00:21:34,800 --> 00:21:38,000
Evidence that can't be retained safely is just future liability.

427
00:21:38,000 --> 00:21:39,400
And here's the final constraint.

428
00:21:39,400 --> 00:21:41,800
Never claim compliance claim alignment.

429
00:21:41,800 --> 00:21:46,800
You are implementing oversight notice and evidence patterns that map to emerging requirements.

430
00:21:46,800 --> 00:21:51,000
You're building a system that can be audited, challenged and improved without rewriting history.

431
00:21:51,000 --> 00:21:54,000
That's what compliance hooks actually means in architecture terms.

432
00:21:54,000 --> 00:21:57,000
Now we can implement the first concrete control pattern.

433
00:21:57,000 --> 00:21:59,800
Hittl confidence gates as deterministic stops.

434
00:21:59,800 --> 00:22:03,200
Hittl pattern confidence gates as deterministic stops.

435
00:22:03,200 --> 00:22:05,400
Human in the loop is not a moral preference.

436
00:22:05,400 --> 00:22:06,800
It's an architectural circuit breaker.

437
00:22:06,800 --> 00:22:09,200
The point of Hittl isn't to make the agent humble.

438
00:22:09,200 --> 00:22:14,600
The point is to make irreversible actions impossible when the system certainty drops below a defined bar.

439
00:22:14,600 --> 00:22:18,800
That bar can be a confidence threshold, a risk category, a policy condition or all three.

440
00:22:18,800 --> 00:22:21,200
But it has to be explicit, enforced and logged.

441
00:22:21,200 --> 00:22:23,000
So here's the pattern. Confidence gates.

442
00:22:23,000 --> 00:22:26,600
A confidence gate is a deterministic stop in an otherwise autonomous workflow.

443
00:22:26,600 --> 00:22:28,800
The agent can reason retrieve draft and propose.

444
00:22:28,800 --> 00:22:32,400
But if the gate conditions are met, it must pause and request a human decision

445
00:22:32,400 --> 00:22:35,000
before any further tool calls that change state.

446
00:22:35,000 --> 00:22:40,400
Notice what that does. It converts a probabilistic model output into a controlled workflow step with ownership.

447
00:22:40,400 --> 00:22:44,200
Now teams usually implement this backward. They do Hittl when it feels risky.

448
00:22:44,200 --> 00:22:47,600
That translates to Hittl never until after the incident.

449
00:22:47,600 --> 00:22:49,200
You need mechanical triggers.

450
00:22:49,200 --> 00:22:51,600
The most obvious trigger is a numeric confidence score.

451
00:22:51,600 --> 00:22:54,600
For example, anything below 75 escalates.

452
00:22:54,600 --> 00:22:56,200
That doesn't mean the model is measuring truth.

453
00:22:56,200 --> 00:23:00,600
It means you're defining a threshold where the system must stop pretending it can decide.

454
00:23:00,600 --> 00:23:05,600
And you pick that number by testing, run historical cases, measure how often the agent gets it wrong,

455
00:23:05,600 --> 00:23:09,800
and set the gate where the cost of a mistake exceeds the cost of a review.

456
00:23:09,800 --> 00:23:13,200
But confidence scores a loan of fragile because models can be confidently wrong.

457
00:23:13,200 --> 00:23:14,800
So you add policy-based gates.

458
00:23:14,800 --> 00:23:17,200
In candidate screening, you gate on.

459
00:23:17,200 --> 00:23:21,600
In complete evidence, missing rubric fields, conflicting criteria or detected proxy risk.

460
00:23:21,600 --> 00:23:27,600
In ticket triage, you gate on. Requests that involve pay, leave disputes, medical accommodations,

461
00:23:27,600 --> 00:23:30,800
or anything that triggers an employee relations workflow.

462
00:23:30,800 --> 00:23:35,200
In onboarding, you gate on license assignment, group membership, access to sensitive systems,

463
00:23:35,200 --> 00:23:38,000
or any action that can create access you can't easily unwind.

464
00:23:38,000 --> 00:23:41,200
That's how you build a risk tiered Hittl model without pretending to be a lawyer.

465
00:23:41,200 --> 00:23:42,800
Next is the approval object.

466
00:23:42,800 --> 00:23:46,400
A high-till pause is useless if it's just ask a human in chat.

467
00:23:46,400 --> 00:23:49,600
You need a structured approval artifact that records who approved,

468
00:23:49,600 --> 00:23:52,400
what they saw, what decision they made, and why.

469
00:23:52,400 --> 00:23:55,200
That artifact lives in Dataverse, not in a team's message.

470
00:23:55,200 --> 00:23:57,000
Teams is the notification channel.

471
00:23:57,000 --> 00:23:58,000
Dataverse is the record.

472
00:23:58,000 --> 00:24:03,400
So define an approval object with fields that force accountability, case ID, decision type,

473
00:24:03,400 --> 00:24:09,600
proposed action, evidence links, confidence score, gate reason, approver identity, time stamp,

474
00:24:09,600 --> 00:24:13,600
outcome, override reason code, and a free text justification field,

475
00:24:13,600 --> 00:24:19,200
but only after the structured reason code, because otherwise humans will write looks good and call it oversight.

476
00:24:19,200 --> 00:24:20,600
What does the approver see?

477
00:24:20,600 --> 00:24:22,000
Not the entire chat transcript.

478
00:24:22,000 --> 00:24:24,200
That's how you leak data and drown the reviewer.

479
00:24:24,200 --> 00:24:29,000
The reviewer gets a context package, the structured rubric, the retrieved sources with links and versions,

480
00:24:29,000 --> 00:24:31,600
and the specific recommendation the agent proposes.

481
00:24:31,600 --> 00:24:34,400
If the agent can't produce that package, it hasn't earned autonomy.

482
00:24:34,400 --> 00:24:36,000
It escalates by default.

483
00:24:36,000 --> 00:24:38,400
Now define borderline case like you mean it.

484
00:24:38,400 --> 00:24:40,600
A borderline case is not "I'm not sure".

485
00:24:40,600 --> 00:24:42,800
It's a set of conditions you can test.

486
00:24:42,800 --> 00:24:44,000
Example for screening.

487
00:24:44,000 --> 00:24:49,000
Candidate meets 70 to 85% of required rubric points or fails one must have,

488
00:24:49,000 --> 00:24:53,600
but exceeds in compensating skills or the job requirement text maps to multiple interpretations.

489
00:24:53,600 --> 00:24:56,400
You don't need perfection, you need repeatability.

490
00:24:56,400 --> 00:25:01,000
And when a human overrides the agent, approve a borderline candidate, reject a recommended one,

491
00:25:01,000 --> 00:25:02,800
that override becomes a signal.

492
00:25:02,800 --> 00:25:05,600
It's evidence of drift, bias, or rubric failure.

493
00:25:05,600 --> 00:25:08,200
You capture it, aggregate it, and review it monthly.

494
00:25:08,200 --> 00:25:09,600
Overrides are not noise.

495
00:25:09,600 --> 00:25:12,600
Overrides are where the system tells you it's misaligned with intent.

496
00:25:12,600 --> 00:25:15,200
Now here's where most people accidentally kill the value.

497
00:25:15,200 --> 00:25:16,400
Rubber stamping.

498
00:25:16,400 --> 00:25:21,400
If the approval UI makes approve the easy button and reject the annoying one,

499
00:25:21,400 --> 00:25:25,200
humans will approve everything and your HITEL becomes compliance theatre.

500
00:25:25,200 --> 00:25:28,600
So you enforce friction, require reason codes on approval and override,

501
00:25:28,600 --> 00:25:31,600
require selection of which rubric criterion justified the decision,

502
00:25:31,600 --> 00:25:34,800
require a note when the decision contradicts the recommendation,

503
00:25:34,800 --> 00:25:36,600
and limit who can approve which actions,

504
00:25:36,600 --> 00:25:38,600
hiring managers approve shortlist additions,

505
00:25:38,600 --> 00:25:40,800
HR-Ops approves policy exceptions,

506
00:25:40,800 --> 00:25:42,800
platform admins approve tool expansion,

507
00:25:42,800 --> 00:25:44,200
nobody approves everything.

508
00:25:44,200 --> 00:25:46,600
Finally, treat HITEL as a stop, not a detour.

509
00:25:46,600 --> 00:25:49,600
When the system pauses, it should freeze the workflow state and wait.

510
00:25:49,600 --> 00:25:52,600
No parallel tool calls, no keep going and will review later.

511
00:25:52,600 --> 00:25:55,200
If you let actions proceed, you didn't implement HITEL.

512
00:25:55,200 --> 00:25:57,800
You implemented a notification, so the payoff is simple.

513
00:25:57,800 --> 00:26:01,400
Autonomy where mistakes are reversible, friction where mistakes are irreversible,

514
00:26:01,400 --> 00:26:05,400
and a decision trail that survives scrutiny because it's made of records, not stories.

515
00:26:05,400 --> 00:26:06,400
Now we prove it.

516
00:26:06,400 --> 00:26:11,200
Observability and audit evidence across prompts, actions, overrides and state.

517
00:26:11,200 --> 00:26:15,200
Observability and audit evidence prompts actions, overrides and state.

518
00:26:15,200 --> 00:26:18,000
If you remember nothing else from this episode, remember this.

519
00:26:18,000 --> 00:26:22,600
An HR agent without observability is just a liability generator with a chat window

520
00:26:22,600 --> 00:26:25,600
because the failure mode isn't the agent answered wrong.

521
00:26:25,600 --> 00:26:29,400
The failure mode is the agent acted and you can't reconstruct why.

522
00:26:29,400 --> 00:26:32,200
And in HR, reconstruction is not a technical hobby.

523
00:26:32,200 --> 00:26:33,600
It's your defensability story.

524
00:26:33,600 --> 00:26:37,400
So what does audit grade observability actually mean in an agente workflow?

525
00:26:37,400 --> 00:26:40,600
It means you can produce a complete chain of evidence for any case,

526
00:26:40,600 --> 00:26:44,600
from the original user intent to the sources retrieved, to the tools invoked,

527
00:26:44,600 --> 00:26:49,000
to the data that changed to the human approvals and overrides that allowed it to proceed.

528
00:26:49,000 --> 00:26:50,400
Not we have some logs.

529
00:26:50,400 --> 00:26:51,400
A chain.

530
00:26:51,400 --> 00:26:53,000
Start with prompts and context.

531
00:26:53,000 --> 00:26:56,400
You need to capture the user's input, plus the relevant system instructions

532
00:26:56,400 --> 00:26:58,600
that shape the agent's behavior at that moment.

533
00:26:58,600 --> 00:27:00,800
Not because you love storing chat transcripts,

534
00:27:00,800 --> 00:27:04,400
because when someone challenges an outcome, the first question is always,

535
00:27:04,400 --> 00:27:07,200
what did the user ask and what did the system assume?

536
00:27:07,200 --> 00:27:08,400
But don't get sloppy.

537
00:27:08,400 --> 00:27:11,600
Prompts can contain PII, so you don't default to raw dumps.

538
00:27:11,600 --> 00:27:16,000
You store what's necessary for reconstruction and you redact what creates unnecessary exposure.

539
00:27:16,000 --> 00:27:20,400
This is where secure inputs and outputs in logic apps stop being a checkbox

540
00:27:20,400 --> 00:27:22,400
and start being a survival feature.

541
00:27:22,400 --> 00:27:24,400
Next, retrieval evidence.

542
00:27:24,400 --> 00:27:28,000
If the agent grounded its response or decision on SharePoint resumes,

543
00:27:28,000 --> 00:27:30,800
policy documents or people directory data,

544
00:27:30,800 --> 00:27:34,200
you must store the references, not just the generated text.

545
00:27:34,200 --> 00:27:37,800
That means document IDs, version IDs, and metadata

546
00:27:37,800 --> 00:27:41,600
that lets you prove what the agent saw, otherwise you get into the worst kind of argument.

547
00:27:41,600 --> 00:27:44,600
The policy changed after the decision and you can't prove it didn't.

548
00:27:44,600 --> 00:27:46,400
Now tool calls.

549
00:27:46,400 --> 00:27:48,600
Your model doesn't do onboarding.

550
00:27:48,600 --> 00:27:51,400
It calls a tool that provisions accounts.

551
00:27:51,400 --> 00:27:53,000
Your model doesn't root tickets it.

552
00:27:53,000 --> 00:27:54,800
It calls a tool that assigns a case.

553
00:27:54,800 --> 00:27:57,200
Those tool calls are where reality happens.

554
00:27:57,200 --> 00:28:02,000
So log each tool call with tool name, input payload shape, output summary,

555
00:28:02,000 --> 00:28:05,000
and the execution identifiers from your downstream platform.

556
00:28:05,000 --> 00:28:08,800
In our stack, that means correlating co-pilot studio session identifiers

557
00:28:08,800 --> 00:28:13,600
to MCP tool invocations, to logic app run IDs, to dataverse transaction IDs,

558
00:28:13,600 --> 00:28:15,600
correlation IDs aren't nice to have.

559
00:28:15,600 --> 00:28:18,400
They are the only way you can stitch a conversation to an action.

560
00:28:18,400 --> 00:28:21,400
Now let's talk about overrides because overrides are where most teams

561
00:28:21,400 --> 00:28:23,200
accidentally delete their own credibility.

562
00:28:23,200 --> 00:28:26,600
A human override must be treated as a first class event, not a footnote.

563
00:28:26,600 --> 00:28:30,600
Store who overrode what they overrode, the reason code, and the justification.

564
00:28:30,600 --> 00:28:34,000
And crucially, store whether the override contradicted the agent recommendation

565
00:28:34,000 --> 00:28:35,200
or simply confirmed it.

566
00:28:35,200 --> 00:28:37,800
Why? Because override rates are how you detect drift.

567
00:28:37,800 --> 00:28:41,400
A rising override rate means the rubric is failing, the knowledge sources are stale,

568
00:28:41,400 --> 00:28:43,400
or the agent is rooting edge cases badly.

569
00:28:43,400 --> 00:28:46,400
A near zero override rate might mean the agent is perfect.

570
00:28:46,400 --> 00:28:48,200
Or it might mean humans are rubber stamping.

571
00:28:48,200 --> 00:28:51,400
You don't get to assume which one you instrumented, then state.

572
00:28:51,400 --> 00:28:54,600
The system needs a durable case record that captures progression over time,

573
00:28:54,600 --> 00:28:57,800
candidates screening from intake to shortlist, ticket triage,

574
00:28:57,800 --> 00:29:02,800
from submission to resolution, on boarding from offer accepted today 30 milestones.

575
00:29:02,800 --> 00:29:06,800
Chat transcripts don't model state, they model conversation.

576
00:29:06,800 --> 00:29:11,400
Dataverse model state case objects, milestones, approvals and artifacts

577
00:29:11,400 --> 00:29:14,200
that survive retries, restarts and long running weights.

578
00:29:14,200 --> 00:29:17,000
That's what makes your workflows deterministic at scale because your agent

579
00:29:17,000 --> 00:29:18,600
doesn't need to remember.

580
00:29:18,600 --> 00:29:21,000
It can read current state and act accordingly.

581
00:29:21,000 --> 00:29:23,200
Now retention logs aren't free.

582
00:29:23,200 --> 00:29:26,400
They aren't free operationally and they definitely aren't free legally.

583
00:29:26,400 --> 00:29:28,200
Decide retention deliberately.

584
00:29:28,200 --> 00:29:31,400
What evidence must be retained where it lives and for how long?

585
00:29:31,400 --> 00:29:33,600
Tie it to your HR and compliance policies.

586
00:29:33,600 --> 00:29:37,800
Don't accidentally retain sensitive prompts forever because nobody set a policy

587
00:29:37,800 --> 00:29:39,600
and the default was keep everything.

588
00:29:39,600 --> 00:29:41,400
Then the metrics that expose failure.

589
00:29:41,400 --> 00:29:44,800
Track escalation rate, title trigger rate, override frequency,

590
00:29:44,800 --> 00:29:47,400
unresolved sessions and reopen rates for tickets.

591
00:29:47,400 --> 00:29:51,400
In screening track borderline volume and decision variance across reviewers.

592
00:29:51,400 --> 00:29:54,800
In on boarding, track retry counts and idempotency violations

593
00:29:54,800 --> 00:29:58,400
because provision twice is not an error you want to discover in an audit.

594
00:29:58,400 --> 00:30:02,000
And yes, track latency because slow agents cause humans to bypass them

595
00:30:02,000 --> 00:30:03,400
and bypass is the real enemy.

596
00:30:03,400 --> 00:30:06,800
The point of all of this is audit readiness as an operational posture.

597
00:30:06,800 --> 00:30:08,800
When someone asks what happened, you don't go digging.

598
00:30:08,800 --> 00:30:10,400
You produce the record on demand.

599
00:30:10,400 --> 00:30:14,200
And once you can do that, you can scale the agent without losing control

600
00:30:14,200 --> 00:30:16,800
because the system tells you what it's doing, where it's drifting

601
00:30:16,800 --> 00:30:19,600
and where humans are compensating for design omissions.

602
00:30:19,600 --> 00:30:22,600
Now we can apply this control plane to the three use cases

603
00:30:22,600 --> 00:30:24,800
without changing the architecture every time.

604
00:30:24,800 --> 00:30:28,000
Use case map, three workflows, one control plane.

605
00:30:28,000 --> 00:30:32,000
Now the three use cases, not because they're the only HR automations worth doing

606
00:30:32,000 --> 00:30:35,000
but because they represent three different failure modes.

607
00:30:35,000 --> 00:30:39,400
High risk decisions, high volume operations and long running orchestration.

608
00:30:39,400 --> 00:30:42,000
And the point is that the control plane stays the same.

609
00:30:42,000 --> 00:30:43,600
Candidate screening is the volatile one.

610
00:30:43,600 --> 00:30:45,600
It's bi-sensitive, it's legally sensitive,

611
00:30:45,600 --> 00:30:48,600
and it's where people most want to let the model just decide

612
00:30:48,600 --> 00:30:50,000
because the volume is painful.

613
00:30:50,000 --> 00:30:51,600
That's exactly why it fails first.

614
00:30:51,600 --> 00:30:54,400
So the control plane here is strict, structured rubric,

615
00:30:54,400 --> 00:30:55,800
constrained tool calls,

616
00:30:55,800 --> 00:30:58,600
hitl gates on borderline scores and an evidence trail

617
00:30:58,600 --> 00:31:02,400
that links the score artifact to the exact resume source version.

618
00:31:02,400 --> 00:31:06,000
The agent can summarize and propose, it can't silently rank and move on.

619
00:31:06,000 --> 00:31:10,000
Screening is where you prove your agent isn't an automated discrimination machine

620
00:31:10,000 --> 00:31:11,000
with good UX.

621
00:31:11,000 --> 00:31:13,400
Ticket triage is the safer, higher ROI starter.

622
00:31:13,400 --> 00:31:15,800
It's operationally ugly but it's not usually irreversible.

623
00:31:15,800 --> 00:31:18,000
If the agent mis-roots a ticket, you can correct it.

624
00:31:18,000 --> 00:31:21,000
If it auto-resolve the tier one request with the wrong answer,

625
00:31:21,000 --> 00:31:25,000
you can reopen it, fix the response, and improve the knowledge boundary.

626
00:31:25,000 --> 00:31:27,400
That makes it ideal for scaling early

627
00:31:27,400 --> 00:31:31,800
because you can instrument deflection without gambling on high stakes decisions.

628
00:31:31,800 --> 00:31:35,400
And yes, this is where measurable reductions like the ticket volume drop happen,

629
00:31:35,400 --> 00:31:39,000
not from smarter language but from deterministic classification,

630
00:31:39,000 --> 00:31:44,400
routing and scripted resolution paths that don't require three humans to touch the same case.

631
00:31:44,400 --> 00:31:46,800
Onboarding orchestration is the one that exposes

632
00:31:46,800 --> 00:31:50,000
whether you actually built an agent work flow system

633
00:31:50,000 --> 00:31:53,400
or just a chat layer with some connectors because onboarding isn't one flow.

634
00:31:53,400 --> 00:31:57,000
It's a sequence of dependent actions across days, accounts, groups, licenses,

635
00:31:57,000 --> 00:32:00,000
equipment requests, training assignments, manager check-ins, and exceptions

636
00:32:00,000 --> 00:32:02,200
that arrive late and break assumptions.

637
00:32:02,200 --> 00:32:05,800
This is where state, id, and potency and retries stop being theoretical.

638
00:32:05,800 --> 00:32:09,800
If you can't survive a connector timeout without double provisioning a new hire,

639
00:32:09,800 --> 00:32:11,800
you haven't automated onboarding.

640
00:32:11,800 --> 00:32:15,600
You've created a denial of service against your own identity platform.

641
00:32:15,600 --> 00:32:16,600
So those are the three.

642
00:32:16,600 --> 00:32:18,400
Now here's what ties them together.

643
00:32:18,400 --> 00:32:22,400
The event, reasoning, orchestration, evidence pattern.

644
00:32:22,400 --> 00:32:24,200
Every workflow starts with an event.

645
00:32:24,200 --> 00:32:27,800
In screening, it's a resume intake or a recruiter request to shortlist.

646
00:32:27,800 --> 00:32:31,600
In triage, it's a ticket submission through Teams, email or a portal.

647
00:32:31,600 --> 00:32:35,000
In onboarding, it's an offer accepted signal from your ATS or ATRIS.

648
00:32:35,000 --> 00:32:37,000
The event creates a case record in Dataverse.

649
00:32:37,000 --> 00:32:38,600
That case id becomes your spine.

650
00:32:38,600 --> 00:32:39,600
No case, no control.

651
00:32:39,600 --> 00:32:40,600
Then reasoning.

652
00:32:40,600 --> 00:32:43,200
Copilot Studio takes the event, asks for missing parameters,

653
00:32:43,200 --> 00:32:47,000
applies the conversational policy and chooses the next allowed tool.

654
00:32:47,000 --> 00:32:49,600
Reasoning does not mean freeform thinking.

655
00:32:49,600 --> 00:32:51,600
Means selecting from constrained options.

656
00:32:51,600 --> 00:32:55,000
If the agent can't bind the request to a known rubric, a known routing map

657
00:32:55,000 --> 00:32:57,200
or a known onboarding stage, it escalates.

658
00:32:57,200 --> 00:32:59,200
Because a known state is where models improvise.

659
00:32:59,200 --> 00:33:01,400
Improvisation is not an HR strategy.

660
00:33:01,400 --> 00:33:04,200
Then orchestration logic apps executes the tools, paths,

661
00:33:04,200 --> 00:33:06,800
classifier, root, provision, notify, request approval.

662
00:33:06,800 --> 00:33:08,000
Each workflow is a tool.

663
00:33:08,000 --> 00:33:09,200
Each tool has a schema.

664
00:33:09,200 --> 00:33:11,600
Each tool runs under an identity boundary.

665
00:33:11,600 --> 00:33:14,800
This is where you enforce the difference between read and write operations

666
00:33:14,800 --> 00:33:20,200
and where you front load safety, validate inputs, enforce preconditions and stop on gates.

667
00:33:20,200 --> 00:33:22,200
The model doesn't get to try things.

668
00:33:22,200 --> 00:33:26,400
It gets to invoke permitted operations or it gets blocked and finally evidence.

669
00:33:26,400 --> 00:33:31,400
Every tool call, state transition, approval and override, writes an audit trail.

670
00:33:31,400 --> 00:33:38,400
Correlation IDs across copilot session, MCP invocation, logic app run and Dataverse records.

671
00:33:38,400 --> 00:33:42,400
That's how you survive audits and incident reviews without inventing a narrative.

672
00:33:42,400 --> 00:33:45,600
The common misunderstanding is thinking these are three separate builds.

673
00:33:45,600 --> 00:33:46,200
They're not.

674
00:33:46,200 --> 00:33:50,400
They are three applications of the same control plane with different risk tolerances

675
00:33:50,400 --> 00:33:55,200
and we start with candidate screening because it's the one that punishes weak governance immediately.

676
00:33:55,200 --> 00:34:00,400
Use case one candidate screening intake to short list candidate screening is where people

677
00:34:00,400 --> 00:34:04,200
quietly give the model too much authority because the volume feels unbearable

678
00:34:04,200 --> 00:34:07,400
and then they act surprised when they can't defend outcomes.

679
00:34:07,400 --> 00:34:09,800
So the first rule in this workflow is simple.

680
00:34:09,800 --> 00:34:11,400
Resumes are not documents.

681
00:34:11,400 --> 00:34:13,400
They are regulated data objects.

682
00:34:13,400 --> 00:34:14,600
Treat them like objects.

683
00:34:14,600 --> 00:34:20,000
Resume intake starts in SharePoint but not in a folder named Resumes Final Final 2.

684
00:34:20,000 --> 00:34:27,200
You need metadata discipline, requisition ID, role family, region, submission date, source channel

685
00:34:27,200 --> 00:34:29,400
and a stable candidate identifier.

686
00:34:29,400 --> 00:34:30,900
The reason is not organization.

687
00:34:30,900 --> 00:34:32,400
The reason is retrieval control.

688
00:34:32,400 --> 00:34:40,400
If copilot can retrieve from everything, it will and you've just turned your whole tenant into an unreviewed training set for your screening logic.

689
00:34:40,400 --> 00:34:43,600
So the tool boundary is the agent never pulls all resumes.

690
00:34:43,600 --> 00:34:49,600
It calls a logic app's tool that queries SharePoint by metadata filters and returns a bounded set of candidates and references.

691
00:34:49,600 --> 00:34:50,600
References not payloads.

692
00:34:50,600 --> 00:34:53,200
You don't pipe raw resumes through the chat layer.

693
00:34:53,200 --> 00:34:59,400
You return candidate IDs, file links and extracted structured fields if you've done passing inside the tool boundary.

694
00:34:59,400 --> 00:35:03,400
Next is criteria passing and this is where most teams sabotage themselves.

695
00:35:03,400 --> 00:35:07,000
They take a job description, throw it into the model and ask for a score out of 10.

696
00:35:07,000 --> 00:35:08,000
That's not screening.

697
00:35:08,000 --> 00:35:09,800
That's astrology with better typography.

698
00:35:09,800 --> 00:35:13,600
Instead, you translate job requirements into a structured rubric.

699
00:35:13,600 --> 00:35:16,200
Must haves, should haves and disqualifiers.

700
00:35:16,200 --> 00:35:19,600
Each rubric item has a label, a weight and a required evidence type.

701
00:35:19,600 --> 00:35:23,000
Evidence type matters because it forces specificity.

702
00:35:23,000 --> 00:35:32,000
Years of experience in X, certification Y, experience with tool Z, portfolio link, clearance level, whatever is job related.

703
00:35:32,000 --> 00:35:36,000
If it can't be expressed as a rubric item, it doesn't belong in automated scoring.

704
00:35:36,000 --> 00:35:39,800
That's how you keep the system from drifting into proxies. Now bias filtered scoring.

705
00:35:39,800 --> 00:35:42,000
This is not about pretending the system is moral.

706
00:35:42,000 --> 00:35:45,200
It's about preventing the system from using irrelevant correlates.

707
00:35:45,200 --> 00:35:47,400
Resumes contain proxy landmines.

708
00:35:47,400 --> 00:35:51,000
Names, graduation years, addresses, gaps, extracurriculars.

709
00:35:51,000 --> 00:35:54,000
Some of these are legitimate signals in specific contexts.

710
00:35:54,000 --> 00:35:57,200
Many are not, so you do a two-stage scoring pattern. Stage one.

711
00:35:57,200 --> 00:36:00,000
Redact or ignore obvious proxy fields before scoring.

712
00:36:00,000 --> 00:36:02,000
The agent doesn't need names to assess skills.

713
00:36:02,000 --> 00:36:04,400
It doesn't need addresses to assess qualifications.

714
00:36:04,400 --> 00:36:07,200
It doesn't need a graduation year to assess role fit.

715
00:36:07,200 --> 00:36:11,200
If you want location for relocation eligibility, that's a separate explicit field,

716
00:36:11,200 --> 00:36:13,000
not whatever the resume implies.

717
00:36:13,000 --> 00:36:17,600
Stage two require rationale per rubric item, not a narrative essay, a structured statement

718
00:36:17,600 --> 00:36:21,400
which excerpt or evidence supported the score linked back to the source reference.

719
00:36:21,400 --> 00:36:25,000
That gives you explainability without turning the workflow into a philosophy seminar.

720
00:36:25,000 --> 00:36:28,600
And because you are not trusting vibes, you store the scoring artifact in dataverse.

721
00:36:28,600 --> 00:36:32,600
Candidate ID, requisition ID, rubric version per item scores,

722
00:36:32,600 --> 00:36:35,000
rational pointers and a final recommendation status.

723
00:36:35,000 --> 00:36:36,400
That artifact is the output.

724
00:36:36,400 --> 00:36:38,000
The chat response is just the UI.

725
00:36:38,000 --> 00:36:40,400
Now here's where most people mess up, borderline cases.

726
00:36:40,400 --> 00:36:44,600
They either escalate everything which kills throughput or they escalate nothing,

727
00:36:44,600 --> 00:36:46,000
which kills defensibility.

728
00:36:46,000 --> 00:36:47,800
So define borderline explicitly.

729
00:36:47,800 --> 00:36:51,600
A practical definition is candidates that land in a scoring band

730
00:36:51,600 --> 00:36:55,400
where small rubric interpretation differences change the outcome.

731
00:36:55,400 --> 00:37:00,000
Or candidates missing one must have, but exceeding strongly in multiple should have.

732
00:37:00,000 --> 00:37:01,800
Your rubric can label these conditions.

733
00:37:01,800 --> 00:37:04,000
The workflow can detect them deterministically.

734
00:37:04,000 --> 00:37:05,600
Then you apply the confidence gate.

735
00:37:05,600 --> 00:37:09,800
If the workflow meets borderline conditions or the model confidence falls below your threshold,

736
00:37:09,800 --> 00:37:13,800
the agent stops and generates a review package, not the chat transcript, a package.

737
00:37:13,800 --> 00:37:16,800
Rubric results, evidence links and the proposed decision.

738
00:37:16,800 --> 00:37:18,200
Then hit the triggers.

739
00:37:18,200 --> 00:37:21,600
In the demo research, the hiring manager approval happens via email.

740
00:37:21,600 --> 00:37:24,800
That's fine as a channel as long as the approval becomes a record.

741
00:37:24,800 --> 00:37:27,800
The approval object in dataverse captures approval identity,

742
00:37:27,800 --> 00:37:30,000
decision, timestamp and reason code.

743
00:37:30,000 --> 00:37:33,200
If the manager overrides the recommendation, they must choose why.

744
00:37:33,200 --> 00:37:34,600
Not because you want to punish them,

745
00:37:34,600 --> 00:37:37,200
because you want to measure drift and prevent rubber stamping.

746
00:37:37,200 --> 00:37:40,200
Once approved, the agent can trigger the next actions.

747
00:37:40,200 --> 00:37:42,800
Schedule an interview, generate interview questions,

748
00:37:42,800 --> 00:37:45,000
or move the candidate to the next stage.

749
00:37:45,000 --> 00:37:47,000
Again, those actions should be tools.

750
00:37:47,000 --> 00:37:48,800
Invite candidate is a tool.

751
00:37:48,800 --> 00:37:50,000
Create questions is a tool.

752
00:37:50,000 --> 00:37:51,600
Update status is a tool.

753
00:37:51,600 --> 00:37:54,000
Each one has an input schema and a permission boundary.

754
00:37:54,000 --> 00:37:55,800
And the audit entry is automatic.

755
00:37:55,800 --> 00:37:59,400
Every time a candidate gets scored, escalated, approved or rejected,

756
00:37:59,400 --> 00:38:01,600
you write an evidence record.

757
00:38:01,600 --> 00:38:05,400
Correlation IDs from co-pilot session to MCP tool call to logic app

758
00:38:05,400 --> 00:38:06,800
run to dataverse transaction.

759
00:38:06,800 --> 00:38:09,800
This is the difference between explainability and storytelling.

760
00:38:09,800 --> 00:38:12,000
If someone asks, why did we reject this candidate?

761
00:38:12,000 --> 00:38:13,600
You don't answer with a paragraph.

762
00:38:13,600 --> 00:38:14,800
You answer with a record.

763
00:38:14,800 --> 00:38:17,600
So the outcome of this workflow is not a short list.

764
00:38:17,600 --> 00:38:20,000
The outcome is a short list you can defend.

765
00:38:20,000 --> 00:38:22,600
Buyers, fairness and explainability.

766
00:38:22,600 --> 00:38:24,400
What you can actually defend.

767
00:38:24,400 --> 00:38:26,800
Buyers doesn't show up as an evil line of code.

768
00:38:26,800 --> 00:38:29,200
It shows up as a system choosing shortcuts.

769
00:38:29,200 --> 00:38:32,000
And hiring systems love shortcuts because the data is messy

770
00:38:32,000 --> 00:38:35,000
and the pressure is high and everyone wants the queue to disappear.

771
00:38:35,000 --> 00:38:38,000
The thing most people miss is that bias is usually a proxy problem,

772
00:38:38,000 --> 00:38:39,400
not a prompt problem.

773
00:38:39,400 --> 00:38:43,200
The model rarely uses race and it uses variables that correlate with race.

774
00:38:43,200 --> 00:38:47,000
Zip codes, school names, employment gaps, graduation years,

775
00:38:47,000 --> 00:38:50,600
even certain role titles that are historically skewed by demographic patterns.

776
00:38:50,600 --> 00:38:53,000
If you let unstructured text drive scoring,

777
00:38:53,000 --> 00:38:57,200
the system will learn these correlations faster than you can write a policy memo about them.

778
00:38:57,200 --> 00:38:59,000
And yes, it also shows up as drift.

779
00:38:59,000 --> 00:39:00,000
Your rubric starts clean.

780
00:39:00,000 --> 00:39:02,000
Then a recruiter adds an exception.

781
00:39:02,000 --> 00:39:04,400
Then a hiring manager insists we need culture fit.

782
00:39:04,400 --> 00:39:06,800
Then someone adds a field for communication style.

783
00:39:06,800 --> 00:39:09,000
Then a new region uses different signals.

784
00:39:09,000 --> 00:39:12,200
Over time, the scoring criteria stops being job related

785
00:39:12,200 --> 00:39:15,600
and starts being an organizational mirror of whatever bias already exists.

786
00:39:15,600 --> 00:39:17,000
So what can you actually defend?

787
00:39:17,000 --> 00:39:19,400
You can defend repeatable criteria tied to the job.

788
00:39:19,400 --> 00:39:22,000
You can defend consistent treatment across candidates.

789
00:39:22,000 --> 00:39:24,600
You can defend oversight with recorded rationale.

790
00:39:24,600 --> 00:39:29,800
And you can defend monitoring that detects when the system starts behaving differently than intended.

791
00:39:29,800 --> 00:39:31,200
You cannot defend vibes.

792
00:39:31,200 --> 00:39:32,800
Start with structured scoring.

793
00:39:32,800 --> 00:39:35,200
The rubric is your defensibility artifact.

794
00:39:35,200 --> 00:39:36,400
It doesn't need to be perfect.

795
00:39:36,400 --> 00:39:37,800
It needs to be explicit.

796
00:39:37,800 --> 00:39:40,600
Must haves, should haves, disqualifiers and weights.

797
00:39:40,600 --> 00:39:42,400
And every score has to buy into evidence.

798
00:39:42,400 --> 00:39:43,600
Not the model thinks.

799
00:39:43,600 --> 00:39:44,400
Evidence.

800
00:39:44,400 --> 00:39:45,600
An excerpt reference.

801
00:39:45,600 --> 00:39:48,600
A portfolio link, a certification, a documented project.

802
00:39:48,600 --> 00:39:52,600
Something you can point at later without rerunning the model and getting a different answer.

803
00:39:52,600 --> 00:39:59,800
That alone knocks out a large class of bias because it forces the system to operate on job related signals instead of socially correlated noise.

804
00:39:59,800 --> 00:40:01,400
Now test outcomes, not intentions.

805
00:40:01,400 --> 00:40:04,200
A practical method here is disparate impact screening.

806
00:40:04,200 --> 00:40:08,200
The simplest lens people use is the 80% rule as a quick check.

807
00:40:08,200 --> 00:40:13,800
Compare selection rates across groups and flag when one groups rate drops below 80% of the highest groups rate.

808
00:40:13,800 --> 00:40:15,000
It's not a legal conclusion.

809
00:40:15,000 --> 00:40:16,000
It's a smoke alarm.

810
00:40:16,000 --> 00:40:19,800
If it triggers you investigate the rubric, the data, the workflow and the human overrides.

811
00:40:19,800 --> 00:40:21,400
You also run counterfactual checks.

812
00:40:21,400 --> 00:40:27,400
This clicked for a lot of teams when they realized you can change one attribute that shouldn't matter and see if the score shifts.

813
00:40:27,400 --> 00:40:31,800
Swap a name, remove a graduation year, replace and address with a neutral placeholder.

814
00:40:31,800 --> 00:40:35,000
If the scoring moves materially, you just found a proxy pathway.

815
00:40:35,000 --> 00:40:37,400
That doesn't mean the system is racist.

816
00:40:37,400 --> 00:40:41,000
It means your design allowed correlated inputs to influence outcomes.

817
00:40:41,000 --> 00:40:43,000
Then consistency verification.

818
00:40:43,000 --> 00:40:47,800
Take a sample of resumes and score them across rubric versions and across reviewers.

819
00:40:47,800 --> 00:40:53,400
If small wording changes in the job description cause large ranking shifts, your rubric is underspecified.

820
00:40:53,400 --> 00:40:59,800
If different reviewers override in opposite directions with no consistent reason codes, your human oversight is theater.

821
00:40:59,800 --> 00:41:03,400
And if override rate is near zero, assume deference, not perfection.

822
00:41:03,400 --> 00:41:05,000
Because the human factor is brutal.

823
00:41:05,000 --> 00:41:09,400
When you put an AI recommendation next to a human in a busy workflow, you create authority bias.

824
00:41:09,400 --> 00:41:11,400
People defer. They rub a stamp.

825
00:41:11,400 --> 00:41:16,200
That's why you require reason codes and justification on overrides and approvals.

826
00:41:16,200 --> 00:41:20,600
Because it's fun. Because it forces cognitive engagement and it gives you signals you can measure.

827
00:41:20,600 --> 00:41:23,800
Now explainability. Explainability is not the model's chain of thought.

828
00:41:23,800 --> 00:41:27,800
Now, don't store it. Don't rely on it and don't pretend it's the rationale.

829
00:41:27,800 --> 00:41:31,000
Explainability is what rubric items drove the outcome.

830
00:41:31,000 --> 00:41:32,600
What evidence supported each item.

831
00:41:32,600 --> 00:41:35,800
What sources were used and what human interventions changed the path.

832
00:41:35,800 --> 00:41:40,600
That's it. If the agent can produce that, you can defend the process as a control decision workflow.

833
00:41:40,600 --> 00:41:44,200
If it can't, it's just generating text and hoping the audience trusts it.

834
00:41:44,200 --> 00:41:47,400
And vendor reality responsibility stays with the deployer.

835
00:41:47,400 --> 00:41:52,200
Even if the resume passing comes from a third party, even if the scoring model is industry standard,

836
00:41:52,200 --> 00:41:54,600
you own the outcome because you operationalized it.

837
00:41:54,600 --> 00:41:59,000
Contracts don't absorb liability. They just spread blame and blame is not a control.

838
00:41:59,000 --> 00:42:01,200
So the defensible posture is a cycle.

839
00:42:01,200 --> 00:42:06,200
Structured rubric, proxy minimization, outcome monitoring and periodic sampling reviews,

840
00:42:06,200 --> 00:42:09,000
not annual panic, a rhythm.

841
00:42:09,000 --> 00:42:12,600
And once you have that rhythm, you can safely move to the lower risk,

842
00:42:12,600 --> 00:42:16,600
higher ROI deployment pattern, HR ticket triage.

843
00:42:16,600 --> 00:42:20,200
Use case two, HR ticket triage, deflection without chaos.

844
00:42:20,200 --> 00:42:23,400
Ticket triage is where most HR teams actually bleed time.

845
00:42:23,400 --> 00:42:26,200
Not because the questions are hard, because the intake is messy,

846
00:42:26,200 --> 00:42:29,400
the routing is emotional and the same request gets touched by three people

847
00:42:29,400 --> 00:42:31,400
before anyone decides who owns it.

848
00:42:31,400 --> 00:42:34,600
So the goal here isn't build a helpful HR chatbot.

849
00:42:34,600 --> 00:42:36,800
The goal is deterministic deflection.

850
00:42:36,800 --> 00:42:40,000
The system takes a request, classifies it into a lane you control,

851
00:42:40,000 --> 00:42:44,200
resolves what it can safely resolve and escalates the rest with an evidence package.

852
00:42:44,200 --> 00:42:48,400
Start with intake. You need one entry point, even if you accept many channels.

853
00:42:48,400 --> 00:42:53,600
Teams message form portal, email, find, but every pathway must normalize into a case object.

854
00:42:53,600 --> 00:42:57,600
A single schema ticket ID, request identity, category candidates,

855
00:42:57,600 --> 00:43:00,600
region, urgency and whatever minimal context you allow.

856
00:43:00,600 --> 00:43:03,600
If you don't normalize, you can't measure, if you can't measure, you can't scale.

857
00:43:03,600 --> 00:43:06,000
Copilot Studio then does what it's actually good at.

858
00:43:06,000 --> 00:43:09,600
It runs the conversation to fill missing parameters and prevent garbage in.

859
00:43:09,600 --> 00:43:13,600
If the user writes benefits, the agent asks one clarifying question, not six.

860
00:43:13,600 --> 00:43:15,800
If it's time off, it asks jurisdiction.

861
00:43:15,800 --> 00:43:18,600
If it's payroll, it asks pay period.

862
00:43:18,600 --> 00:43:21,200
This is where you reduce entropy before you automate.

863
00:43:21,200 --> 00:43:22,400
Then classification.

864
00:43:22,400 --> 00:43:26,000
Most teams create a general HR bucket because it feels flexible.

865
00:43:26,000 --> 00:43:28,200
It isn't. It's where triage goes to die.

866
00:43:28,200 --> 00:43:31,200
Instead you map topics to tool groups and lanes.

867
00:43:31,200 --> 00:43:35,000
Benefits questions go to benefits knowledge and benefits workflows.

868
00:43:35,000 --> 00:43:37,800
Time off requests go to leave policies and leave tools.

869
00:43:37,800 --> 00:43:41,200
Payroll issues go to payroll routing and a higher default hitl profile.

870
00:43:41,200 --> 00:43:46,800
Employee relations items get flagged as high sensitivity and routed to humans by design.

871
00:43:46,800 --> 00:43:49,200
The agent shouldn't be allowed to improvise categories.

872
00:43:49,200 --> 00:43:54,400
It chooses from the set you own and it must attach a confidence score and a rationale for the classification.

873
00:43:54,400 --> 00:43:56,400
Once you nail that, everything else clicks.

874
00:43:56,400 --> 00:43:58,200
Rooting becomes deterministic.

875
00:43:58,200 --> 00:44:01,200
Logic apps takes the classified case and applies rooting rules.

876
00:44:01,200 --> 00:44:03,200
Queue assignment by category and region.

877
00:44:03,200 --> 00:44:05,800
Priority rules for specific keywords or metadata.

878
00:44:05,800 --> 00:44:09,000
SLA timers, escalation targets and here's the key.

879
00:44:09,000 --> 00:44:11,000
The routing is not AI magic.

880
00:44:11,000 --> 00:44:12,600
It's a mapping table you can review.

881
00:44:12,600 --> 00:44:16,000
If the agent predicts category A, the workflow roots to QA.

882
00:44:16,000 --> 00:44:18,800
If category is uncertain or the case hits a risk trigger,

883
00:44:18,800 --> 00:44:22,200
the workflow roots to a human triage Q with a clear reason code.

884
00:44:22,200 --> 00:44:27,400
Tier one auto resolution is where you get the ROI and it's also where people accidentally create chaos.

885
00:44:27,400 --> 00:44:31,600
Auto resolution doesn't mean the model writes a confident paragraph and closes the ticket.

886
00:44:31,600 --> 00:44:35,800
It means use grounded knowledge plus deterministic actions where allowed.

887
00:44:35,800 --> 00:44:40,200
Reset a password, that's an IT flow, not HR, but you get the idea, update and address,

888
00:44:40,200 --> 00:44:43,000
that might be a controlled HRS write with approvals.

889
00:44:43,000 --> 00:44:44,400
Where do I find the policy?

890
00:44:44,400 --> 00:44:47,000
That's a link plus the policy version reference.

891
00:44:47,000 --> 00:44:48,600
How many leave days do I have?

892
00:44:48,600 --> 00:44:52,200
That's a read operation that returns a number, not a hallucinated answer.

893
00:44:52,200 --> 00:44:53,200
So the play is simple.

894
00:44:53,200 --> 00:44:56,000
For tier one categories, your agent can do two things.

895
00:44:56,000 --> 00:44:59,000
Provide an answer grounded in approved knowledge sources.

896
00:44:59,000 --> 00:45:03,200
An executor bounded action through a tool when the action is low risk and reversible.

897
00:45:03,200 --> 00:45:04,400
Anything else poses.

898
00:45:04,400 --> 00:45:07,800
This is where the title pattern becomes operational, not philosophical.

899
00:45:07,800 --> 00:45:11,400
Complex cases don't get escalated by dumping chat history into an email.

900
00:45:11,400 --> 00:45:15,800
They get escalated with a context package, the normalized case object classification confidence,

901
00:45:15,800 --> 00:45:18,800
the knowledge sources consulted, the proposed next action,

902
00:45:18,800 --> 00:45:20,800
and the exact reason it couldn't auto resolve.

903
00:45:20,800 --> 00:45:24,000
That package gets written to dataverse and routed to HR ops,

904
00:45:24,000 --> 00:45:26,200
so the human starts with structure, not archaeology.

905
00:45:26,200 --> 00:45:27,800
Now, the uncomfortable truth.

906
00:45:27,800 --> 00:45:30,600
Triage will expose your knowledge boundaries fast.

907
00:45:30,600 --> 00:45:33,000
If your knowledge base contains outdated policies,

908
00:45:33,000 --> 00:45:35,000
the agent will confidently cite them.

909
00:45:35,000 --> 00:45:37,000
If your SharePoint permissions are sloppy,

910
00:45:37,000 --> 00:45:40,200
the agent will surface sensitive content to the wrong people.

911
00:45:40,200 --> 00:45:43,400
If your categories are vague, the agent will root unpredictably.

912
00:45:43,400 --> 00:45:45,800
And if your workflows can't log safely,

913
00:45:45,800 --> 00:45:49,400
you'll end up with PII sitting in run history like a self-inflicted breach,

914
00:45:49,400 --> 00:45:52,200
so you treat ticket triage as a controlled system.

915
00:45:52,200 --> 00:45:56,200
Bounded inputs, bounded tool calls, bounded outputs, and full evidence.

916
00:45:56,200 --> 00:45:58,600
You measure it with metrics that matter.

917
00:45:58,600 --> 00:46:01,400
Deflection rate versus resolution rate, escalation rate,

918
00:46:01,400 --> 00:46:03,800
re-open rate, and human touches per case.

919
00:46:03,800 --> 00:46:06,400
And your review overrides when humans reclassify,

920
00:46:06,400 --> 00:46:08,400
when they reopen, when they change the lane.

921
00:46:08,400 --> 00:46:11,400
Those are drift signals, fix the mapping, tighten the knowledge scope,

922
00:46:11,400 --> 00:46:12,600
or add a gate.

923
00:46:12,600 --> 00:46:15,000
That's how you get deflection without chaos,

924
00:46:15,000 --> 00:46:16,400
not by trusting the model.

925
00:46:16,400 --> 00:46:19,600
By enforcing the workflow, measuring ticket reduction,

926
00:46:19,600 --> 00:46:22,000
what actually drives the 44%.

927
00:46:22,000 --> 00:46:29,200
When someone claims we reduced HR tickets by 44%, the first job is to ask what they mean by reduced.

928
00:46:29,200 --> 00:46:31,800
Because there are two numbers that get confused on purpose.

929
00:46:31,800 --> 00:46:33,400
Deflection and resolution.

930
00:46:33,400 --> 00:46:36,200
Deflection means the employee never created a ticket,

931
00:46:36,200 --> 00:46:38,400
or the ticket never reached a human queue.

932
00:46:38,400 --> 00:46:41,800
Resolution means a ticket existed and eventually got closed.

933
00:46:41,800 --> 00:46:44,400
You can inflate one while the other quietly gets worse.

934
00:46:44,400 --> 00:46:47,400
If the agent answers in chat and employees still open tickets

935
00:46:47,400 --> 00:46:50,000
because they don't trust it, you didn't reduce anything.

936
00:46:50,000 --> 00:46:52,400
You just added a new front door to the same backlog.

937
00:46:52,400 --> 00:46:55,400
So the measurement model needs to start with a strict funnel.

938
00:46:55,400 --> 00:46:56,600
Intake volume.

939
00:46:56,600 --> 00:47:01,000
Normalized cases created, cases routed to a human lane, cases auto resolved,

940
00:47:01,000 --> 00:47:03,200
cases reopened, cases escalated,

941
00:47:03,200 --> 00:47:07,200
and then the one metric that exposes operational truth, human touches per case.

942
00:47:07,200 --> 00:47:12,400
If your AI triage still requires two humans to read, interpret, and forward the same request,

943
00:47:12,400 --> 00:47:14,800
you just replaced manual work with slower manual work.

944
00:47:14,800 --> 00:47:18,600
Now, what actually drives a ticket reduction claim in a system like this?

945
00:47:18,600 --> 00:47:20,800
It's not the language model being smarter.

946
00:47:20,800 --> 00:47:22,000
It's three levers.

947
00:47:22,000 --> 00:47:26,000
Classification quality, knowledge boundaries, and action capability.

948
00:47:26,000 --> 00:47:30,000
Classification quality comes first because it determines everything downstream.

949
00:47:30,000 --> 00:47:34,000
If the agent can reliably map requests into a small set of well-owned lanes

950
00:47:34,000 --> 00:47:35,800
you stop burning time on triage ping pong.

951
00:47:35,800 --> 00:47:39,400
You also reduce rework because the right team gets the case the first time.

952
00:47:39,400 --> 00:47:41,400
This is where you measure precision, not vibes.

953
00:47:41,400 --> 00:47:45,400
How often did the initial category match the final category after human review?

954
00:47:45,400 --> 00:47:47,000
How often did it hit the right queue?

955
00:47:47,000 --> 00:47:53,000
How often did the agent ask for one clarifying parameter instead of dumping the request into other?

956
00:47:53,000 --> 00:47:54,800
Knowledge boundaries come next.

957
00:47:54,800 --> 00:47:57,400
People love to talk about rag, like it's a strategy.

958
00:47:57,400 --> 00:47:59,400
It isn't. Rag is a retrieval mechanism.

959
00:47:59,400 --> 00:48:03,600
The strategy is what sources are allowed, how they're filtered, and how they're versed.

960
00:48:03,600 --> 00:48:07,800
If the agent can only ground answers in curated HR policy sources,

961
00:48:07,800 --> 00:48:11,000
by region, by employee type, by current effective date,

962
00:48:11,000 --> 00:48:16,000
you reduce the number of tickets created for, where is the policy and what's the process questions.

963
00:48:16,000 --> 00:48:20,000
But if it retrieves from random sharepoint sites, you increase tickets

964
00:48:20,000 --> 00:48:23,400
because employees will immediately escalate when they see conflicting answers,

965
00:48:23,400 --> 00:48:24,600
that's not an AI problem.

966
00:48:24,600 --> 00:48:27,200
That's your information architecture showing up in public.

967
00:48:27,200 --> 00:48:31,600
Action capability is the third lever, and it's the one that creates real deflection.

968
00:48:31,600 --> 00:48:33,600
Not answers. Actions.

969
00:48:33,600 --> 00:48:38,000
When the agent can complete a bounded low-risk task, submit a request, update a field,

970
00:48:38,000 --> 00:48:41,000
create a case with the right metadata, route it correctly,

971
00:48:41,000 --> 00:48:43,400
employees stop opening tickets as a workaround.

972
00:48:43,400 --> 00:48:48,400
This is also why logic apps and MCP tools matter. They turn chat into completion.

973
00:48:48,400 --> 00:48:50,400
Now here's where most people mess up the measurement.

974
00:48:50,400 --> 00:48:55,400
They report automation rate using the agent's activity logs, which mostly measure sessions.

975
00:48:55,400 --> 00:48:59,400
Sessions don't equal outcomes. You want outcome metrics that punish bad automation.

976
00:48:59,400 --> 00:49:04,400
So track deflection rate as percentage of intents that end without a human ticket touch.

977
00:49:04,400 --> 00:49:08,800
Track auto-resolve rate as percentage of cases closed with no human intervention.

978
00:49:08,800 --> 00:49:13,600
Track escalation rate as percentage that hit hit-y-all gates or high-risk categories.

979
00:49:13,600 --> 00:49:17,800
Track reopen rate as percentage reopened within a defined window.

980
00:49:17,800 --> 00:49:23,000
Track satisfaction separately because low satisfaction with high deflection is just silent failure.

981
00:49:23,000 --> 00:49:25,600
And then track the failure signals that predict collapse.

982
00:49:25,600 --> 00:49:31,000
If escalation's rise week over week classification is drifting or knowledge is stale.

983
00:49:31,000 --> 00:49:36,200
If satisfaction drops while deflection rises, employees are getting wrong answers but giving up.

984
00:49:36,200 --> 00:49:41,200
If retries and repeat questions increase, the agent is not completing tasks. It's looping.

985
00:49:41,200 --> 00:49:45,200
And if override frequency is near zero, you probably trained humans to defer,

986
00:49:45,200 --> 00:49:48,200
which means you're accumulating risk not reducing workload.

987
00:49:48,200 --> 00:49:52,200
When you connect those metrics back to design choices, the story becomes boring.

988
00:49:52,200 --> 00:49:54,200
Good, boring means deterministic.

989
00:49:54,200 --> 00:49:58,200
Confidence gates reduce wasted human time by rooting only edge cases.

990
00:49:58,200 --> 00:50:03,200
Deterministic rooting rules reduce triage ping pong, knowledge scoping reduces conflicting guidance.

991
00:50:03,200 --> 00:50:08,200
Scripted tier one actions eliminate, please forward this to workflows.

992
00:50:08,200 --> 00:50:12,800
The 44% is not magic. It's what happens when you stop treating HR support as a conversation

993
00:50:12,800 --> 00:50:14,800
and start treating it as a controlled system.

994
00:50:14,800 --> 00:50:17,800
And if you can't measure it with those levers, you don't have a reduction.

995
00:50:17,800 --> 00:50:21,200
You have marketing. Use case three. Intelligence onboarding.

996
00:50:21,200 --> 00:50:23,200
Offer accepted to day 30.

997
00:50:23,200 --> 00:50:27,200
Onboarding is where most agents go to die because onboarding is not a question and answer problem.

998
00:50:27,200 --> 00:50:31,800
It's a long running orchestration problem with dependencies, delays and irreversible side effects.

999
00:50:31,800 --> 00:50:36,800
And the trigger matters if your onboarding starts because someone posted new hire starting Monday in teams.

1000
00:50:36,800 --> 00:50:38,800
You don't have an onboarding workflow.

1001
00:50:38,800 --> 00:50:41,800
You have a ritual. The trigger needs to be an actual event.

1002
00:50:41,800 --> 00:50:46,800
Offer accepted in your ATS, a row created in your HR is a status change in a hiring system.

1003
00:50:46,800 --> 00:50:49,800
Something durable, something you can replay, something you can audit.

1004
00:50:49,800 --> 00:50:55,800
That event creates a case record in data verse with a daytime stamp, owner, region, role family and a stage checklist.

1005
00:50:55,800 --> 00:50:58,800
From there, Copilot Studio does the conversational work.

1006
00:50:58,800 --> 00:51:02,800
It fills in the missing parameters without turning onboarding into a form.

1007
00:51:02,800 --> 00:51:07,800
Start date, manager, location, equipment needs, any conditional parts like contractor versus employee.

1008
00:51:07,800 --> 00:51:11,800
And any regional policy flags that change what standard onboarding even means.

1009
00:51:11,800 --> 00:51:14,800
Then it hands execution to logic apps through MCP tools.

1010
00:51:14,800 --> 00:51:17,800
And this is where you stop being cute and start being reliable.

1011
00:51:17,800 --> 00:51:25,800
First, provisioning via Microsoft Graph, account creation, group membership, license assignment, mailbox provisioning, teams enablement, whatever your org does.

1012
00:51:25,800 --> 00:51:30,800
The architectural point is not that Graph exists. The point is that these are right operations with blast radius.

1013
00:51:30,800 --> 00:51:34,800
So they are gated, idempotent and least privileged.

1014
00:51:34,800 --> 00:51:36,800
Provision identity is not one tool.

1015
00:51:36,800 --> 00:51:45,800
It's a tool group with separation, one tool to create the account, one tool to assign baseline groups, one tool to request elevated access, one tool to validate the final state.

1016
00:51:45,800 --> 00:51:49,800
Each tool runs under a managed identity with scoped permissions.

1017
00:51:49,800 --> 00:51:57,800
No shared connections, no HR automation service account that quietly becomes the most privileged identity in the tenant.

1018
00:51:57,800 --> 00:52:03,800
And yes, you put approvals where the damage is expensive, assigning a standard M365 license might be auto.

1019
00:52:03,800 --> 00:52:06,800
Assigning access to finance systems or HR systems should not be.

1020
00:52:06,800 --> 00:52:11,800
That's a hitlegate by risk category, not by feelings, second training assignments.

1021
00:52:11,800 --> 00:52:14,800
Most organizations handle this with an email and hope.

1022
00:52:14,800 --> 00:52:17,800
That's not onboarding, that's outsourcing accountability to outlook.

1023
00:52:17,800 --> 00:52:24,800
So the agent creates structured training tasks, role-based learning paths, compliance modules and deadlines.

1024
00:52:24,800 --> 00:52:29,800
It stores each assignment as a stateful milestone in dataverse assigned due date, completion status, evidence link.

1025
00:52:29,800 --> 00:52:37,800
The completion signal can come from your LMS connector, a file submission or a simple acknowledgement flow, but it must update state, not just send reminders.

1026
00:52:37,800 --> 00:52:44,800
Because if you can't query who is behind on onboarding milestones, you don't have an onboarding system, you have noise.

1027
00:52:44,800 --> 00:52:49,800
Third, scheduling check-ins. This is where agents look impressive in demos and disappointing in reality.

1028
00:52:49,800 --> 00:52:52,800
Scheduling isn't hard, scheduling with constraints is hard.

1029
00:52:52,800 --> 00:52:55,800
So you treat check-ins as tasks, not calendar magic.

1030
00:52:55,800 --> 00:53:02,800
The agent creates a manager task, schedule day 7 check-in, schedule day 30 review, confirm equipment received, confirm access works.

1031
00:53:02,800 --> 00:53:06,800
If you can automate calendar invites through graph, fine.

1032
00:53:06,800 --> 00:53:11,800
But the system still tracks completion as state, because calendar invites do not equal completed onboarding.

1033
00:53:11,800 --> 00:53:17,800
And now the part people get wrong, adaptive nudges. This is not surveillance, it is not monitor their team's messages.

1034
00:53:17,800 --> 00:53:24,800
You don't need psychometrics, you need basic workflow signals, training in complete, no check-in scheduled, provisioning failed, equipment ticket unresolved.

1035
00:53:24,800 --> 00:53:27,800
Those are objective indicators that onboarding is drifting.

1036
00:53:27,800 --> 00:53:31,800
So the agent watches milestone state. If a milestone is late, it nudges the owner.

1037
00:53:31,800 --> 00:53:35,800
The manager. HR ops IT onboarding, whoever owns that step.

1038
00:53:35,800 --> 00:53:39,800
And it nudges with context, what's missing, what tool call failed, what the next action is.

1039
00:53:39,800 --> 00:53:44,800
Not hey, just checking in. That's how you create notification fatigue and then everyone turns it off.

1040
00:53:44,800 --> 00:53:48,800
Now the reason this use case matters is that it exposes orchestration reality.

1041
00:53:48,800 --> 00:53:51,800
Provisioning actions can't run twice without causing damage.

1042
00:53:51,800 --> 00:53:55,800
Training assignments can't get duplicated because the employee changes managers.

1043
00:53:55,800 --> 00:53:58,800
Check-ins can't be best effort if you want consistent time to productivity.

1044
00:53:58,800 --> 00:54:02,800
So the workflow must be durable across days. It must survive connector timeouts.

1045
00:54:02,800 --> 00:54:06,800
It must resume after human approvals. It must log every state transition.

1046
00:54:06,800 --> 00:54:11,800
And it must be able to answer at any time. What stage is this person in? What has been done?

1047
00:54:11,800 --> 00:54:15,800
What is blocked and who owns the block? That's what offer accepted today 30 actually means.

1048
00:54:15,800 --> 00:54:23,800
Not a chatbot that says welcome to the company. A govern system that turns onboarding into a deterministic sequence of actions with evidence.

1049
00:54:23,800 --> 00:54:27,800
And if you can do that, you've proven the platform can handle the worst kind of HR automation.

1050
00:54:27,800 --> 00:54:30,800
Long running, cross system and full of edge cases.

1051
00:54:30,800 --> 00:54:33,800
Now we deal with the uncomfortable mechanics that make it survive.

1052
00:54:33,800 --> 00:54:36,800
State, Retrieves, Identity and Drift.

1053
00:54:36,800 --> 00:54:40,800
Orchestration. Reality. Reliability patterns for agentic HR.

1054
00:54:40,800 --> 00:54:42,800
Now it gets unglamorous.

1055
00:54:42,800 --> 00:54:47,800
Onboarding, ticket routing and screening workflows don't fail because the model hallucinates.

1056
00:54:47,800 --> 00:54:50,800
They fail because distributed systems behave like distributed systems.

1057
00:54:50,800 --> 00:54:54,800
Partial failure, duplicate events, timeouts, retries and human delays.

1058
00:54:54,800 --> 00:54:58,800
And the moment you let an LLM sit on top of that without hard reliability patterns,

1059
00:54:58,800 --> 00:55:02,800
you've built a probabilistic control plane for deterministic work.

1060
00:55:02,800 --> 00:55:06,800
So start with the dampency because HR workflows love doing the same damage twice.

1061
00:55:06,800 --> 00:55:08,800
Provisioning is the obvious example.

1062
00:55:08,800 --> 00:55:12,800
If create user runs twice, you don't get two users. You get a mess.

1063
00:55:12,800 --> 00:55:17,800
Conflicting objects, partial licenses, inconsistent group membership and cleanup work that becomes a shadow project.

1064
00:55:17,800 --> 00:55:23,800
The fix is not be careful. The fix is to design every right tool to be idempotent by default.

1065
00:55:23,800 --> 00:55:27,800
Accept a stable key. Check current state first and only apply deltas.

1066
00:55:27,800 --> 00:55:32,800
That stable key should be something like a candidate id, employee id or a composite key you control.

1067
00:55:32,800 --> 00:55:37,800
Not whatever name came from the resume. Names are not identifiers, their noise.

1068
00:55:37,800 --> 00:55:42,800
So each tool call needs a precondition check. Does the record already exist? Is it in the expected state?

1069
00:55:42,800 --> 00:55:45,800
And is this transition allowed? If the answer is no, you stop and escalate.

1070
00:55:45,800 --> 00:55:49,800
The agent doesn't improvise close enough state transitions. It can't see the consequences.

1071
00:55:49,800 --> 00:55:52,800
Next is retries and dead lettering. Retries are not reliability.

1072
00:55:52,800 --> 00:55:56,800
Retries are how you turn transient failures into duplicate side effects.

1073
00:55:56,800 --> 00:56:02,800
That's why you pair retries with id impotency and why you distinguish transient errors from business rule failures.

1074
00:56:02,800 --> 00:56:08,800
A connector timeout, retry with back off, a 429 throttle, retry with back off and jitter.

1075
00:56:08,800 --> 00:56:12,800
A validation failure like manager id missing or region policy mismatch.

1076
00:56:12,800 --> 00:56:17,800
Don't retry. Create a deterministic failure state, root to hitle and wait for human input.

1077
00:56:17,800 --> 00:56:20,800
And when retries finally give up, you need a dead letter path.

1078
00:56:20,800 --> 00:56:32,800
Not send an email to the builder. An actual queue or case state that says, "This workflow instance is blocked. Here's the error. Here's the context package. And here's who owns the fix. HR doesn't need heroics. It needs predictable recovery."

1079
00:56:32,800 --> 00:56:36,800
Now here's where most people mess up the architecture. They couple protocol to processing.

1080
00:56:36,800 --> 00:56:44,800
They build HTTP request response flows that do heavy work in line because the demo looks clean. But HR work isn't always fast and it's rarely reliable end to end.

1081
00:56:44,800 --> 00:56:52,800
So split it. Use request response only to acknowledge receipt and create the durable case record. Then push the heavy work into asynchronous processing.

1082
00:56:52,800 --> 00:57:02,800
Cues, long running orchestrations and stepwise state transitions. That's how you avoid timeouts, how you absorb spikes and how you stop the client from becoming the reliability boundary. And yes, HR has spikes.

1083
00:57:02,800 --> 00:57:11,800
Monday mornings. Open enrollment. New hire waves, policy changes. The system doesn't care that your workload is seasonal. It will fail anyway. Design for it.

1084
00:57:11,800 --> 00:57:21,800
Tool throttles and connector limits are the other silent killer. Every connector has quotas, latency variation and failure modes that show up only when you scale. Your demo will work.

1085
00:57:21,800 --> 00:57:30,800
Your production run will hit rate limits and stall mid-process, leaving partially completed onboarding or half-routed tickets. So you need explicit throttling strategy.

1086
00:57:30,800 --> 00:57:40,800
Limit concurrency, batchware appropriate and prefer event-driven triggers over constant polling when the source supports it. Where it doesn't, you still design for duplicate events because polling will deliver them.

1087
00:57:40,800 --> 00:57:48,800
Now state ownership dot co pilot studio is not a system of record. The chat history is not a system of record. Logic app run history is not a system of record.

1088
00:57:48,800 --> 00:58:03,800
Dataverse is your durable workflow state. The case, the milestones, the approval objects, the artifacts, the correlation IDs. That means every long running workflow reads state before acting, writes state after acting and treats state transitions as the source of truth.

1089
00:58:03,800 --> 00:58:14,800
The agent restarts, it doesn't remember, it rehydrates from dataverse and continues. That's what makes it survivable and finally drift. The workflow you build today is not the workflow you'll run in six months.

1090
00:58:14,800 --> 00:58:22,800
Someone will change a rubric, someone will add a ticket category, someone will update on boarding steps, entropy accumulates. So make drift explicit.

1091
00:58:22,800 --> 00:58:33,800
Version your rubrics, version your tool schemers and store the version used on every case record. If you can't say this decision used rubric v3, you can't explain behavior changes without rewriting history.

1092
00:58:33,800 --> 00:58:44,800
Reliability in agentech r is not uptime. It's controlled repetition, safe retreat, safe duplicates, safe delays and recoverable failure. Without that, your agent doesn't scale, it just fails faster.

1093
00:58:44,800 --> 00:59:00,800
Reproducible blueprint, build order and guardrails. Now the build plan, not a journey, a sequence that produces working outcomes with guardrails that stop you from creating a clever prototype that collapses the first time HR hits a busy Monday. Start with build order and accept the risk gradient.

1094
00:59:00,800 --> 00:59:14,800
Ticket triage goes first, it's high volume, mostly reversible and it forces you to build the control plane, case normalization, deterministic routing, knowledge boundaries and hitle for sensitive categories. If your stack can't survive triage, it won't survive hiring.

1095
00:59:14,800 --> 00:59:24,800
Onboarding goes second, it teaches durability, long running state, retreats without double provisioning, approval weights and the unpleasant reality that connectors fail more often in production than in demos.

1096
00:59:24,800 --> 00:59:38,800
If you can't do idempotency and state ownership here, you're not ready to let the agent touch identity. Candidate screening goes last, not because it's hard technically but because it's hard operationally. It's where bias and explainability get externalized into audits, complaints and litigation risk.

1097
00:59:38,800 --> 00:59:49,800
You build it after you already know your logging, gating and identity boundaries work. Now environment model. If you build in one environment and publish to prod by exporting a zip file, you don't have an architecture. You have a hobby.

1098
00:59:49,800 --> 00:59:59,800
You need dev, test and prod with policy parity, same connectors, same network constraints, same key vault patterns, same logging settings, different data, same controls.

1099
00:59:59,800 --> 01:00:08,800
That's the only way you stop, it worked in my tenant from becoming your post-incident slogan. And yes, that includes copilot studio environments and power platform governance, not just Azure.

1100
01:00:08,800 --> 01:00:17,800
Agents drift when environments drift, you want drift to be intentional, versioned and reviewed. Identity model next because this is where HR agents quietly become security debt.

1101
01:00:17,800 --> 01:00:28,800
Every workflow tool gets its own managed identity, not one identity for HR automation, not one shared connection that accumulates privileges over time, per tool, per workflow, scope permissions.

1102
01:00:28,800 --> 01:00:36,800
And where you need user context, you use on behalf of patterns deliberately, not accidentally. Then you separate roles across the human side.

1103
01:00:36,800 --> 01:00:48,800
Recruiters, HR ops and platform admins, recruiters can initiate and review. HR ops can approve policy exceptions, platform admins can expand tool access. Nobody gets the combined power to change the tool set and approve the outcomes.

1104
01:00:48,800 --> 01:00:56,800
That's not governance, that's a future incident with a single name on it. Now data model, this is where teams sabotage themselves by stuffing PII into prompts because it's convenient.

1105
01:00:56,800 --> 01:01:07,800
Don't, minimal PII in prompts, use IDs and retrieval with permissions. The agent should reason over structured fields and reference documents, not raw resumes and medical notes pasted into a chat window.

1106
01:01:07,800 --> 01:01:18,800
In practice, that means data verse holds case state and structured artifacts. Rubric scores, ticket categories, onboarding milestones, approval objects and evidence links, sharepoint holds documents with metadata.

1107
01:01:18,800 --> 01:01:30,800
Logic apps tools retrieve what's needed when it's needed under an identity boundary. The chat layer stays thin, guardrails come next. The non-negotiables that prevent capability sprawl, one define the action space upfront.

1108
01:01:30,800 --> 01:01:38,800
List the allowed tools, the allowed operations and the prohibited actions. Can read policies is not the same as can update HRIS records.

1109
01:01:38,800 --> 01:01:50,800
Can schedule interviews is not the same as can generate and send offer letters without review. If you don't define action space, the agent will expand it through requests, exceptions and make a creativity. Entropy always wins.

1110
01:01:50,800 --> 01:02:01,800
Two implement confidence gates and risk gates as defaults. Low confidence classification routes to humans, high risk categories, route to humans, any right to identity or access routes through approvals.

1111
01:02:01,800 --> 01:02:07,800
You don't negotiate this in production, you implement it as the system's behavior. Three observability is mandatory before scale.

1112
01:02:07,800 --> 01:02:18,800
Correlation IDs across copilot, mcp, logic apps and dataverse. Secure inputs and outputs where PI exists, retention as a policy decision. If you can't reconstruct the case end to end, you don't scale, you stop.

1113
01:02:18,800 --> 01:02:22,800
And now governance rhythm because governance is not a document, it's a calendar.

1114
01:02:22,800 --> 01:02:28,800
Monthly. Review KPI is that expose failure, not vanity adoption metrics.

1115
01:02:28,800 --> 01:02:38,800
Escalation rates, override frequency, reopen rates, unresolved sessions and latency. Quarterly sample decisions for fairness and consistency and screening and run outcome checks where you have the data to do it.

1116
01:02:38,800 --> 01:02:47,800
And always, incident playbooks that include a kill switch, disable tools fast, restore after review. That's how you prevent a bad change from becoming a month long cleanup.

1117
01:02:47,800 --> 01:02:56,800
If you implement this blueprint, you get something rare. An HR agent system that behaves like an enterprise system, deterministic where it must be, probablyistic only where humans can correct it.

1118
01:02:56,800 --> 01:03:09,800
Govnt by design, not by hope. ROI and KPI is the only numbers that matter. Now the numbers, not because spreadsheets are fun, but because agentic HR without measurement becomes a budget line item with no defensibility.

1119
01:03:09,800 --> 01:03:16,800
And the numbers that matter aren't the ones vendors put on slides, they're the ones that tie directly to throughput, risk and human time.

1120
01:03:16,800 --> 01:03:30,800
Start with ROI components. In this architecture, value comes from three places. Fewer human touches per case, faster cycle times and fewer exception loops. Dicquetriage drives the first one, on boarding drives the second, candidates screening drives the third.

1121
01:03:30,800 --> 01:03:41,800
But only if you treat it as structured scoring plus review, not an automated decision engine. Then cost components. Licensing is the obvious line, but it's never the dominant cost in production. Build effort matters, but it's one time.

1122
01:03:41,800 --> 01:03:53,800
Governance and monitoring are the ongoing costs people forget to price in. Log analytics retention, audit evidence storage, quarterly sampling, bias checks and incident response practice.

1123
01:03:53,800 --> 01:04:09,800
If you don't fund those, you aren't saving money. You're deferring it into a compliance event. So the model is simple. Value is reclaimed, hours times loaded cost plus any measurable reduction in external spend like agency costs or contractor triage support plus any cycle time improvement that has a business impact.

1124
01:04:09,800 --> 01:04:25,800
Costs are licenses, build, run and governance overhead. Nothing else counts. Here's the math framing you can actually use. Pick one workflow, measure baseline, then measure after. For ticket triage, average handle time, particular total ticket volume, percent auto resolved and reopen rate.

1125
01:04:25,800 --> 01:04:36,800
If you reduce human touches, you reclaim hours. If you reduce reopen rate, you reduce second pass work. If you reduce time to first response, you reduce escalations and executive noise. That's ROI without fantasy.

1126
01:04:36,800 --> 01:04:46,800
For onboarding, measure time from offer accepted to day one ready plus the number of provisioning failures per hire plus the number of manual follow ups required to complete milestones.

1127
01:04:46,800 --> 01:04:57,800
If you cut missed access and reduce rework, new hires become productive sooner. That's the only onboarding ROI that matters. Time to productivity, not number of welcome emails sent.

1128
01:04:57,800 --> 01:05:10,800
For candidate screening, measure screening time per candidate, percentage of borderline escalations and review variance. If your structured rubric cuts screening time while increasing consistency, you get measurable savings and a defensibility upgrade.

1129
01:05:10,800 --> 01:05:25,800
If it just moves work into review the AI score, you didn't automate anything. You added a new step. Now KPIs, keep them ruthless. For ticket triage, deflection rate, auto resolved percent, time to first response, reopen rate, escalation rate and human touches per case.

1130
01:05:25,800 --> 01:05:41,800
Track override frequency 2 because it tells you when classification is drifting or when the knowledge boundary is wrong. For onboarding, percent of hires day one ready, provisioning retry counts, idempotency violations, milestone completion times, and number of late milestones per hire.

1131
01:05:41,800 --> 01:05:47,800
Late milestones are the real indicator of systemic failure because they correlate with manager frustration and new hired churn.

1132
01:05:47,800 --> 01:06:05,800
For screening, rubric completion rate, borderline rate, approval turnaround time, override rate by reviewer and consistency across rubric versions and you need at least one fairness signal. Not a legal conclusion. A monitoring indicator that tells you selection rates changed materially when we changed rubric v2 to v3.

1133
01:06:05,800 --> 01:06:14,800
If you can't detect that, you can't claim your managing bias. Finally the executive narrative, scale doesn't come from smarter AI. Scale comes from fewer exceptions.

1134
01:06:14,800 --> 01:06:26,800
Every exception is an entropy generator, a one-off rule, a manual override, a hidden permission, a side channel. Your agent gets better when your system gets tighter. That's the uncomfortable truth behind every successful deployment.

1135
01:06:26,800 --> 01:06:35,800
So when someone asks, what's the ROI? The answer is not, the model is amazing. The answer is we reduced human touches, shortened cycle times and made high-risk decisions auditable by design.

1136
01:06:35,800 --> 01:06:37,800
So that's the transformation.

1137
01:06:37,800 --> 01:06:49,800
Now, the governed, agente workflows turn HR from a reactive queue into a controlled system that can actually scale. Here's the challenge. Next week, pick one workflow you already have pain around. Ticket triage is usually the best start.

1138
01:06:49,800 --> 01:06:58,800
Add one confidence gate that forces a human decision when risk crosses your line. And add one evidence record in data verse that lets you reconstruct what happened without digging through chats.

1139
01:06:58,800 --> 01:07:08,800
So, blue prints like this, copilot studio as the brain, logic apps, standard as the muscle and governance as the control plane. Subscribe to the M365 FM podcast.

1140
01:07:08,800 --> 01:07:13,800
And if you've got a real HR agent failure mode, you want dissected, connect with myocopeters on LinkedIn and send it.

Mirko Peters Profile Photo

Founder of m365.fm, m365.show and m365con.net

Mirko Peters is a Microsoft 365 expert, content creator, and founder of m365.fm, a platform dedicated to sharing practical insights on modern workplace technologies. His work focuses on Microsoft 365 governance, security, collaboration, and real-world implementation strategies.

Through his podcast and written content, Mirko provides hands-on guidance for IT professionals, architects, and business leaders navigating the complexities of Microsoft 365. He is known for translating complex topics into clear, actionable advice, often highlighting common mistakes and overlooked risks in real-world environments.

With a strong emphasis on community contribution and knowledge sharing, Mirko is actively building a platform that connects experts, shares experiences, and helps organizations get the most out of their Microsoft 365 investments.