June 5, 2026

Connect Microsoft Copilot to Predictive Power BI

This episode explores how organizations can connect Microsoft Copilot with the predictive capabilities of Power BI to move beyond simple reporting and toward proactive, data-driven decision-making. The discussion highlights that while Copilot excels at natural language interactions and summarizing information, its real business value increases when it can access trusted analytical models, forecasts, and governed business data from Power BI.

The episode explains that many organizations still use Power BI primarily as a dashboard destination. However, the future lies in treating Power BI as a semantic and analytical layer that feeds AI-powered experiences. Instead of navigating reports manually, users can ask questions in natural language through Copilot and receive contextual answers backed by governed Power BI models.

A key theme is predictive analytics. By combining Copilot with Power BI datasets, organizations can surface forecasts, trends, risk indicators, and business predictions directly within conversational workflows. This allows business users to move from asking “What happened?” to “What is likely to happen next?” and “What actions should we take?”

The conversation also emphasizes the importance of governance, data quality, and semantic modeling. Predictive insights are only trustworthy when Copilot is connected to well-structured Power BI models with clear lineage, security controls, and business definitions. Without these foundations, AI-generated answers can become inconsistent or misleading.

Listeners will gain practical insights into building an architecture where Copilot acts as the conversational interface while Power BI provides the analytical intelligence underneath. The episode demonstrates how this combination can improve decision-making, increase accessibility to business insights, and help organizations create a more intelligent and proactive data culture.

You bridge the gap between copilot and predictive Power BI by enabling intuitive, AI-powered analytics. Copilot transforms your data exploration with conversational BI, making complex insights accessible in real time. AI-driven predictive analytics lets you act proactively, improving forecasting accuracy and operational efficiency. Copilot empowers you to ask questions and get instant answers, bridging the gap for business users who need actionable insights fast. Microsoft copilot connects your team to predictive models, bridging the gap between traditional reporting and intelligent decision-making.

Key Takeaways

Enable Copilot in Power BI by checking tenant settings and ensuring your organization has the right capacity.
Use Power BI Premium or eligible Pro licenses to access advanced AI features and ensure all users can utilize Copilot tools.
Connect to high-quality data sources like Excel and SQL Server for accurate AI-driven insights and dashboards.
Assign role-based access to protect sensitive data and ensure only trained users can access Copilot features.
Ask questions in plain language using Copilot to get instant, actionable insights without needing complex queries.
Automate report generation with Copilot to save time and keep your team informed about key trends and insights.
Follow best practices for data governance to maintain accuracy, security, and compliance in your BI environment.
Regularly review and update your integration settings to keep your AI and BI features secure and effective.

Bridge the Gap: Prerequisites for Copilot and Power BI

Before you connect copilot to power bi copilot, you need to prepare your environment. This preparation ensures that you unlock the full value of ai-powered dashboards and predictive analytics. Let’s break down the essential tools, access, and security steps you need to take.

Essential Tools and Access

Microsoft Copilot Requirements

To start, you must enable copilot for power bi in Microsoft Fabric. Make sure your organization has access to a supported capacity, such as F64 or P1, and that your region supports copilot features. You also need to turn on the tenant switch for copilot. If your business operates outside the US or France, review compliance and geographical settings to meet local requirements.

Tip: Always check your tenant settings and capacity before you begin. This step prevents delays during integration.

Power BI Licensing

You need the right power bi licensing to use advanced ai features. Copilot works best with Power BI Premium or certain Pro license tiers. Review your current licenses and upgrade if necessary. This step ensures that all users can access copilot tools and dashboards without interruption.

Requirement Type	Details
Tenant Settings	Enable copilot and Azure OpenAI features for users.
Access to Capacity	Use a dedicated Fabric copilot capacity or a workspace linked to supported capacity.
Power BI Licensing	Power BI Premium or eligible Pro license required.
Admin Configuration	Set up security policies, data governance, and user permissions.

Data and Security Readiness

Supported Data Sources

Copilot and power bi copilot support a wide range of data sources. You can connect to Microsoft Excel, Azure, Salesforce, SQL Server, SharePoint, and many third-party SaaS platforms. For the best ai-driven insights, use well-structured, high-quality data. Organize your tables with clear relationships, logical groupings, and consistent data types. Define your KPIs and document your data model to help copilot deliver accurate dashboards and reports.

User Roles and Permissions

Security and governance play a key role in successful integration. Assign role-based access control to protect sensitive data. Encrypt data at rest and in transit. Make sure only trained users have access to copilot features. Set up security groups and grant permissions based on user roles. This approach supports compliance with standards like GDPR and HIPAA, and it helps you monitor for vulnerabilities.

Note: Good data governance ensures that your dashboards remain accurate, secure, and compliant. It also helps users validate ai-generated outputs and maintain trust in your bi environment.

By following these prerequisites, you set the stage for a seamless copilot and power bi integration. You empower users to create powerful dashboards, unlock predictive ai, and drive business value from your data.

Power BI Copilot Setup

Setting up Copilot in your Power BI environment unlocks the full value of AI-powered dashboards. You need to follow a few important steps to make sure your users can access Copilot features and connect them to the right workspaces. This setup ensures your business gets the most from predictive analytics and conversational BI.

Enable Copilot Integration

Admin Center Configuration

You start by configuring Copilot in the Admin Center. This process gives your organization access to AI features and prepares your data environment for advanced dashboards.

Go to the Fabric Admin portal at https://app.fabric.microsoft.com/admin-portal or select the Settings gear icon in Power BI and choose Admin portal.
In the Admin portal, select Tenant settings.
Find the setting labeled "Users can use Copilot and other features powered by Azure OpenAI" and toggle it on.
In the Applies to section, choose which users or groups can use Copilot, then select Apply to save your changes.
If your data needs to be processed outside your geographic region, enable the appropriate setting and apply the changes.
Make sure your workspace uses the correct capacity, such as Power BI Premium or a paid Fabric capacity.

Tip: Review your tenant settings regularly. This helps you keep your AI and BI features up to date and secure.

Assign Permissions

Assigning permissions is a key step. You need to decide which users or groups can access Copilot. Use role-based access to control who can create, edit, or view dashboards powered by AI. This approach protects sensitive data and supports compliance with business policies.

Connect Copilot to Power BI Workspace

Workspace Selection

After enabling Copilot, you connect it to a specific Power BI workspace. You must have admin, member, or contributor access to a workspace assigned to a paid Fabric or Power BI Premium capacity with Copilot enabled.

Open a Power BI report.
Select Copilot in the ribbon. The Copilot pane will appear.
When prompted, select the workspace you want to associate with Copilot.
Use the Settings gear in the bottom-right corner of Power BI to switch the workspace or check which workspace your report uses.

Note: Choosing the right workspace ensures your dashboards use the correct data and AI features.

Integration Check

Check your integration to confirm everything works. Open a report in your selected workspace and use Copilot to ask questions or generate insights. If you see AI-powered suggestions and dashboards, your setup is complete. This step helps you verify that your business can access predictive analytics and conversational BI without issues.

By following these steps, you set up Power BI Copilot to deliver real-time value from your data. You empower users to create smarter dashboards, use AI for deeper insights, and drive better business outcomes.

AI-Driven Predictive Analytics Integration

Activate Predictive Features

AI Visuals and Forecasting

You can unlock the full power of predictive analytics by activating AI features in your dashboards. Start by making sure you meet the requirements for copilot. These requirements depend on whether you use copilot as a standalone tool or within the report pane. After the initial setup, install any necessary plugins and set your preferences to match your business needs.

Power bi copilot offers a range of AI-driven visuals and forecasting tools that help you find patterns and trends in your data. Here are some of the most valuable features you can use:

Anomaly detection highlights outliers on line charts and explains unusual data points.
Forecasting tools in the Analytics pane let you predict future trends directly from your dashboards.
Clustering automatically groups similar data points in scatter or table visuals.
Key Influencers and Decomposition Tree visuals help you understand what drives changes in your data.
Smart Narrative and Q&A visuals generate automated explanations and answer questions about your dashboards.
AutoML in Fabric Dataflows brings predictive scoring into your BI pipeline, supporting churn scores, demand forecasts, and risk models.

These AI features make it easier for users to spot trends, explain changes, and act on predictive insights without needing advanced technical skills.

Model Configuration

To get the most value from your dashboards, configure your predictive models to match your business goals. Use built-in tools to set up forecasting, anomaly detection, and clustering. You can also integrate Azure Cognitive Services for text analytics or use R and Python scripts for custom models. Adjust your model settings to focus on the metrics that matter most to your team. This approach ensures your dashboards deliver accurate, actionable insights every time.

Use Copilot for Insights

Natural Language Queries

Copilot changes the way you interact with your data. You can ask questions in plain language, such as "Why did sales drop yesterday?" and receive instant, AI-generated analyses. This conversational approach lets you chat with your dashboards and get answers without needing to write complex queries. Users can quickly find information, explore trends, and understand the story behind the numbers.

Tip: Use natural language queries to save time and reduce the need for follow-up questions to analysts.

Automated Reports

Copilot also helps you create automated reports that summarize key insights from your dashboards. You can generate summaries, highlight important trends, and share findings with your team. This automation reduces manual work and ensures everyone stays informed. By using copilot and AI features together, you empower your business to make smarter decisions and unlock the full power of your data.

Integration Steps: Power BI and Copilot

Connecting Power BI and copilot gives you a seamless ai experience that transforms how you interact with dashboards. You can follow a clear process to ensure your integration works smoothly and delivers value to your business.

Setup Checklist

Start with a checklist to prepare your environment for integration. This helps you avoid common mistakes and ensures a smooth experience for all users.

Launch Power BI Desktop on your device.
Go to the File menu, then select Options and Storage.
Open the Preview Features section.
Turn on Quick Measure Suggestions to enable ai-powered recommendations.
Confirm that your workspace uses the correct capacity and licensing.
Check that your data sources are connected and up to date.
Review user roles and permissions for security.
Document your data model and relationships for easy troubleshooting.

Tip: Completing this checklist before you start will save you time and prevent integration issues later.

Step-by-Step Workflow

You can follow these steps to connect copilot and Power BI. Each step builds on the last, so move through them in order for the best results.

Access Integration Settings

Open Power BI Desktop and navigate to the Options menu. In the Preview Features section, make sure Quick Measure Suggestions is enabled. This feature allows copilot to suggest ai-driven calculations and insights for your dashboards. You should also review your workspace settings to confirm that copilot features are active and available for your team.

Link Accounts

Sign in with your Microsoft account that has the correct permissions for both Power BI and copilot. Link your accounts by following the prompts in the integration settings. This step ensures that ai features work across your dashboards and that your data stays secure. You may need to verify your identity or accept updated terms of use.

Test Connection

After linking your accounts, test the connection by opening a report and using copilot to generate insights. Ask a question in natural language or request a summary of your data. If you see ai-powered suggestions and automated reports, your integration works. If not, review your checklist and settings to find any missing steps.

Note: Testing the connection helps you catch problems early and ensures a smooth ai experience for all users.

Best Practices

You can follow best practices to get the most from your integration and maintain a reliable bi environment.

Establish clear relationships between tables in your data model. This step ensures that copilot delivers accurate insights in your dashboards.
Standardize calculation logic and naming conventions for all measures. Consistency improves report generation and makes your dashboards easier to understand.
Separate fact tables from dimension tables. Use descriptive names for columns and group related data logically.
Check that all data types are correct and values are consistent. This prevents errors and improves the quality of ai recommendations.
Define key performance indicators (KPIs) that matter to your business. Set up role-level access to protect sensitive information.
Keep your data model well documented. Good documentation supports ongoing management and helps new users learn the system.

Tip: Following these best practices will help you avoid common challenges, such as incorrect code, governance risks, and operational blind spots. You will also strengthen your team's ability to manage and evolve your bi system over time.

By following these steps and best practices, you create a strong foundation for integrating Power BI and copilot. You empower your team to use ai for deeper insights, create smarter dashboards, and deliver real value from your data.

Troubleshooting and Optimization

Common Integration Issues

Authentication and Permissions

You may encounter authentication or permission errors when connecting Copilot to Power BI. These issues often arise if users lack the correct access or if tenant settings are not configured. To resolve this, check that each user has the right role in your workspace. Review admin settings to confirm Copilot features are enabled for your team. If you see login failures, verify that your Microsoft account matches the required permissions for both BI and Copilot. Always update your credentials and review access controls regularly. This practice protects your dashboards and ensures a smooth AI experience.

Data Sync Problems

Data sync problems can disrupt your dashboards and reduce the value of your BI environment. You should monitor scheduled refreshes and test them under load. Validate that your data sources connect properly and that refresh times meet your business needs. If you notice missing or outdated data, check your workspace capacity and review your data model documentation. Automated visualizations and narrative summaries help highlight key metrics and trends, making it easier to spot sync issues. Collaborative editing and integration with Microsoft Fabric improve your experience and keep your dashboards up to date.

Performance and Security

Optimizing performance and maintaining security are essential for delivering reliable dashboards and protecting business data. You can use several techniques to enhance your BI environment:

Track KPIs such as time-to-triage, implementation time, and refresh time improvements.
Extend governance to AI usage by defining who can run Copilot scans and logging all sessions.
Train your team by using Copilot as a training scaffold and investing in model hygiene training.
Validate Copilot's recommended DAX for correctness and test scheduled refreshes under load.
Ensure tenant data handling policies manage data indexing and privacy.

Comprehensive AI analytics require integration with enterprise systems, such as HR and financial systems, to provide a holistic view of AI adoption. Ongoing maintenance tasks include regular API updates, performance optimization, and security updates to ensure the dashboard's effectiveness and compliance with evolving standards.

Security measures protect your data and support compliance. You should:

Evaluate external sharing policies to control how sensitive data is shared outside your organization.
Develop a data classification system to categorize data based on sensitivity levels.
Review and implement access controls to protect sensitive data effectively.

You can also use sensitivity labels and data loss prevention policies to classify and protect information. Continuous monitoring of user activity on sensitive data helps you respond to risks in real time. These steps ensure your dashboards deliver power and value while keeping your BI experience secure.

Maximize Value with Power BI Copilot and AI

Business Use Cases

Predictive Forecasting

You can use predictive forecasting to transform your business planning. Power BI Copilot helps you predict future outcomes by analyzing historical data and market trends. This approach supports better decisions in sales, finance, and operations. For example, you can forecast sales for each region, plan inventory, or predict equipment failures before they happen. The table below shows how different industries use predictive forecasting with Power BI:

Use Case	Business Scenario	Power BI Solution
Sales and Revenue Forecasting	Predict quarterly sales and adjust strategies by region.	Copilot generates forecasts using historical data and market variables.
Inventory Optimization	Balance stock levels across warehouses.	Power BI analyzes sales and demand patterns in real time, reducing stockouts and excess inventory.
Financial Planning	Project revenue and expenses for reporting.	Predictive models forecast trends, with smart narratives explaining deviations.
Resource Allocation in Healthcare	Predict patient admissions for staffing.	Predictive models analyze trends, and Copilot provides summaries for administrators.

You can see real results from predictive forecasting. A food processing company used Power BI to predict equipment failures two weeks in advance. This action prevented costly breakdowns and saved $68,000. An electronics manufacturer improved forecast accuracy and reduced stockouts, justifying their investment in three months.

Trend Analysis

Trend analysis helps you spot patterns and changes in your data. Power BI Copilot makes this process simple and fast. You can generate summaries of your dashboards in seconds. These summaries highlight key trends, insights, and possible issues. Copilot also suggests relevant report pages based on your data model. You can select these suggestions or ask for a custom report to answer a specific question.

With AI, you do not need to spend hours searching for trends. Power BI Copilot automatically finds insights, outliers, and important metrics. This feature streamlines your decision-making and helps you act quickly. You can use advanced visuals and smart suggestions to enhance your dashboards and improve your BI experience.

Adoption and Impact

User Training

You can drive adoption by building a strong training program. Start with a pilot group to test Power BI Copilot in your organization. Choose users who are ready to learn and can share their experience with others. Build an internal champions network to support new users. Use contextual training tools that provide real-time help as users explore dashboards.

Tip: Embed AI governance in your training to ensure compliance and security.

Measuring Success

You should measure the impact of Power BI Copilot and AI on your business. Track key metrics to understand how users interact with dashboards and how much value you gain. Common metrics include:

Daily Active Users (DAU) and Weekly Active Users (WAU)
Feature Adoption Rate
Time to First Value
Productivity Impact (time saved, tasks automated)

You can visualize these metrics with KPI cards, line charts, and gauge charts. Collect feedback from users to learn how Copilot improves collaboration and efficiency. Highlight wins from pilot phases, such as a 30% reduction in reporting time. These steps help you show the power of your BI investment and guide future improvements.

Integrating Copilot with Predictive Power BI transforms your analytics experience. You gain faster insights, improved teamwork, and a more efficient business workflow. AI lets you ask questions in plain language and receive instant, actionable answers. To get started, open Power BI, use Copilot to explore your data, and review the reports it creates. Keep learning by joining community courses and sharing your experience with your team.

Stay curious and keep building your skills—your next breakthrough could be one question away.

FAQ

What do I need to enable Copilot in Power BI?

You need Power BI Premium or a supported Fabric capacity. Make sure your admin enables Copilot in the Admin portal. Assign the right permissions to users who need access.

Can I use Copilot with any data source?

Copilot works best with structured data from supported sources like Excel, Azure, and SQL Server. You should organize your data model and check compatibility before starting.

How do I ask Copilot questions in Power BI?

You type your question in plain language in the Copilot pane. For example, ask, "Show sales trends for last quarter." Copilot will generate insights or visuals based on your request.

Is my data secure when using Copilot?

Microsoft uses enterprise-grade security. You control user access and data sharing. Always review permissions and use sensitivity labels to protect sensitive information.

What should I do if Copilot features are missing?

Check your licensing and workspace capacity. Make sure your admin has enabled Copilot. Refresh your browser or restart Power BI Desktop if you still do not see the features.

Can Copilot help with predictive analytics?

Yes! Copilot can generate forecasts, highlight trends, and explain anomalies. You can use AI visuals and natural language queries to unlock predictive insights from your dashboards.

How do I train my team to use Copilot?

Start with a pilot group.
Use built-in tutorials and tooltips.
Encourage users to ask questions in plain language.
Share best practices and success stories.

Training helps your team get comfortable and confident with Copilot.

🚀 Want to be part of m365.fm?

Then stop just listening… and start showing up.

👉 Connect with me on LinkedIn and let’s make something happen:

🎙️ Be a podcast guest and share your story
🎧 Host your own episode (yes, seriously)
💡 Pitch topics the community actually wants to hear
🌍 Build your personal brand in the Microsoft 365 space

This isn’t just a podcast — it’s a platform for people who take action.

🔥 Most people wait. The best ones don’t.

👉 Connect with me on LinkedIn and send me a message:
"I want in"

Let’s build something awesome 👊

1
00:00:00,000 --> 00:00:02,320
Co-pilot is everywhere in Microsoft 365.

2
00:00:02,320 --> 00:00:04,800
It sits in word, Excel teams, and Outlook,

3
00:00:04,800 --> 00:00:06,840
and it's likely running in a dozen other places

4
00:00:06,840 --> 00:00:09,080
your organization uses every single day.

5
00:00:09,080 --> 00:00:10,240
But here is the problem.

6
00:00:10,240 --> 00:00:12,360
Most of the time, your teams are using it wrong.

7
00:00:12,360 --> 00:00:13,680
They are specifically using it wrong

8
00:00:13,680 --> 00:00:15,680
when they ask it to predict the future.

9
00:00:15,680 --> 00:00:17,480
The assumption people make is that co-pilot

10
00:00:17,480 --> 00:00:19,760
is a reasoning engine that can read data

11
00:00:19,760 --> 00:00:21,680
and figure out what comes next.

12
00:00:21,680 --> 00:00:24,360
You type a prompt like, "What will our Q2 sales be?"

13
00:00:24,360 --> 00:00:27,200
And you expect co-pilot to digest your historical numbers,

14
00:00:27,200 --> 00:00:29,800
understand your business context, and return a forecast.

15
00:00:29,800 --> 00:00:32,520
It feels simple and intuitive, but it is completely wrong.

16
00:00:32,520 --> 00:00:34,600
In reality, co-pilot is a reasoning engine

17
00:00:34,600 --> 00:00:37,120
that works over language patterns rather than structure.

18
00:00:37,120 --> 00:00:39,480
It sees your tables, but it doesn't understand

19
00:00:39,480 --> 00:00:41,120
the relationships between them.

20
00:00:41,120 --> 00:00:43,360
It reads your numbers, but it has no idea

21
00:00:43,360 --> 00:00:46,280
what those digits actually mean for your specific business.

22
00:00:46,280 --> 00:00:48,680
The result is a tool that is confident, eloquent,

23
00:00:48,680 --> 00:00:50,360
and frequently hallucinating.

24
00:00:50,360 --> 00:00:52,280
This episode is about closing that gap.

25
00:00:52,280 --> 00:00:53,920
We aren't trying to make co-pilot smarter.

26
00:00:53,920 --> 00:00:55,920
Instead, we are building architecture around it

27
00:00:55,920 --> 00:00:58,160
so that co-pilot becomes what it's actually good at.

28
00:00:58,160 --> 00:00:59,920
We wanted to orchestrate structured logic

29
00:00:59,920 --> 00:01:02,160
instead of just pretending to be a logic engine itself.

30
00:01:02,160 --> 00:01:04,880
We are moving away from the idea of chatting with spreadsheets

31
00:01:04,880 --> 00:01:06,200
and moving toward building systems

32
00:01:06,200 --> 00:01:08,920
where co-pilot interprets what a person is asking for.

33
00:01:08,920 --> 00:01:10,680
Then, it delegates that request

34
00:01:10,680 --> 00:01:14,160
to a real predictive engine grounded in your semantic layer.

35
00:01:14,160 --> 00:01:16,320
We will expose exactly what is broken,

36
00:01:16,320 --> 00:01:19,000
show you how to fix it, and walk through the architecture

37
00:01:19,000 --> 00:01:20,000
that makes it work.

38
00:01:20,000 --> 00:01:22,880
There is no marketing language here, just structural clarity.

39
00:01:22,880 --> 00:01:24,640
The co-pilot prediction problem.

40
00:01:24,640 --> 00:01:26,560
Every week, organizations ask co-pilot

41
00:01:26,560 --> 00:01:27,680
the same kinds of questions.

42
00:01:27,680 --> 00:01:30,360
They want to know what customer churn will look like next quarter

43
00:01:30,360 --> 00:01:33,480
or they ask for a revenue forecast for the next fiscal year.

44
00:01:33,480 --> 00:01:35,080
They might even try to find out which products

45
00:01:35,080 --> 00:01:37,080
are most likely to see demand spikes.

46
00:01:37,080 --> 00:01:38,760
These are legitimate business questions,

47
00:01:38,760 --> 00:01:41,960
but the problem is, what happens the moment you hit enter?

48
00:01:41,960 --> 00:01:44,320
Co-pilot generates an answer that sounds plausible.

49
00:01:44,320 --> 00:01:46,920
The numbers look reasonable, the confidence level seems high,

50
00:01:46,920 --> 00:01:48,640
and you decide to act on that information.

51
00:01:48,640 --> 00:01:50,640
Then reality hits, and the actual results

52
00:01:50,640 --> 00:01:52,280
don't match what co-pilot said at all.

53
00:01:52,280 --> 00:01:53,720
This isn't some random fluke.

54
00:01:53,720 --> 00:01:56,080
It is exactly what happens when you ask a language model

55
00:01:56,080 --> 00:01:59,560
to predict the future, without giving it the structure it needs to function.

56
00:01:59,560 --> 00:02:00,640
The reason is simple.

57
00:02:00,640 --> 00:02:02,760
Co-pilot doesn't understand what revenue means

58
00:02:02,760 --> 00:02:04,560
in the context of your specific business.

59
00:02:04,560 --> 00:02:07,480
It doesn't know if that number includes refunds or excludes them,

60
00:02:07,480 --> 00:02:09,200
and it has no idea if you count revenue

61
00:02:09,200 --> 00:02:11,080
by the invoice date or the delivery date.

62
00:02:11,080 --> 00:02:13,000
Because it hasn't ingested your actual data,

63
00:02:13,000 --> 00:02:15,040
it doesn't know your seasonality patterns.

64
00:02:15,040 --> 00:02:17,640
It is working from patent recognition at the token level

65
00:02:17,640 --> 00:02:19,800
rather than using actual business logic.

66
00:02:19,800 --> 00:02:21,760
And here is the thing, it can't do math.

67
00:02:21,760 --> 00:02:23,080
It cannot do the kind of real math

68
00:02:23,080 --> 00:02:24,680
that matters for a serious prediction.

69
00:02:24,680 --> 00:02:26,960
But what it actually does is approximate language patterns.

70
00:02:26,960 --> 00:02:28,960
If the training data showed thousands of examples

71
00:02:28,960 --> 00:02:31,280
where Q1 sales followed certain patterns,

72
00:02:31,280 --> 00:02:33,680
co-pilot can guess at something that sounds plausible.

73
00:02:33,680 --> 00:02:34,880
But it is still just a guess.

74
00:02:34,880 --> 00:02:36,680
It is a sophisticated, eloquent guess,

75
00:02:36,680 --> 00:02:38,000
but it is a guess nonetheless.

76
00:02:38,000 --> 00:02:39,680
The confidence is the real killer here.

77
00:02:39,680 --> 00:02:41,800
Co-pilot never whispers its uncertainty.

78
00:02:41,800 --> 00:02:44,720
It states its predictions like they are absolute facts.

79
00:02:44,720 --> 00:02:46,480
A user reads a sentence like,

80
00:02:46,480 --> 00:02:48,640
"Based on historical trends, your Q2 sales

81
00:02:48,640 --> 00:02:51,000
will likely reach $3.2 million,

82
00:02:51,000 --> 00:02:53,000
and they assume they are looking at real analysis."

83
00:02:53,000 --> 00:02:54,760
They think co-pilot has actually done the math

84
00:02:54,760 --> 00:02:56,640
when it hasn't done anything of the sort.

85
00:02:56,640 --> 00:02:58,200
There is another layer to this problem.

86
00:02:58,200 --> 00:02:59,400
Co-pilot doesn't know the difference

87
00:02:59,400 --> 00:03:00,800
between a forecast and an actual.

88
00:03:00,800 --> 00:03:02,800
It doesn't know which tables in your data warehouse

89
00:03:02,800 --> 00:03:05,520
contain real numbers and which ones are just projections.

90
00:03:05,520 --> 00:03:07,200
It doesn't understand your business rules

91
00:03:07,200 --> 00:03:09,760
and it cannot execute filters based on your company's

92
00:03:09,760 --> 00:03:10,600
specific logic.

93
00:03:10,600 --> 00:03:12,800
It just sees data and tries to match a pattern.

94
00:03:12,800 --> 00:03:14,800
When the data is messy, which is the case

95
00:03:14,800 --> 00:03:16,760
in almost every real organization,

96
00:03:16,760 --> 00:03:18,360
co-pilot doesn't flag the mess,

97
00:03:18,360 --> 00:03:20,200
it just absorbs the mess and passes it along.

98
00:03:20,200 --> 00:03:22,800
If your historical sales data has inconsistent definitions

99
00:03:22,800 --> 00:03:24,240
across different regions,

100
00:03:24,240 --> 00:03:25,680
co-pilot won't catch that error.

101
00:03:25,680 --> 00:03:27,480
It just predicts using all of it equally,

102
00:03:27,480 --> 00:03:29,400
which only serves to amplify the noise.

103
00:03:29,400 --> 00:03:31,480
Most organizations think this is a co-pilot problem

104
00:03:31,480 --> 00:03:33,360
but it's actually an architecture problem.

105
00:03:33,360 --> 00:03:35,720
Co-pilot is doing exactly what it was built to do

106
00:03:35,720 --> 00:03:38,760
by reasoning over language and generating plausible outputs.

107
00:03:38,760 --> 00:03:40,200
But you are asking it to do something

108
00:03:40,200 --> 00:03:42,920
that requires structured computation, business logic

109
00:03:42,920 --> 00:03:44,120
and data governance.

110
00:03:44,120 --> 00:03:45,880
You're asking it to be a data analysis engine

111
00:03:45,880 --> 00:03:47,480
when it is actually a language model.

112
00:03:47,480 --> 00:03:50,000
The gap isn't co-pilot's fault, it's ours.

113
00:03:50,000 --> 00:03:52,200
Why natural language queries fail at prediction?

114
00:03:52,200 --> 00:03:53,640
Let's get specific about what happens

115
00:03:53,640 --> 00:03:55,480
when you type a question into co-pilot.

116
00:03:55,480 --> 00:03:58,680
Imagine a user opens Power BI to look at a dashboard

117
00:03:58,680 --> 00:03:59,840
full of sales data.

118
00:03:59,840 --> 00:04:02,560
They click the co-pilot button and type a simple question.

119
00:04:02,560 --> 00:04:04,600
What will our Q2 sales be?

120
00:04:04,600 --> 00:04:06,640
The question is clear, it's reasonable

121
00:04:06,640 --> 00:04:08,200
and it is exactly the kind of thing

122
00:04:08,200 --> 00:04:10,400
any human analyst should be able to answer.

123
00:04:10,400 --> 00:04:11,840
But from co-pilot's perspective,

124
00:04:11,840 --> 00:04:13,040
it sees something very different

125
00:04:13,040 --> 00:04:14,800
from what the user thinks it sees.

126
00:04:14,800 --> 00:04:17,320
Co-pilot sees a collection of tables, columns with numbers

127
00:04:17,320 --> 00:04:19,320
and maybe some dates or product categories.

128
00:04:19,320 --> 00:04:22,000
But here's what it doesn't see, it doesn't see relationships.

129
00:04:22,000 --> 00:04:25,040
It doesn't understand that the sales amount column in table A

130
00:04:25,040 --> 00:04:26,680
needs to be joined with product ID

131
00:04:26,680 --> 00:04:28,920
in table B through a specific business rule.

132
00:04:28,920 --> 00:04:30,480
It sees tables instead of a graph

133
00:04:30,480 --> 00:04:32,360
and it sees numbers instead of measures.

134
00:04:32,360 --> 00:04:35,120
More critically, co-pilot doesn't see business logic.

135
00:04:35,120 --> 00:04:37,360
It doesn't know that revenue in your organization

136
00:04:37,360 --> 00:04:40,200
means invoiced sales, excluding returns and credits,

137
00:04:40,200 --> 00:04:43,280
calculated on the invoice date rather than the delivery date.

138
00:04:43,280 --> 00:04:44,800
It doesn't know that your fiscal year

139
00:04:44,800 --> 00:04:46,720
runs on a calendar different from the calendar year

140
00:04:46,720 --> 00:04:49,520
and it has no idea which regions use which currencies

141
00:04:49,520 --> 00:04:52,120
or whether certain product lines are being phased out.

142
00:04:52,120 --> 00:04:53,720
Those things live in your semantic model

143
00:04:53,720 --> 00:04:56,720
but they are not in the raw tables co-pilot can access.

144
00:04:56,720 --> 00:04:58,800
When co-pilot reasons about this data,

145
00:04:58,800 --> 00:05:00,960
it is doing something very specific.

146
00:05:00,960 --> 00:05:03,120
It is predicting the next token in a sequence

147
00:05:03,120 --> 00:05:05,200
based on patterns in its training data.

148
00:05:05,200 --> 00:05:07,120
That is literally what language models do.

149
00:05:07,120 --> 00:05:09,760
They see a pattern and they predict what comes next.

150
00:05:09,760 --> 00:05:11,800
If you feed in Q1 sales were $2 million,

151
00:05:11,800 --> 00:05:15,520
Q2 sales were $2.3 million, Q3 sales were.

152
00:05:15,520 --> 00:05:17,840
Co-pilot will generate a plausible next number.

153
00:05:17,840 --> 00:05:20,000
It is sophisticated statistical pattern matching

154
00:05:20,000 --> 00:05:21,480
but it isn't prediction.

155
00:05:21,480 --> 00:05:23,480
It is just extrapolation with a lot of confidence.

156
00:05:23,480 --> 00:05:26,760
The problem is that real prediction requires something else entirely.

157
00:05:26,760 --> 00:05:29,240
Real prediction requires understanding causation

158
00:05:29,240 --> 00:05:30,560
rather than just correlation.

159
00:05:30,560 --> 00:05:32,440
It requires knowing that a sale happens

160
00:05:32,440 --> 00:05:34,280
because a customer decided to buy

161
00:05:34,280 --> 00:05:36,440
not just because the calendar page turned over.

162
00:05:36,440 --> 00:05:38,360
It requires understanding seasonality

163
00:05:38,360 --> 00:05:40,280
as a structural fact about your business

164
00:05:40,280 --> 00:05:42,520
rather than a pattern in historical data.

165
00:05:42,520 --> 00:05:44,400
If demand drops 20% every summer

166
00:05:44,400 --> 00:05:47,480
because your core market goes on vacation, that is knowledge

167
00:05:47,480 --> 00:05:49,080
and co-pilot simply doesn't have it.

168
00:05:49,080 --> 00:05:50,600
And this is where it gets worse.

169
00:05:50,600 --> 00:05:53,600
Co-pilot has no mechanism to know what it doesn't know.

170
00:05:53,600 --> 00:05:55,320
It doesn't have a way to signal uncertainty

171
00:05:55,320 --> 00:05:57,520
or say that it's working with incomplete information

172
00:05:57,520 --> 00:05:58,680
about your business model.

173
00:05:58,680 --> 00:06:01,480
It just generates an answer and assigns a confidence level.

174
00:06:01,480 --> 00:06:03,720
And the user interprets that confidence is justified

175
00:06:03,720 --> 00:06:05,880
because co-pilot sounds so authoritative.

176
00:06:05,880 --> 00:06:07,280
This is the fundamental mismatch.

177
00:06:07,280 --> 00:06:09,800
Language models work by finding patterns in sequences

178
00:06:09,800 --> 00:06:11,400
but prediction requires structure

179
00:06:11,400 --> 00:06:14,680
like causal models, business rules, and validated assumptions.

180
00:06:14,680 --> 00:06:17,280
One is about language, while the other is about logic.

181
00:06:17,280 --> 00:06:19,040
You could try to fix this with better prompts

182
00:06:19,040 --> 00:06:21,400
by telling co-pilot to consider seasonality,

183
00:06:21,400 --> 00:06:24,080
regional variations, and product mix changes.

184
00:06:24,080 --> 00:06:26,600
Co-pilot will incorporate those words into its output

185
00:06:26,600 --> 00:06:29,360
and it will sound much smarter, but it will still be guessing.

186
00:06:29,360 --> 00:06:31,320
Because co-pilot cannot execute logic,

187
00:06:31,320 --> 00:06:33,720
it can only reflect language patterns back at you.

188
00:06:33,720 --> 00:06:35,200
The solution isn't better prompts

189
00:06:35,200 --> 00:06:36,960
or training co-pilot harder.

190
00:06:36,960 --> 00:06:38,600
It's architecture, you need to give co-pilot

191
00:06:38,600 --> 00:06:40,360
a structured layer to reason over,

192
00:06:40,360 --> 00:06:41,880
which means using a semantic model

193
00:06:41,880 --> 00:06:44,880
that encodes your business meaning instead of raw tables.

194
00:06:44,880 --> 00:06:47,240
You need to build actual predictive models in code

195
00:06:47,240 --> 00:06:49,400
that are grounded in real data science.

196
00:06:49,400 --> 00:06:51,680
Then, co-pilot's job becomes orchestration,

197
00:06:51,680 --> 00:06:53,960
where it interprets what the user is asking for

198
00:06:53,960 --> 00:06:56,600
and routes that request to the right structured system

199
00:06:56,600 --> 00:06:57,840
to actually answer it.

200
00:06:57,840 --> 00:07:00,360
That shift from co-pilot as a calculator to co-pilot

201
00:07:00,360 --> 00:07:03,320
as an orchestrator is what this entire episode is about.

202
00:07:03,320 --> 00:07:05,960
What power BI semantic models actually are?

203
00:07:05,960 --> 00:07:08,120
A semantic model is not a database view.

204
00:07:08,120 --> 00:07:09,240
If that is what you think it is,

205
00:07:09,240 --> 00:07:11,000
you need to stop now and recalibrate.

206
00:07:11,000 --> 00:07:14,760
A view is just a filtered slice of raw data organized by SQL.

207
00:07:14,760 --> 00:07:17,280
But a semantic model is something fundamentally different

208
00:07:17,280 --> 00:07:19,320
because it is a layer of meaning.

209
00:07:19,320 --> 00:07:20,280
Think of it this way.

210
00:07:20,280 --> 00:07:23,400
A semantic model is a machine readable contract.

211
00:07:23,400 --> 00:07:26,120
It is an agreement about what your data represents

212
00:07:26,120 --> 00:07:27,360
and how it should be used.

213
00:07:27,360 --> 00:07:30,440
It sits between the raw data and the tools that consume it,

214
00:07:30,440 --> 00:07:32,720
like reports, dashboards, and now co-pilot.

215
00:07:32,720 --> 00:07:34,080
Everything that lives in that contract

216
00:07:34,080 --> 00:07:36,480
is explicitly defined rather than inferred.

217
00:07:36,480 --> 00:07:37,880
Start with the tables.

218
00:07:37,880 --> 00:07:40,240
In a semantic model, tables don't just exist.

219
00:07:40,240 --> 00:07:41,680
They are documented.

220
00:07:41,680 --> 00:07:43,840
A table called sales has a description

221
00:07:43,840 --> 00:07:45,600
that says exactly what it contains,

222
00:07:45,600 --> 00:07:47,840
like completed transactions with revenue

223
00:07:47,840 --> 00:07:49,560
recognized on the invoice date.

224
00:07:49,560 --> 00:07:52,280
That description isn't just narrative flavor for the humans.

225
00:07:52,280 --> 00:07:54,120
It is operational because co-pilot reads it,

226
00:07:54,120 --> 00:07:56,160
your analyst reads it, and the tools read it.

227
00:07:56,160 --> 00:07:57,760
The same rule applies to columns.

228
00:07:57,760 --> 00:08:00,480
A column called amount comes with a decimal type rather

229
00:08:00,480 --> 00:08:01,360
than a string.

230
00:08:01,360 --> 00:08:03,920
And it is formatted specifically as USD currency.

231
00:08:03,920 --> 00:08:05,520
It includes a description that defines

232
00:08:05,520 --> 00:08:07,680
what that amount represents in business terms,

233
00:08:07,680 --> 00:08:09,400
such as whether it is gross or net,

234
00:08:09,400 --> 00:08:10,800
or if it includes shipping costs.

235
00:08:10,800 --> 00:08:12,040
All of that is explicit.

236
00:08:12,040 --> 00:08:13,480
Then come the relationships.

237
00:08:13,480 --> 00:08:15,400
This is where semantic models diverge sharply

238
00:08:15,400 --> 00:08:16,600
from raw databases.

239
00:08:16,600 --> 00:08:18,760
In your data warehouse, you might have a junction table

240
00:08:18,760 --> 00:08:21,240
that links customers to orders through a foreign key,

241
00:08:21,240 --> 00:08:24,160
but in your semantic model, that relationship is labeled.

242
00:08:24,160 --> 00:08:25,920
It has a direction, it has a cardinality,

243
00:08:25,920 --> 00:08:27,440
and it has business context.

244
00:08:27,440 --> 00:08:29,880
The semantic model says this is a one-to-many relationship

245
00:08:29,880 --> 00:08:31,160
from customer to order.

246
00:08:31,160 --> 00:08:33,680
And because it is active, co-pilot knows this is the path

247
00:08:33,680 --> 00:08:36,160
to use when correlating customer attributes with order

248
00:08:36,160 --> 00:08:36,880
facts.

249
00:08:36,880 --> 00:08:39,440
The raw database just has keys, but the semantic model

250
00:08:39,440 --> 00:08:40,240
has meaning.

251
00:08:40,240 --> 00:08:42,080
Hierarchies live here, too.

252
00:08:42,080 --> 00:08:44,040
Your data dimension probably has relationships

253
00:08:44,040 --> 00:08:46,880
where years contain quarters, quarters contain months,

254
00:08:46,880 --> 00:08:48,440
and months contain dates.

255
00:08:48,440 --> 00:08:50,520
In a database, that is just normalization,

256
00:08:50,520 --> 00:08:52,600
but in a semantic model, you explicitly

257
00:08:52,600 --> 00:08:54,800
declare it as a hierarchy so that co-pilot

258
00:08:54,800 --> 00:08:57,080
and your analysts can navigate time intelligently.

259
00:08:57,080 --> 00:08:59,400
Asking to see sales by month while drilling into weeks

260
00:08:59,400 --> 00:09:02,680
for this quarter makes sense because that hierarchy is encoded.

261
00:09:02,680 --> 00:09:04,320
Now we get to the core, the measures.

262
00:09:04,320 --> 00:09:06,400
Measures are where business logic crystallizes

263
00:09:06,400 --> 00:09:08,240
into an executable form.

264
00:09:08,240 --> 00:09:10,240
A measure is a calculation expressed in DAX

265
00:09:10,240 --> 00:09:11,560
that defines a metric.

266
00:09:11,560 --> 00:09:14,280
Total sales isn't just a sum of the amount column.

267
00:09:14,280 --> 00:09:16,440
It is a sum of the amount for all rows

268
00:09:16,440 --> 00:09:18,520
where the order status is marked as completed,

269
00:09:18,520 --> 00:09:20,440
and no return has been processed.

270
00:09:20,440 --> 00:09:21,800
That logic is baked in.

271
00:09:21,800 --> 00:09:24,760
When co-pilot asks for sales, it doesn't have to guess

272
00:09:24,760 --> 00:09:26,360
because the semantic model has already

273
00:09:26,360 --> 00:09:27,640
defined what that word means.

274
00:09:27,640 --> 00:09:29,600
This is the revelation that changes everything.

275
00:09:29,600 --> 00:09:31,880
In a raw table, the word sales is ambiguous.

276
00:09:31,880 --> 00:09:33,360
Are you summing all amounts or are you

277
00:09:33,360 --> 00:09:35,080
filtering for specific statuses?

278
00:09:35,080 --> 00:09:37,000
Are you including returns or excluding them?

279
00:09:37,000 --> 00:09:38,360
Co-pilot has to guess.

280
00:09:38,360 --> 00:09:41,160
In a semantic model, sales is the measure,

281
00:09:41,160 --> 00:09:42,680
which makes it unambiguous.

282
00:09:42,680 --> 00:09:45,040
So co-pilot can reason over it with confidence.

283
00:09:45,040 --> 00:09:48,200
Row-level security and object-level security sit here as well.

284
00:09:48,200 --> 00:09:50,600
Row-level security means a specific user can only

285
00:09:50,600 --> 00:09:52,800
see sales for their own regional product line.

286
00:09:52,800 --> 00:09:55,240
Object-level security means a user can't even

287
00:09:55,240 --> 00:09:57,840
see that a column exists if they don't have authorization.

288
00:09:57,840 --> 00:09:59,360
These are not report-level settings.

289
00:09:59,360 --> 00:10:01,000
They are semantic model-level rules

290
00:10:01,000 --> 00:10:02,960
that apply everywhere the data is consumed,

291
00:10:02,960 --> 00:10:05,880
whether that is in a report or a co-pilot query.

292
00:10:05,880 --> 00:10:08,040
This structure of tables with definitions,

293
00:10:08,040 --> 00:10:10,080
relationships with direction, and measures

294
00:10:10,080 --> 00:10:11,760
with calculation is a contract.

295
00:10:11,760 --> 00:10:14,000
It is an agreement that this is what the data actually

296
00:10:14,000 --> 00:10:14,760
means.

297
00:10:14,760 --> 00:10:17,520
And because it is machine readable, co-pilot can pass it

298
00:10:17,520 --> 00:10:19,760
and use it to generate trustworthy outputs instead

299
00:10:19,760 --> 00:10:20,920
of hallucinations.

300
00:10:20,920 --> 00:10:23,640
That is why semantic models matter for prediction.

301
00:10:23,640 --> 00:10:25,360
They are not just a way to organize things.

302
00:10:25,360 --> 00:10:27,160
They are meaning-made explicit.

303
00:10:27,160 --> 00:10:29,560
The role of metadata and co-pilot reliability.

304
00:10:29,560 --> 00:10:31,480
You can build the most sophisticated semantic model

305
00:10:31,480 --> 00:10:33,280
in the world, but co-pilot will still

306
00:10:33,280 --> 00:10:36,240
fail if the metadata inside it is sparse or vague.

307
00:10:36,240 --> 00:10:38,280
This is the underrated lever in the entire system.

308
00:10:38,280 --> 00:10:39,800
Here is the specific mechanism.

309
00:10:39,800 --> 00:10:41,160
When you ask co-pilot a question,

310
00:10:41,160 --> 00:10:43,600
it doesn't directly query your semantic model.

311
00:10:43,600 --> 00:10:45,440
That first tries to understand what you're asking for,

312
00:10:45,440 --> 00:10:47,840
which means it has to identify the right table,

313
00:10:47,840 --> 00:10:49,880
the right measure, and the right filters.

314
00:10:49,880 --> 00:10:52,200
It does this by reading the descriptions you've written,

315
00:10:52,200 --> 00:10:54,600
the column names, the measure definitions, and the synonyms.

316
00:10:54,600 --> 00:10:57,320
All of that textual context is how co-pilot figures out

317
00:10:57,320 --> 00:10:58,560
what you actually mean.

318
00:10:58,560 --> 00:11:01,160
If those descriptions are weak, co-pilot guesses,

319
00:11:01,160 --> 00:11:03,200
and it guesses with total confidence,

320
00:11:03,200 --> 00:11:04,720
imagine you have a measure called revenue

321
00:11:04,720 --> 00:11:06,280
with no description at all.

322
00:11:06,280 --> 00:11:08,160
Co-pilot sees the name, but it doesn't know

323
00:11:08,160 --> 00:11:11,280
if you mean total revenue, net revenue, or revenue by customer.

324
00:11:11,280 --> 00:11:12,880
It doesn't know if returns are included

325
00:11:12,880 --> 00:11:15,000
or what the time period is, so it makes an assumption.

326
00:11:15,000 --> 00:11:16,760
Sometimes it's right, but often it's not.

327
00:11:16,760 --> 00:11:19,680
The user never finds out until they check the underlying numbers

328
00:11:19,680 --> 00:11:21,200
and realize co-pilot was working

329
00:11:21,200 --> 00:11:23,120
with a different definition than they thought.

330
00:11:23,120 --> 00:11:26,280
Contrast that with a measure that has a real detailed description.

331
00:11:26,280 --> 00:11:28,840
If the metadata says total invoiced sales in USD,

332
00:11:28,840 --> 00:11:30,600
excluding returns and credits processed

333
00:11:30,600 --> 00:11:33,880
within 30 days of invoice date, co-pilot finally has context.

334
00:11:33,880 --> 00:11:36,480
When a user asks what last month's revenue was,

335
00:11:36,480 --> 00:11:38,960
co-pilot understands exactly which measure to use

336
00:11:38,960 --> 00:11:40,160
and how to filter it.

337
00:11:40,160 --> 00:11:43,400
The same logic applies to your column descriptions.

338
00:11:43,400 --> 00:11:46,600
A column called status could mean order status, payment status,

339
00:11:46,600 --> 00:11:48,280
or even customer status.

340
00:11:48,280 --> 00:11:51,400
Without a description, co-pilot has to infer the meaning.

341
00:11:51,400 --> 00:11:53,200
A description that explains that order status

342
00:11:53,200 --> 00:11:54,640
is the fulfillment state of the order

343
00:11:54,640 --> 00:11:57,040
at the time of query removes all ambiguity.

344
00:11:57,040 --> 00:11:58,960
Synnums matter enormously here.

345
00:11:58,960 --> 00:12:00,680
Your semantic model might define a measure

346
00:12:00,680 --> 00:12:03,480
called total revenue USD, but your business team

347
00:12:03,480 --> 00:12:06,280
calls it sales or the top line depending on the context.

348
00:12:06,280 --> 00:12:08,640
If you don't register those synonyms in the metadata,

349
00:12:08,640 --> 00:12:11,240
co-pilot won't map the user's language to the right measure.

350
00:12:11,240 --> 00:12:13,120
They'll ask what the sales were last quarter

351
00:12:13,120 --> 00:12:15,520
and co-pilot will either say it can't find that measure

352
00:12:15,520 --> 00:12:17,840
or it will return something completely unrelated.

353
00:12:17,840 --> 00:12:20,560
Business glossaries and term definitions layer on top of this.

354
00:12:20,560 --> 00:12:23,080
A term like customer churn means something very specific

355
00:12:23,080 --> 00:12:24,360
in your organization.

356
00:12:24,360 --> 00:12:27,440
Your definition might state that a customer is considered churned

357
00:12:27,440 --> 00:12:30,080
if they have made no purchases in the past 90 days

358
00:12:30,080 --> 00:12:34,080
after having made at least one purchase in the prior 180 days.

359
00:12:34,080 --> 00:12:35,560
That definition is operational.

360
00:12:35,560 --> 00:12:38,600
Co-pilot uses it to understand not just what you're asking for,

361
00:12:38,600 --> 00:12:40,360
but exactly how to calculate it.

362
00:12:40,360 --> 00:12:42,240
Without this metadata richness,

363
00:12:42,240 --> 00:12:45,040
you're asking co-pilot to operate in total ambiguity.

364
00:12:45,040 --> 00:12:47,040
With it, you're giving co-pilot a shared language

365
00:12:47,040 --> 00:12:48,360
with your organization.

366
00:12:48,360 --> 00:12:51,160
Microsoft has recognized this and made it systematic.

367
00:12:51,160 --> 00:12:53,600
The prepare your data for AI feature in Power BI

368
00:12:53,600 --> 00:12:55,280
is now a first class workflow.

369
00:12:55,280 --> 00:12:57,200
It's not buried in documentation tabs,

370
00:12:57,200 --> 00:12:59,040
but is instead a guided process.

371
00:12:59,040 --> 00:13:00,360
You go through your semantic model

372
00:13:00,360 --> 00:13:03,120
and you write descriptions, you define synonyms and you tag

373
00:13:03,120 --> 00:13:05,080
which fields are safe for AI to use.

374
00:13:05,080 --> 00:13:07,080
All of that becomes part of the model itself.

375
00:13:07,080 --> 00:13:08,880
This is the shift that matters most.

376
00:13:08,880 --> 00:13:11,600
Metadata used to be documentation that lived in a wiki

377
00:13:11,600 --> 00:13:13,120
somewhere separate from the data.

378
00:13:13,120 --> 00:13:15,520
Now metadata is operational infrastructure.

379
00:13:15,520 --> 00:13:17,480
It's how co-pilot understands your business

380
00:13:17,480 --> 00:13:20,440
and it's how your predictive models stay aligned with your reports.

381
00:13:20,440 --> 00:13:22,360
When you're evaluating your semantic models

382
00:13:22,360 --> 00:13:25,320
for co-pilot readiness, don't look first at the data volume

383
00:13:25,320 --> 00:13:26,440
or the table count.

384
00:13:26,440 --> 00:13:28,640
Look at the descriptions and the measure definitions.

385
00:13:28,640 --> 00:13:30,920
Count how many columns have no description at all.

386
00:13:30,920 --> 00:13:34,680
That gap between data and metadata is exactly where co-pilot fails.

387
00:13:34,680 --> 00:13:36,640
Introducing semantic link as the bridge.

388
00:13:36,640 --> 00:13:38,240
So far we've established the problem.

389
00:13:38,240 --> 00:13:40,880
Co-pilot can't predict because it has no structured foundation

390
00:13:40,880 --> 00:13:41,840
to reason over.

391
00:13:41,840 --> 00:13:44,320
We've built the semantic model as that foundation

392
00:13:44,320 --> 00:13:47,080
which is a layer of meaning that encodes business logic

393
00:13:47,080 --> 00:13:48,400
and security rules.

394
00:13:48,400 --> 00:13:49,760
Now comes the critical question,

395
00:13:49,760 --> 00:13:52,360
how does co-pilot actually access that semantic model?

396
00:13:52,360 --> 00:13:55,040
And more importantly, how does a real predictive engine

397
00:13:55,040 --> 00:13:56,720
access the same business logic

398
00:13:56,720 --> 00:13:58,960
so that predictions stay aligned with reports?

399
00:13:58,960 --> 00:14:00,440
The answer is semantic link.

400
00:14:00,440 --> 00:14:02,400
Semantic link is a feature in Microsoft fabric

401
00:14:02,400 --> 00:14:03,880
that does one fundamental thing.

402
00:14:03,880 --> 00:14:06,400
It exposes your semantic models to code.

403
00:14:06,400 --> 00:14:09,240
Not to reports or dashboards, but to Python and PySPEC.

404
00:14:09,240 --> 00:14:11,080
It's for any data science workload

405
00:14:11,080 --> 00:14:13,200
that needs to reason over your business logic

406
00:14:13,200 --> 00:14:14,720
without rewriting it from scratch.

407
00:14:14,720 --> 00:14:16,840
This is not a new database or a new reporting tool.

408
00:14:16,840 --> 00:14:18,240
It's a programmatic interface.

409
00:14:18,240 --> 00:14:20,880
It's a bridge that allows code to query your semantic model

410
00:14:20,880 --> 00:14:22,120
the same way a report does.

411
00:14:22,120 --> 00:14:24,440
It uses the same measures, the same security rules

412
00:14:24,440 --> 00:14:26,560
and the same table relationships.

413
00:14:26,560 --> 00:14:28,840
Think about what this means structurally.

414
00:14:28,840 --> 00:14:30,480
Right now in most organizations,

415
00:14:30,480 --> 00:14:33,320
the BI team builds a semantic model in Power BI.

416
00:14:33,320 --> 00:14:35,480
They define measures and set up security.

417
00:14:35,480 --> 00:14:38,600
That model powers reports and it's locked down and audited.

418
00:14:38,600 --> 00:14:41,520
Meanwhile, your data science team is off in their own world

419
00:14:41,520 --> 00:14:43,160
using Python notebooks.

420
00:14:43,160 --> 00:14:45,680
They're working with raw data or SQL queries

421
00:14:45,680 --> 00:14:47,560
and they're rebuilding metrics from scratch

422
00:14:47,560 --> 00:14:50,680
because they don't have access to the BI team's definitions.

423
00:14:50,680 --> 00:14:52,400
They define revenue one way

424
00:14:52,400 --> 00:14:54,320
and the BI team defines it another.

425
00:14:54,320 --> 00:14:56,320
Nobody knows which is right until someone compares

426
00:14:56,320 --> 00:14:58,320
the outputs and finds a discrepancy.

427
00:14:58,320 --> 00:15:00,440
Then you spend weeks reconciling definitions.

428
00:15:00,440 --> 00:15:02,160
Semantic link eliminates that split.

429
00:15:02,160 --> 00:15:04,800
It says the semantic model is the source of truth.

430
00:15:04,800 --> 00:15:07,120
All predictive code should run through it

431
00:15:07,120 --> 00:15:09,640
and all metrics should use the same definitions.

432
00:15:09,640 --> 00:15:12,400
The implementation is a Python library called SemPy.

433
00:15:12,400 --> 00:15:15,680
You import it in a notebook and point it at your fabric workspace.

434
00:15:15,680 --> 00:15:19,120
Suddenly, your code can access your semantic model programmatically.

435
00:15:19,120 --> 00:15:21,080
You can list tables, you can list measures

436
00:15:21,080 --> 00:15:23,400
and you can see the DAX definitions behind those measures.

437
00:15:23,400 --> 00:15:26,080
You can execute DAX directly from Python and get results back

438
00:15:26,080 --> 00:15:27,200
as a data frame.

439
00:15:27,200 --> 00:15:29,480
You can read semantic model data into Spark

440
00:15:29,480 --> 00:15:32,960
for distributed processing while maintaining the full context of relationships.

441
00:15:32,960 --> 00:15:35,800
This changes everything about how you build predictive models.

442
00:15:35,800 --> 00:15:38,400
Before semantic link, a data scientist would write a query

443
00:15:38,400 --> 00:15:40,280
to pull historical revenue data.

444
00:15:40,280 --> 00:15:42,040
They'd calculate a trend and build a model.

445
00:15:42,040 --> 00:15:44,520
It would use a different definition of revenue than the reports

446
00:15:44,520 --> 00:15:47,280
so everyone would argue about which number is right.

447
00:15:47,280 --> 00:15:49,920
With semantic link, a data scientist imports SemPy.

448
00:15:49,920 --> 00:15:52,680
They call a function to list all measures in the semantic model.

449
00:15:52,680 --> 00:15:55,040
They find the total revenue measure, check its description

450
00:15:55,040 --> 00:15:56,760
and see exactly how it's calculated.

451
00:15:56,760 --> 00:16:00,640
They use that same measure definition to pull training data for their model.

452
00:16:00,640 --> 00:16:03,040
Their model inherits the semantic model's logic.

453
00:16:03,040 --> 00:16:06,440
When they deploy the model and expose its outputs as a new measure,

454
00:16:06,440 --> 00:16:10,080
that measure uses the same definitions as everything else in the organization.

455
00:16:10,080 --> 00:16:11,840
There is no discrepancy and no argument.

456
00:16:11,840 --> 00:16:13,560
It's just one source of truth.

457
00:16:13,560 --> 00:16:16,640
This is also where co-pilot integration becomes possible at scale.

458
00:16:16,640 --> 00:16:18,840
Co-pilot doesn't need to understand how to predict.

459
00:16:18,840 --> 00:16:20,960
It needs to understand how to ask for predictions.

460
00:16:20,960 --> 00:16:24,160
It needs to know those predictions are grounded in the same semantic logic

461
00:16:24,160 --> 00:16:25,760
that powers your reports.

462
00:16:25,760 --> 00:16:28,520
When co-pilot roots a user's question to a predictive model,

463
00:16:28,520 --> 00:16:31,120
that model can read the semantic model through SemPy

464
00:16:31,120 --> 00:16:35,440
to understand exactly what data to use and what security rules must be enforced.

465
00:16:35,440 --> 00:16:37,360
The bridge isn't magic, it's access.

466
00:16:37,360 --> 00:16:42,080
Semantic link is access to business logic, access to definitions and access to security.

467
00:16:42,080 --> 00:16:45,640
It's the foundation that makes predictions trustworthy instead of just plausible.

468
00:16:45,640 --> 00:16:48,480
How SemPy exposes semantic models to code.

469
00:16:48,480 --> 00:16:50,400
Let's get concrete about how this actually works.

470
00:16:50,400 --> 00:16:52,600
When you open a fabric notebook and start working,

471
00:16:52,600 --> 00:16:55,040
SemPy changes the entire nature of your workflow.

472
00:16:55,040 --> 00:16:58,760
You start by importing the library and authenticating to your workspace

473
00:16:58,760 --> 00:17:03,920
and immediately you have full-programmatic access to every semantic model in that environment.

474
00:17:03,920 --> 00:17:06,320
You can call a simple function like fabric.

475
00:17:06,320 --> 00:17:10,560
List data sets, while sustaining to see every model available to you.

476
00:17:10,560 --> 00:17:13,600
From there, you can drill deeper by calling ListTables

477
00:17:13,600 --> 00:17:17,840
to see every table, its description and the specific data types within it.

478
00:17:17,840 --> 00:17:19,440
If you call ListMeasures,

479
00:17:19,440 --> 00:17:22,080
you'll see every single measure the DAX formula behind it

480
00:17:22,080 --> 00:17:24,040
and exactly which column it belongs to.

481
00:17:24,040 --> 00:17:27,440
This isn't just looking at data, it's introspection at the semantic level.

482
00:17:27,440 --> 00:17:29,480
But here's where it really matters for your daily work.

483
00:17:29,480 --> 00:17:31,120
Imagine you're building a predictive model

484
00:17:31,120 --> 00:17:33,440
and you need historical sales data to train it.

485
00:17:33,440 --> 00:17:36,280
Before SemPy, you'd have to write a complex SQL query

486
00:17:36,280 --> 00:17:39,280
and handcraft a join between your fact and dimension tables

487
00:17:39,280 --> 00:17:41,280
while hoping you got the relationships right.

488
00:17:41,280 --> 00:17:45,600
With SemPy, you don't write the join because the semantic model has already defined it for you.

489
00:17:45,600 --> 00:17:49,560
You simply call a function that reads the data directly into a Spark DataFrame

490
00:17:49,560 --> 00:17:52,520
and the relationships, column names and filters come with it.

491
00:17:52,520 --> 00:17:56,280
You aren't reconstructing business logic from scratch, you're inheriting it.

492
00:17:56,280 --> 00:17:58,520
And there's more to it than just reading tables.

493
00:17:58,520 --> 00:18:01,240
You can now execute DAX directly from your Python notebook.

494
00:18:01,240 --> 00:18:04,280
This is a massive shift because DAX is where your business logic lives,

495
00:18:04,280 --> 00:18:06,720
including your time intelligence and measure definitions.

496
00:18:06,720 --> 00:18:09,080
Normally, that logic only runs inside Power BI,

497
00:18:09,080 --> 00:18:11,840
but SemPy lets you evaluate those expressions from your notebook.

498
00:18:11,840 --> 00:18:15,320
You can write a DAX expression that sums revenue for the last 12 months

499
00:18:15,320 --> 00:18:19,320
while automatically handling your fiscal calendar and specific filtering rules.

500
00:18:19,320 --> 00:18:22,440
You execute it from Python and get the results back as a data frame.

501
00:18:22,440 --> 00:18:27,160
Your predictive model is now using the exact same calculation that your executive reports use.

502
00:18:27,160 --> 00:18:30,040
Think about the errors this prevents in a typical organization.

503
00:18:30,040 --> 00:18:34,040
In the old model, a data scientist would try to replicate that last 12 months logic

504
00:18:34,040 --> 00:18:36,200
using Pandas code and date filters.

505
00:18:36,200 --> 00:18:39,000
They'd probably get it slightly wrong because the company's fiscal year

506
00:18:39,000 --> 00:18:41,960
doesn't align with the calendar year and nobody thought to tell them.

507
00:18:41,960 --> 00:18:45,640
Their training data would be off, their model would be calibrated to the wrong numbers

508
00:18:45,640 --> 00:18:48,280
and your reports would eventually show one number

509
00:18:48,280 --> 00:18:50,360
while the model suggested something else.

510
00:18:50,360 --> 00:18:55,160
That leads to confusion, misalignment, and endless arguments about which number is actually right.

511
00:18:55,160 --> 00:18:59,720
With SemPy, that conflict disappears because there is only one calculation and one answer.

512
00:18:59,720 --> 00:19:02,280
The DAX runs the same way whether it's powering a dashboard

513
00:19:02,280 --> 00:19:03,960
or feeding a machine learning model.

514
00:19:03,960 --> 00:19:07,480
You can also evaluate measures with very specific filters and groupings.

515
00:19:07,480 --> 00:19:10,520
You might ask for total revenue for each product category,

516
00:19:10,520 --> 00:19:12,840
filtered to the West region for the last quarter,

517
00:19:12,840 --> 00:19:16,200
and SemPy will construct that query and return a data frame.

518
00:19:16,200 --> 00:19:19,320
The semantic model applies your row-level security automatically,

519
00:19:19,320 --> 00:19:23,160
so if a user only has access to certain regions, those rules are enforced here too.

520
00:19:23,160 --> 00:19:25,720
The aggregations follow the official measure definition,

521
00:19:25,720 --> 00:19:27,800
not a guess at what the definition should be.

522
00:19:27,800 --> 00:19:31,080
This creates a unified foundation for your entire data strategy.

523
00:19:31,080 --> 00:19:33,880
Your predictive models aren't just compatible with your semantic model.

524
00:19:33,880 --> 00:19:36,440
They are built directly on top of it, they use the same measures,

525
00:19:36,440 --> 00:19:38,120
they respect the same security rules,

526
00:19:38,120 --> 00:19:40,680
and they inherit every definition you've already built.

527
00:19:40,680 --> 00:19:43,240
When you deploy a model and expose its output as a new measure,

528
00:19:43,240 --> 00:19:46,440
that new data is automatically aligned with everything else in the system.

529
00:19:46,440 --> 00:19:49,720
This enables a fundamental structural shift in how teams operate.

530
00:19:49,720 --> 00:19:52,520
Your data science team stops asking for manual data exports,

531
00:19:52,520 --> 00:19:54,040
they stop writing custom SQL,

532
00:19:54,040 --> 00:19:57,080
and they stop maintaining their own parallel definitions of revenue.

533
00:19:57,080 --> 00:19:59,720
They work through the semantic model,

534
00:19:59,720 --> 00:20:02,280
which means everything they build is rooted in the same logic

535
00:20:02,280 --> 00:20:03,960
that powers the rest of the business.

536
00:20:03,960 --> 00:20:08,040
Descrepancies vanish because there is only one source of truth for every metric.

537
00:20:08,040 --> 00:20:11,320
Your data science and BI teams are no longer working in separate universes.

538
00:20:11,320 --> 00:20:13,000
They are finally working in the same one.

539
00:20:13,000 --> 00:20:16,200
That alignment is the only way to get predictions people actually trust.

540
00:20:17,160 --> 00:20:20,360
The architecture, where prediction actually happens.

541
00:20:20,360 --> 00:20:24,200
To set this up correctly, you have to understand what's actually happening under the hood.

542
00:20:24,200 --> 00:20:27,240
This is fundamentally different from how most people think co-pilot works.

543
00:20:27,240 --> 00:20:29,480
The reality is that co-pilot does not predict,

544
00:20:29,480 --> 00:20:32,680
I'll say that again because it contradicts almost all the marketing you've seen.

545
00:20:32,680 --> 00:20:35,160
Co-pilot does not forecast, it does not calculate,

546
00:20:35,160 --> 00:20:37,080
and it does not run statistical models.

547
00:20:37,080 --> 00:20:39,160
But what co-pilot actually does is orchestrate,

548
00:20:39,160 --> 00:20:41,400
the actual prediction happens somewhere else entirely.

549
00:20:41,400 --> 00:20:43,640
It happens in a Python notebook, a Spark job,

550
00:20:43,640 --> 00:20:45,800
or an R-script running inside fabric.

551
00:20:45,800 --> 00:20:49,800
That code uses semantic link to pull data and measures from your semantic model.

552
00:20:49,800 --> 00:20:53,800
And then it runs a real statistical model like Arima or gradient boosting.

553
00:20:53,800 --> 00:20:57,560
That code produces the actual number, the forecast, or the churn score.

554
00:20:57,560 --> 00:20:59,800
Co-pilot's job is to handle the interface.

555
00:20:59,800 --> 00:21:01,640
It takes the user's natural language request,

556
00:21:01,640 --> 00:21:04,280
like, "What will our customer churn be next month?"

557
00:21:04,280 --> 00:21:05,560
And passes the intent.

558
00:21:05,560 --> 00:21:10,280
It understands what churn means and identifies the time horizon the user is asking about.

559
00:21:10,280 --> 00:21:13,480
Then co-pilot makes a decision about which model should answer the question.

560
00:21:13,480 --> 00:21:17,240
It checks if a churn model exists and verifies if the user has permission to see it.

561
00:21:17,240 --> 00:21:20,840
If everything checks out, co-pilot routes the question to that specific model.

562
00:21:20,840 --> 00:21:22,760
Once the model runs and returns a result,

563
00:21:22,760 --> 00:21:24,520
co-pilot does what it's actually good at.

564
00:21:24,520 --> 00:21:26,120
It contextualizes the data.

565
00:21:26,120 --> 00:21:28,520
If the model says churn will be 8.3%,

566
00:21:28,520 --> 00:21:31,240
co-pilot explains that based on 12 months of history,

567
00:21:31,240 --> 00:21:33,960
we expect that specific percentage of customers to leave.

568
00:21:33,960 --> 00:21:37,320
It can even explain that the prediction is accurate within a certain range

569
00:21:37,320 --> 00:21:40,200
and point out that a seasonal uptick is the primary driver.

570
00:21:40,200 --> 00:21:42,120
The explanation comes from the language model,

571
00:21:42,120 --> 00:21:44,040
but the number comes from the predictive model.

572
00:21:44,040 --> 00:21:45,720
The data comes from the semantic layout.

573
00:21:45,720 --> 00:21:48,360
Each part of the system is doing exactly what it was built for.

574
00:21:48,360 --> 00:21:50,760
This is the big architectural shift you need to understand.

575
00:21:50,760 --> 00:21:52,120
Co-pilot isn't the calculator.

576
00:21:52,120 --> 00:21:53,160
It's the translator.

577
00:21:53,160 --> 00:21:57,320
It turns user intent into a structured query for a predictive system

578
00:21:57,320 --> 00:22:01,560
and then it turns that system's numerical output back into a human readable explanation.

579
00:22:01,560 --> 00:22:05,160
This matters because it allows the system to be honest about uncertainty.

580
00:22:05,160 --> 00:22:08,680
If the predictive model says the confidence interval is 1.2%,

581
00:22:08,680 --> 00:22:10,440
co-pilot communicates that clearly.

582
00:22:10,440 --> 00:22:14,440
If the model was only trained on six months of data for a specific segment,

583
00:22:14,440 --> 00:22:17,400
co-pilot can flag that the forecast should be taken with caution.

584
00:22:17,400 --> 00:22:19,240
It can even mention known biases,

585
00:22:19,240 --> 00:22:21,240
like a tendency to underestimate outliers.

586
00:22:21,240 --> 00:22:24,520
The system becomes transparent about what it knows and what it doesn't.

587
00:22:24,520 --> 00:22:26,440
Compare that to the current state of AI,

588
00:22:26,440 --> 00:22:29,240
where a basic co-pilot tries to predict things directly.

589
00:22:29,240 --> 00:22:31,160
It has no sense of uncertainty or caveats.

590
00:22:31,160 --> 00:22:34,200
It just generates a number that sounds plausible and moves on,

591
00:22:34,200 --> 00:22:36,600
even if that number is a total hallucination.

592
00:22:36,600 --> 00:22:39,160
This architecture also means your capability's scale

593
00:22:39,160 --> 00:22:41,560
without needing co-pilot to get smarter.

594
00:22:41,560 --> 00:22:45,720
As you improve your predictive models by adding more data or refining your statistics,

595
00:22:45,720 --> 00:22:49,400
co-pilot automatically gets better because it's rooting to superior models.

596
00:22:49,400 --> 00:22:51,160
Co-pilot itself hasn't changed,

597
00:22:51,160 --> 00:22:53,160
but it now has better systems to coordinate.

598
00:22:53,160 --> 00:22:55,800
Governance also becomes universal across the platform,

599
00:22:55,800 --> 00:22:58,120
because your predictive model runs through semantic link,

600
00:22:58,120 --> 00:23:01,000
it inherits the row-level security of the semantic model.

601
00:23:01,000 --> 00:23:03,960
A manager in the West region will only see predictions for the West

602
00:23:03,960 --> 00:23:07,800
and a user without access to a specific product line won't see churn data for it.

603
00:23:07,800 --> 00:23:12,520
You don't have to tell co-pilot any of these rules because the semantic layer handles them automatically.

604
00:23:12,520 --> 00:23:15,560
This is the only architecture that makes AI-driven prediction trustworthy.

605
00:23:15,560 --> 00:23:17,560
It isn't about co-pilot doing the work.

606
00:23:17,560 --> 00:23:20,760
It's about co-pilot coordinating the work while specialized systems

607
00:23:20,760 --> 00:23:24,120
and a strong semantic layer ensure everything stays accurate and secure.

608
00:23:24,120 --> 00:23:27,160
Building predictive models on the semantic layer.

609
00:23:27,160 --> 00:23:29,560
Let's look at how this actually works when you're on the ground.

610
00:23:29,560 --> 00:23:31,240
You have your semantic model ready,

611
00:23:31,240 --> 00:23:33,800
your measures are defined, you understand the architecture,

612
00:23:33,800 --> 00:23:35,400
but now you face the real challenge.

613
00:23:35,400 --> 00:23:39,800
How do you build a predictive model that lives inside this structure instead of sitting off to the side?

614
00:23:39,800 --> 00:23:41,080
It starts with your measures.

615
00:23:41,080 --> 00:23:42,600
You aren't looking at raw tables here,

616
00:23:42,600 --> 00:23:46,040
you're looking at the calculated fields that already hold your business logic.

617
00:23:46,040 --> 00:23:48,280
Total revenue, active customers,

618
00:23:48,280 --> 00:23:49,800
customer lifetime value.

619
00:23:49,800 --> 00:23:52,840
These metrics represent how your organization actually functions

620
00:23:52,840 --> 00:23:55,000
and because they live in the semantic model,

621
00:23:55,000 --> 00:23:57,320
they are already consistent and documented.

622
00:23:57,320 --> 00:23:59,320
This is the foundation of your predictive work.

623
00:23:59,320 --> 00:24:00,600
When you open a fabric notebook,

624
00:24:00,600 --> 00:24:03,240
you import SEMPI and authenticate your workspace.

625
00:24:03,240 --> 00:24:06,120
From there you pull the specific measures you need for your training data.

626
00:24:06,120 --> 00:24:11,480
You aren't stuck writing complex SQL or manually joining tables while praying the cardinality is right.

627
00:24:11,480 --> 00:24:15,160
Instead you call SEMPI functions that evaluate those measures across your history.

628
00:24:15,160 --> 00:24:18,040
It automatically respects your relationships and hierarchies

629
00:24:18,040 --> 00:24:21,240
using the exact same logic that powers your executive reports.

630
00:24:21,240 --> 00:24:23,320
What you get back is a clean data frame,

631
00:24:23,320 --> 00:24:26,360
structured exactly the way your business defines its success.

632
00:24:26,360 --> 00:24:28,760
Once you have that grounded data you start building,

633
00:24:28,760 --> 00:24:31,720
maybe you need a time series forecast for revenue using Arima

634
00:24:31,720 --> 00:24:35,400
or perhaps a classification model to predict customer churn with gradient boosting.

635
00:24:35,400 --> 00:24:39,640
You might even be clustering your customer base to see who is most likely to buy a new product.

636
00:24:39,640 --> 00:24:44,280
The specific math doesn't change the fact that your training data came from the semantic layer.

637
00:24:44,280 --> 00:24:47,320
Your model is learning from data that was already cleaned and aligned

638
00:24:47,320 --> 00:24:49,160
with how the company sees itself.

639
00:24:49,160 --> 00:24:51,160
This inheritance is the part you can't skip.

640
00:24:51,160 --> 00:24:53,160
When you train on data pulled through SEMPI,

641
00:24:53,160 --> 00:24:57,080
your model isn't just compatible with your system, it is grounded in it.

642
00:24:57,080 --> 00:25:00,520
The version of revenue your model learned from is the exact same version

643
00:25:00,520 --> 00:25:02,200
your dashboards show every morning.

644
00:25:02,200 --> 00:25:04,520
There is no data drift and no awkward translation layer

645
00:25:04,520 --> 00:25:06,760
where definitions start to diverge over time.

646
00:25:06,760 --> 00:25:08,520
Security rules follow the data too,

647
00:25:08,520 --> 00:25:10,760
even if you don't see them working in the background.

648
00:25:10,760 --> 00:25:12,600
When you pull history through SEMPI,

649
00:25:12,600 --> 00:25:15,880
any role-level security rules in the semantic model apply instantly.

650
00:25:15,880 --> 00:25:18,280
If your organization segments data by region

651
00:25:18,280 --> 00:25:20,840
and your notebook user only has access to the West,

652
00:25:20,840 --> 00:25:22,680
SEMPI only returns West region data.

653
00:25:22,680 --> 00:25:25,960
Your model trains only on what that specific user is allowed to see.

654
00:25:25,960 --> 00:25:27,400
When the model is deployed,

655
00:25:27,400 --> 00:25:31,480
users only see predictions for the areas they are authorized to access

656
00:25:31,480 --> 00:25:34,040
all without you writing a single line of security code.

657
00:25:34,040 --> 00:25:35,640
After the model is validated,

658
00:25:35,640 --> 00:25:37,480
you don't just leave it sitting in a notebook,

659
00:25:37,480 --> 00:25:39,800
you deploy it by creating a new table in your fabric,

660
00:25:39,800 --> 00:25:42,280
lake house or warehouse to hold those predictions.

661
00:25:42,280 --> 00:25:44,280
You might call it churn predictions,

662
00:25:44,280 --> 00:25:49,480
where rows represent customers and columns show the probability of them leaving next month.

663
00:25:49,480 --> 00:25:52,520
You've just turned a predictive insight into a permanent data asset.

664
00:25:52,520 --> 00:25:54,760
The final step brings everything full circle.

665
00:25:54,760 --> 00:25:57,560
You expose those new predictions back through the semantic layer

666
00:25:57,560 --> 00:26:00,440
by adding a measure like predicted churn rate.

667
00:26:00,440 --> 00:26:02,920
The DAX behind this measure aggregates your prediction table

668
00:26:02,920 --> 00:26:04,680
and calculates the metric for the whole company.

669
00:26:04,680 --> 00:26:07,480
Now, Copilot can see that measure just like any other.

670
00:26:07,480 --> 00:26:10,120
When a manager asks about predicted churn by region,

671
00:26:10,120 --> 00:26:11,960
Copilot roots the request to this measure,

672
00:26:11,960 --> 00:26:14,440
the semantic model handles the regional filtering

673
00:26:14,440 --> 00:26:17,240
and the result comes back secured and consistent.

674
00:26:17,240 --> 00:26:19,720
This is the complete cycle of a modern data system.

675
00:26:19,720 --> 00:26:21,240
You start with established measures,

676
00:26:21,240 --> 00:26:23,320
build a predictive system grounded in that logic

677
00:26:23,320 --> 00:26:25,240
and store the results as fresh data.

678
00:26:25,240 --> 00:26:28,040
By exposing those results back through the semantic layer,

679
00:26:28,040 --> 00:26:30,840
your predictive capability is no longer a separate silo.

680
00:26:30,840 --> 00:26:33,800
It uses the same definitions, respects the same security,

681
00:26:33,800 --> 00:26:36,440
and follows the same governance as your standard BI.

682
00:26:36,440 --> 00:26:37,720
Because of this integration,

683
00:26:37,720 --> 00:26:40,040
Copilot can finally root difficult questions

684
00:26:40,040 --> 00:26:42,200
to models that actually understand what they are predicting.

685
00:26:42,200 --> 00:26:45,240
DAX is a language for prediction logic.

686
00:26:45,240 --> 00:26:47,640
Most people treat DAX like a simple reporting tool.

687
00:26:47,640 --> 00:26:50,040
They think you write a formula to show a number in power BI

688
00:26:50,040 --> 00:26:51,320
and that's the end of the story.

689
00:26:51,320 --> 00:26:53,400
But that's a misunderstanding of what DAX actually is.

690
00:26:53,400 --> 00:26:55,720
DAX is a language for encoding business logic

691
00:26:55,720 --> 00:26:58,920
and it isn't tied to a single report or a specific visualization.

692
00:26:58,920 --> 00:26:59,880
When you write DAX,

693
00:26:59,880 --> 00:27:03,080
you are documenting the actual rules of how your business functions.

694
00:27:03,080 --> 00:27:04,440
This is a massive deal for prediction

695
00:27:04,440 --> 00:27:07,000
because DAX makes your logic executable,

696
00:27:07,000 --> 00:27:07,960
machine readable,

697
00:27:07,960 --> 00:27:10,120
and consistent across every system you own.

698
00:27:10,120 --> 00:27:11,880
Take a look at a basic example like revenue.

699
00:27:11,880 --> 00:27:14,520
For most companies, revenue isn't just a simple sum

700
00:27:14,520 --> 00:27:16,520
of every transaction in the database.

701
00:27:16,520 --> 00:27:18,520
It might only include invoice transactions

702
00:27:18,520 --> 00:27:19,800
from completed orders,

703
00:27:19,800 --> 00:27:22,680
excluding any returns processed within 30 days

704
00:27:22,680 --> 00:27:25,240
or calculated in USD based on the invoice date.

705
00:27:25,240 --> 00:27:26,920
In a traditional SQL environment,

706
00:27:26,920 --> 00:27:29,160
you'd write a long-ware clause with multiple filters

707
00:27:29,160 --> 00:27:31,000
for order, status, and transaction types.

708
00:27:31,000 --> 00:27:32,440
That is just procedural syntax.

709
00:27:32,440 --> 00:27:33,960
In DAX, you are writing logic.

710
00:27:33,960 --> 00:27:36,120
You use the calculate function to specify a measure

711
00:27:36,120 --> 00:27:38,120
and then layer on the business context.

712
00:27:38,120 --> 00:27:40,680
You are telling the system to calculate a specific value

713
00:27:40,680 --> 00:27:43,960
but only for these rows and only for these specific periods.

714
00:27:43,960 --> 00:27:45,560
The syntax looks like this.

715
00:27:45,560 --> 00:27:48,280
Calculate amount, orders, order status,

716
00:27:48,280 --> 00:27:51,160
complete orders, return date, blank.

717
00:27:51,160 --> 00:27:52,520
That isn't just code.

718
00:27:52,520 --> 00:27:54,680
It is a business rule translated into a language

719
00:27:54,680 --> 00:27:56,120
the computer understands.

720
00:27:56,120 --> 00:27:57,800
The real power is that this formula

721
00:27:57,800 --> 00:27:59,960
doesn't just hide inside a single report.

722
00:27:59,960 --> 00:28:02,040
It becomes a permanent part of your semantic model

723
00:28:02,040 --> 00:28:03,400
as a documented measure.

724
00:28:03,400 --> 00:28:05,160
Every system that needs that data,

725
00:28:05,160 --> 00:28:07,240
whether it's a dashboard, a co-pilot query,

726
00:28:07,240 --> 00:28:09,160
or a predictive model using semantic link,

727
00:28:09,160 --> 00:28:10,760
access is the exact same logic.

728
00:28:10,760 --> 00:28:13,400
You can see this power most clearly with time intelligence.

729
00:28:13,400 --> 00:28:15,480
Your business likely operates on a fiscal year

730
00:28:15,480 --> 00:28:17,320
that doesn't align with the standard calendar,

731
00:28:17,320 --> 00:28:19,720
perhaps starting in July and ending in June.

732
00:28:19,720 --> 00:28:21,400
A standard data warehouse just sees dates

733
00:28:21,400 --> 00:28:23,080
and SQL doesn't inherently understand

734
00:28:23,080 --> 00:28:24,440
your specific fiscal calendar.

735
00:28:24,440 --> 00:28:25,240
DAX does.

736
00:28:25,240 --> 00:28:27,560
You can use a function like Samperia last year

737
00:28:27,560 --> 00:28:30,760
and it shifts the context back by one fiscal year automatically.

738
00:28:30,760 --> 00:28:33,400
SQL can't do that without a lot of custom, manual logic,

739
00:28:33,400 --> 00:28:35,400
but DAX handles it natively.

740
00:28:35,400 --> 00:28:38,600
When a user asks co-pilot how performance compares to last year,

741
00:28:38,600 --> 00:28:39,800
co-pilot doesn't have to guess

742
00:28:39,800 --> 00:28:42,200
what last year means for your specific company.

743
00:28:42,200 --> 00:28:44,200
It simply reads the DAX definition

744
00:28:44,200 --> 00:28:45,880
of your year-over-year measures.

745
00:28:45,880 --> 00:28:48,120
Since the DAX already holds the fiscal year logic

746
00:28:48,120 --> 00:28:51,400
in its metadata, co-pilot knows exactly which dates to compare.

747
00:28:51,400 --> 00:28:54,120
There is no guessing and no room for reinterpretation.

748
00:28:54,120 --> 00:28:57,640
The same logic applies to every filter and calculation you create.

749
00:28:57,640 --> 00:28:59,080
These aren't just bits of syntax,

750
00:28:59,080 --> 00:29:01,160
they are structural facts about your business.

751
00:29:01,160 --> 00:29:03,720
You might have rules that exclude certain transaction types

752
00:29:03,720 --> 00:29:05,960
from revenue or define an active customer

753
00:29:05,960 --> 00:29:08,680
as someone who bought something in the last 12 months.

754
00:29:08,680 --> 00:29:10,440
DAX is the only place where these rules live

755
00:29:10,440 --> 00:29:11,960
in a way that the whole system can use.

756
00:29:11,960 --> 00:29:14,040
And this has a huge impact on your architecture.

757
00:29:14,040 --> 00:29:16,360
When you pull training data for a predictive model

758
00:29:16,360 --> 00:29:18,200
through semantic link, you are pulling data

759
00:29:18,200 --> 00:29:20,120
that has already been filtered by these DAX rules.

760
00:29:20,120 --> 00:29:23,320
Your model isn't training on messy raw transactions.

761
00:29:23,320 --> 00:29:25,720
It's training on business-defined aggregates.

762
00:29:25,720 --> 00:29:27,880
Revenue means what the DAX says it means,

763
00:29:27,880 --> 00:29:31,480
and custom accounts follow the company's official definition of active.

764
00:29:31,480 --> 00:29:34,520
This creates a level of consistency that was previously impossible.

765
00:29:34,520 --> 00:29:36,600
Your predictive model can't accidentally use

766
00:29:36,600 --> 00:29:40,280
a different definition of profit than your CFO uses in the quarterly meeting.

767
00:29:40,280 --> 00:29:42,200
The definition is locked into the DAX,

768
00:29:42,200 --> 00:29:43,640
making it the same everywhere.

769
00:29:43,640 --> 00:29:46,280
This is also why your metadata tagging is so important.

770
00:29:46,280 --> 00:29:48,200
The descriptions you write for each measure,

771
00:29:48,200 --> 00:29:50,440
the assumptions and the filters you've baked in,

772
00:29:50,440 --> 00:29:53,080
become the context co-pilot uses to answer questions.

773
00:29:53,080 --> 00:29:56,520
A measure with a clear description is transparent and useful

774
00:29:56,520 --> 00:29:59,400
while a measure with no description is just a black box.

775
00:29:59,400 --> 00:30:00,840
Co-pilot can work with the first one,

776
00:30:00,840 --> 00:30:02,760
but it will be forced to guess on the second.

777
00:30:02,760 --> 00:30:07,160
At the end of the day, DAX is where your business rules become executable code.

778
00:30:07,160 --> 00:30:09,160
It is the bridge that transforms raw,

779
00:30:09,160 --> 00:30:11,480
meaningless data into actual business value.

780
00:30:12,520 --> 00:30:15,400
The security perimeter, RLS and co-pilot.

781
00:30:15,400 --> 00:30:17,880
Most organizations haven't thought through this next problem yet,

782
00:30:17,880 --> 00:30:22,600
but it's going to force a total rethink of how security works in your semantic models.

783
00:30:22,600 --> 00:30:25,000
Row-level security is no longer just a reporting control.

784
00:30:25,000 --> 00:30:26,520
It is now a co-pilot control.

785
00:30:26,520 --> 00:30:29,720
That distinction changes everything about how you decide who sees what.

786
00:30:29,720 --> 00:30:31,560
In the old world, RLS was simple.

787
00:30:31,560 --> 00:30:34,840
You defined roles in your model and gave each one a filter rule.

788
00:30:34,840 --> 00:30:37,880
Sales managers in the West only saw West region transactions,

789
00:30:37,880 --> 00:30:40,680
while accountants in Europe only saw European cost data.

790
00:30:40,680 --> 00:30:42,520
These rules lived behind the scenes.

791
00:30:42,520 --> 00:30:45,720
When a user opened a report, RLS silently filtered the data

792
00:30:45,720 --> 00:30:47,880
to show only their authorized slice.

793
00:30:47,880 --> 00:30:50,120
It was clean, invisible and predictable,

794
00:30:50,120 --> 00:30:52,600
but now co-pilot can reason over that data.

795
00:30:52,600 --> 00:30:54,360
And this is where the shift happens.

796
00:30:54,360 --> 00:30:56,760
Co-pilot doesn't just look at what's on the report page,

797
00:30:56,760 --> 00:31:01,800
it can ask questions that drill into every single corner of the data the user is technically allowed to touch.

798
00:31:01,800 --> 00:31:03,480
Think about how this plays out in real life.

799
00:31:03,480 --> 00:31:06,040
A salesperson in the West has read access to a model,

800
00:31:06,040 --> 00:31:09,400
and their RLS rule limits them to West region transactions.

801
00:31:09,400 --> 00:31:11,880
They open a report and see their regional pipeline,

802
00:31:11,880 --> 00:31:13,560
which is exactly what you expected.

803
00:31:13,560 --> 00:31:17,080
But then they open co-pilot and ask for a year over year comparison of sales

804
00:31:17,080 --> 00:31:19,480
by product category for their specific region.

805
00:31:19,480 --> 00:31:22,520
Co-pilot takes that request and runs it through the semantic model

806
00:31:22,520 --> 00:31:24,520
while applying the RLS filters.

807
00:31:24,520 --> 00:31:25,880
It returns the comparison,

808
00:31:25,880 --> 00:31:28,440
and suddenly the salesperson sees insights

809
00:31:28,440 --> 00:31:30,120
that were never in a pre-built report

810
00:31:30,120 --> 00:31:33,160
because no one thought to combine those specific dimensions before.

811
00:31:33,160 --> 00:31:34,360
This isn't a data breach.

812
00:31:34,360 --> 00:31:36,120
The data is there, the user is authorized,

813
00:31:36,120 --> 00:31:37,960
and RLS is doing its job perfectly.

814
00:31:37,960 --> 00:31:41,080
However, the user now has visibility into analytical slices

815
00:31:41,080 --> 00:31:43,000
that no one explicitly planned for them to see.

816
00:31:43,000 --> 00:31:45,240
They can ask questions you didn't anticipate.

817
00:31:45,240 --> 00:31:49,640
They can discover performance trends that used to require a formal request to the analytics team.

818
00:31:49,640 --> 00:31:51,400
This is what I mean by amplification.

819
00:31:51,400 --> 00:31:53,080
You aren't bypassing security,

820
00:31:53,080 --> 00:31:57,000
but you are amplifying a user's ability to reason over the data they already own.

821
00:31:57,000 --> 00:31:58,520
That is incredibly powerful,

822
00:31:58,520 --> 00:32:01,640
but it's also why your governance has to become much more intentional.

823
00:32:01,640 --> 00:32:05,400
The problem is that most RLS was designed with static reports in mind.

824
00:32:05,400 --> 00:32:08,040
You build roles around what should be visible on a dashboard,

825
00:32:08,040 --> 00:32:11,160
assuming the user wouldn't ask questions outside of those boundaries.

826
00:32:11,160 --> 00:32:13,800
That assumption breaks the moment you introduce co-pilot.

827
00:32:13,800 --> 00:32:16,760
The user will ask questions that no report ever exposed,

828
00:32:16,760 --> 00:32:18,120
simply because they can.

829
00:32:18,120 --> 00:32:20,920
The real question you have to answer is whether that's okay.

830
00:32:20,920 --> 00:32:23,480
If a manager uses co-pilot to find product insights

831
00:32:23,480 --> 00:32:25,080
that weren't in the official reports,

832
00:32:25,080 --> 00:32:25,960
was that your intention?

833
00:32:25,960 --> 00:32:28,440
If the answer is yes, then your RLS is fine.

834
00:32:28,440 --> 00:32:31,160
If the answer is no, you have a massive gap in your strategy.

835
00:32:31,160 --> 00:32:33,960
This is the point where RLS stops being a technical setting

836
00:32:33,960 --> 00:32:35,560
and becomes a governance decision.

837
00:32:35,560 --> 00:32:38,280
You aren't just asking what data a user should see anymore.

838
00:32:38,280 --> 00:32:42,440
You're asking what predictions and analyses that user should be allowed to request.

839
00:32:42,440 --> 00:32:45,000
The answers to those two questions might be very different.

840
00:32:45,000 --> 00:32:49,560
Imagine a customer service rep who has access to order history to help resolve issues.

841
00:32:49,560 --> 00:32:53,320
In a standard report, they see order details for their assigned customers,

842
00:32:53,320 --> 00:32:55,080
which is exactly what you intended,

843
00:32:55,080 --> 00:33:00,200
but now they can ask co-pilot which of their customers are most likely to churn next quarter.

844
00:33:00,200 --> 00:33:03,480
Co-pilot routes that to your churn prediction measure and applies RLS,

845
00:33:03,480 --> 00:33:05,320
so they only see their own customers.

846
00:33:05,320 --> 00:33:09,160
They get the answer and start reaching out to ad risk accounts proactively.

847
00:33:09,160 --> 00:33:10,040
Is that a good thing?

848
00:33:10,040 --> 00:33:11,000
In most cases, yes.

849
00:33:11,000 --> 00:33:13,240
It's a perfect example of intended amplification

850
00:33:13,240 --> 00:33:16,280
where a rep uses predictive insights to do their job better.

851
00:33:16,280 --> 00:33:17,960
But you have to decide that on purpose.

852
00:33:17,960 --> 00:33:21,560
You have to design your RLS rules with that specific use case in mind

853
00:33:21,560 --> 00:33:24,200
to ensure the role configuration actually allows it.

854
00:33:24,200 --> 00:33:26,120
The inverse risk is just as real.

855
00:33:26,120 --> 00:33:28,680
If you aren't careful about what data a user can touch,

856
00:33:28,680 --> 00:33:32,600
co-pilot will find ways to reason over sensitive info you never meant to highlight.

857
00:33:32,600 --> 00:33:37,960
A user with basic read access to payroll data could ask co-pilot for a comparative salary analysis

858
00:33:37,960 --> 00:33:39,400
or ask it to flag outliers.

859
00:33:39,400 --> 00:33:42,600
If RLS isn't set up to stop that co-pilot will give them the answer.

860
00:33:42,600 --> 00:33:46,360
The architecture is forcing governance out of the abstract and into the concrete.

861
00:33:46,360 --> 00:33:50,520
RLS is now the control surface where you decide what users can do with AI,

862
00:33:50,520 --> 00:33:52,280
not just what they can see on a screen.

863
00:33:52,280 --> 00:33:55,640
The choices you make here will directly shape what happens when co-pilot starts

864
00:33:55,640 --> 00:33:57,240
orchestrating your predictive models.

865
00:33:57,240 --> 00:33:59,720
Object level security and AI ready models.

866
00:33:59,720 --> 00:34:03,560
RLS handles the rows but there is another layer of security that is now critical

867
00:34:03,560 --> 00:34:05,320
because co-pilot can see your entire model.

868
00:34:05,320 --> 00:34:07,560
This is object level security or OLS.

869
00:34:07,560 --> 00:34:13,160
While RLS filters rows, OLS operates at the column level to hide entire fields from specific roles.

870
00:34:13,160 --> 00:34:16,680
This distinction is a huge deal when you are getting ready for co-pilot access.

871
00:34:16,680 --> 00:34:18,760
Think about the difference in how these two work.

872
00:34:18,760 --> 00:34:22,680
With RLS and HR person might only see employees in their specific department.

873
00:34:22,680 --> 00:34:25,160
The rows are filtered but they still see every column,

874
00:34:25,160 --> 00:34:27,880
including names, salaries and performance ratings.

875
00:34:27,880 --> 00:34:29,640
With OLS you can go a step further.

876
00:34:29,640 --> 00:34:33,480
You can set it up so that the HR person sees everyone in their department

877
00:34:33,480 --> 00:34:35,400
but the salary column is completely gone.

878
00:34:35,400 --> 00:34:36,920
It isn't just blurred or masked.

879
00:34:36,920 --> 00:34:37,800
It is hidden.

880
00:34:37,800 --> 00:34:40,600
For that specific user, the column doesn't even exist.

881
00:34:40,600 --> 00:34:45,320
This is essential for sensitive data like health records, credit card numbers or disciplinary files.

882
00:34:45,320 --> 00:34:46,920
These aren't things you just filter.

883
00:34:46,920 --> 00:34:49,320
You exclude them from certain roles entirely.

884
00:34:49,320 --> 00:34:51,800
An HR coordinator needs names and departments

885
00:34:51,800 --> 00:34:53,960
but they have no reason to see salary data.

886
00:34:53,960 --> 00:34:57,000
OLS ensures they can't see it by making the field invisible

887
00:34:57,000 --> 00:34:58,520
within the semantic model itself.

888
00:34:58,520 --> 00:35:01,080
Here is the part people miss.

889
00:35:01,080 --> 00:35:04,760
Hiding a column in the report interface is not the same as OLS.

890
00:35:04,760 --> 00:35:08,200
I see organizations hide salary columns from reports all the time

891
00:35:08,200 --> 00:35:11,720
and think they are secure but those columns are still sitting in the underlying data.

892
00:35:11,720 --> 00:35:15,000
If a user clicks the co-pilot button and asks the question about pay,

893
00:35:15,000 --> 00:35:16,440
they might get a full answer.

894
00:35:16,440 --> 00:35:19,240
The data is there and the report just isn't showing it.

895
00:35:19,240 --> 00:35:22,040
Since co-pilot has access to the whole model by default,

896
00:35:22,040 --> 00:35:25,160
nothing stops it from reading those hidden columns unless OLS is turned on.

897
00:35:25,800 --> 00:35:29,720
This is where co-pilot's reasoning becomes a liability if your OLS isn't right.

898
00:35:29,720 --> 00:35:31,560
Co-pilot doesn't care about the report layer.

899
00:35:31,560 --> 00:35:33,400
It looks straight at the semantic model.

900
00:35:33,400 --> 00:35:36,200
If sensitive columns aren't protected, co-pilot will use them.

901
00:35:36,200 --> 00:35:39,400
A user might ask a simple question about what's driving headcount costs

902
00:35:39,400 --> 00:35:42,120
and co-pilot will pull salary data to explain the variance.

903
00:35:42,120 --> 00:35:46,040
That data was never supposed to be accessible but it was because you only hid it in the UI.

904
00:35:46,040 --> 00:35:47,560
OLS stops this at the source.

905
00:35:47,560 --> 00:35:51,880
You configure it at the model level and specify which columns belong to which security group.

906
00:35:51,880 --> 00:35:56,440
Salary data gets a specific classification and employees only see it if they have the right permission.

907
00:35:56,440 --> 00:35:58,520
Co-pilot inherits these rules automatically.

908
00:35:58,520 --> 00:36:02,440
When it looks at the model, it respects OLS just like it respects RLS.

909
00:36:02,440 --> 00:36:04,200
Sensitive columns stay locked away.

910
00:36:04,200 --> 00:36:07,560
The user can't ask co-pilot to show them because from their perspective,

911
00:36:07,560 --> 00:36:08,600
those columns aren't there.

912
00:36:08,600 --> 00:36:11,720
This is exactly why the prepped for AI designation exists.

913
00:36:11,720 --> 00:36:14,440
Microsoft has a formal process for marking a model this way

914
00:36:14,440 --> 00:36:15,960
and it isn't just a marketing label.

915
00:36:15,960 --> 00:36:17,480
It is a governance checkpoint.

916
00:36:17,480 --> 00:36:19,480
When you mark a model as ready for AI,

917
00:36:19,480 --> 00:36:21,800
you are telling the system that a few things are true.

918
00:36:21,800 --> 00:36:23,800
You've provided clear names and descriptions.

919
00:36:23,800 --> 00:36:25,800
You've built proper relationships and measures.

920
00:36:25,800 --> 00:36:29,720
But most importantly, you've confirmed that RLS and OLS are configured correctly.

921
00:36:29,720 --> 00:36:33,480
You've thought about who should have access and you've tested your roles against co-pilot.

922
00:36:33,480 --> 00:36:36,520
This certification tells your organization that the model is safe.

923
00:36:36,520 --> 00:36:40,760
Users can trust co-pilot to give them answers without accidentally leaking sensitive details.

924
00:36:40,760 --> 00:36:44,520
Your data science team can build predictive models without worrying about security gaps.

925
00:36:44,520 --> 00:36:48,440
Your governance team can audit the model and know exactly what access really means.

926
00:36:48,440 --> 00:36:52,280
The prepped for AI label is the new foundation for your access control.

927
00:36:52,280 --> 00:36:54,680
It isn't just saying the model works with co-pilot.

928
00:36:54,680 --> 00:36:57,080
It is a guarantee that the model has been audited,

929
00:36:57,080 --> 00:36:59,480
hardened and approved for AI-driven analysis.

930
00:36:59,480 --> 00:37:02,360
The prepped for AI certification.

931
00:37:02,360 --> 00:37:04,680
Let's move from the theory to the actual process.

932
00:37:04,680 --> 00:37:08,520
Microsoft has built a specific workflow in Power BI that turns prepped for AI

933
00:37:08,520 --> 00:37:11,000
from a vague goal into a measurable certification.

934
00:37:11,000 --> 00:37:14,600
You don't just declare that your model is ready for co-pilot and hope for the best.

935
00:37:14,600 --> 00:37:17,160
Instead, you work through a structured checklist

936
00:37:17,160 --> 00:37:19,720
to confirm every single element before testing it.

937
00:37:19,720 --> 00:37:21,640
At the end of that process, you receive a badge.

938
00:37:21,640 --> 00:37:24,360
That badge signals something very specific.

939
00:37:24,360 --> 00:37:28,680
This model has been audited and is officially safe for AI-driven predictions.

940
00:37:28,680 --> 00:37:31,160
The workflow starts with naming and descriptions.

941
00:37:31,160 --> 00:37:33,720
We've talked about this a lot, so I'll keep it brief.

942
00:37:33,720 --> 00:37:38,440
Every table needs a clear name, every column needs a description that makes sense to a business user,

943
00:37:38,440 --> 00:37:43,320
and every measure needs an unambiguous name with its DAX logic documented in plain English.

944
00:37:43,320 --> 00:37:45,240
This isn't just a nice-to-have feature.

945
00:37:45,240 --> 00:37:48,760
It is the foundation of the entire system because without these labels,

946
00:37:48,760 --> 00:37:51,480
co-pilot simply cannot understand what it's looking at.

947
00:37:51,480 --> 00:37:54,200
Next, you have to look at your relationships and measures.

948
00:37:54,200 --> 00:37:56,680
Your semantic model must be structurally sound,

949
00:37:56,680 --> 00:38:01,160
which means tables are related correctly and cardinality is explicitly defined.

950
00:38:01,160 --> 00:38:05,320
You need to ensure hierarchies are set up and measures are calculated consistently across the board.

951
00:38:05,320 --> 00:38:10,120
This is basic data architecture that has been required for good reporting since the beginning,

952
00:38:10,120 --> 00:38:12,520
but now it's a mandatory check for AI readiness.

953
00:38:12,520 --> 00:38:14,600
Then we get to the security configuration.

954
00:38:14,600 --> 00:38:19,960
You have to configure and test your row level and object level security to ensure they work exactly as intended.

955
00:38:19,960 --> 00:38:24,360
You define the roles, specify which rows or columns each row can see,

956
00:38:24,360 --> 00:38:27,000
and then run tests to prove those rules hold up.

957
00:38:27,000 --> 00:38:29,880
This step forces you to be intentional about governance.

958
00:38:29,880 --> 00:38:31,560
You aren't just checking a box here.

959
00:38:31,560 --> 00:38:35,400
You are documenting exactly who has access to what and why that access exists.

960
00:38:35,400 --> 00:38:38,200
Now we hit the part that is brand new and specific to AI.

961
00:38:38,200 --> 00:38:41,000
The fourth requirement is defining your AI schema.

962
00:38:41,000 --> 00:38:43,160
This is where you explicitly tell Power BI

963
00:38:43,160 --> 00:38:45,960
which tables and measures co-pilot is allowed to use.

964
00:38:45,960 --> 00:38:48,520
You don't want to expose the entire model by default,

965
00:38:48,520 --> 00:38:49,960
so you curate the experience.

966
00:38:49,960 --> 00:38:53,400
You might have technical tables used for background calculations or development measures

967
00:38:53,400 --> 00:38:54,360
that aren't finished yet.

968
00:38:54,360 --> 00:38:55,960
You hide those from the AI.

969
00:38:55,960 --> 00:39:00,680
Your AI schema defines the specific surface area that co-pilot is authorized to reason over.

970
00:39:00,680 --> 00:39:03,880
This curation is vital because it stops the AI from hallucinating.

971
00:39:03,880 --> 00:39:06,920
If your model has 30 tables and co-pilot can see all of them,

972
00:39:06,920 --> 00:39:09,080
it might try to join data that shouldn't be joined.

973
00:39:09,080 --> 00:39:11,400
It might find a correlation between two things

974
00:39:11,400 --> 00:39:14,760
that aren't actually related just because they happen to be in the same model.

975
00:39:14,760 --> 00:39:17,000
By restricting the AI to a curated subset,

976
00:39:17,000 --> 00:39:21,400
you ensure that co-pilot stays within the boundaries of what users should actually be asking about.

977
00:39:21,400 --> 00:39:23,240
Finally, you define your AI instructions.

978
00:39:23,240 --> 00:39:27,400
These are natural language guidelines that tell co-pilot how to handle confusing requests.

979
00:39:27,400 --> 00:39:31,160
For example, your company might have three different definitions for revenue

980
00:39:31,160 --> 00:39:32,360
depending on the department.

981
00:39:32,360 --> 00:39:35,400
A user might just ask for revenue without being specific,

982
00:39:35,400 --> 00:39:39,000
so your instruction would tell co-pilot to assume they mean total net revenue

983
00:39:39,000 --> 00:39:40,600
unless they say otherwise.

984
00:39:40,600 --> 00:39:43,720
Co-pilot reads that instruction and follows it every single time.

985
00:39:43,720 --> 00:39:46,840
The certification process walks you through every one of these steps.

986
00:39:46,840 --> 00:39:49,640
You confirm the names are clear, validate the security,

987
00:39:49,640 --> 00:39:51,160
and set the AI boundaries.

988
00:39:51,160 --> 00:39:54,920
Once that's done, the model is officially marked as prepped for AI.

989
00:39:54,920 --> 00:39:58,360
This badge tells your team that the model has passed a real governance review

990
00:39:58,360 --> 00:40:00,200
and is approved for predictive access.

991
00:40:00,200 --> 00:40:03,080
It turns the model into something data scientists can build on

992
00:40:03,080 --> 00:40:05,640
and something co-pilot can reason over reliably.

993
00:40:05,640 --> 00:40:07,400
This isn't just more bureaucracy.

994
00:40:07,400 --> 00:40:08,600
It's a guarantee of quality.

995
00:40:09,240 --> 00:40:11,720
Semantic, link labs, and governance automation.

996
00:40:11,720 --> 00:40:14,200
You've prepared your first model, it's certified,

997
00:40:14,200 --> 00:40:15,480
and it's ready for the AI.

998
00:40:15,480 --> 00:40:18,600
But in a large organization, you don't just have one model to worry about.

999
00:40:18,600 --> 00:40:22,920
You likely have dozens or even hundreds of them spread across different teams and business units.

1000
00:40:22,920 --> 00:40:24,200
Some of these are well maintained,

1001
00:40:24,200 --> 00:40:29,080
but others have been neglected for years with metadata that hasn't been touched since 2019.

1002
00:40:29,080 --> 00:40:31,720
If you want to scale co-pilot across the whole company,

1003
00:40:31,720 --> 00:40:35,080
every single one of those models needs to meet the same high standards.

1004
00:40:35,080 --> 00:40:36,920
This is where manual governance falls apart.

1005
00:40:36,920 --> 00:40:39,000
You cannot audit hundreds of models by hand.

1006
00:40:39,320 --> 00:40:42,040
And you certainly can't update every description one by one.

1007
00:40:42,040 --> 00:40:44,360
You need a way to handle governance at scale.

1008
00:40:44,360 --> 00:40:47,160
And that is exactly why semantic link labs exists.

1009
00:40:47,160 --> 00:40:51,400
Semantic link labs is a specialized toolkit built on top of the standard semantic link.

1010
00:40:51,400 --> 00:40:54,040
It isn't a new product, but rather an enhancement

1011
00:40:54,040 --> 00:40:57,480
that gives you programmatic access to tasks that used to require

1012
00:40:57,480 --> 00:40:59,560
clicking through the Power BI interface.

1013
00:40:59,560 --> 00:41:01,720
You can manage roles, configure security filters,

1014
00:41:01,720 --> 00:41:04,760
and update descriptions across hundreds of models using a single script.

1015
00:41:04,760 --> 00:41:07,160
All of this happens through code in a fabric notebook,

1016
00:41:07,160 --> 00:41:08,840
which is much faster than manual work.

1017
00:41:08,840 --> 00:41:10,680
Take role management as an example.

1018
00:41:10,680 --> 00:41:13,560
Usually you'd have to open Power BI Desktop, create a role,

1019
00:41:13,560 --> 00:41:15,800
and set up the filters for every single model.

1020
00:41:15,800 --> 00:41:18,840
If you have 50 models, you're doing that work 50 times.

1021
00:41:18,840 --> 00:41:20,200
With semantic link labs,

1022
00:41:20,200 --> 00:41:23,400
you write one Python script that defines the role in its filters

1023
00:41:23,400 --> 00:41:25,800
and then you apply it to all 50 models at once.

1024
00:41:25,800 --> 00:41:29,080
This ensures that your security is consistent across the entire environment.

1025
00:41:29,080 --> 00:41:31,000
The same logic applies to managing users.

1026
00:41:31,000 --> 00:41:32,760
Instead of manually adding people to roles,

1027
00:41:32,760 --> 00:41:35,240
you can write a script that talks to your HR system.

1028
00:41:35,240 --> 00:41:37,160
When a new person joins the sales team,

1029
00:41:37,160 --> 00:41:39,480
the script identifies them and automatically adds them

1030
00:41:39,480 --> 00:41:42,760
to the correct roles across every sales-related model in the company.

1031
00:41:42,760 --> 00:41:44,760
When they leave, the script removes them.

1032
00:41:44,760 --> 00:41:47,000
This turns governance into an automated workflow

1033
00:41:47,000 --> 00:41:49,160
rather than a chore for your admins.

1034
00:41:49,160 --> 00:41:51,080
Metadata management scales in the same way.

1035
00:41:51,080 --> 00:41:53,160
If you need to ensure every measure has a description,

1036
00:41:53,160 --> 00:41:54,680
you don't have to check them manually.

1037
00:41:54,680 --> 00:41:57,080
You can run a validation script that scans every model,

1038
00:41:57,080 --> 00:41:58,840
flags anything that's missing a description,

1039
00:41:58,840 --> 00:42:00,280
and generates a report.

1040
00:42:00,280 --> 00:42:02,280
From there, you can batch update those descriptions

1041
00:42:02,280 --> 00:42:04,840
using templates or have your team review them in bulk

1042
00:42:04,840 --> 00:42:05,800
before they go live.

1043
00:42:05,800 --> 00:42:08,680
Translation is another huge pain point that this toolkit solves.

1044
00:42:08,680 --> 00:42:10,280
If your company operates globally,

1045
00:42:10,280 --> 00:42:12,360
you need your model descriptions in multiple languages

1046
00:42:12,360 --> 00:42:14,040
like Spanish, French, and German.

1047
00:42:14,040 --> 00:42:15,640
Instead of translating these by hand,

1048
00:42:15,640 --> 00:42:17,880
you can write a script that sends your English descriptions

1049
00:42:17,880 --> 00:42:21,240
to a translation service and populates the results back into the models.

1050
00:42:21,240 --> 00:42:23,240
This allows you to update every language version

1051
00:42:23,240 --> 00:42:25,320
simultaneously without the manual effort.

1052
00:42:25,320 --> 00:42:27,960
Naming standards are where this gets even more powerful.

1053
00:42:27,960 --> 00:42:29,480
Your organization probably has rules

1054
00:42:29,480 --> 00:42:31,800
about how measures and columns should be named.

1055
00:42:31,800 --> 00:42:34,760
Labs can scan your entire fleet of models and flag any violations

1056
00:42:34,760 --> 00:42:36,520
like a measure named revenue that should be

1057
00:42:36,520 --> 00:42:38,600
"measure net revenue USD".

1058
00:42:38,600 --> 00:42:40,840
The governance team gets a clean report of these issues

1059
00:42:40,840 --> 00:42:42,920
and can decide which ones need an immediate fix

1060
00:42:42,920 --> 00:42:44,760
and which ones are acceptable exceptions.

1061
00:42:44,760 --> 00:42:46,600
You can even use this for report validation.

1062
00:42:46,600 --> 00:42:49,560
If you retire a column that was used in 50 different reports,

1063
00:42:49,560 --> 00:42:52,280
finding those reports manually would take weeks of work.

1064
00:42:52,280 --> 00:42:54,520
With Labs, you run a script that identifies

1065
00:42:54,520 --> 00:42:56,600
every report referencing that old column.

1066
00:42:56,600 --> 00:42:58,120
It can even autofix the logic

1067
00:42:58,120 --> 00:43:00,520
if the change is simple or flagged for a human to review

1068
00:43:00,520 --> 00:43:01,480
if it's more complex.

1069
00:43:01,480 --> 00:43:03,960
This is the shift to governance as code.

1070
00:43:03,960 --> 00:43:05,400
You write the scripts, version them,

1071
00:43:05,400 --> 00:43:06,520
and run them on a schedule.

1072
00:43:06,520 --> 00:43:07,960
Because they are automated,

1073
00:43:07,960 --> 00:43:10,440
they aren't prone to human error or the inconsistency

1074
00:43:10,440 --> 00:43:11,720
that comes with manual work.

1075
00:43:11,720 --> 00:43:13,800
You move from managing one model at a time to managing

1076
00:43:13,800 --> 00:43:15,800
an entire enterprise fleet reliably.

1077
00:43:15,800 --> 00:43:18,440
Governance is no longer just a final checkpoint.

1078
00:43:18,440 --> 00:43:21,880
It becomes a continuous part of how your models live and breathe.

1079
00:43:21,880 --> 00:43:23,720
Building the data pipeline.

1080
00:43:23,720 --> 00:43:25,080
From source to prediction,

1081
00:43:25,080 --> 00:43:27,240
the reality of raw data is that it's messy.

1082
00:43:27,240 --> 00:43:29,480
It lives in ERPs, transactional systems,

1083
00:43:29,480 --> 00:43:31,960
and third party databases designed for processing,

1084
00:43:31,960 --> 00:43:33,160
not for analysis.

1085
00:43:33,160 --> 00:43:34,360
To make this data useful,

1086
00:43:34,360 --> 00:43:36,040
you first have to get it into fabric.

1087
00:43:36,040 --> 00:43:37,480
Everything starts in the fabric lake house.

1088
00:43:37,480 --> 00:43:38,600
This is your bronze layer,

1089
00:43:38,600 --> 00:43:40,520
where data lands exactly as it exists

1090
00:43:40,520 --> 00:43:42,520
in the source system without any transformations.

1091
00:43:42,520 --> 00:43:45,240
You aren't worried about schema consistency or quality yet

1092
00:43:45,240 --> 00:43:47,480
because the lake house acts as a staging ground

1093
00:43:47,480 --> 00:43:49,720
for accumulating information from everywhere.

1094
00:43:49,720 --> 00:43:51,240
Once the data is staged,

1095
00:43:51,240 --> 00:43:52,920
your engineering team builds the pipelines

1096
00:43:52,920 --> 00:43:54,360
that actually clean it.

1097
00:43:54,360 --> 00:43:56,360
These pipelines handle missing values,

1098
00:43:56,360 --> 00:43:57,560
standardized formats,

1099
00:43:57,560 --> 00:44:00,600
and resolve duplicates to turn chaos into a cohesive structure.

1100
00:44:00,600 --> 00:44:01,880
This creates the silver layer,

1101
00:44:01,880 --> 00:44:04,920
which is a cleaner and de-duplicated version of your records.

1102
00:44:04,920 --> 00:44:08,280
Many organizations take it a step further by building a gold layer,

1103
00:44:08,280 --> 00:44:10,040
which is where fact and dimension tables

1104
00:44:10,040 --> 00:44:12,360
are finally constructed for high-level analysis.

1105
00:44:12,360 --> 00:44:15,480
On top of this, prepare data sits the semantic model.

1106
00:44:15,480 --> 00:44:18,360
This is where the shift from engineering to meaning happens.

1107
00:44:18,360 --> 00:44:20,440
The semantic model doesn't move data around,

1108
00:44:20,440 --> 00:44:22,200
but it defines what that data represents

1109
00:44:22,200 --> 00:44:24,280
by creating measures for business metrics.

1110
00:44:24,280 --> 00:44:26,440
It establishes relationships between tables,

1111
00:44:26,440 --> 00:44:27,560
adds hierarchies,

1112
00:44:27,560 --> 00:44:29,160
and locks down row-level security

1113
00:44:29,160 --> 00:44:31,000
so everyone sees only what they should.

1114
00:44:31,000 --> 00:44:33,320
Now your predictive models can enter the picture.

1115
00:44:33,320 --> 00:44:35,080
You open a fabric notebook and use SEMP

1116
00:44:35,080 --> 00:44:37,800
to pull in the measures directly from your semantic model.

1117
00:44:37,800 --> 00:44:39,560
If you're predicting customer churn,

1118
00:44:39,560 --> 00:44:41,800
you need historical outcomes and behavioral features

1119
00:44:41,800 --> 00:44:43,960
like transaction size or purchase frequency

1120
00:44:43,960 --> 00:44:46,920
because the semantic model already has these measures defined.

1121
00:44:46,920 --> 00:44:48,200
SEMPI retrieves them

1122
00:44:48,200 --> 00:44:50,520
with the relationships and hierarchies intact.

1123
00:44:50,520 --> 00:44:52,920
Your data science team then trains an algorithm

1124
00:44:52,920 --> 00:44:54,200
on this grounded data.

1125
00:44:54,200 --> 00:44:56,760
The model learns based on your specific business definitions

1126
00:44:56,760 --> 00:44:57,720
and once it's validated,

1127
00:44:57,720 --> 00:45:00,760
it scores every active customer for churn probability.

1128
00:45:00,760 --> 00:45:02,600
These results are written back into the lake house

1129
00:45:02,600 --> 00:45:05,000
as a new table called churn predictions,

1130
00:45:05,000 --> 00:45:08,040
which includes customer IDs, scores, and timestamps.

1131
00:45:08,040 --> 00:45:09,480
But the circle only closes

1132
00:45:09,480 --> 00:45:12,040
when you connect those predictions back to the business.

1133
00:45:12,040 --> 00:45:14,520
You create a new measure in your semantic model

1134
00:45:14,520 --> 00:45:16,680
that reads from that churn predictions table.

1135
00:45:16,680 --> 00:45:19,400
This measure aggregates the data by region or product

1136
00:45:19,400 --> 00:45:22,280
so co-pilot can reference it just like any other metric.

1137
00:45:22,280 --> 00:45:24,600
When a user asks which segments are at risk,

1138
00:45:24,600 --> 00:45:26,120
co-pilot routes to this measure,

1139
00:45:26,120 --> 00:45:27,480
applies security rules,

1140
00:45:27,480 --> 00:45:30,680
and returns an answer grounded in your actual predictive model.

1141
00:45:30,680 --> 00:45:33,960
This is the full pipeline engineering transforms the raw data,

1142
00:45:33,960 --> 00:45:35,720
the semantic model adds meaning,

1143
00:45:35,720 --> 00:45:38,040
and the predictive models run on that foundation.

1144
00:45:38,040 --> 00:45:40,680
The results flow back through the semantic layer

1145
00:45:40,680 --> 00:45:42,760
so users can interact with predictions

1146
00:45:42,760 --> 00:45:45,880
through the same interface they use for regular analytics.

1147
00:45:45,880 --> 00:45:49,000
The structural point here is that predictions don't exist in a vacuum.

1148
00:45:49,000 --> 00:45:51,160
They aren't separate from your BI infrastructure

1149
00:45:51,160 --> 00:45:54,440
but are integrated into it using the same definitions and security.

1150
00:45:54,440 --> 00:45:57,240
The pipeline isn't a straight line from source to notebook

1151
00:45:57,240 --> 00:45:59,640
but a loop that moves from the lake to the model

1152
00:45:59,640 --> 00:46:01,400
and back to the user as an insight.

1153
00:46:01,400 --> 00:46:04,520
This integration makes co-pilot driven predictions trustworthy.

1154
00:46:04,520 --> 00:46:07,480
Every insight the AI surfaces has been through a structured

1155
00:46:07,480 --> 00:46:09,880
auditable pipeline grounded in your business rules.

1156
00:46:09,880 --> 00:46:12,440
It isn't a black box output from a random notebook

1157
00:46:12,440 --> 00:46:14,920
but a properly engineered business capability.

1158
00:46:14,920 --> 00:46:17,480
Connecting co-pilot studio to predictive agents,

1159
00:46:17,480 --> 00:46:19,480
we've looked at the architecture in the abstract

1160
00:46:19,480 --> 00:46:22,520
but now we need to see how you actually wire these systems together.

1161
00:46:22,520 --> 00:46:24,840
Co-pilot sits on one side and your model sits on the other

1162
00:46:24,840 --> 00:46:26,840
and the tool that connects them is co-pilot studio.

1163
00:46:26,840 --> 00:46:30,280
Co-pilot studio is a low-code platform for building custom AI agents

1164
00:46:30,280 --> 00:46:31,640
that do more than just chat.

1165
00:46:31,640 --> 00:46:33,880
It's a separate environment where you create experiences

1166
00:46:33,880 --> 00:46:36,600
that can act, coordinate, and deliver specific outcomes.

1167
00:46:36,600 --> 00:46:37,960
And this is where you take your predictions

1168
00:46:37,960 --> 00:46:40,760
and make them accessible to your team in a controlled way.

1169
00:46:40,760 --> 00:46:42,600
The process starts with a basic agent.

1170
00:46:42,600 --> 00:46:44,440
You give it a name, define its purpose,

1171
00:46:44,440 --> 00:46:46,600
and write instructions for how it should behave.

1172
00:46:46,600 --> 00:46:48,600
If you're building a sales performance agent,

1173
00:46:48,600 --> 00:46:51,640
its job is to answer questions about pipelines and forecasts

1174
00:46:51,640 --> 00:46:54,440
while knowing exactly which data sources to trust.

1175
00:46:54,440 --> 00:46:57,880
The real power comes when you connect this agent to fabric data agents.

1176
00:46:57,880 --> 00:47:00,120
A fabric data agent is specialized because it knows

1177
00:47:00,120 --> 00:47:03,240
how to query specific semantic models and evaluate measures.

1178
00:47:03,240 --> 00:47:05,240
It respects row-level security

1179
00:47:05,240 --> 00:47:07,800
and returns results with full lineage information,

1180
00:47:07,800 --> 00:47:09,880
acting as a data native tool for analytics.

1181
00:47:09,880 --> 00:47:12,200
In co-pilot studio, you can connect your main agent

1182
00:47:12,200 --> 00:47:15,080
to several of these specialized fabric agents at once.

1183
00:47:15,080 --> 00:47:17,640
You might have one agent for sales, one for customers,

1184
00:47:17,640 --> 00:47:19,080
and one for the supply chain.

1185
00:47:19,080 --> 00:47:22,760
You configure the system so that when a user asks about the pipeline,

1186
00:47:22,760 --> 00:47:24,120
it roots to the sales agent,

1187
00:47:24,120 --> 00:47:26,120
but when they ask about supply constraints,

1188
00:47:26,120 --> 00:47:28,200
it coordinates between multiple agents.

1189
00:47:28,200 --> 00:47:30,920
This is what we call multi-agent orchestration.

1190
00:47:30,920 --> 00:47:33,240
Your co-pilot studio agent acts as the middleman

1191
00:47:33,240 --> 00:47:34,680
that interprets what the user wants

1192
00:47:34,680 --> 00:47:36,920
and decides which fabric agent has the answer.

1193
00:47:36,920 --> 00:47:39,880
The fabric agent executes the query against the semantic model

1194
00:47:39,880 --> 00:47:41,800
and returns the results with citations.

1195
00:47:41,800 --> 00:47:44,120
The orchestrator then evaluates if the answer is complete

1196
00:47:44,120 --> 00:47:46,760
before sending a natural language response back to the user.

1197
00:47:46,760 --> 00:47:49,320
This design pattern solves the problem of scale.

1198
00:47:49,320 --> 00:47:52,520
You can create different fabric agents for every business domain

1199
00:47:52,520 --> 00:47:54,360
and wire them together flexibly.

1200
00:47:54,360 --> 00:47:56,120
It also centralizes governance

1201
00:47:56,120 --> 00:47:58,120
because each data agent is already configured

1202
00:47:58,120 --> 00:48:00,440
to enforce your existing security rules.

1203
00:48:00,440 --> 00:48:03,080
You aren't managing permissions in two different places.

1204
00:48:03,080 --> 00:48:05,160
In a real-world scenario, a user might ask

1205
00:48:05,160 --> 00:48:07,640
why there's a variance in their region's Q3 forecast.

1206
00:48:07,640 --> 00:48:09,960
The co-pilot studio agent identifies the intent

1207
00:48:09,960 --> 00:48:11,400
as forecast analysis

1208
00:48:11,400 --> 00:48:14,200
and sends the request to the correct fabric agent.

1209
00:48:14,200 --> 00:48:16,360
That agent queries the semantic model

1210
00:48:16,360 --> 00:48:18,600
applies the user's regional security rules

1211
00:48:18,600 --> 00:48:20,200
and calculates the variances.

1212
00:48:20,200 --> 00:48:21,880
The user then receives a response explaining

1213
00:48:21,880 --> 00:48:24,680
that the forecast is up 12% due to three large deals

1214
00:48:24,680 --> 00:48:27,000
and an 8% increase in deal size.

1215
00:48:27,000 --> 00:48:28,680
This insight came from a real measure

1216
00:48:28,680 --> 00:48:31,160
in your semantic model that was secured and accurate.

1217
00:48:31,160 --> 00:48:32,840
The system delivered a reliable answer

1218
00:48:32,840 --> 00:48:35,320
to a specific question that wasn't even on a dashboard.

1219
00:48:35,320 --> 00:48:38,200
This is how co-pilot becomes a legitimate tool for the business.

1220
00:48:38,200 --> 00:48:40,200
By connecting it through fabric data agents

1221
00:48:40,200 --> 00:48:41,560
to your semantic models,

1222
00:48:41,560 --> 00:48:43,880
you've grounded the AI in structure.

1223
00:48:43,880 --> 00:48:45,560
You've moved past conversational toys

1224
00:48:45,560 --> 00:48:47,640
and built something that is secured, governed,

1225
00:48:47,640 --> 00:48:49,560
and completely trustworthy.

1226
00:48:49,560 --> 00:48:51,800
The role of fabric data agents in prediction.

1227
00:48:51,800 --> 00:48:54,840
A fabric data agent is fundamentally different from a chatbot

1228
00:48:54,840 --> 00:48:56,200
and understanding this distinction

1229
00:48:56,200 --> 00:48:59,880
is the only way to make prediction work reliably at scale.

1230
00:48:59,880 --> 00:49:02,920
A chatbot responds to natural language by generating text

1231
00:49:02,920 --> 00:49:05,160
but a fabric data agent responds to intent

1232
00:49:05,160 --> 00:49:07,080
by executing structured operations

1233
00:49:07,080 --> 00:49:08,920
against your data infrastructure.

1234
00:49:08,920 --> 00:49:11,080
When you configure a fabric data agent,

1235
00:49:11,080 --> 00:49:13,000
you aren't building a conversational interface.

1236
00:49:13,000 --> 00:49:14,440
You're building a query interpreter.

1237
00:49:14,440 --> 00:49:17,560
You specify which semantic models this agent can access

1238
00:49:17,560 --> 00:49:21,560
and you define exactly which tables and measures it has the authority to use.

1239
00:49:21,560 --> 00:49:24,760
You write instructions that explain how it should handle edge cases

1240
00:49:24,760 --> 00:49:27,560
and you might even provide example queries to guide its logic.

1241
00:49:27,560 --> 00:49:29,160
If a user asks about revenue,

1242
00:49:29,160 --> 00:49:31,000
you tell the agent to look for measures

1243
00:49:31,000 --> 00:49:33,720
starting with revenue and filter by the user's region.

1244
00:49:33,720 --> 00:49:35,640
These examples aren't conversation starters

1245
00:49:35,640 --> 00:49:37,080
because they function as templates

1246
00:49:37,080 --> 00:49:38,440
that guide the agent's reasoning

1247
00:49:38,440 --> 00:49:40,200
when it encounters similar requests.

1248
00:49:40,200 --> 00:49:42,040
This is where prediction enters the picture.

1249
00:49:42,040 --> 00:49:45,720
A fabric data agent understands that some of your measures are predictive by nature.

1250
00:49:45,720 --> 00:49:48,120
You might have a measure called predicted churn risk

1251
00:49:48,120 --> 00:49:49,640
that returns a probability

1252
00:49:49,640 --> 00:49:52,600
or another called forecasted revenue next quarter

1253
00:49:52,600 --> 00:49:54,840
that returns a specific dollar amount.

1254
00:49:54,840 --> 00:49:58,040
The agent knows these are different from standard descriptive measures.

1255
00:49:58,040 --> 00:49:59,960
When it sees a request for a prediction,

1256
00:49:59,960 --> 00:50:02,760
it roots to the right measure and structures the query correctly

1257
00:50:02,760 --> 00:50:04,680
so it can interpret the results properly.

1258
00:50:04,680 --> 00:50:08,040
This matters because predictions are incredibly easy to misinterpret.

1259
00:50:08,040 --> 00:50:10,680
If a user asks co-pilot what churn will be next month,

1260
00:50:10,680 --> 00:50:12,520
the request is completely ambiguous.

1261
00:50:12,520 --> 00:50:14,600
Do they want the probability for every single customer

1262
00:50:14,600 --> 00:50:17,080
or are they looking for the aggregate rate for the whole company?

1263
00:50:17,080 --> 00:50:20,440
They might need it for their specific region or just one product line.

1264
00:50:20,440 --> 00:50:23,720
A generic chatbot would simply generate a plausible sounding answer

1265
00:50:23,720 --> 00:50:25,880
and might even invent a number to fill the gap.

1266
00:50:25,880 --> 00:50:28,040
A fabric data agent with the right configuration

1267
00:50:28,040 --> 00:50:29,320
understands this ambiguity

1268
00:50:29,320 --> 00:50:31,720
and will ask clarifying questions instead of guessing.

1269
00:50:31,720 --> 00:50:33,320
If you've given it good instructions,

1270
00:50:33,320 --> 00:50:36,840
it can infer context from the user's role and data access.

1271
00:50:36,840 --> 00:50:40,200
A support manager likely wants churn predictions for their own customers

1272
00:50:40,200 --> 00:50:43,240
while a VP of retention probably needs company-wide totals.

1273
00:50:43,240 --> 00:50:45,720
The agent learns these patterns and applies them.

1274
00:50:45,720 --> 00:50:47,560
Once the agent formulates a query,

1275
00:50:47,560 --> 00:50:49,400
it executes it through the semantic model

1276
00:50:49,400 --> 00:50:51,960
which is the most critical step in the process.

1277
00:50:51,960 --> 00:50:53,640
The query goes through the exact same engine

1278
00:50:53,640 --> 00:50:55,480
that powers your official reports.

1279
00:50:55,480 --> 00:50:56,920
The relationships are applied,

1280
00:50:56,920 --> 00:50:58,360
the hierarchies are resolved

1281
00:50:58,360 --> 00:51:00,760
and row-level security is strictly enforced.

1282
00:51:00,760 --> 00:51:03,960
The measures are calculated using their original DAX definitions

1283
00:51:03,960 --> 00:51:06,360
so there is no approximation or guessing involved.

1284
00:51:06,360 --> 00:51:07,560
There is no separate code path

1285
00:51:07,560 --> 00:51:09,640
because the prediction moves through the same infrastructure

1286
00:51:09,640 --> 00:51:11,000
as everything else in your system.

1287
00:51:11,000 --> 00:51:14,040
The agent also validates the results before the user ever sees them.

1288
00:51:14,040 --> 00:51:15,960
It checks to see if the numbers are reasonable

1289
00:51:15,960 --> 00:51:18,040
and if they fall within expected ranges.

1290
00:51:18,040 --> 00:51:21,000
It ensures the result has the required metadata attached

1291
00:51:21,000 --> 00:51:24,040
such as timestamps, model versions, and confidence intervals.

1292
00:51:24,040 --> 00:51:26,280
This validation layer prevents nonsense results

1293
00:51:26,280 --> 00:51:27,480
from reaching the user.

1294
00:51:27,480 --> 00:51:29,800
If something looks wrong, the agent can escalate the issue

1295
00:51:29,800 --> 00:51:32,360
or decline to answer rather than returning garbage data.

1296
00:51:32,360 --> 00:51:33,800
Then we have lineage and evidence.

1297
00:51:33,800 --> 00:51:35,960
When the fabric data agent returns a result,

1298
00:51:35,960 --> 00:51:37,400
it doesn't just hand over a number.

1299
00:51:37,400 --> 00:51:38,840
It provides the full context.

1300
00:51:38,840 --> 00:51:40,840
It shows which semantic model was queried,

1301
00:51:40,840 --> 00:51:42,040
which measures were used,

1302
00:51:42,040 --> 00:51:43,560
and which filters were applied.

1303
00:51:43,560 --> 00:51:45,880
It even notes the user's role and security context

1304
00:51:45,880 --> 00:51:47,320
at the time of the request.

1305
00:51:47,320 --> 00:51:50,120
This lineage is the only way to build real trust.

1306
00:51:50,120 --> 00:51:52,840
When a user sees a prediction and wants to know where it came from,

1307
00:51:52,840 --> 00:51:54,360
the agent provides a trace

1308
00:51:54,360 --> 00:51:56,520
that is both auditable and repeatable.

1309
00:51:56,520 --> 00:51:58,200
If another user asks the same question,

1310
00:51:58,200 --> 00:51:59,480
they get the same result

1311
00:51:59,480 --> 00:52:02,520
because the same query executed through the same infrastructure.

1312
00:52:02,520 --> 00:52:03,960
This is a massive departure

1313
00:52:03,960 --> 00:52:06,120
from copilot generating predictions on its own.

1314
00:52:06,120 --> 00:52:07,480
Left to its own devices,

1315
00:52:07,480 --> 00:52:09,480
copilot would try to reason through the numbers

1316
00:52:09,480 --> 00:52:11,800
and generate plausible text before moving on.

1317
00:52:11,800 --> 00:52:14,280
A fabric data agent executing a structured query

1318
00:52:14,280 --> 00:52:15,400
against a semantic model

1319
00:52:15,400 --> 00:52:18,200
returns something that can actually be tested and verified.

1320
00:52:18,200 --> 00:52:20,920
The prediction either came from the right measure or it didn't,

1321
00:52:20,920 --> 00:52:23,800
and the security rules either applied correctly or they didn't.

1322
00:52:23,800 --> 00:52:26,840
These are binary facts rather than a spectrum of confidence.

1323
00:52:26,840 --> 00:52:29,400
This is how copilot finally becomes trustworthy.

1324
00:52:29,400 --> 00:52:31,640
It doesn't happen by making the AI smarter,

1325
00:52:31,640 --> 00:52:33,240
but by connecting it to systems

1326
00:52:33,240 --> 00:52:34,920
that are structured and validated.

1327
00:52:34,920 --> 00:52:36,760
The intelligence lives in your semantic models

1328
00:52:36,760 --> 00:52:38,120
and your predictive engines.

1329
00:52:38,120 --> 00:52:40,280
Copilot's only job is rooting users to those systems

1330
00:52:40,280 --> 00:52:41,800
and explaining the results.

1331
00:52:41,800 --> 00:52:43,480
The fabric data agent is the mechanism

1332
00:52:43,480 --> 00:52:45,000
that makes that rooting reliable.

1333
00:52:45,000 --> 00:52:47,240
Grounding copilot in your business context.

1334
00:52:47,240 --> 00:52:49,960
Grounding is the word we use when everything works correctly

1335
00:52:49,960 --> 00:52:51,640
and copilot's reasoning stays aligned

1336
00:52:51,640 --> 00:52:53,640
with how your business actually operates,

1337
00:52:53,640 --> 00:52:56,280
but grounding isn't something copilot does on its own.

1338
00:52:56,280 --> 00:52:59,560
It is something your architecture either enables or it prevents.

1339
00:52:59,560 --> 00:53:01,000
Most organizations get this wrong

1340
00:53:01,000 --> 00:53:03,800
because they assume grounding is a copilot feature.

1341
00:53:03,800 --> 00:53:05,480
It isn't. It is a baseline requirement.

1342
00:53:05,480 --> 00:53:08,440
If grounding isn't there, your predictions will fail silently.

1343
00:53:08,440 --> 00:53:10,280
Think about what grounding actually requires

1344
00:53:10,280 --> 00:53:11,800
in a real-world scenario.

1345
00:53:11,800 --> 00:53:14,280
Copilot needs context from four specific sources.

1346
00:53:14,280 --> 00:53:15,960
The structure of your semantic model,

1347
00:53:15,960 --> 00:53:17,480
the richness of your metadata,

1348
00:53:17,480 --> 00:53:19,480
the configuration of your security rules,

1349
00:53:19,480 --> 00:53:21,160
and the instructions you've provided.

1350
00:53:21,160 --> 00:53:22,760
Each of these is a dependency.

1351
00:53:22,760 --> 00:53:24,280
If even one of them is weak,

1352
00:53:24,280 --> 00:53:26,200
the whole system becomes unreliable.

1353
00:53:26,200 --> 00:53:28,360
It starts with semantic model relationships.

1354
00:53:28,360 --> 00:53:29,560
These aren't optional extras

1355
00:53:29,560 --> 00:53:32,040
because they are foundational to how the system thinks.

1356
00:53:32,040 --> 00:53:33,800
When you define that a customer table relates

1357
00:53:33,800 --> 00:53:35,800
to an order table through a customer ID,

1358
00:53:35,800 --> 00:53:38,680
you are encoding a structural fact about your business.

1359
00:53:38,680 --> 00:53:40,840
Copilot reads that relationship and understands

1360
00:53:40,840 --> 00:53:43,160
that it can aggregate orders to the customer level.

1361
00:53:43,160 --> 00:53:44,600
The joint logic is implicit.

1362
00:53:44,600 --> 00:53:48,200
Now imagine your semantic model has unclear or messy relationships.

1363
00:53:48,200 --> 00:53:50,760
Maybe a customer ID exists in both tables

1364
00:53:50,760 --> 00:53:52,600
but you haven't explicitly linked them.

1365
00:53:52,600 --> 00:53:55,240
Or perhaps you've marked a relationship as inactive

1366
00:53:55,240 --> 00:53:57,480
because you wanted reports to use a different logic

1367
00:53:57,480 --> 00:53:58,440
in certain cases.

1368
00:53:58,440 --> 00:54:01,000
Copilot sees that ambiguity and is forced to guess.

1369
00:54:01,000 --> 00:54:03,000
It might create a join that is technically valid

1370
00:54:03,000 --> 00:54:04,280
but semantically wrong.

1371
00:54:04,280 --> 00:54:06,360
A user asks for the average order value per customer

1372
00:54:06,360 --> 00:54:08,360
but Copilot routes through an unintended path

1373
00:54:08,360 --> 00:54:11,640
and returns numbers that don't match your official reports.

1374
00:54:11,640 --> 00:54:14,840
The user loses trust and your team has to waste time investigating.

1375
00:54:14,840 --> 00:54:17,240
It turns out Copilot was using an inactive relationship

1376
00:54:17,240 --> 00:54:18,920
that nobody intended it to touch.

1377
00:54:18,920 --> 00:54:20,440
The problem wasn't the AI's reasoning

1378
00:54:20,440 --> 00:54:22,440
but the ambiguity of the model itself.

1379
00:54:22,440 --> 00:54:24,520
Metadata quality multiplies this effect.

1380
00:54:24,520 --> 00:54:27,480
A measure without a description is a total mystery to an AI.

1381
00:54:27,480 --> 00:54:28,920
Copilot has to guess what it means

1382
00:54:28,920 --> 00:54:30,520
based on the name and the formula.

1383
00:54:30,520 --> 00:54:33,800
A measure called total amount could mean gross revenue,

1384
00:54:33,800 --> 00:54:37,000
net revenue after returns or something else entirely.

1385
00:54:37,000 --> 00:54:39,480
The DAX formula tells you what is being calculated

1386
00:54:39,480 --> 00:54:40,920
but it never tells you why.

1387
00:54:40,920 --> 00:54:42,680
A measure with a proper description stating

1388
00:54:42,680 --> 00:54:45,640
it is net revenue minus approved returns in USD

1389
00:54:45,640 --> 00:54:47,080
removes all that doubt.

1390
00:54:47,080 --> 00:54:49,480
Copilot reads that description and understands the scope,

1391
00:54:49,480 --> 00:54:51,640
the calculation method and the currency.

1392
00:54:51,640 --> 00:54:54,600
When aggregation levels matter, the description flags it.

1393
00:54:54,600 --> 00:54:56,680
When the measure requires specific context

1394
00:54:56,680 --> 00:54:59,080
to be meaningful, the description explains it.

1395
00:54:59,080 --> 00:55:00,840
Without this, Copilot makes assumptions

1396
00:55:00,840 --> 00:55:02,520
that usually turn out to be wrong.

1397
00:55:02,520 --> 00:55:05,720
Security misalignment creates a different kind of grounding failure.

1398
00:55:05,720 --> 00:55:08,200
The row level security rules you've set up for reports

1399
00:55:08,200 --> 00:55:10,040
might not cover everything Copilot can do.

1400
00:55:10,040 --> 00:55:12,120
A user might not be able to see a column in a report

1401
00:55:12,120 --> 00:55:15,160
because it's hidden but they can still ask Copilot about it

1402
00:55:15,160 --> 00:55:17,400
if object level security isn't configured.

1403
00:55:17,400 --> 00:55:19,960
The data isn't supposed to be accessible but it is.

1404
00:55:19,960 --> 00:55:22,120
This is a grounding failure because Copilot's access

1405
00:55:22,120 --> 00:55:24,440
isn't grounded in your actual security intent.

1406
00:55:24,440 --> 00:55:26,120
The gap between what you meant to restrict

1407
00:55:26,120 --> 00:55:27,560
and what is actually restricted

1408
00:55:27,560 --> 00:55:29,000
becomes a major vulnerability.

1409
00:55:29,000 --> 00:55:31,240
It's a subtle problem because everything seems to work

1410
00:55:31,240 --> 00:55:32,680
without any error messages.

1411
00:55:32,680 --> 00:55:33,960
The prediction comes back.

1412
00:55:33,960 --> 00:55:36,920
But the user sees data they should never have accessed.

1413
00:55:36,920 --> 00:55:39,320
Grounding requires that your security rules align

1414
00:55:39,320 --> 00:55:42,360
with your actual intent, not just your reporting design.

1415
00:55:42,360 --> 00:55:45,320
Instructions are the fourth and final grounding mechanism.

1416
00:55:45,320 --> 00:55:46,920
These are the guidelines you give Copilot

1417
00:55:46,920 --> 00:55:49,640
to help it interpret messy or ambiguous requests.

1418
00:55:49,640 --> 00:55:51,240
If you don't provide these instructions,

1419
00:55:51,240 --> 00:55:52,920
Copilot has to make default assumptions

1420
00:55:52,920 --> 00:55:55,080
that might not match your business logic.

1421
00:55:55,080 --> 00:55:58,440
With clear instructions, you can tell the AI to use net revenue

1422
00:55:58,440 --> 00:56:01,160
when someone asks for revenue without being specific.

1423
00:56:01,160 --> 00:56:02,680
You can tell it to only include customers

1424
00:56:02,680 --> 00:56:05,160
who have had a transaction in the last 12 months.

1425
00:56:05,160 --> 00:56:07,480
Now, the reasoning stays grounded in your business rules

1426
00:56:07,480 --> 00:56:09,320
rather than the AI's own defaults.

1427
00:56:09,320 --> 00:56:12,600
The critical insight here is that grounding isn't a property of Copilot.

1428
00:56:12,600 --> 00:56:15,800
It is a property of the system that Copilot is querying.

1429
00:56:15,800 --> 00:56:18,440
If your semantic model has a weak structure, grounding fails.

1430
00:56:18,440 --> 00:56:21,240
If your metadata is sparse or your security is inconsistent,

1431
00:56:21,240 --> 00:56:21,960
grounding fails.

1432
00:56:21,960 --> 00:56:23,880
Copilot is just the interface

1433
00:56:23,880 --> 00:56:27,080
and it can only be as grounded as the infrastructure sitting underneath it.

1434
00:56:27,080 --> 00:56:30,680
This is exactly why prepped for AI certification is so important.

1435
00:56:30,680 --> 00:56:32,680
It isn't about making the AI smarter.

1436
00:56:32,680 --> 00:56:35,240
It is about ensuring the architecture beneath the AI

1437
00:56:35,240 --> 00:56:37,480
is sound enough that the reasoning stays reliable.

1438
00:56:37,480 --> 00:56:39,080
You are auditing the semantic model

1439
00:56:39,080 --> 00:56:40,680
and verifying the relationships.

1440
00:56:40,680 --> 00:56:43,880
You are confirming the metadata and testing the security.

1441
00:56:43,880 --> 00:56:45,480
You are formalizing the instructions.

1442
00:56:45,480 --> 00:56:47,560
All of that hard work ensures that when Copilot

1443
00:56:47,560 --> 00:56:50,680
reasons over your data, it stays aligned with your business reality.

1444
00:56:50,680 --> 00:56:53,080
Monitoring and auditing Copilot predictions.

1445
00:56:53,080 --> 00:56:55,080
You have the predictive infrastructure ready.

1446
00:56:55,080 --> 00:56:57,480
You connected Copilot through fabric data agents,

1447
00:56:57,480 --> 00:56:59,480
secured the perimeter with RLS and OLS

1448
00:56:59,480 --> 00:57:00,680
and now the users are in.

1449
00:57:00,680 --> 00:57:03,080
They are asking Copilot for sales forecasts,

1450
00:57:03,080 --> 00:57:05,480
customer churn risks and revenue impacts.

1451
00:57:05,480 --> 00:57:08,680
But this creates a new problem because now you actually need to know

1452
00:57:08,680 --> 00:57:10,280
what is happening inside the black box.

1453
00:57:10,280 --> 00:57:13,880
You need to see who is asking what which predictions are driving real business decisions

1454
00:57:13,880 --> 00:57:16,280
and whether those predictions are even accurate.

1455
00:57:16,280 --> 00:57:19,080
Monitoring and auditing are no longer just compliance checkboxes

1456
00:57:19,080 --> 00:57:20,680
to satisfy a legal team.

1457
00:57:20,680 --> 00:57:23,480
They are operational necessities for a functioning system.

1458
00:57:23,480 --> 00:57:25,880
Every time a user asks Copilot a question

1459
00:57:25,880 --> 00:57:28,680
that query routes directly through your semantic model,

1460
00:57:28,680 --> 00:57:31,480
you cannot afford for that query to be logged vaguely.

1461
00:57:31,480 --> 00:57:34,280
You need the specifics, including the user name,

1462
00:57:34,280 --> 00:57:35,880
the exact question they typed,

1463
00:57:35,880 --> 00:57:37,880
the specific semantic model that was hit,

1464
00:57:37,880 --> 00:57:39,080
and which measures were used.

1465
00:57:39,080 --> 00:57:41,880
You also need to see what filters were applied based on RLS

1466
00:57:41,880 --> 00:57:44,280
and exactly what result the system returned to the user.

1467
00:57:44,280 --> 00:57:47,480
This trail is your evidence that the system is working as intended

1468
00:57:47,480 --> 00:57:49,480
and without it you have no way to prove

1469
00:57:49,480 --> 00:57:51,480
why a specific prediction was made.

1470
00:57:51,480 --> 00:57:53,880
Fabric audit logs capture this data automatically.

1471
00:57:53,880 --> 00:57:56,680
When Copilot executes a query against a semantic model,

1472
00:57:56,680 --> 00:57:58,680
the query engine logs the operation

1473
00:57:58,680 --> 00:58:01,080
without you needing to add a single line of logging code.

1474
00:58:01,080 --> 00:58:03,080
You are not instrumenting your application

1475
00:58:03,080 --> 00:58:06,680
or writing custom scripts because the platform handles the heavy lifting for you.

1476
00:58:06,680 --> 00:58:08,680
Your only job is to access those logs

1477
00:58:08,680 --> 00:58:11,480
and turn that raw data into something your team can actually understand.

1478
00:58:11,480 --> 00:58:13,080
Raw audit logs are overwhelming

1479
00:58:13,080 --> 00:58:15,880
because you might see thousands of entries every single day.

1480
00:58:15,880 --> 00:58:17,880
You need visibility into high-level patterns

1481
00:58:17,880 --> 00:58:19,480
rather than individual queries,

1482
00:58:19,480 --> 00:58:21,880
so this is where you build a dedicated monitoring dashboard.

1483
00:58:21,880 --> 00:58:25,480
You pull the data from fabric audit logs into a power BI semantic model

1484
00:58:25,480 --> 00:58:29,880
and create measures for total queries by user and average queries per day.

1485
00:58:29,880 --> 00:58:32,680
By tracking which models and measures are accessed most frequently,

1486
00:58:32,680 --> 00:58:35,080
you can see at a glance how the system is being used.

1487
00:58:35,080 --> 00:58:37,080
A sudden spike in churn prediction queries

1488
00:58:37,080 --> 00:58:40,080
might mean your retention team is responding to a market event

1489
00:58:40,080 --> 00:58:42,080
while a drop in forecast queries could mean

1490
00:58:42,080 --> 00:58:44,280
users have lost confidence in the numbers.

1491
00:58:44,280 --> 00:58:46,680
These patterns tell you if your predictive system

1492
00:58:46,680 --> 00:58:49,480
is actually creating value or just sitting idle.

1493
00:58:49,480 --> 00:58:51,480
Sensitivity is a major factor here

1494
00:58:51,480 --> 00:58:54,080
because not all predictions carry the same weight.

1495
00:58:54,080 --> 00:58:57,480
A revenue forecast that drives next year's budgeting is a high stakes event

1496
00:58:57,480 --> 00:59:01,080
while a curiosity query from an analyst exploring a what-if scenario

1497
00:59:01,080 --> 00:59:02,680
is relatively low stakes.

1498
00:59:02,680 --> 00:59:05,080
You need to classify these predictions by risk level

1499
00:59:05,080 --> 00:59:07,080
and then configure your governance rules to match.

1500
00:59:07,080 --> 00:59:09,480
High-risk predictions that affect major business decisions

1501
00:59:09,480 --> 00:59:11,280
should go through additional controls,

1502
00:59:11,280 --> 00:59:14,480
such as requiring a manual approval or a second validation model

1503
00:59:14,480 --> 00:59:16,080
before the data is surfaced.

1504
00:59:16,080 --> 00:59:19,880
Microsoft Per View can help you classify these results by sensitivity,

1505
00:59:19,880 --> 00:59:22,080
allowing you to mark high confidence predictions

1506
00:59:22,080 --> 00:59:24,680
as gold and experimental ones as development.

1507
00:59:24,680 --> 00:59:26,480
Co-pilot respects those classifications

1508
00:59:26,480 --> 00:59:28,880
when it decides what to show to a specific user.

1509
00:59:28,880 --> 00:59:31,880
Data loss prevention policies also extend into this predictive space.

1510
00:59:31,880 --> 00:59:34,880
You can set DLP rules that flag whenever someone accesses predictions

1511
00:59:34,880 --> 00:59:39,080
about sensitive topics like compensation, health status, or credit worthiness.

1512
00:59:39,080 --> 00:59:40,880
You aren't necessarily blocking the user

1513
00:59:40,880 --> 00:59:43,480
but you are creating a record and a governance event

1514
00:59:43,480 --> 00:59:44,680
for the team to review.

1515
00:59:44,680 --> 00:59:48,080
If someone queries churned predictions filtered by salary level,

1516
00:59:48,080 --> 00:59:49,880
that is a red flag that needs attention.

1517
00:59:49,880 --> 00:59:52,080
It might be a legitimate compensation analysis

1518
00:59:52,080 --> 00:59:54,480
or it might be an inappropriate data correlation

1519
00:59:54,480 --> 00:59:58,280
but the monitoring system ensures your governance team can review the intent

1520
00:59:58,280 --> 01:00:00,280
and decide if it violates policy.

1521
01:00:00,280 --> 01:00:02,880
This continuous monitoring approach is fundamentally different

1522
01:00:02,880 --> 01:00:04,680
from a traditional compliance audit.

1523
01:00:04,680 --> 01:00:06,880
An audit usually happens once a year

1524
01:00:06,880 --> 01:00:10,880
where an auditor reviews old data and files a report that nobody reads.

1525
01:00:10,880 --> 01:00:14,480
Continuous monitoring happens in real time as the queries are actually occurring.

1526
01:00:14,480 --> 01:00:16,280
You see the patterns as they emerge,

1527
01:00:16,280 --> 01:00:19,280
which allows you to detect problems while they are still small.

1528
01:00:19,280 --> 01:00:21,680
You can respond to misuse the moment it happens

1529
01:00:21,680 --> 01:00:25,280
rather than waiting for a post-mortem review months after the damage is done.

1530
01:00:25,280 --> 01:00:28,080
The second piece of the puzzle is accuracy monitoring.

1531
01:00:28,080 --> 01:00:30,880
Your predictive models might have looked great on test data

1532
01:00:30,880 --> 01:00:33,480
but you need to know how they are performing in the real world.

1533
01:00:33,480 --> 01:00:36,080
You have to ask if the churn predictions are actually correlating

1534
01:00:36,080 --> 01:00:38,480
with the customers who leave the company later on.

1535
01:00:38,480 --> 01:00:40,480
To do this, you create feedback loops

1536
01:00:40,480 --> 01:00:44,080
where you log the actual outcome once a predicted event occurs.

1537
01:00:44,080 --> 01:00:46,480
By comparing predicted values to actual results,

1538
01:00:46,480 --> 01:00:48,080
you can calculate model drift,

1539
01:00:48,080 --> 01:00:50,080
which is the natural degradation in accuracy

1540
01:00:50,080 --> 01:00:52,080
as data patterns change over time.

1541
01:00:52,080 --> 01:00:54,080
When that drift hits a certain threshold,

1542
01:00:54,080 --> 01:00:55,680
you trigger a retraining process

1543
01:00:55,680 --> 01:00:58,280
so the model stays fresh and the predictions stay sharp.

1544
01:00:58,280 --> 01:01:00,880
This is not a one-time setup that you can just forget about.

1545
01:01:00,880 --> 01:01:04,280
It is a continuous cycle where you run accuracy checks every month

1546
01:01:04,280 --> 01:01:06,080
and retrain your models every quarter.

1547
01:01:06,080 --> 01:01:08,680
You are constantly monitoring real-time query patterns

1548
01:01:08,680 --> 01:01:11,480
and updating your security rules as the business evolves.

1549
01:01:11,480 --> 01:01:14,480
The monitoring system becomes the essential feedback mechanism

1550
01:01:14,480 --> 01:01:17,480
that keeps your predictive capabilities trustworthy for the long haul.

1551
01:01:17,480 --> 01:01:20,480
Governance is not something that happens after you deploy the system.

1552
01:01:20,480 --> 01:01:23,480
It is something that runs through the entire life of the project.

1553
01:01:23,480 --> 01:01:24,680
Common mistakes.

1554
01:01:24,680 --> 01:01:26,680
When bridging co-pilot and power BI,

1555
01:01:26,680 --> 01:01:29,080
let's look at what actually happens in the real world

1556
01:01:29,080 --> 01:01:31,080
when organizations try to implement this.

1557
01:01:31,080 --> 01:01:33,080
We aren't talking about the theory in a white paper

1558
01:01:33,080 --> 01:01:35,480
but the actual practice of building these systems.

1559
01:01:35,480 --> 01:01:37,480
You build your models, you connect co-pilot

1560
01:01:37,480 --> 01:01:39,080
and then things start to break.

1561
01:01:39,080 --> 01:01:42,480
When they do, there is almost always a specific pattern to the failure

1562
01:01:42,480 --> 01:01:43,680
that you could have avoided.

1563
01:01:43,680 --> 01:01:47,680
The first big mistake is assuming that raw data is enough for an LLM to understand.

1564
01:01:47,680 --> 01:01:51,080
A team will often point co-pilot at a lake house full of raw tables

1565
01:01:51,080 --> 01:01:53,680
and expect it to start making predictions immediately.

1566
01:01:53,680 --> 01:01:56,280
They assume the AI will just know what each table represents

1567
01:01:56,280 --> 01:01:58,080
and how they relate to each other but it won't.

1568
01:01:58,080 --> 01:02:02,080
Co-pilot sees columns with cryptic names like CustDD or TransDD

1569
01:02:02,080 --> 01:02:05,080
and has no idea what they actually mean in a business context.

1570
01:02:05,080 --> 01:02:07,680
Without a defined schema or calculated measures,

1571
01:02:07,680 --> 01:02:12,280
co-pilot will generate text that sounds plausible even though the numbers are completely wrong.

1572
01:02:12,280 --> 01:02:13,880
The user ends up blaming the AI

1573
01:02:13,880 --> 01:02:15,680
but the real problem is the architecture

1574
01:02:15,680 --> 01:02:18,680
because they expected reasoning without providing any structure.

1575
01:02:18,680 --> 01:02:22,680
The second mistake usually happens inside the data science team.

1576
01:02:22,680 --> 01:02:26,880
A developer builds a predictive model in a notebook using raw tables from the data lake

1577
01:02:26,880 --> 01:02:30,080
and the model trains and generates predictions perfectly.

1578
01:02:30,080 --> 01:02:32,480
They surface those results to the business users

1579
01:02:32,480 --> 01:02:36,480
but they realize too late that the model used a different definition of revenue

1580
01:02:36,480 --> 01:02:38,280
than the official company reports.

1581
01:02:38,280 --> 01:02:41,880
Your BIT might have a measure that excludes specific transaction types

1582
01:02:41,880 --> 01:02:43,680
but the data scientists didn't know that.

1583
01:02:43,680 --> 01:02:46,680
The predictions are mathematically correct but semantically useless

1584
01:02:46,680 --> 01:02:50,680
because they are predicting against a definition that nobody in the company actually uses.

1585
01:02:50,680 --> 01:02:54,880
You can fix this by building the model on top of semantic model measures using semantic link

1586
01:02:54,880 --> 01:02:58,880
but teams often skip this step to move faster and pay for it later in confusion.

1587
01:02:58,880 --> 01:03:02,480
The third mistake is treating metadata as optional documentation

1588
01:03:02,480 --> 01:03:04,280
instead of critical infrastructure.

1589
01:03:04,280 --> 01:03:07,080
A team might create a semantic model with clear table names

1590
01:03:07,080 --> 01:03:09,080
but they leave out the descriptions and synonyms.

1591
01:03:09,080 --> 01:03:11,080
They think co-pilot will just figure it out from the names

1592
01:03:11,080 --> 01:03:13,080
but that is a dangerous assumption.

1593
01:03:13,080 --> 01:03:15,080
If co-pilot sees a measure simply named total

1594
01:03:15,080 --> 01:03:19,280
it has to guess if that means total revenue, total cost, or total transaction count

1595
01:03:19,280 --> 01:03:22,480
that generic name creates ambiguity that leads to hallucinations.

1596
01:03:22,480 --> 01:03:24,480
Every table needs a business definition

1597
01:03:24,480 --> 01:03:27,280
and every measure needs an explanation of its scope.

1598
01:03:27,280 --> 01:03:30,080
This isn't just bureaucracy, it is the operational infrastructure

1599
01:03:30,080 --> 01:03:32,080
that makes your predictions reliable.

1600
01:03:32,080 --> 01:03:34,880
The fourth mistake is what I call RLS configuration theater.

1601
01:03:34,880 --> 01:03:37,880
A team sets up row-level security for their reports

1602
01:03:37,880 --> 01:03:41,880
and verifies that a salesperson can only see their own region on the dashboard.

1603
01:03:41,880 --> 01:03:44,880
They assume the security is handled but they never actually test what happens

1604
01:03:44,880 --> 01:03:47,280
when co-pilot queries that say model directly.

1605
01:03:47,280 --> 01:03:51,280
It often turns out that the RLS rules are too loose for an open-ended query interface.

1606
01:03:51,280 --> 01:03:54,480
A user might ask co-pilot for a comparison across regions

1607
01:03:54,480 --> 01:03:58,880
and while the filter applies, it might not work the way you expected in a conversational context.

1608
01:03:58,880 --> 01:04:02,280
You have to test actual co-pilot queries against your RLS rules

1609
01:04:02,280 --> 01:04:05,280
to ensure the security model covers every possible use case.

1610
01:04:05,280 --> 01:04:09,280
The fifth mistake is exposing sensitive columns without using object-level security.

1611
01:04:09,280 --> 01:04:11,880
A model might contain salary information that is necessary

1612
01:04:11,880 --> 01:04:14,080
for calculating certain high-level metrics.

1613
01:04:14,080 --> 01:04:17,880
The team uses RLS to restrict which employees each person can see

1614
01:04:17,880 --> 01:04:19,680
and they think the data is safe.

1615
01:04:19,680 --> 01:04:22,680
However, because they didn't use RLS to hide the salary column itself

1616
01:04:22,680 --> 01:04:25,680
a user can ask co-pilot a seemingly innocent question

1617
01:04:25,680 --> 01:04:28,280
that requires the AI to look at that hidden data.

1618
01:04:28,280 --> 01:04:32,080
Co-pilot then constructs an answer that gives away compensation insights

1619
01:04:32,080 --> 01:04:33,680
the user was never supposed to have.

1620
01:04:33,680 --> 01:04:36,280
Hiding a field in a report UI is not real security,

1621
01:04:36,280 --> 01:04:40,880
but RLS is because it removes the column from the model entirely for unauthorized users.

1622
01:04:40,880 --> 01:04:44,280
The sixth mistake is releasing these predictive models without any versioning.

1623
01:04:44,280 --> 01:04:46,880
A model gets built and added to the semantic model

1624
01:04:46,880 --> 01:04:49,280
and users start relying on it for their daily work.

1625
01:04:49,280 --> 01:04:52,480
Then the data science team retrains the model with new data

1626
01:04:52,480 --> 01:04:54,080
and the performance changes.

1627
01:04:54,080 --> 01:04:55,880
Suddenly users are getting different predictions

1628
01:04:55,880 --> 01:04:59,080
and they have no idea why because there is no version tracking or change log.

1629
01:04:59,080 --> 01:05:02,880
You should version your predictive models exactly like you would version a piece of software.

1630
01:05:02,880 --> 01:05:04,080
Track what changed.

1631
01:05:04,080 --> 01:05:06,080
Test the new version before you release it

1632
01:05:06,080 --> 01:05:10,080
and make it clear to your users that the models can and will evolve over time.

1633
01:05:10,080 --> 01:05:12,480
These mistakes are not just theoretical possibilities.

1634
01:05:12,480 --> 01:05:16,280
They are the common pitfalls that happen when teams try to build co-pilot systems

1635
01:05:16,280 --> 01:05:18,680
without respecting the underlying architecture.

1636
01:05:18,680 --> 01:05:21,680
If you want your system to work, you have to move past the hype

1637
01:05:21,680 --> 01:05:25,880
and focus on the structural requirements that keep the data accurate and secure.

1638
01:05:25,880 --> 01:05:29,080
The 2027 governance fabric policy as code.

1639
01:05:29,080 --> 01:05:32,080
We need to stop building for today and start looking at where this is heading

1640
01:05:32,080 --> 01:05:34,080
because the shift is going to be massive.

1641
01:05:34,080 --> 01:05:38,280
Right now governance is a manual grind involving spreadsheets, approval chains

1642
01:05:38,280 --> 01:05:41,880
and static configurations that usually just sit there gathering dust.

1643
01:05:41,880 --> 01:05:44,680
By 2027, that entire model is going to disappear

1644
01:05:44,680 --> 01:05:46,880
as governance becomes automated, embedded,

1645
01:05:46,880 --> 01:05:50,680
and treated as a core operational power rather than a compliance headache.

1646
01:05:50,680 --> 01:05:53,880
The reality today is that governance happens after the work is done

1647
01:05:53,880 --> 01:05:57,080
meaning you build a semantic model and then you have to go back to documented,

1648
01:05:57,080 --> 01:05:59,280
set up security and wait for certification.

1649
01:05:59,280 --> 01:06:02,880
It is a slow sequence of checkpoints and friction that holds everything up.

1650
01:06:02,880 --> 01:06:05,480
By 2027, this process flips on its head

1651
01:06:05,480 --> 01:06:07,880
so that governance actually precedes creation.

1652
01:06:07,880 --> 01:06:12,080
When you start defining a new model, the rules will apply before you even finish building it.

1653
01:06:12,080 --> 01:06:14,880
If you create a new measure, the system will immediately check

1654
01:06:14,880 --> 01:06:17,680
if the name follows standards, if the description is there

1655
01:06:17,680 --> 01:06:19,480
and if the classification is correct.

1656
01:06:19,480 --> 01:06:22,280
If the measure fails those checks, you simply cannot move forward.

1657
01:06:22,280 --> 01:06:24,680
Governance stops being a gate at the end of the road

1658
01:06:24,680 --> 01:06:26,680
and becomes part of the engine itself.

1659
01:06:26,680 --> 01:06:29,680
This is the moment where semantic link turns into your control plane

1660
01:06:29,680 --> 01:06:31,280
instead of just a data access layer.

1661
01:06:31,280 --> 01:06:33,880
It becomes the governance layer that knows your policies

1662
01:06:33,880 --> 01:06:36,280
and enforces them every single second.

1663
01:06:36,280 --> 01:06:37,880
You define a rule once,

1664
01:06:37,880 --> 01:06:39,680
like requiring descriptions for all measures

1665
01:06:39,680 --> 01:06:41,480
or classifying sensitive data

1666
01:06:41,480 --> 01:06:44,080
and semantic link acts as the enforcement engine.

1667
01:06:44,080 --> 01:06:46,880
If a developer tries to publish a model that is missing descriptions,

1668
01:06:46,880 --> 01:06:48,080
the system blocks it.

1669
01:06:48,080 --> 01:06:50,080
But it won't just throw a cold error message.

1670
01:06:50,080 --> 01:06:51,880
It will offer guidance by showing a template

1671
01:06:51,880 --> 01:06:54,280
and giving examples of what a good description looks like.

1672
01:06:54,280 --> 01:06:55,880
Security follows the same logic.

1673
01:06:55,880 --> 01:06:57,680
So when you add a user to a role,

1674
01:06:57,680 --> 01:07:00,080
your RLS rules will propagate automatically.

1675
01:07:00,080 --> 01:07:02,480
You won't have to manually assign that person to every single model

1676
01:07:02,480 --> 01:07:04,480
because the system will recognize the group change

1677
01:07:04,480 --> 01:07:06,280
and understand exactly what they should see.

1678
01:07:06,280 --> 01:07:08,480
It looks at the policies and classifications

1679
01:07:08,480 --> 01:07:10,280
then applies the correct RLS rules

1680
01:07:10,280 --> 01:07:12,680
across every relevant model in a matter of seconds.

1681
01:07:12,680 --> 01:07:15,480
A person goes from having zero access to being fully configured

1682
01:07:15,480 --> 01:07:18,280
without any manual intervention or forgotten models.

1683
01:07:18,280 --> 01:07:20,880
Consistency is finally enforced by design.

1684
01:07:20,880 --> 01:07:23,280
When you decide to expose a semantic model to co-pilot,

1685
01:07:23,280 --> 01:07:24,880
the security checks will run on their own

1686
01:07:24,880 --> 01:07:26,480
to ensure everything is ready.

1687
01:07:26,480 --> 01:07:28,680
The system asks if the model is prepped for AI,

1688
01:07:28,680 --> 01:07:30,680
if the RLS and OLS are tight

1689
01:07:30,680 --> 01:07:33,280
and if the instructions match the data classification.

1690
01:07:33,280 --> 01:07:34,480
These are not optional steps

1691
01:07:34,480 --> 01:07:36,280
that a busy developer might skip over.

1692
01:07:36,280 --> 01:07:37,880
There are mandatory gates that prevent a model

1693
01:07:37,880 --> 01:07:40,280
from reaching co-pilot until it passes every test.

1694
01:07:40,280 --> 01:07:42,680
The pieces of this technology are already here.

1695
01:07:42,680 --> 01:07:44,680
And tools like semantic link labs

1696
01:07:44,680 --> 01:07:48,080
are already automating governance at scale for early adopters.

1697
01:07:48,080 --> 01:07:51,080
By 2027, this will be the mainstream way of working

1698
01:07:51,080 --> 01:07:54,080
and will be just essential to fabric as the data warehouse itself.

1699
01:07:54,080 --> 01:07:56,080
You won't even think of it as a separate task

1700
01:07:56,080 --> 01:07:58,480
because it will just be how the platform functions.

1701
01:07:58,480 --> 01:08:00,880
This shift changes everything for your governance team

1702
01:08:00,880 --> 01:08:02,880
because they can finally stop doing manual reviews

1703
01:08:02,880 --> 01:08:03,880
and start writing code.

1704
01:08:03,880 --> 01:08:05,480
They design the rules once, publish them

1705
01:08:05,480 --> 01:08:07,480
and the system handles the enforcement everywhere

1706
01:08:07,480 --> 01:08:08,280
and all the time.

1707
01:08:08,280 --> 01:08:10,080
The effort moves from being a bottleneck

1708
01:08:10,080 --> 01:08:12,880
in the approval process to being the architects of the system.

1709
01:08:12,880 --> 01:08:14,680
Instead of looking at models one by one,

1710
01:08:14,680 --> 01:08:17,280
they are building logic that governs hundreds of models

1711
01:08:17,280 --> 01:08:18,280
simultaneously.

1712
01:08:18,280 --> 01:08:20,280
It also makes governance visible and measurable

1713
01:08:20,280 --> 01:08:21,880
in a way that actually makes sense.

1714
01:08:21,880 --> 01:08:24,580
Most compliance reports today are backward looking documents

1715
01:08:24,580 --> 01:08:26,180
that show you what happened last month.

1716
01:08:26,180 --> 01:08:29,780
But by 2027, your dashboards will show compliance in real time.

1717
01:08:29,780 --> 01:08:31,880
You will know instantly if every model meets the standard

1718
01:08:31,880 --> 01:08:33,980
and if every user has the right access.

1719
01:08:33,980 --> 01:08:35,980
If an OLS rule drifts or changes,

1720
01:08:35,980 --> 01:08:37,680
the system detects it immediately.

1721
01:08:37,680 --> 01:08:39,480
If that change violates a policy,

1722
01:08:39,480 --> 01:08:41,480
you get an alert and if it was intentional,

1723
01:08:41,480 --> 01:08:43,280
you document it right then and there.

1724
01:08:43,280 --> 01:08:46,080
The mindset shift here is just as big as the technical one.

1725
01:08:46,080 --> 01:08:49,080
Today, builder see governance as something forced on them,

1726
01:08:49,080 --> 01:08:50,080
but in the near future,

1727
01:08:50,080 --> 01:08:52,480
it will be a tool they actually use to move faster.

1728
01:08:52,480 --> 01:08:54,680
It removes the guessing game about what is required

1729
01:08:54,680 --> 01:08:57,080
and makes it nearly impossible to accidentally break the rules.

1730
01:08:57,080 --> 01:08:59,880
The governance fabric is not a wall designed to slow you down.

1731
01:08:59,880 --> 01:09:01,280
It is a set of guardrails

1732
01:09:01,280 --> 01:09:04,280
that lets you run at full speed while staying completely safe.

1733
01:09:04,280 --> 01:09:05,680
This is the path we are on,

1734
01:09:05,680 --> 01:09:07,880
so you should start thinking about it right now.

1735
01:09:07,880 --> 01:09:09,280
When you build models today,

1736
01:09:09,280 --> 01:09:12,080
design them to be ready for this automated future

1737
01:09:12,080 --> 01:09:14,880
by using clear names and writing solid descriptions.

1738
01:09:14,880 --> 01:09:16,880
The organizations that get this right early

1739
01:09:16,880 --> 01:09:19,680
will be able to scale their AI capabilities much faster

1740
01:09:19,680 --> 01:09:22,880
than the ones still stuck doing manual paperwork in 2027.

1741
01:09:22,880 --> 01:09:24,880
Starting your implementation first steps.

1742
01:09:24,880 --> 01:09:26,880
You have seen the full architecture now

1743
01:09:26,880 --> 01:09:28,680
from semantic models as meaning layers

1744
01:09:28,680 --> 01:09:31,080
to semantic link acting as the bridge for prediction.

1745
01:09:31,080 --> 01:09:33,680
We have covered how data agents talk to co-pilot

1746
01:09:33,680 --> 01:09:36,080
and why monitoring is an operational must have.

1747
01:09:36,080 --> 01:09:37,680
Now we have to get practical and figure out

1748
01:09:37,680 --> 01:09:40,280
where you actually start building this in your own organization.

1749
01:09:40,280 --> 01:09:41,880
The first step is an inventory audit

1750
01:09:41,880 --> 01:09:43,880
to see exactly what you are working with.

1751
01:09:43,880 --> 01:09:45,080
Open up your fabric workspace

1752
01:09:45,080 --> 01:09:46,880
or your Power BI premium capacity

1753
01:09:46,880 --> 01:09:49,480
and make a list of every single semantic model you own.

1754
01:09:49,480 --> 01:09:50,680
For every model on that list,

1755
01:09:50,680 --> 01:09:52,080
you need to check the metadata

1756
01:09:52,080 --> 01:09:54,280
to see if the tables have business descriptions

1757
01:09:54,280 --> 01:09:56,680
and if the measures are clearly defined.

1758
01:09:56,680 --> 01:09:59,280
Most companies find that about 40% of their models

1759
01:09:59,280 --> 01:10:00,080
are in good shape,

1760
01:10:00,080 --> 01:10:02,880
while the other 60% are missing basic info.

1761
01:10:02,880 --> 01:10:04,480
That is not a failure on your part,

1762
01:10:04,480 --> 01:10:06,480
but it is a baseline that tells you exactly

1763
01:10:06,480 --> 01:10:07,680
where you are starting from.

1764
01:10:07,680 --> 01:10:10,080
Use a simple color code where green is complete,

1765
01:10:10,080 --> 01:10:12,280
yellow is partial and red is missing.

1766
01:10:12,280 --> 01:10:13,280
Once you see the patterns,

1767
01:10:13,280 --> 01:10:14,680
you will know which teams are doing well

1768
01:10:14,680 --> 01:10:17,080
and which legacy models need the most work.

1769
01:10:17,080 --> 01:10:19,080
While you are digging through the metadata,

1770
01:10:19,080 --> 01:10:20,680
you need to go talk to your users

1771
01:10:20,680 --> 01:10:23,080
to find out what they are actually asking co-pilot.

1772
01:10:23,080 --> 01:10:24,480
Don't guess what they want,

1773
01:10:24,480 --> 01:10:27,680
but instead have real conversations with your power users

1774
01:10:27,680 --> 01:10:29,180
to see where they are struggling.

1775
01:10:29,180 --> 01:10:31,480
Ask them what questions they wish they could ask

1776
01:10:31,480 --> 01:10:33,280
but can't get an answer to right now.

1777
01:10:33,280 --> 01:10:35,880
You will start to hear the same things over and over.

1778
01:10:35,880 --> 01:10:38,080
Sales teams usually want better forecasts,

1779
01:10:38,080 --> 01:10:40,180
finance wants to understand variances

1780
01:10:40,180 --> 01:10:42,580
and operations is looking for anomaly detection.

1781
01:10:42,580 --> 01:10:44,180
Your job is to find the overlap

1782
01:10:44,180 --> 01:10:45,580
between what the business needs

1783
01:10:45,580 --> 01:10:47,280
and what your data can actually support.

1784
01:10:47,280 --> 01:10:49,380
Once the audit is done and you have your use cases,

1785
01:10:49,380 --> 01:10:51,080
pick one single target to start with.

1786
01:10:51,080 --> 01:10:53,180
Do not try to fix the entire company at once,

1787
01:10:53,180 --> 01:10:55,680
but instead choose one prediction domain

1788
01:10:55,680 --> 01:10:57,980
like sales forecasting or customer churn.

1789
01:10:57,980 --> 01:10:59,680
Pick something that solves a real problem

1790
01:10:59,680 --> 01:11:00,880
but is small enough for your team

1791
01:11:00,880 --> 01:11:02,380
to build in a reasonable amount of time.

1792
01:11:02,380 --> 01:11:05,380
It is much better to build on an existing semantic model

1793
01:11:05,380 --> 01:11:08,480
than to try and start from absolute zero for your first project.

1794
01:11:08,480 --> 01:11:10,780
Now it is time to build that first predictive model.

1795
01:11:10,780 --> 01:11:12,480
Open a fabric notebook and use SemPi

1796
01:11:12,480 --> 01:11:14,680
to pull the measures directly from your semantic model

1797
01:11:14,680 --> 01:11:16,880
so you aren't redefining things like revenue.

1798
01:11:16,880 --> 01:11:19,380
You use those existing measures as your foundation

1799
01:11:19,380 --> 01:11:20,680
and then build out your features

1800
01:11:20,680 --> 01:11:23,180
like customer segments or transaction history.

1801
01:11:23,180 --> 01:11:25,280
You gather your training data, run the model

1802
01:11:25,280 --> 01:11:26,580
and validate the results.

1803
01:11:26,580 --> 01:11:28,280
Once you are sure the numbers are right,

1804
01:11:28,280 --> 01:11:30,480
you run the prediction for your entire population

1805
01:11:30,480 --> 01:11:33,080
whether that is every customer or every open sales lead.

1806
01:11:33,080 --> 01:11:34,380
You need to store those predictions

1807
01:11:34,380 --> 01:11:35,680
back in the lake house in a table

1808
01:11:35,680 --> 01:11:37,980
with a very clear name like churn predictions.

1809
01:11:37,980 --> 01:11:40,880
Make sure you include the primary key, the prediction itself

1810
01:11:40,880 --> 01:11:41,980
and a confidence score

1811
01:11:41,980 --> 01:11:43,980
so people know how much to trust the data.

1812
01:11:43,980 --> 01:11:45,580
This table becomes the official source

1813
01:11:45,580 --> 01:11:46,780
for your new predictive measures.

1814
01:11:46,780 --> 01:11:49,780
Now you can wire that data back into your semantic model,

1815
01:11:49,780 --> 01:11:51,780
create a new measure that reads from the prediction table

1816
01:11:51,780 --> 01:11:53,180
like average churn risk

1817
01:11:53,180 --> 01:11:55,580
and suddenly it is available for everyone to use.

1818
01:11:55,580 --> 01:11:56,880
Co-pilot can see it

1819
01:11:56,880 --> 01:11:58,380
and your data agents can query it

1820
01:11:58,380 --> 01:12:00,580
because it is now a native part of your semantic layer.

1821
01:12:00,580 --> 01:12:02,080
Before you turn it on for everyone,

1822
01:12:02,080 --> 01:12:04,180
mark the model as prepped for AI

1823
01:12:04,180 --> 01:12:06,280
and go through your certification checklist.

1824
01:12:06,280 --> 01:12:08,480
Double check that the naming is crystal clear

1825
01:12:08,480 --> 01:12:10,780
and that your security settings are locked down tight.

1826
01:12:10,780 --> 01:12:12,780
Write the AI instructions that tell Co-pilot

1827
01:12:12,780 --> 01:12:14,280
how to interpret these new numbers

1828
01:12:14,280 --> 01:12:17,280
and run some test queries to make sure the answers make sense.

1829
01:12:17,280 --> 01:12:19,080
This is how you build trust in the system

1830
01:12:19,080 --> 01:12:20,680
before the rest of the company sees it.

1831
01:12:20,680 --> 01:12:22,980
The final step is to create a Co-pilot studio agent

1832
01:12:22,980 --> 01:12:25,080
that connects to this model as a data agent.

1833
01:12:25,080 --> 01:12:26,580
Keep the scope very narrow at first

1834
01:12:26,580 --> 01:12:28,580
by focusing on that one specific use case

1835
01:12:28,580 --> 01:12:30,680
and testing it with a small group of people.

1836
01:12:30,680 --> 01:12:31,880
Watch the questions they ask

1837
01:12:31,880 --> 01:12:33,680
and see where the system gets confused.

1838
01:12:33,680 --> 01:12:36,380
Every time a user asks a question that fails,

1839
01:12:36,380 --> 01:12:39,480
it tells you exactly how to improve your semantic model.

1840
01:12:39,480 --> 01:12:40,780
This is your first cycle

1841
01:12:40,780 --> 01:12:42,480
and it is small on purpose.

1842
01:12:42,480 --> 01:12:43,380
You are learning the ropes

1843
01:12:43,380 --> 01:12:46,180
and proving that the architecture actually works in the real world.

1844
01:12:46,180 --> 01:12:48,180
Once you have one successful loop finished,

1845
01:12:48,180 --> 01:12:51,280
you can start to scale up by adding more models and more domains.

1846
01:12:51,280 --> 01:12:52,980
But before you try to go big,

1847
01:12:52,980 --> 01:12:55,280
you have to prove the pattern works right here.

1848
01:12:55,280 --> 01:12:58,080
Building a sustainable predictive AI practice.

1849
01:12:58,080 --> 01:13:00,080
You've built your first predictive system

1850
01:13:00,080 --> 01:13:00,980
and it actually works.

1851
01:13:00,980 --> 01:13:03,080
Users are asking Co-pilot questions

1852
01:13:03,080 --> 01:13:06,280
and predictions are flowing through your semantic layer without a hitch.

1853
01:13:06,280 --> 01:13:07,680
But now you face the real test

1854
01:13:07,680 --> 01:13:10,980
that determines if this is a permanent capability or just a one-off project.

1855
01:13:10,980 --> 01:13:12,280
You have to make it operational.

1856
01:13:12,280 --> 01:13:13,880
That requires structure, clear ownership,

1857
01:13:13,880 --> 01:13:15,380
and a real commitment to the process.

1858
01:13:15,380 --> 01:13:16,480
It starts with ownership.

1859
01:13:16,480 --> 01:13:18,580
Every single piece of your predictive infrastructure

1860
01:13:18,580 --> 01:13:20,580
needs a dedicated Stuart to look after it.

1861
01:13:20,580 --> 01:13:22,280
Think about who owns each semantic model.

1862
01:13:22,280 --> 01:13:24,780
I don't mean the person who created it once and walked away.

1863
01:13:24,780 --> 01:13:27,380
I mean, the person responsible for its accuracy and fitness

1864
01:13:27,380 --> 01:13:28,680
for use every single day.

1865
01:13:28,680 --> 01:13:30,380
This person is the one who gets the alert

1866
01:13:30,380 --> 01:13:31,880
when metadata is outdated

1867
01:13:31,880 --> 01:13:33,480
or data quality starts to drop.

1868
01:13:33,480 --> 01:13:34,680
They approve model retraining

1869
01:13:34,680 --> 01:13:36,080
and they are the first line of defense

1870
01:13:36,080 --> 01:13:37,480
when users run into trouble.

1871
01:13:37,480 --> 01:13:39,580
You need to assign this role explicitly

1872
01:13:39,580 --> 01:13:41,880
and make it a formal part of a job description

1873
01:13:41,880 --> 01:13:43,480
rather than just an assumption.

1874
01:13:43,480 --> 01:13:46,080
The predictive models themselves need a different kind of owner.

1875
01:13:46,080 --> 01:13:48,680
These are the machine learning models running in your notebooks

1876
01:13:48,680 --> 01:13:51,480
and they require constant attention to stay relevant.

1877
01:13:51,480 --> 01:13:52,880
They need versioning,

1878
01:13:52,880 --> 01:13:54,980
regular retraining as data patterns shift

1879
01:13:54,980 --> 01:13:57,280
and active monitoring to catch accuracy drift

1880
01:13:57,280 --> 01:13:58,680
before it causes problems.

1881
01:13:58,680 --> 01:13:59,980
Usually your data science team

1882
01:13:59,980 --> 01:14:02,080
or whoever maintains your analytics infrastructure

1883
01:14:02,080 --> 01:14:02,880
should take this on.

1884
01:14:02,880 --> 01:14:05,180
They aren't just responsible for the initial training

1885
01:14:05,180 --> 01:14:07,680
but for the long term performance of the model in the wild.

1886
01:14:07,680 --> 01:14:09,380
Then you have governance and compliance.

1887
01:14:09,380 --> 01:14:10,680
This is where your policies live.

1888
01:14:10,680 --> 01:14:12,880
You need a person or a team to define the standards

1889
01:14:12,880 --> 01:14:15,680
for naming, documentation and security configurations.

1890
01:14:15,680 --> 01:14:17,080
They conduct the audits to make sure

1891
01:14:17,080 --> 01:14:18,380
everyone is following the rules

1892
01:14:18,380 --> 01:14:19,780
but they also approve exceptions

1893
01:14:19,780 --> 01:14:22,380
when a policy needs to bend for a legitimate reason.

1894
01:14:22,380 --> 01:14:23,880
This team maintains the playbooks

1895
01:14:23,880 --> 01:14:26,780
and updates them as your requirements evolve over time.

1896
01:14:26,780 --> 01:14:27,880
In most organizations,

1897
01:14:27,880 --> 01:14:31,780
this ends up being a specialized subset of your BI or governance group.

1898
01:14:31,780 --> 01:14:33,280
Next, you need to look at your processes.

1899
01:14:33,280 --> 01:14:34,680
You have to establish exactly

1900
01:14:34,680 --> 01:14:36,180
how a new predictive model moves

1901
01:14:36,180 --> 01:14:38,080
from a development environment into full production.

1902
01:14:38,080 --> 01:14:39,580
This isn't about creating a committee

1903
01:14:39,580 --> 01:14:41,080
to slow things down with red tape.

1904
01:14:41,080 --> 01:14:42,380
It's about quality gates.

1905
01:14:42,380 --> 01:14:44,580
A data scientist develops a model in a dev workspace

1906
01:14:44,580 --> 01:14:46,780
and validates it against test data.

1907
01:14:46,780 --> 01:14:49,180
They document what it predicts and how it works.

1908
01:14:49,180 --> 01:14:51,580
Then they submit it for a structured review.

1909
01:14:51,580 --> 01:14:52,880
Appear from the BI team

1910
01:14:52,880 --> 01:14:54,380
checks the semantic model grounding

1911
01:14:54,380 --> 01:14:56,980
while the governance team reviews it for compliance.

1912
01:14:56,980 --> 01:14:58,080
Once they both sign off,

1913
01:14:58,080 --> 01:14:59,280
the model moves to production

1914
01:14:59,280 --> 01:15:01,480
and becomes a new measure that co-pilot can query.

1915
01:15:01,480 --> 01:15:03,080
Don't forget about the retirement process

1916
01:15:03,080 --> 01:15:05,580
because predictive models do not last forever.

1917
01:15:05,580 --> 01:15:08,080
Business conditions change and data patterns shift,

1918
01:15:08,080 --> 01:15:10,380
which means a model trained on five years of data

1919
01:15:10,380 --> 01:15:12,380
eventually becomes stale and misleading.

1920
01:15:12,380 --> 01:15:14,780
You need an explicit way to retire old models,

1921
01:15:14,780 --> 01:15:16,280
perhaps through an annual review

1922
01:15:16,280 --> 01:15:19,280
where every model is evaluated for its continued relevance.

1923
01:15:19,280 --> 01:15:20,780
If a model isn't being used

1924
01:15:20,780 --> 01:15:23,480
or the accuracy has dropped below an acceptable level,

1925
01:15:23,480 --> 01:15:24,980
you market as deprecated.

1926
01:15:24,980 --> 01:15:27,580
You stop exposing it to co-pilot and you tell your users

1927
01:15:27,580 --> 01:15:29,280
so they can find a better alternative.

1928
01:15:29,280 --> 01:15:31,380
Active management means you don't just let things decay

1929
01:15:31,380 --> 01:15:32,280
in the background.

1930
01:15:32,280 --> 01:15:34,680
Training is another critical piece of the puzzle.

1931
01:15:34,680 --> 01:15:36,680
Your teams need to deeply understand semantic link

1932
01:15:36,680 --> 01:15:39,480
and they need to know exactly what Sampai can and cannot do.

1933
01:15:39,480 --> 01:15:41,580
They have to grasp why grounding matters so much

1934
01:15:41,580 --> 01:15:42,680
for reliability.

1935
01:15:42,680 --> 01:15:45,680
Governance shouldn't be seen as a boring compliance burden

1936
01:15:45,680 --> 01:15:46,880
but as the very infrastructure

1937
01:15:46,880 --> 01:15:48,880
that makes your predictions trustworthy.

1938
01:15:48,880 --> 01:15:50,780
You can achieve this by running workshops,

1939
01:15:50,780 --> 01:15:53,680
creating internal documentation and building templates.

1940
01:15:53,680 --> 01:15:56,080
Your goal is to make it easy for a new hire to step in

1941
01:15:56,080 --> 01:15:58,380
and understand exactly how predictive systems work

1942
01:15:58,380 --> 01:15:59,180
in your company.

1943
01:15:59,180 --> 01:16:01,680
You also need to track the metrics that actually matter.

1944
01:16:01,680 --> 01:16:04,080
To understand if this practice is delivering real value,

1945
01:16:04,080 --> 01:16:05,980
you should track how many users are actually asking

1946
01:16:05,980 --> 01:16:07,480
co-pilot for predictions.

1947
01:16:07,480 --> 01:16:08,980
You need to measure accuracy by seeing

1948
01:16:08,980 --> 01:16:11,480
if those predictions correlate with real-world outcomes.

1949
01:16:11,480 --> 01:16:13,680
Most importantly, look at the business impact.

1950
01:16:13,680 --> 01:16:15,980
Are these predictions changing decisions for the better

1951
01:16:15,980 --> 01:16:17,880
and how fast can a user go from a question

1952
01:16:17,880 --> 01:16:19,280
to a solid prediction?

1953
01:16:19,280 --> 01:16:20,880
These numbers will tell you if you're building

1954
01:16:20,880 --> 01:16:21,980
something meaningful

1955
01:16:21,980 --> 01:16:24,880
or just maintaining a system that nobody uses.

1956
01:16:24,880 --> 01:16:26,880
Finally, you need an iteration rhythm.

1957
01:16:26,880 --> 01:16:28,880
Set up quarterly reviews to look at what worked

1958
01:16:28,880 --> 01:16:30,280
and where the predictions failed.

1959
01:16:30,280 --> 01:16:33,580
Listen to user feedback and adjust your processes accordingly.

1960
01:16:33,580 --> 01:16:35,880
You might find that your governance is too restrictive

1961
01:16:35,880 --> 01:16:38,280
or perhaps it isn't strict enough to keep the data clean.

1962
01:16:38,280 --> 01:16:39,880
Maybe the teams need more hands-on training

1963
01:16:39,880 --> 01:16:41,980
or the documentation is too confusing.

1964
01:16:41,980 --> 01:16:44,280
Every cycle is an opportunity to learn something new

1965
01:16:44,280 --> 01:16:46,480
and evolve the practice based on reality.

1966
01:16:46,480 --> 01:16:48,680
This operating model is the line between saying

1967
01:16:48,680 --> 01:16:52,780
we built a co-pilot system and actually running a predictive AI practice.

1968
01:16:52,780 --> 01:16:55,880
One of those is just a project, but the other is a core capability.

1969
01:16:55,880 --> 01:16:58,680
The difference always comes down to structure, ownership,

1970
01:16:58,680 --> 01:17:01,880
and a commitment to continuous improvement.

1971
01:17:01,880 --> 01:17:04,380
The gap between co-pilot and predictive analytics

1972
01:17:04,380 --> 01:17:06,380
isn't a limitation of the technology,

1973
01:17:06,380 --> 01:17:07,780
but a problem with the architecture.

1974
01:17:07,780 --> 01:17:09,880
Co-pilot doesn't do the heavy lifting of calculation

1975
01:17:09,880 --> 01:17:11,780
because its job is to orchestrate.

1976
01:17:11,780 --> 01:17:13,780
Your semantic models grounded in semantic link

1977
01:17:13,780 --> 01:17:15,480
are what actually do the math.

1978
01:17:15,480 --> 01:17:17,580
Your data science teams are the ones building the engines

1979
01:17:17,580 --> 01:17:18,580
in their notebooks

1980
01:17:18,580 --> 01:17:21,380
and your governance framework keeps the whole system trustworthy.

1981
01:17:21,380 --> 01:17:23,080
It all starts with metadata.

1982
01:17:23,080 --> 01:17:24,780
From there, you move to SEMPY

1983
01:17:24,780 --> 01:17:27,080
and connect everything through fabric data agents.

1984
01:17:27,080 --> 01:17:28,880
Once the technical pieces are in place,

1985
01:17:28,880 --> 01:17:31,180
you build the operating model that keeps it all running.

1986
01:17:31,180 --> 01:17:32,680
By the time we reach 2027,

1987
01:17:32,680 --> 01:17:34,880
this approach will be the standard way of doing business

1988
01:17:34,880 --> 01:17:36,780
so the best time to start is right now.

1989
01:17:36,780 --> 01:17:38,980
If you want to stay ahead on Microsoft 365,

1990
01:17:38,980 --> 01:17:40,680
co-pilot and the modern workplace,

1991
01:17:40,680 --> 01:17:42,380
make sure to subscribe to the channel.

1992
01:17:42,380 --> 01:17:44,280
If this was helpful, leave a review

1993
01:17:44,280 --> 01:17:45,880
and let's connect over on LinkedIn.

Mirko Peters

Founder of m365.fm, m365.show and m365con.net

Mirko Peters is a Microsoft 365 expert, content creator, and founder of m365.fm, a platform dedicated to sharing practical insights on modern workplace technologies. His work focuses on Microsoft 365 governance, security, collaboration, and real-world implementation strategies.

Through his podcast and written content, Mirko provides hands-on guidance for IT professionals, architects, and business leaders navigating the complexities of Microsoft 365. He is known for translating complex topics into clear, actionable advice, often highlighting common mistakes and overlooked risks in real-world environments.

With a strong emphasis on community contribution and knowledge sharing, Mirko is actively building a platform that connects experts, shares experiences, and helps organizations get the most out of their Microsoft 365 investments.