Aug. 14, 2025

Microsoft Fabric Notebooks: AI Model Training Explained

Stop torturing your laptop. Train models where the data lives. With Microsoft Fabric notebooks running on Spark next to your Lakehouse, you skip CSV exports, move terabytes at query speed, and iterate in Python or R without memory crashes. Push transforms to the data, engineer features at scale, monitor long runs in real time, checkpoint models, and evaluate across massive test sets—cutting days of wrangling into hours of results.

You can transform ai model training with Microsoft Fabric Notebooks. By running your learning jobs directly next to your data, you skip long data transfers and work with the freshest information. The Native Execution Engine in fabric speeds up processing by up to six times compared to traditional Spark, saving compute costs and boosting performance. Real-time monitoring in notebooks helps you track resources and session details, making it easier to optimize your workflow and avoid errors.

Feature	Benefit
Real-time resource monitoring	See CPU and memory usage instantly for better performance.
Session tracking	Check session time and compute type to make smart choices.
Resource usage insights	Spot resource-heavy jobs quickly to prevent crashes.

Aspect	Improvement Description
Spark Cluster Startup Time	Sessions start in seconds, so you get results faster.
Latency Reduction	Lower startup delays mean you analyze data more quickly.
Cost Optimization	Smarter pooling keeps costs down by using only what you need.
SLA Compliance	Faster and more reliable jobs help you meet your goals.

Key Takeaways

Transform AI model training by using Microsoft Fabric Notebooks for direct data access, reducing transfer times.
Leverage the Native Execution Engine to speed up processing by up to six times, saving costs and enhancing performance.
Utilize real-time resource monitoring to track CPU and memory usage, optimizing your workflow and preventing errors.
Set up your workspace easily in Fabric Notebooks, allowing quick access to data and powerful analytics tools.
Use Spark for efficient data ingestion and processing, enabling you to handle large datasets without memory issues.
Document your workflow thoroughly to ensure reproducibility and clarity in your AI projects.
Employ hyperparameter tuning and experiment tracking to improve model accuracy and streamline your training process.
Collaborate effectively with your team by sharing notebooks and using built-in features for feedback and discussion.

7 Surprising Facts about Microsoft Fabric Notebooks

Deep OneLake integration: Fabric Notebooks can read and write directly to OneLake as a native first-class store, letting code operate on the same lakehouse files and tables used across Fabric without separate connectors.
Multiple execution environments in one notebook: a single Fabric Notebook can run PySpark, Python, Spark SQL and .NET interactive cells side-by-side, enabling mixed-language workflows without switching tools.
Serverless compute by default: notebooks can run on Fabric's managed, serverless Spark compute so you don’t need to provision clusters manually for many analytics workloads.
Live, embeddable visuals and Power BI integration: notebook outputs (charts and tables) can be promoted to Power BI visuals or embedded directly into Fabric dashboards, creating a seamless path from exploration to production reporting.
Built-in collaboration and real-time editing: colleagues can co-edit the same notebook in real time with autosave and shared execution context options, similar to collaborative document editors.
Direct access to Fabric Copilot and LLM features: notebooks can leverage Fabric’s Copilot/LLM integrations to generate or transform code, build queries, and create visualizations interactively from natural language prompts.
Data lineage and governance applied to notebook outputs: artifacts produced by notebooks (tables, models, visuals) are tracked in Fabric’s lineage and governance system, so notebook-derived assets inherit organizational policy and discoverability.

Fabric Notebooks Setup

Setting up your workspace in fabric notebooks gives you a strong foundation for AI model training. This section provides a fabric notebooks overview and guides you through each step, from account creation to launching your first notebook. You will see how easy it is to connect with your data and leverage Spark for powerful analytics.

Accessing Microsoft Fabric

Creating an Account

To begin, you need access to fabric. You can create a Fabric free account, which lets you log in to the Fabric app. If you already use Microsoft Power BI, you can use your existing Power BI account. This simple process ensures you can start working with notebooks right away.

Tip: No major access issues have been reported by users, so you can expect a smooth sign-up experience.

Navigating Workspace

Once you log in, you will see the fabric workspace. Here, you can organize your projects, manage resources, and access your lakehouse. The workspace dashboard helps you find your notebooks, datasets, and other assets quickly. You can also check your workspace capacity and settings to make sure you meet the requirements for AI projects.

Requirement	Description
Microsoft Fabric	Set up fabric with F64 capacity.
Lakehouse	Add a lakehouse to the notebook and download data from a public blob.
Azure AI Search	Configure Azure AI Search.
Copilot	Ensure the tenant setting for Copilot is enabled.
Workspace Capacity	Use a supported capacity (F2 or higher, or P1 or higher).
Cross-geo Settings	Enable tenant settings for cross-geo data processing if needed.

Launching a Notebook

Kernel Selection

After setting up your workspace, you can start creating a new notebook. When you launch a notebook, you choose a kernel. The kernel controls the programming language and environment. Fabric notebooks support Python and other languages, making them flexible for different AI tasks.

Library Setup

Before you begin your analysis, you may need to install or import libraries. Fabric notebooks make this process simple. You can add libraries for data science, machine learning, or visualization directly in your notebook. This setup allows you to use Spark for processing large datasets and connect seamlessly with your lakehouse.

Feature	Description
Language Flexibility	Supports Python and other languages for data science workflows.
Data Processing at Scale	Uses Spark to handle large datasets efficiently.
Interactivity	Lets you run code and see results in real time.
Seamless Integration with Storage	Connects directly with lakehouse, so you work with data where it lives.

When creating a new notebook, you avoid the hassle of moving data between tools. You can process, analyze, and visualize your data all in one place. This integration saves time and reduces errors. If you are new to fabric, you may notice a learning curve, but the unified workspace and direct data access make the process much easier.

Note: If you work with large projects or teams, fabric notebooks help you avoid fragmented workflows. You can keep everything organized and efficient from the start.

Data Preparation in Fabric Notebooks

Preparing your dataset in fabric notebooks sets the stage for successful AI model training. You can streamline every step, from data ingestion to data visualization, using the built-in tools and Spark-powered features.

Importing Data

Connecting to Lakehouse

You start by connecting your notebook to a lakehouse. This connection gives you direct access to your datasets, so you do not need to move files between systems. You can work with data where it lives, which saves time and reduces errors. The lakehouse stores your dataset in delta tables, making it easy to load and update information as needed.

Loading Data with Spark

After connecting, you use Spark for data ingestion. Spark lets you load large datasets quickly and efficiently. You can read data from multiple sources, including CSV files, Parquet files, or existing tables in your lakehouse. Spark handles the heavy lifting, so you can focus on working with data instead of worrying about memory limits.

Tip: Use Spark DataFrames to process your dataset at scale. This approach helps you manage even the largest datasets without slowing down your workflow.

Data Cleaning

Handling Missing Values

Cleaning your dataset is a key step before training any model. You can launch Data Wrangler from your fabric notebook to explore your dataset and spot missing values. Data Wrangler provides automatic code generation for common cleaning tasks, such as filling in missing values or removing incomplete rows. You can export these cleaning steps as reusable functions in pandas or PySpark, making your workflow more efficient.

Feature Engineering

Feature engineering helps you create new variables from your existing dataset. You can use Python-based tools in fabric notebooks to transform your data and build features that improve model accuracy. Spark makes it easy to apply these transformations across your entire dataset, even if you have millions of rows.

Best Practice:
Use Spark and Python tools to clean and transform your dataset.
Create experiments to test different feature sets.
Track your runs and results using MLflow in the fabric UI.

Data Exploration

Summary Statistics

Exploring your dataset helps you understand its structure and quality. You can generate summary statistics, such as mean, median, and standard deviation, directly in your notebook. These statistics give you a quick overview of your dataset and highlight any issues that need attention.

Visualizations

Data visualization brings your dataset to life. Fabric notebooks support built-in visualization functions and integrate with libraries like Matplotlib and Bokeh. You can turn tabular results into charts without writing extra code. The display function lets you interact with your data and spot trends or outliers easily.

Note: Interactive data visualization helps you make better decisions when preparing your dataset for AI model training.

Train Models in Fabric

Writing Training Code

Using Python or R

You can write your model training code in several languages within Microsoft Fabric Notebooks. The platform supports:

PySpark (Python)
Spark (Scala)
Spark SQL
SparkR (R)

Most users choose Python for its flexibility and rich ecosystem. You can also use R if you prefer its statistical tools. PySpark lets you scale your learning jobs across many nodes, which is important for large datasets. Python works well for deep learning and pytorch model development. If you want to use pytorch, you can install the library and start building your pytorch model right away. You can also use SparkR for statistical learning or Spark SQL for quick data queries.

Structuring Code Cells

You should organize your code into clear, logical cells. Start with data loading and cleaning. Add cells for feature engineering and exploratory analysis. Place your model training code in its own cell. This structure helps you debug and rerun parts of your machine learning experiment without repeating earlier steps. You can also add markdown cells to explain your logic and document your experiments. This approach makes your notebooks easy to read and share with others.

Algorithm Selection

Choosing Models

You need to pick the right algorithm for your task. Logistic regression works well for binary classification problems. You can also use libraries like scikit-learn, PySpark ML, TensorFlow, or pytorch for more advanced models. If you want to build a pytorch model, you can use the GPU support in fabric to speed up your learning process. PyTorch is popular for deep learning, image recognition, and natural language processing. You can also use pytorch for transfer learning and fine-tuning pre-trained models.

When you select an algorithm, think about your data size, the type of problem, and the resources you have. FLAML, created by Microsoft Research, helps you train models efficiently. It uses less compute and works well with parallel jobs. This makes it a good choice for large-scale machine learning experiment runs.

Setting Hyperparameters

Hyperparameters control how your models learn. You can use flaml.tune to search for the best settings. Fabric lets you run many tuning trials at once, thanks to Spark’s parallel processing. This means you can test different learning rates, batch sizes, or layers for your pytorch model quickly. You can track each trial in your machine learning experiment using experiment tracking. Visualization tools help you compare results and pick the best configuration. You can see which settings work best for your model testing and model training.

Tip: Use experiment tracking to log every trial, metric, and parameter. This helps you repeat successful experiments and avoid mistakes.

Running Training Jobs

Monitoring Progress

You can monitor your training jobs in real time. Fabric notebooks show you how each step performs. High Concurrency mode speeds up your jobs by about 30%. You get detailed logs for every notebook step. MLflow integration gives you autologging, so you do not have to log metrics by hand. You can define your training sessions and choose which parameters and metrics to record. This makes it easy to track your learning progress and compare different experiments.

Scenario	Python Notebooks (2-core VM)	PySpark Notebooks (Spark Compute)
Handling of Large Datasets	Limited by single-node memory. May struggle with scaling.	Distributed processing ensures scalable handling of multi-GB to TB workloads.

You can see that PySpark Notebooks handle large datasets better. This is important when you train pytorch model or other deep learning models on big data.

Checkpointing

Checkpointing lets you save your model’s state during training. If your job stops or you want to pause, you can resume from the last checkpoint. This is useful for long-running pytorch model training jobs. You do not lose your progress if you need to restart. You can also use checkpoints to test different learning strategies or continue your machine learning experiment from a certain point.

Note: Always set checkpoints when you train models on large datasets. This practice saves time and protects your work.

You now have the tools to train, test, and monitor machine learning models in Microsoft Fabric Notebooks. You can run experiments, tune hyperparameters, and build powerful pytorch model solutions for any data science challenge.

AI Model Training Workflow

You can unlock the full power of ai model training by mastering the workflow in Microsoft Fabric Notebooks. This section guides you through distributed evaluation, model tuning, and the use of built-in AI functions. You will see how Spark, real-time monitoring, and Copilot features help you build better models faster.

Distributed Evaluation

Evaluating Performance

You can evaluate your model’s performance across large datasets using Spark’s distributed computing. Spark splits your test data into smaller parts and processes them in parallel. This means you do not have to wait hours for results. You can check metrics like accuracy, precision, and recall right inside your notebook. This approach works well for deep learning with pytorch, where you need to test models on millions of records. Real-time monitoring in fabric lets you watch resource usage and job progress as your evaluation runs.

Comparing Models

You can compare different models side by side in notebooks. For example, you might train a pytorch model and a scikit-learn model on the same dataset. You can log each model’s results and visualize them with built-in charts. This helps you pick the best model for your ai model training project. You can also use experiment tracking to save your results and share them with your team. This makes it easy to repeat successful experiments and improve your workflow.

Tuning Models

Hyperparameter Search

You can boost your model’s accuracy by tuning hyperparameters. Microsoft Fabric Notebooks support advanced tools like Optuna for this task. Here is a simple workflow you can follow:

Create a study with Optuna to store trial results:

study = optuna.create_study(direction="maximize")

Optimize the study over a set number of trials:

study.optimize(objective, n_trials=60, show_progress_bar=True)

Print the results of the trials:

print("Number of finished trials:", len(study.trials))

You can also visualize your search with plots. Use plot_optimization_history(study).show() to see how your model improves over time. Try plot_param_importances(study).show() to find out which settings matter most. Use plot_parallel_coordinate(study).show() to explore how different parameters interact. Do not just copy code—adapt and experiment with different hyperparameters to get the best results for your pytorch or other models.

Avoiding Overfitting

You want your ai model training to create models that work well on new data, not just the training set. Overfitting happens when a model learns the training data too well and fails on new examples. You can prevent this by using cross-validation and stacking techniques.

Stacking typically uses cross-validation to generate predictions for the meta-model, ensuring that no information leaks from training into testing. This adds complexity but also provides robustness.

You can also use early stopping in pytorch, regularization, and dropout layers to make your models more robust. Always check your validation scores and adjust your training process if you see signs of overfitting.

Built-in AI Functions

Text Classification

You can save time on common tasks by using built-in AI functions in Microsoft Fabric Notebooks. For text classification, you can use the Classify function to sort emails, support tickets, or documents into categories. This works well for business cases like routing urgent requests or tagging customer feedback. You can also build custom pytorch models for more advanced classification tasks.

Sentiment Analysis

Sentiment analysis helps you understand the tone of text, such as customer reviews or social media posts. You can use the Sentiment Analysis function to flag negative comments and respond quickly. This feature works out of the box in notebooks, so you do not need to write complex code. You can also combine built-in functions with your own pytorch models for even better results.

Here is a table of built-in AI functions you can use in your ai model training workflow:

Function	Description	Use Cases
Summarize	Shortens long text into short summaries	Lengthy company internal emails into short, concise summary
Classify	Categorizes text based on custom labels or tags	Classify support tickets based on severity (urgent, critical, etc.)
Extract	Retrieve specific information from input text	Extract name, location from a customer email database
Translate	Convert text from one language to another	Translate customer emails from Spanish to English
Similarity	Check two different texts and tells you how similar	Find similar customer support tickets that highlight the same problem
Sentiment Analysis	Identifies the tone of text – positive, negative, or neutral	Flag customer reviews having words like “unacceptable”, “bad” and address them before they escalate

You can use Copilot to generate code, suggest improvements, and automate repetitive steps. This makes your ai model training process faster and more productive. You can focus on building and tuning pytorch models while Copilot handles the routine work.

By following this workflow in Microsoft Fabric Notebooks, you can scale your ai model training, tune pytorch models efficiently, and use built-in AI functions to solve real-world problems.

Model Management and Deployment

Saving Models

Exporting and Versioning

You need a reliable way to save and track your pytorch models in fabric. Start by using mlflow to log every training run. Mlflow records your parameters, metrics, and artifacts. This helps you find the best version of your pytorch model when you need it. You should enable autologging at the start of tracking experiments. This way, you capture all important details from the first run. You cannot add logs later, so start early.

You can use Delta Lake time travel to pin the exact version of your training data. This ensures you can always reproduce your results. Register each pytorch model in the mlflow model registry. Include the version, training data hash, performance metrics, and owner. This creates a clear audit trail for managing models. Before you move a model to production, set up a human review process. Use Azure DevOps pull request approvals linked to fabric deployment pipelines. This step keeps your workflow safe and compliant.

Tip: Always keep your best pytorch models registered in mlflow. You can roll back to a previous version if you find a problem.

Sharing Notebooks

Collaboration Features

You can work with your team easily in notebooks. The platform connects directly to your Lakehouse, so everyone uses the same data. Notebooks combine code, comments, and outputs in one place. This makes it simple to share ideas and results. Team members can use their favorite programming languages, like Python or R, on the same dataset. This flexibility helps everyone contribute to managing models and tracking experiments.

Feature	Description
Easy to work with Lakehouse data	Direct connection to fabric Lakehouse for seamless data interaction without extra setup.
Better collaboration	Notebooks combine code, comments, and outputs, enhancing understanding and sharing among team members.
Supports different coding styles	Allows team members to use their preferred programming languages, fostering collaboration on the same dataset.

Deploying Models

Integration with Fabric Pipelines

You can deploy your pytorch models using fabric deployment pipelines. These pipelines help you manage models across development, test, and production environments. You keep control and consistency at every stage. The visual workflow in Microsoft Fabric Deployment Pipelines lets you see where your pytorch model is and what steps come next. You can promote your pytorch models through controlled environments. This process ensures your team follows best practices for managing models.

Integrate with Azure OpenAI services using SynapseML.
Use the Python SDK for deployment.
Access your pytorch models through REST APIs.
Common use cases include text summarization and sentiment analysis.

Real-Time Inference

You can serve your pytorch models for real-time inference. This means you can make predictions on new data as soon as it arrives. For example, you can use your deployed pytorch model to classify support tickets or analyze customer feedback instantly. Mlflow tracks every deployment, so you know which version of your pytorch model is running. You can update or roll back deployments quickly if you need to.

Note: Real-time inference helps you respond to business needs without delay. Use mlflow to monitor and manage your pytorch models in production.

By following these steps, you can handle saving, sharing, and deploying your pytorch models in fabric. You keep your workflow organized and your results reproducible. Mlflow and notebooks give you the tools you need for tracking experiments and managing models at scale.

Best Practices in Fabric Notebooks

Reproducibility

Documenting Workflow

You should always document your workflow to make your AI projects easy to repeat and understand. Good documentation helps you and your team track every step of your model training process. Start by describing the business context for each experiment. Register your models with clear explanations of what they predict, how to use them, and any known limits. Use markdown cells in your notebooks to explain your logic and decisions. This habit makes your work easier to share and review.

Use Delta Lake time travel to pin the exact version of your training data for each experiment.
Separate feature engineering from training by creating reusable feature tables.
Enable autologging at the start of your experiments to capture all parameters and results.
Stage your models before production. Validate them against holdout data and compare with current models.
Schedule regular checks for model drift by comparing predictions with real outcomes.

Tip: Document every experiment as you go. You will save time and avoid confusion later.

Environment Management

Managing your environment ensures that your results stay consistent across different runs. You should record the versions of all libraries and dependencies you use. This practice helps you avoid surprises when you or your teammates rerun your notebooks. Use environment files or package lists to keep track of your setup. When you share your work, include these details so others can reproduce your results without issues.

Collaboration

Team Sharing

Working as a team in fabric is simple and effective. You can share notebooks with your colleagues and set permissions for each user. This control lets you decide who can view or edit your work. Teams can add cell-level comments to discuss code, ask questions, or suggest changes. These features help everyone stay on the same page and move projects forward together.

Share notebooks directly with your team.
Set user permissions to control access.
Use cell-level comments for feedback and discussion.

Code Review

Code review is important for quality and learning. Notebooks support version history, so you can see what changed and when. If you need to, you can roll back to a previous version. Integration with Git allows you to use source control for even better tracking. These tools make it easy to review code, spot errors, and keep your project safe.

Check version history to track changes.
Roll back to earlier versions if needed.
Use Git integration for advanced source control.

Performance Optimization

Resource Management

You can optimize performance by managing your resources wisely. Right-size your compute capacity based on your workload. Scale up during busy times and pause resources when not in use. Reserved capacity and spot workloads can help you save costs. Set budgets and alerts to avoid overspending. Monitor workload peaks to detect bottlenecks early.

Optimization Tip	Benefit
Right-size capacity	Matches resources to workload needs
Reserved/spot workloads	Reduces costs
Monitor peaks	Finds bottlenecks quickly
Set budgets/alerts	Prevents unexpected expenses

Efficient Data Handling

Efficient data handling speeds up your AI projects. Process data in-place to minimize movement and save time. Clean up your storage to improve performance. Use reusable feature tables to avoid repeating work. Parallel hyperparameter tuning lets you test many settings at once, making your experiments faster. Always track your experiments to keep your results organized.

Note: Efficient data handling and smart resource management help you get the most out of your AI projects.

Troubleshooting

When you work with Microsoft Fabric Notebooks, you may face challenges that slow down your progress. Knowing how to troubleshoot common issues helps you keep your projects on track. This section gives you practical steps for debugging and finding support when you need it.

Debugging

You can solve many problems by following a clear process. Start by checking the Monitoring Hub in the Microsoft Fabric portal. This tool shows you if any pipelines have failed or if there are errors in your workspace. If you see a problem, look at the details, such as the workspace name, activator, and eventstream source. These details help you find the root cause quickly.

Use the following steps to debug your data pipelines and notebooks:

Check the Monitoring Hub for failed jobs or errors.
Validate all connections and credentials, especially for Azure SQL or other linked services.
Use debug mode to step through your data pipelines and spot where they get stuck.
Review scheduling and trigger settings to make sure your jobs run as planned.
Monitor resource usage to see if slow performance comes from limited capacity.
Visit the Microsoft Fabric Status Page to check for service outages or updates.
Export and import pipelines using the 'Save As' feature if you need to duplicate or reset them.
Document rule IDs, conditions, timestamps, and affected objects to keep track of what happened.

Tip: When running and debugging notebooks, always keep a log of changes and actions. This habit makes it easier to trace issues and share findings with your team.

Support Resources

If you cannot fix a problem on your own, you have several support options. Microsoft provides a strong community and official help channels. You can use these resources to get answers and learn from others.

Microsoft Docs offer step-by-step guides and troubleshooting articles.
Community forums let you ask questions and get advice from both peers and Microsoft staff.
Support tickets connect you with Microsoft experts for more complex issues.
The Monitoring Hub and status pages keep you updated on ongoing problems or outages.

When you ask for help, include important details like your workspace name, eventstream source, rule IDs, and the time the issue happened. This information helps support teams respond faster and more accurately.

Note: Engaging with the community often leads to quick solutions. Many users share tips and best practices that can help you avoid similar issues in the future.

By following these troubleshooting steps and using available support resources, you can resolve most issues in fabric notebooks and keep your projects moving forward.

You can accelerate AI model training with Microsoft Fabric Notebooks. The familiar interface, similar to Power BI, makes your workflow intuitive and productive. You work in a unified environment that combines data engineering, analytics, and AI assistance, so you avoid switching between tools. This integration streamlines every step, from data prep to deployment.
Explore these resources to deepen your skills:

Microsoft Learn for hands-on exercises
Certification programs to validate your expertise
Data Wrangler for easy data cleaning
MLflow for experiment tracking
Lakehouse for unified data storage
Start your journey today and unlock the full power of scalable, efficient AI model development.

Microsoft Fabric Notebooks Checklist

Confirm Microsoft Fabric workspace access and appropriate license
Authenticate with Azure Active Directory and verify permissions for required resources
Select or configure the correct compute (Spark/Pool) and verify cluster status
Connect to data sources (OneLake, ADLS, SQL, Delta Lake) and test read/write operations
Choose appropriate kernel (Python/Scala/SQL) and install any required libraries/packages
Organize notebook into clear sections and use descriptive titles for cells
Parameterize inputs (paths, credentials, environment variables) for reusability
Add data validation and sanity checks after data ingestion
Optimize Spark configurations, caching, and data partitions for performance
Create and validate visualizations and dashboards; ensure visuals update correctly
Implement error handling and logging for critical steps
Use secure secret management (Key Vault) instead of hard-coded credentials
Verify notebook and data access controls and role assignments
Enable versioning, comments, and set collaboration guidelines
Create schedules or pipeline integration for recurring runs if needed
Write and run unit/integration tests for critical code paths
Configure monitoring, alerts, and cost tracking for compute usage
Export notebooks to desired formats (HTML, .ipynb) for sharing or archiving
Ensure notebooks and important outputs are backed up or stored in source control
Document assumptions, data lineage, dependencies, and run instructions
Conduct a security and compliance review before production deployment
Prepare a troubleshooting checklist for common failures and recovery steps

notebooks in microsoft fabric learning and use

What are Microsoft Fabric notebooks and what do they provide?

Microsoft Fabric notebooks are integrated development environments within the Fabric ecosystem that let data engineers and data scientists write code, run Apache Spark jobs, and perform data transformation and visualization. Fabric notebooks provide interactive cells for code snippets, support Python and Spark runtimes, and connect directly to lakehouses and semantic model layers for analytics and predictive workloads.

How do I create a notebook in Microsoft Fabric?

To create a notebook, open your Fabric workspace, choose the Notebooks experience, and select New Notebook. You can pick a language like Python, attach the notebook to a Fabric capacity or spark runtime, and connect to data sources such as lakehouses for data ingestion and integration.

Can I use Apache Spark within Fabric notebooks?

Yes. Fabric notebooks support Apache Spark; you can use Apache Spark jobs and run the code using the built-in spark runtime. This makes it straightforward to run large-scale data transformation tasks, scale spark job execution, and leverage distributed compute directly from the current notebook session.

How do Fabric notebooks compare vs Databricks?

Both platforms support notebooks and Spark, but Fabric notebooks provide tighter integration with Microsoft Fabric features like semantic models, Power BI, lakehouses, and Fabric data engineering pipelines. Databricks is focused on a dedicated Spark platform; Fabric emphasizes an end-to-end fabric ecosystem that includes analytics, visualization, and governance alongside notebook activity and data integration.

Who should use Microsoft Fabric notebooks: data engineers or data scientists?

Both. Data engineers use Fabric notebooks to ingest data into a lakehouse, create pipelines, and run spark jobs for batch transformation. Data scientists use notebooks to write code for exploratory analysis, build predictive models, and connect models to the semantic model and Power BI for visualization and reporting.

How do I run the notebook and execute a spark job?

Attach the notebook to a compute target (Fabric capacity or spark runtime), then run cells sequentially or run the notebook end-to-end. Cells that submit distributed workloads will create Apache Spark jobs; monitor notebook activity and job status from the session UI to track progress and logs.

Can I ingest data into a lakehouse using Fabric notebooks?

Yes. Use Fabric notebooks to ingest data from external data sources and perform data integration tasks. You can run data ingestion code snippets, use Fabric connectors, and write transformed data back to lakehouses for downstream analytics and the semantic model.

How do I share notebook content and results with colleagues or Power BI?

Notebook content can be exported, shared within the Fabric workspace, or referenced by Power BI for visualization. Notebooks also integrate with the semantic model so dashboards and reports can use curated datasets and predictive outputs created within the notebook.

What languages and Python libraries are supported in Fabric notebooks?

Fabric notebooks commonly support Python and libraries used in data science and machine learning. You can install or reference Python library packages within the notebook environment to run analysis, build models, and execute code snippets for data transformation and predictive tasks.

How does security and technical support work for Fabric notebooks?

Fabric notebooks inherit workspace security and governance settings, including role-based access and data protection for lakehouses and semantic models. Security updates and technical support are provided through Microsoft Fabric support channels and the Fabric community, with documentation and additional resources available for troubleshooting.

Are there tutorials or additional resources to learn using Fabric notebooks?

Yes. Microsoft publishes tutorials, learning paths, and documentation that cover creating notebooks, using Apache Spark, connecting to lakehouses, and building pipelines. Additional resources include community forums, Fabric data engineering guides, and sample notebooks to help you get started.

How do I manage long-running workloads and the current notebook session?

Monitor the current notebook session and notebook activity to track resource usage. For long-running or resource-intensive tasks, attach to an appropriate Fabric capacity or spark runtime, schedule jobs through Fabric pipelines, and design your code to checkpoint intermediate results in lakehouses to avoid data loss if sessions end.

🚀 Want to be part of m365.fm?

Then stop just listening… and start showing up.

👉 Connect with me on LinkedIn and let’s make something happen:

🎙️ Be a podcast guest and share your story
🎧 Host your own episode (yes, seriously)
💡 Pitch topics the community actually wants to hear
🌍 Build your personal brand in the Microsoft 365 space

This isn’t just a podcast — it’s a platform for people who take action.

🔥 Most people wait. The best ones don’t.

👉 Connect with me on LinkedIn and send me a message:
"I want in"

Let’s build something awesome 👊

Ever tried to train an AI model on your laptop only to watch it crawl for hours—or crash completely? You’re not alone. Most business datasets have outgrown our local hardware. But what if your entire multi-terabyte dataset was instantly accessible in your training notebook—no extracts, no CSV chaos?Today, we’re stepping into Microsoft Fabric’s built-in notebooks, where your model training happens right next to your Lakehouse data. We’ll break down exactly how this setup can save days in processing time, while letting you work in Python or R without compromises.

When Big Data Outgrows Your Laptop

Imagine your laptop fan spinning loud enough to drown out your meeting as you work through a spreadsheet. Now, replace that spreadsheet with twelve terabytes of raw customer transactions, spread across years of activity, with dozens of fields per record. Even before you hit “run,” you already know this is going to hurt. That’s exactly where a lot of marketing teams find themselves. They’ve got a transactional database that could easily be the backbone of an advanced AI project—predicting churn, segmenting audiences, personalizing campaigns in near real time—but their tools are still stuck on their desktops. They’re opening files in Excel or a local Jupyter Notebook, slicing and filtering in tiny chunks just to keep from freezing the machine, and hoping everything holds together long enough to get results they can use. When teams try to do this locally, the cracks show quickly. Processing slows to a crawl, UI elements lag seconds behind clicks, and export scripts that once took minutes now run for hours. Even worse, larger workloads don’t just slow down—they stop. Memory errors, hard drive thrashing, or kernel restarts mean training runs don’t just take longer, they often never finish. And when you’re talking about training an AI model, that’s wasted compute, wasted time, and wasted opportunity. One churn prediction attempt I’ve seen was billed as an “overnight run” in a local Python environment. Twenty hours later, the process finally failed because the last part of the dataset pushed RAM usage over the limit. The team lost an entire day without even getting a set of training metrics back. If that sounds extreme, it’s becoming more common. Enterprise marketing datasets have been expanding year over year, driven by richer tracking, omnichannel experiences, and the rise of event-based logging. Even a fairly standard setup—campaign performance logs, web analytics, CRM data—can easily balloon to hundreds of gigabytes. Big accounts with multiple product lines often end up in the multi-terabyte range. The problem isn’t just storage capacity. Large model training loads stress every limitation of a local machine. CPUs peg at 100% for extended periods, and even high-end GPUs end up idle while data trickles in too slowly. Disk input/output becomes a constant choke point, especially if the dataset lives on an external drive or network share. And then there’s the software layer: once files get large enough, even something as versatile as a Jupyter Notebook starts pushing its limits. You can’t just load “data.csv” into memory when “data.csv” is bigger than your SSD. That’s why many teams have tried splitting files, sampling data, or building lightweight stand-ins for their real production datasets. It’s a compromise that keeps your laptop alive, but at the cost of losing insight. Sampling can drop subtle patterns that would have boosted model performance. Splitting files introduces all sorts of inconsistencies and makes retraining more painful than it needs to be. There’s a smarter way to skip that entire download-and-import cycle. Microsoft Fabric shifts the heavy lifting off your local environment entirely. Training moves into the cloud, where compute resources sit right alongside the stored data in the Lakehouse. You’re not shuttling terabytes back and forth—you’re pushing your code to where the data already lives. Instead of worrying about which chunk of your customer history will fit in RAM, you can focus on the structure and logic of your training run. And here’s the part most teams overlook: the real advantage isn’t just the extra horsepower from cloud compute. It’s the fact that you no longer have to move the data at all.

Direct Lakehouse Access: No More CSV Chaos

What if your notebook could pull in terabytes of data instantly without ever flashing a “Downloading…” progress bar? No exporting to CSV. No watching a loading spinner creep across the screen. Just type the query, run it, and start working with the results right there. That’s the difference when the data layer isn’t an external step—it’s built into the environment you’re already coding in. In Fabric, the Lakehouse isn’t just some separate storage bucket you connect to once in a while. It’s the native data layer for notebooks. That means your code is running in the same environment where the data physically sits. You’re not pushing millions of rows over the wire into your session. You’re sending instructions to the data at its home location. The model input pipeline isn’t a juggling act of exports and imports—it’s a direct line from storage to Spark to whatever Python or R logic you’re writing. If you’ve been in a traditional workflow, you already know the usual pain points. Someone builds an extract from the data warehouse, writes it out to a CSV, and hands it to the data science team. Now the schema is frozen in time. The next week, the source data changes and the extract is already stale. In some cases, you even get two different teams each creating their own slightly different exports, and now you’ve got duplicated storage with mismatched definitions. Best case, that’s just inefficiency. Worst case, it’s the reason two models trained on “the same data” give contradictory predictions. One team I worked with needed a filtered set of customer activity records for a new churn model. They pulled everything from the warehouse into a local SQL database, filtered it, then exported the result set to a CSV for the training environment. That alone took nearly a full day on their network. When new activity records were loaded the next week, they had to do the entire process again from scratch. By the time they could start actual training, they’d spent more time wrangling files than writing code. The performance hit isn’t just about the clock time for transfers. Research across multiple enterprises shows consistent gains when transformations run where the data is stored. When you can do the joins, filters, and aggregations in place instead of downstream, you cut out overhead, network hops, and redundant reads. Fabric notebooks tap into Spark under the hood to make that possible, so instead of pulling 400 million rows across your notebook session, Spark executes that aggregation inside the Lakehouse environment and only returns the results your model needs. If you’re working in Python or R, you’re not starting from a bare shell either. Fabric comes with a stack of libraries already integrated for large-scale work—PySpark, pandas-on-Spark, sparklyr, and more—so distributed processing is an option from the moment you open a new notebook. That matters when you’re joining fact and dimension tables in the hundreds of gigabytes, or when you need to compute rolling windows across several years of customer history. As soon as the query completes, the clean, aggregated dataset is ready to move directly into your feature engineering process. There’s no intermediary phase of saving to disk, checking schema, and re-importing into a local training notebook. You’ve skipped an entire prep stage. Teams used to spend days just aligning columns and re-running filters when source data changed. With this setup, they can be exploring feature combinations for the model within the same hour the raw data was updated. And that’s where it gets interesting—because once you have clean, massive datasets flowing directly into your notebook session, the way you think about building features starts to change.

Feature Engineering and Model Selection at Scale

Your dataset might be big enough to predict just about anything, but that doesn’t mean every column in it belongs in your model. The difference between a model that produces meaningful predictions and one that spits out noise often comes down to how you select and shape your features. Scale gives you possibilities—but it also magnifies mistakes. With massive datasets, throwing all raw fields at your algorithm isn’t just messy—it can actively erode performance. More columns mean more parameters to estimate, and more opportunities for your model to fit quirks in the training data that don’t generalize. Overfitting becomes easier, not harder, when the feature set is bloated. On top of that, every extra variable means more computation. Even in a well-provisioned cloud environment, 500 raw features will slow training, increase memory use, and complicate every downstream step compared to a lean set of 50 well-engineered ones. The hidden cost isn’t always obvious from the clock. That “500-feature” run might finish without errors, but it could leave you with a model that’s marginally more accurate on the training data and noticeably worse on new data. When you shrink and refine those features—merging related variables, encoding categories more efficiently, or building aggregates that capture patterns instead of raw values—you cut down compute time while actually improving how well the model predicts the future. Certain data shapes make this harder. High-cardinality features, like unique product SKUs or customer IDs, can explode into thousands of encoded columns if handled naively. Sparse data, where most fields are empty for most records, can hide useful signals but burn resources storing and processing mostly missing values. In something like customer churn prediction, you may also have temporal patterns—purchase cycles, seasonal activity, onboarding phases—that don’t show up in ordinary static fields. Feature engineering at this scale means designing transformations that condense and surface the patterns without flooding the dataset with noise. That’s where automation and distributed processing tools start paying off. Libraries like Featuretools can automate the generation of aggregates and rolling features across large relational datasets. In Fabric, those transformations can run on Spark, so you can scale out creation of hundreds of candidate features without pulling everything into a single machine’s memory. Time-based groupings, customer-level aggregates, ratios between related metrics—all of these can be built and tested iteratively without breaking your workflow. Once you’ve curated your feature set, model selection becomes its own balancing act. Different algorithms interact with large-scale data in different ways. Gradient boosting frameworks like XGBoost or LightGBM can handle large tabular datasets efficiently, but they still pay the cost per feature in both memory and iteration time. Logistic regression scales well and trains quickly, but it won’t capture complex nonlinear relationships unless you build those into the features yourself. Deep learning models can, in theory, discover richer patterns, but they also demand more tuning and more compute—in Fabric’s environment, you can provision that, but you’ll need to weigh whether the gains justify the training cost. The good news is that with Fabric notebooks directly tied into your Lakehouse, you can test these strategies without the traditional bottlenecks. You can spin up multiple training runs with different feature sets and algorithms, using the same underlying data without having to reload or reshape it for each attempt. That ability to iterate quickly means you’re not locked into a guess about which approach will work best—you can measure and decide. Well-engineered features matched to the right model architecture can cut runtimes significantly, drop memory usage, and still boost accuracy on unseen data. You get faster experimentation cycles and more reliable results, and you spend your compute budget on training that actually matters instead of processing dead weight. Next comes the step that keeps these large-scale runs productive: monitoring and evaluating them in real time so you know exactly what’s happening while the model trains in the cloud.

Training, Monitoring, and Evaluating at Cloud Scale

Training on gigabytes of data sounds like the dream—until you’re sitting there wondering if the job is still running or if it quietly died an hour ago. When everything happens in the cloud, you lose the instant feedback you get from watching logs fly past in a local terminal. That’s fine if the job will finish in minutes. It’s a problem when the clock runs into hours and you have no idea whether you’re making progress. Running training in a remote environment changes how you think about visibility. In a local session, you spot issues immediately—missing values in a field, a data type mismatch, or an import hang. On a cloud cluster, that same error might be buried in a log file you don’t check until much later. And because the resources are provisioned and billed while the process is technically “running,” every minute of a failed run is still money spent. The cost of catching a problem too late adds up quickly. I’ve seen a churn prediction job that was kicked off on a Friday evening with an eight-hour estimate. On Monday morning, the team realized it had failed before the first epoch even started—because one column that should have been numeric loaded as text. The actual runtime? Ten wasted minutes up front, eight billed hours on the meter. That’s the kind of mistake that erodes confidence in the process and slows iteration cycles to a crawl. Fabric tackles this with real-time job monitoring you can open alongside your notebook. You get live metrics on memory consumption, CPU usage, and progress through the training epochs. Logs stream in as the job runs, so you can spot warnings or errors before they turn into full-blown failures. If something looks off, you can halt the run right there instead of learning the hard way later. It’s not just about watching, though. You can set up checkpoints during training so the model’s state is saved periodically. If the job stops—whether because of an error, resource limit, or intentional interruption—you can restart from the last checkpoint instead of starting from scratch. Versioning plays a role here too. By saving trained model versions with their parameters and associated data splits, you can revisit a past configuration without having to re-create the entire environment that produced it. Intermediate saves aren’t just a nice safeguard—they’re what make large-scale experimentation feasible. You can branch off a promising checkpoint and try different hyperparameters without paying the time cost of reloading and retraining the base model. With multi-gigabyte datasets, that can mean the difference between running three experiments in a day or just one. Once the model finishes, evaluation at this scale comes with its own set of challenges. You can’t always score against the full test set in one pass without slowing things to a crawl. Balanced sampling helps here, keeping class proportions while cutting the dataset to a size that evaluates faster. For higher accuracy, distributed evaluation lets you split the scoring task across the cluster, with results aggregated automatically. Fabric supports Python libraries like MLlib and distributed scikit-learn workflows to make that possible. Instead of waiting for a single machine to run metrics on hundreds of millions of records, you can fan the task out and pull back the consolidated accuracy, precision, recall, or F1 scores in a fraction of the time. The data never leaves the Lakehouse, so you’re not dealing with test set exports or manual merges. By the time you see the final metrics—say, a churn predictor evaluated over gigabytes of test data—you’ve also got the full training history, resource usage patterns, and any intermediate versions you saved. That’s a complete picture, without a single CSV download or a late-night “is this thing working?” moment. And when you can trust every run to be visible, recoverable, and fully evaluated at scale, the way you think about building projects in this environment starts to shift completely.

Conclusion

Training right next to your data in Fabric doesn’t just make things faster—it removes the ceiling you’ve been hitting with local hardware. You can run bigger experiments, test more ideas, and actually use the full dataset instead of cutting it down to fit. That changes how quickly you can move from concept to a reliable model. If you haven’t tried it yet, spin up a small project in a Fabric Notebook with Lakehouse integration before your next major AI build. You’ll see the workflow shift immediately. In the next video, we’ll map out automated ML pipelines and deployment—without ever leaving Fabric.

Get full access to M365 Show - Microsoft 365 Digital Workplace Daily at m365.show/subscribe

Mirko Peters

Founder of m365.fm, m365.show and m365con.net

Mirko Peters is a Microsoft 365 expert, content creator, and founder of m365.fm, a platform dedicated to sharing practical insights on modern workplace technologies. His work focuses on Microsoft 365 governance, security, collaboration, and real-world implementation strategies.

Through his podcast and written content, Mirko provides hands-on guidance for IT professionals, architects, and business leaders navigating the complexities of Microsoft 365. He is known for translating complex topics into clear, actionable advice, often highlighting common mistakes and overlooked risks in real-world environments.

With a strong emphasis on community contribution and knowledge sharing, Mirko is actively building a platform that connects experts, shares experiences, and helps organizations get the most out of their Microsoft 365 investments.