Data & AI

DP-750T00 : Implement data engineering solutions using Azure Databricks

Course Duration

4 Days

Delivery

ILT/VILT

Overview

Overview Course Curriculum Training Schedule Exam & Certification FAQs

Master end-to-end data engineering with Azure Databricks and Unity Catalog. This course moves from foundational setup to production deployment, covering environment configuration and enterprise-grade governance. Learn to build robust ingestion pipelines, implement security with Unity Catalog, and deploy optimized workloads. By the end, you will have the practical skills to implement, secure, and maintain scalable lakehouse solutions that meet rigorous enterprise requirements.

What You'll Learn

Understand the fundamentals of data engineering and big data processing on Microsoft Azure
Work with Azure Databricks to build scalable data processing solutions
Design and implement data pipelines using Apache Spark within Databricks
Perform data ingestion, transformation, and storage for large-scale datasets
Use Delta Lake for reliable and high-performance data lakes
Optimize data workflows and improve performance using Spark optimization techniques
Implement batch and real-time data processing solutions
Integrate Azure Databricks with other Azure services like Data Lake, Data Factory, and Synapse Analytics
Apply best practices for data security, governance, and monitoring
Collaborate using notebooks, version control, and shared workspaces in Databricks

Who Should Attend

The target audience is data engineers who have fundamental knowledge of data analytics concepts, a basic understanding of cloud storage, and familiarity with data organization principles. They should be comfortable working with SQL and have experience using Python, including notebooks, for data engineering tasks. Learners are expected to have a good understanding of Azure Databricks workspaces and Unity Catalog, along with familiarity with data access patterns and core data engineering and data warehouse concepts. In addition, they should have foundational knowledge of Azure security, including Microsoft Entra ID, and be familiar with Git version control fundamentals.

Prerequisites

Fundamental knowledge of data analytics concepts
Basic understanding of cloud storage concepts
Familiarity with SQL and data organization principles

Learning Journey

Coming Soon...

Module 1: Set up and configure an Azure Databricks environment

Build a solid foundation in Azure Databricks by understanding its architecture, integrations, compute options, and data organization capabilities. Learn how Azure Databricks provides a unified platform for data engineering, analytics, and AI workloads in the cloud.

Module 2: Explore Azure Databricks

Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.

Module 3: Understand Azure Databricks architecture

Azure Databricks architecture separates control and compute planes while organizing resources through a hierarchical structure. This module explores how the account hierarchy works, the differences between serverless and classic compute planes, and the various storage options available including default storage, external storage, and Unity Catalog managed storage for organizing and governing your data.

Module 4: Understand Azure Databricks Integrations

Azure Databricks integrates with multiple Microsoft services to provide end-to-end data engineering, analytics, and AI capabilities. This module explores how Azure Databricks works with Microsoft Fabric, Power BI, Visual Studio Code, Power Platform, Copilot Studio, Microsoft Purview, and Microsoft Foundry to enable comprehensive solutions that combine data lakehouse capabilities with business intelligence, application development, and conversational AI.

Module 5: Select and Configure Compute in Azure Databricks

Azure Databricks provides multiple compute options optimized for different workloads. This module explores how to choose the right compute type, configure performance settings, manage access permissions, and install libraries. You'll learn when to use serverless versus classic compute, how to optimize clusters for cost and performance, and best practices for securing compute resources.

Module 6: Create and organize objects in Unity Catalog

Unity Catalog's three-layer namespace—catalogs, schemas, and objects—provides a flexible foundation for organizing data assets while maintaining centralized governance. This module explores how to create catalogs for environment isolation, organize schemas within those catalogs, and create tables, views, and volumes for structured and unstructured data. You'll learn to implement foreign catalogs for external database access, apply effective naming conventions, and configure AI/BI Genie instructions to enhance data discoverability.

Module 7: Secure Unity Catalog objects

Unity Catalog provides centralized governance and security for data assets in Azure Databricks. This module explores how to secure Unity Catalog objects through access control strategies, fine-grained permissions, credential management, and authentication mechanisms. You'll learn how to implement table and schema-level security, enforce row and column filtering, securely access secrets from Azure Key Vault, and authenticate data access using service principals and managed identities.

Module 8: Govern Unity Catalog objects

This module covers essential governance practices in Unity Catalog, enabling you to secure, monitor, and manage your data estate effectively. You will learn how to implement fine-grained access control, track data lineage, configure audit logs, and share data securely.

Module 9: Design and implement data modeling with Azure Databricks

Effective data modeling forms the foundation of a performant and maintainable data platform. This module explores how to design ingestion logic, select appropriate tools and table formats, implement partitioning schemes, manage slowly changing dimensions, choose appropriate data granularity, and optimize table performance through clustering strategies in Azure Databricks with Unity Catalog.

Module 10: Ingest data into Unity Catalog

Data ingestion is a fundamental capability for any data platform. This module explores the comprehensive set of techniques available in Azure Databricks for loading data into Unity Catalog tables. You'll learn how to use managed connectors with Lakeflow Connect, write custom ingestion code in notebooks, apply SQL commands for batch file loading, process change data capture feeds, configure streaming ingestion from message buses, set up Auto Loader for automatic file detection, and orchestrate ingestion workflows with Lakeflow Spark Declarative Pipelines.

Module 11: Cleanse, transform, and load data into Unity Catalog

Data engineering requires transforming raw data into clean, well-structured formats ready for analysis. This module explores techniques for profiling data quality, selecting appropriate column types, resolving duplicates and null values, applying filtering and aggregation transformations, combining datasets with joins and set operators, reshaping data through pivoting and denormalization, and loading transformed data using append, overwrite, and merge strategies.

Module 12: Implement and manage data quality constraints with Azure Databricks

This module explores strategies for maintaining high data quality in Azure Databricks. You will learn how to implement validation checks, enforce schemas, manage schema drift, and use pipeline expectations to ensure data integrity throughout your data pipelines.

Module 13: Design and implement data pipelines with Azure Databricks

Learn to design and implement robust data pipelines in Azure Databricks using notebooks and Lakeflow Spark Declarative Pipelines, covering orchestration, error handling, and task logic.

Module 14: Implement Lakeflow Jobs with Azure Databricks

This module guides you through the process of implementing Lakeflow Jobs in Azure Databricks. You will learn how to create jobs, configure triggers and schedules, set up alerts, and manage automatic restarts to ensure reliable data pipeline execution.

Module 15: Implement development lifecycle processes in Azure Databricks

Azure Databricks integrates with established development practices through Git folders for version control and Databricks Asset Bundles for infrastructure-as-code deployments. This module explores Git version control best practices, branching and pull request workflows, comprehensive testing strategies, and CLI-based bundle deployment across environments.

Module 16: Monitor, troubleshoot and optimize workloads in Azure Databricks

Monitoring and optimization are essential for running reliable, cost-effective data workloads in Azure Databricks. This module explores cluster consumption metrics, Lakeflow Jobs troubleshooting, Spark job diagnostics, performance optimization for caching, skew, spill, and shuffle issues, and log streaming to Azure Log Analytics.

Frequently Asked Questions (FAQs)

Why get Microsoft certified?

Microsoft certifications validate your skills and expertise in Microsoft technologies and solutions, demonstrating your ability to design, implement, and manage cutting-edge technologies.
These certifications are globally recognized and highly sought after by employers, as they signify your proficiency in using Microsoft products and services to drive innovation and solve business challenges.
Microsoft-certified professionals are in high demand, opening doors to new career opportunities and higher earning potential.
What to expect for the examination?

Microsoft certification exams are designed to assess your knowledge and skills in specific Microsoft technologies and solutions.
Exams typically consist of multiple-choice, multiple-select, and case study questions, and some may include lab simulations to evaluate your practical skills.
Note: Certification requirements and policies may be updated by Microsoft from time to time. We apologize for any discrepancies; do get in touch with us if you have any questions.
How long is Microsoft certification valid for?

Most Microsoft role-based and specialty certifications are valid for one year from the date of passing the exam.
To maintain your certification, you will need to renew it annually by passing a free online assessment on Microsoft Learn.
However, Microsoft Applied Skills credentials and Fundamentals certifications do not expire.
Note: Certification requirements and policies may be updated by Microsoft from time to time. We apologize for any discrepancies; do get in touch with us if you have any questions.
Why take this course with Trainocate?

Here’s what sets us apart:
- Global Reach, Localized Accessibility: Benefit from our geographically diverse training hubs in 24 countries (and counting!).
- Top-Rated Instructors: Our team of subject matter experts (with high average CSAT and MTM scores) are passionate to help you accelerate your digital transformation.
- Customized Training Solutions: Choose from on-site, virtual classrooms, or self-paced learning to fit your organization and individual needs.
- Experiential Learning: Dive into interactive training with our curated lesson plans. Participate in hands-on labs, solve real-world challenges, and take on comprehensive assessments.
- Learn From The Best: With 30+ authorized training partnerships and countless awards from Microsoft, AWS, Google – you're guaranteed learning from the industry's elite.
- Your Bridge To Success: We provide up-to-date course materials, helpful exam guides, and dedicated support to validate your expertise and elevate your career.