Data Engineer

DATB-BDPLDP : Build Data Pipelines with Lakeflow Declarative Pipelines

Course Duration

2 Hours

Delivery

ILT/VILT

Overview

Overview Course Curriculum Training Schedule Exam & Certification FAQs

This course introduces users to the essential concepts and skills needed to build data pipelines using Lakeflow Spark Declarative Pipelines (SDP) in Databricks for incremental batch or streaming ingestion and processing through multiple streaming tables and materialized views. Designed for data engineers new to Spark Declarative Pipelines, the course provides a comprehensive overview of core components such as incremental data processing, streaming tables, materialized views, and temporary views, highlighting their specific purposes and differences.

Topics covered include:

Developing and debugging ETL pipelines with the multi-file editor in Spark Declarative Pipelines using SQL (with Python code examples provided)
How Spark Declarative Pipelines track data dependencies in a pipeline through the pipeline graph
Configuring pipeline compute resources, data assets, trigger modes, and other advanced options

Next, the course introduces data quality expectations in Spark Declarative Pipelines, guiding users through the process of integrating expectations into pipelines to validate and enforce data integrity. Learners will then explore how to put a pipeline into production, including scheduling options, and enabling pipeline event logging to monitor pipeline performance and health.

Finally, the course covers how to implement Change Data Capture (CDC) using the AUTO CDC INTO syntax within Spark Declarative Pipelines to manage slowly changing dimensions (SCD Type 1 and Type 2), preparing users to integrate CDC into their own pipelines.

What You'll Learn

Introduction to Data Engineering in Databricks
Lakeflow Declarative Pipeline Fundamentals
Building Lakeflow Declarative Pipelines

Who Should Attend

Data engineers and ETL/ELT developers responsible for building streaming or incremental-batch pipelines using Lakeflow Declarative Pipelines on Databricks.
Professionals managing data ingestion, transformation and delivery using streaming tables, materialized views, temporary views and Change Data Capture (CDC).
Individuals aiming to accelerate pipeline development through declarative abstractions, dependency graphs, data-quality expectations, and built-in scheduling and monitoring.
Practitioners with experience in SQL, Delta Lake, Apache Spark and the Databricks workspace who want to deepen their skills in declarative data-engineering frameworks.
Teams transitioning from hand-coded, imperative pipelines to more maintainable, observable and scalable declarative pipeline architectures.

Prerequisites

Basic understanding of the Databricks Data Intelligence platform, including Databricks Workspaces, Apache Spark, Delta Lake, the Medallion Architecture and Unity Catalog.
Experience ingesting raw data into Delta tables, including using the read_files SQL function to load formats like CSV, JSON, TXT, and Parquet.
Proficiency in transforming data using SQL, including writing intermediate-level queries and a basic understanding of SQL joins.

Learning Journey

Coming Soon...

Module 1: Introduction to Data Engineering in Databricks

Data Engineering in Databricks
What are Lakeflow Declarative Pipelines?
Course Setup and Creating a Pipeline
Course Project Overview

Module 2: Lakeflow Declarative Pipeline Fundamentals

Dataset Types Overview
Simplified Pipeline Development
Common Pipeline Settings
Developing a Simple Pipeline
Ensure Data Quality with Expectations

Module 3: Building Lakeflow Declarative Pipelines

Streaming Joins Overview
Deploying a Pipeline to Production
Change Data Capture (CDC) Overview
Change Data Capture with AUTO CDC INTO
Additional Features Overview

Frequently Asked Questions (FAQs)

None

Keep Exploring

Course Curriculum

Training Schedule

Exam & Certification

FAQs

Frequently Asked Questions

Improve yourself and your career by taking this course.

Enroll Now

More Courses By Databricks

Browse All Courses

IBM Analytics3_8g102g-ibm-security-guardium-data-protection-foundations-training

Data Engineer

Ready to Take Your Business from Great to Awesome?

Level-up by partnering with Trainocate. Get in touch today.

Name*

Email*

Phone*

I'm inquiring for

Myself

Company

Inquiry Details*

Captcha is required.

By submitting this form, you consent to Trainocate processing your data to respond to your inquiry and provide you with relevant information about our training programs, including occasional emails with the latest news, exclusive events, and special offers.

You can unsubscribe from our marketing emails at any time. Our data handling practices are in accordance with our Privacy Policy.