Vendors

This content provides a comprehensive guide to managing data privacy within Databricks. It covers key topics like Delta Lake architecture, regional data isolation, GDPR/CCPA compliance, and Change Data Feed (CDF) usage. Through practical demos and hands-on labs, participants learn to use Unity Catalog features for securing sensitive data and ensuring compliance, empowering them to safeguard data integrity effectively.

img-course-overview.jpg

What You'll Learn

  • Storing Data Securely
  • Unity Catalog
  • PII Data Security
  • Streaming Data and CDF

Who Should Attend

  • Data engineers, data governance specialists, security architects and compliance professionals responsible for implementing data-privacy controls on the Databricks Lakehouse.
  • Professionals managing sensitive data (including PII) who need to apply masking, anonymisation, encryption and fine-grained access policies.
  • Individuals working with Unity Catalog, Delta Lake, audit logs and data-governance tooling to meet organisational or regulatory privacy requirements.
  • Practitioners with SQL, PySpark and Databricks experience seeking to enhance their skills in privacy-centred data-pipeline design.
  • Teams transitioning from manual or ad hoc permission models to governed, scalable, policy-driven data-privacy frameworks.

img-who-should-learn.png

Prerequisites

  • Ability to perform basic code development tasks using the Databricks Data Engineering & Data Science workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc)
  • Intermediate programming experience with PySpark
  • Extract data from a variety of file formats and data sources
  • Apply a number of common transformations to clean data
  • Reshape and manipulate complex data using advanced built-in functions
  • Intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.)
  • Beginner experience configuring and scheduling data pipelines using the Delta Live Tables (DLT) UI
  • Beginner experience defining Delta Live Tables pipelines using PySpark
  • Ingest and process data using Auto Loader and PySpark syntax
  • Process Change Data Capture feeds with APPLY CHANGES INTO syntax
  • Review pipeline event logs and results to troubleshoot DLT syntax

Learning Journey

Coming Soon...

Module 1. Course Introduction

1.1. Storing Data Securely

  • Regulatory Compliance
  • Data Privacy

1.2. Unity Catalog

  • Key Concepts and Components
  • Audit Your Data
  • Data Isolation
  • Securing Data in Unity Catalog

1.3. PII Data Security

  • Pseudonymization & Anonymization
  • Summary & Best Practices
  • PII Data Security

1.4. Streaming Data and CDF

  • Capturing Changed Data
  • Deleting Data in Databricks
  • Processing Records from CDF and Propagating Changes
  • Propagating Changes with CDF Lab

exam-cert

Frequently Asked Questions (FAQs)

None

Keep Exploring

Course Curriculum

Course Curriculum

Training Schedule

Training Schedule

Exam & Certification

Exam & Certification

FAQs

Frequently Asked Questions

img-improve-career.jpg

Improve yourself and your career by taking this course.

img-get-info.jpg

Ready to Take Your Business from Great to Awesome?

Level-up by partnering with Trainocate. Get in touch today.

Name
Email
Phone
I'm inquiring for
Inquiry Details

By submitting this form, you consent to Trainocate processing your data to respond to your inquiry and provide you with relevant information about our training programs, including occasional emails with the latest news, exclusive events, and special offers.

You can unsubscribe from our marketing emails at any time. Our data handling practices are in accordance with our Privacy Policy.