Vendors

Data Engineering on AWS is a 3-day intermediate course, designed for professionals seeking a deep dive into data engineering practices and solutions on AWS. Through a balanced combination of theory, practical labs, and activities, participants learn to design, build, optimize, and secure data engineering solutions using AWS services. From foundational concepts to hands-on implementation of data lakes, data warehouses, and both batch and streaming data pipelines, this course equips data professionals with the skills needed to architect and manage modern data solutions at scale

img-course-overview.jpg

What You'll Learn

  • Understand the foundational roles and key concepts of data engineering, including data personas, data discovery, and relevant AWS services.
  • Identify and explain the various AWS tools and services crucial for data engineering, encompassing orchestration, security, monitoring, CI/CD, IaC, networking, and cost optimization.
  • Design and implement a data lake solution on AWS, including storage, data ingestion, transformation, and serving data for consumption.
  • Optimize and secure a data lake solution by implementing open table formats, security measures, and troubleshooting common issues.
  • Design and set up a data warehouse using Amazon Redshift Serverless, understanding its architecture, data ingestion, processing, and serving capabilities.
  • Apply performance optimization techniques to data warehouses in Amazon Redshift, including monitoring, data optimization, query optimization, and orchestration.
  • Manage security and access control for data warehouses in Amazon Redshift, understanding authentication, data security, auditing, and compliance.
  • Design effective batch data pipelines using appropriate AWS services for processing and transforming data.
  • Implement comprehensive strategies for batch data pipelines, covering data processing, transformation, integration, cataloging, and serving data for consumption.
  • Optimize, orchestrate, and secure batch data pipelines, demonstrating advanced skills in data processing automation and security.
  • Architect streaming data pipelines, understanding various use cases, ingestion, storage, processing, and analysis using AWS services.
  • Optimize and secure streaming data solutions, including compliance considerations and access control.

Who Should Attend

This course is designed for professionals who are interested in designing, building, optimizing, and securing data engineering solutions using AWS services.

img-who-should-learn.png

Prerequisites

  • Familiarity with basic machine learning concepts, such as supervised and unsupervised learning, regression, classification, and clustering algorithms. 
  • Working knowledge of Python programming language and common data science libraries like NumPy, Pandas, and Scikit-learn. 
  • Basic understanding of cloud computing concepts and familiarity with the AWS platform. 
  • Familiarity with SQL and relational databases is recommended but not mandatory. 
  • Experience with version control systems like Git is beneficial but not required.

Learning Journey

Coming Soon...

Day 1

Module 1: Data Engineering Roles and Key Concepts

  • Role of a Data Engineer
  • Key functions of a Data Engineer
  • Data Personas
  • Data Discovery
  • AWS Data Services

 Module 2: AWS Data Engineering Tools and Services

  • Orchestration and Automation
  • Data Engineering Security
  • Monitoring
  • Continuous Integration and Continuous Delivery
  • Infrastructure as Code
  • AWS Serverless Application Model
  • Networking Considerations
  • Cost Optimization Tools

 Module 3: Designing and Implementing Data Lakes

  • Hands-on lab: Setting up a Data Lake on AWS
  • Data lake introduction
  • Data lake storage
  • Ingest data into a data lake
  • Catalog data
  • Transform data
  • Server data for consumption

 Module 4: Optimizing and Securing a Data Lake Solution

  • Open Table Formats
  • Security using AWS Lake Formation
  • Setting permissions with Lake Formation
  • Security and governance
  • Troubleshooting
  • Hand-on lab: Automating Data Lake Creation using AWS Lake Formation Blueprints

Day 2

Module 5: Data Warehouse Architecture and Design Principles

  • Hands-on Lab: Setting up a Data Warehouse using Amazon Redshift Serverless
  • Introduction to data warehouses
  • Amazon Redshift Overview
  • Ingesting data into Redshift
  • Processing data
  • Serving data for consumption

 Module 6: Performance Optimization Techniques for Data Warehouses

  • Monitoring and optimization options
  • Data optimization in Amazon Redshift
  • Query optimization in Amazon Redshift
  • Orchestration options

Module 7: Security and Access Control for Data Warehouses

  • Hands-on lab: Managing Access Control in Redshift
  • Authentication and access control in Amazon Redshift
  • Data security in Amazon Redshift
  • Auditing and compliance in Amazon Redshift

Module 8: Designing Batch Data Pipelines

  • Introduction to batch data pipelines
  • Designing a batch data pipeline
  • AWS services for batch data processing

Module 9: Implementing Strategies for Batch Data Pipeline

  • Hands-on lab: A Day in the Life of a Data Engineer
  • Elements of a batch data pipeline
  • Processing and transforming data
  • Integrating and cataloging your data
  • Serving data for consumption

Day 3

Module 10: Optimizing, Orchestrating, and Securing Batch Data Pipelines

  • Hands-on lab: Orchestrating Data Processing in Spark using AWS Step Functions
  • Optimizing the batch data pipeline
  • Orchestrating the batch data pipeline
  • Securing the batch data pipeline

 Module 11: Streaming Data Architecture Patterns

  • Hands-on lab: Streaming Analytics with Amazon Managed Service for Apache Flink
  • Introduction to streaming data pipelines
  • Ingesting data from stream sources
  • Streaming data ingestion services
  • Storing streaming data
  • Processing Streaming Data
  • Analyzing Streaming Data with AWS Services

 Module 12: Optimizing and Securing Streaming Solutions

  • Hands-on lab: Access Control with Amazon Managed Streaming for Apache Kafka
  • Optimizing a streaming data solution
  • Securing a streaming data pipeline
  • Compliance considerations

img-exam-cert

Frequently Asked Questions (FAQs)

  • Why get AWS certified?

    AWS certifications validate your expertise in cloud computing and your proficiency in using AWS services.

    These certifications are globally recognized and highly sought after by employers, as they demonstrate your ability to design, deploy, and manage scalable and secure cloud solutions on the AWS platform.

    AWS-certified professionals are in high demand, opening doors to new career opportunities and higher earning potential.

  • What to expect for the examination?

    AWS offers a variety of certification exams at different levels (Foundational, Associate, Professional, and Specialty) covering various domains and services.

    The exams typically consist of multiple-choice and multiple-response questions, and some may include scenario-based questions that assess your ability to apply your knowledge in real-world situations.

    Note: Certification requirements and policies may be updated by AWS from time to time. We apologize for any discrepancies; do get in touch with us if you have any questions.

  • How long is AWS certification valid for?

    AWS certifications are valid for three years from the date of passing the exam.

    To maintain your certification, you will need to recertify by passing the latest version of the exam or completing the AWS Cloud Quest game-based training (if option is applicable).

    Note: Certification requirements and policies may be updated by AWS from time to time. We apologize for any discrepancies; do get in touch with us if you have any questions.

  • Why take this course with Trainocate?

    Here’s what sets us apart:

    - Global Reach, Localized Accessibility: Benefit from our geographically diverse training hubs in 24 countries (and counting!).

    - Top-Rated Instructors: Our team of subject matter experts (with high average CSAT and MTM scores) are passionate to help you accelerate your digital transformation.

    - Customized Training Solutions: Choose from on-site, virtual classrooms, or self-paced learning to fit your organization and individual needs.

    - Experiential Learning: Dive into interactive training with our curated lesson plans. Participate in hands-on labs, solve real-world challenges, and take on comprehensive assessments.

    - Learn From The Best: With 30+ authorized training partnerships and countless awards from Microsoft, AWS, Google – you're guaranteed learning from the industry's elite.

    - Your Bridge To Success: We provide up-to-date course materials, helpful exam guides, and dedicated support to validate your expertise and elevate your career.

Keep Exploring

Course Curriculum

Course Curriculum

Training Schedule

Training Schedule

Exam & Certification

Exam & Certification

FAQs

Frequently Asked Questions

img-improve-career.jpg

Improve yourself and your career by taking this course.

img-get-info.jpg

Ready to Take Your Business from Great to Awesome?

Level-up by partnering with Trainocate. Get in touch today.

Name*
Email*
Phone*
I'm inquiring for
Inquiry Details

By submitting this form, you consent to Trainocate processing your data to respond to your inquiry and provide you with relevant information about our training programs, including occasional emails with the latest news, exclusive events, and special offers.

You can unsubscribe from our marketing emails at any time. Our data handling practices are in accordance with our Privacy Policy.