AWS-CBDS-B - AWS Certified Data Analytics - Specialty

This track includes:

Building Data Lakes on AWS - 1 Day

Building Batch Data Analytics Solutions on AWS - 1 Day

Building Streaming Data Analytics Solutions on AWS - 1 Day

Building Data Analytics Solutions using Amazon Redshift - 1 Day

Exam Readiness: AWS Certified Data Analytics - 1 Day

In this course, Building Data Lakes on AWS, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures.


In this Building Batch Data Analytics Solutions on AWS course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloging, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR.

In this Building Streaming Data Analytics Solutions on AWS course, you will learn to build streaming data analytics solutions using AWS services, including Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (Amazon MSK). Amazon Kinesis is a massively scalable and durable real-time data streaming service. Amazon MSK offers a secure, fully managed, and highly available Apache Kafka service. You will learn how Amazon Kinesis and Amazon MSK integrate with AWS services such as AWS Glue and AWS Lambda. The course addresses the streaming data ingestion, stream storage, and stream processing components of the data analytics pipeline. You will also learn to apply security, performance, and cost management best practices to the operation of Kinesis and Amazon MSK.

In this Building Data Analytics Solutions using Amazon Redshift course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service. The course focuses on the data collection, ingestion, cataloging, storage, and processing components of the analytics pipeline. You will learn to integrate Amazon Redshift with a data lake to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon Redshift.

The AWS Certified Data Analytics - Specialty exam validates technical skills and experience in designing and implementing AWS services to derive value from data. This course is intended for Individuals with a Cloud Practitioner or Associate-level AWS certification and two or more years of experience performing complex big data analysis. The course helps you prepare for the exam by taking a deep dive into several data-driven use cases.


Duration: 5.0 days

Enquire Now

Start learning today!

Click Hereto customize your Training

Objectives

Building Data Lakes on AWS
  • Apply data lake methodologies in planning and designing a data lake 
  • Articulate the components and services required for building an AWS data lake 
  • Secure a data lake with appropriate permission 
  •  Ingest, store, and transform data in a data lake 
  • Query, analyze, and visualize data within a data lake
Building Batch Data Analytics Solutions on AWS
  • Compare the features and benefits of data warehouses, data lakes, and modern data architectures 
  • Design and implement a batch data analytics solution 
  • Identify and apply appropriate techniques, including compression, to optimize data storage 
  • Select and deploy appropriate options to ingest, transform, and store data 
  • Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case 
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights 
  • Secure data at rest and in transit 
  • Monitor analytics workloads to identify and remediate problems 
  • Apply cost management best practices
Building Streaming Data Analytics Solutions on AWS
  • Understand the features and benefits of a modern data architecture. 
  • Learn how AWS streaming services fit into a modern data architecture. 
  • Design and implement a streaming data analytics solution 
  • Identify and apply appropriate techniques, such as compression, sharding, and partitioning, to optimize data storage 
  • Select and deploy appropriate options to ingest, transform, and store real-time and near real-time data 
  • Choose the appropriate streams, clusters, topics, scaling approach, and network topology for a particular business use case 
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights 
  • Secure streaming data at rest and in transit 
  • Monitor analytics workloads to identify and remediate problems 
  • Apply cost management best practices Building Data Analytics Solutions Using Amazon Redshift
Building Data Analytics Solutions using Amazon Redshift
  • Compare the features and benefits of data warehouses, data lakes, and modern data architectures 
  • Design and implement a data warehouse analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage 
  • Select and deploy appropriate options to ingest, transform, and store data 
  • Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case 
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights 
  • Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems
  • Apply cost management best practices
Exam Readiness Workshop: AWS Certified Data Analytics Specialty
  • Navigate the AWS Certification process 
  • Understand the content domains that will be tested in the AWS Certified Big Data - Specialty exam 
  • Implement core AWS Big Data services according to architectural best practices 
  • Leverage tools to automate data analysis on AWS

Content

Building Data Lakes on AWS

Module 1: Introduction to data lakes
  • Describe the value of data lakes 
  • Compare data lakes and data warehouses 
  • Describe the components of a data lake 
  • Recognize common architectures built on data lakes

Module 2: Data ingestion, cataloging, and preparation
  • Describe the relationship between data lake storage and data ingestion 
  • Describe AWS Glue crawlers and how they are used to create a data catalog 
  • Identify data formatting, partitioning, and compression for efficient storage and query 
  • Lab 1: Set up a simple data lake 

Module 3: Data processing and analytics
  • Recognize how data processing applies to a data lake 
  • Use AWS Glue to process data within a data lake 
  • Describe how to use Amazon Athena to analyze data in a data lake

Module 4: Building a data lake with AWS Lake Formation
  • Describe the features and benefits of AWS Lake Formation 
  • Use AWS Lake Formation to create a data lake 
  • Understand the AWS Lake Formation security model 
  • Lab 2: Build a data lake using AWS Lake Formation

Module 5: Additional Lake Formation configurations
  • Automate AWS Lake Formation using blueprints and workflows 
  • Apply security and access controls to AWS Lake Formation 
  • Match records with AWS Lake Formation FindMatches 
  • Visualize data with Amazon QuickSight
  • Lab 3: Automate data lake creation using AWS Lake Formation blueprints
  • Lab 4: Data visualization using Amazon QuickSight

Module 6: Architecture and course review
  • Post course knowledge check 
  • Architecture review 
  • Course review 
Building Batch Data Analytics Solutions on AWS

Module A: Overview of Data Analytics and the Data Pipeline
  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Introduction to Amazon EMR
  • Using Amazon EMR in analytics solutions
  • Amazon EMR cluster architecture
  • Interactive Demo 1: Launching an Amazon EMR cluster
  • Cost management strategies

Module 2: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage
  • Storage optimization with Amazon EMR
  • Data ingestion techniques

Module 3: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR
  • Apache Spark on Amazon EMR use cases 
  • Why Apache Spark on Amazon EMR Spark concepts 
  • Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell 
  • Transformation, processing, and analytics 
  • Using notebooks with Amazon EMR 
  • Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR

Module 4: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive
  • Using Amazon EMR with Hive to process batch data 
  • Transformation, processing, and analytics 
  • Practice Lab 2: Batch data processing using Amazon EMR with Hive 
  • Introduction to Apache HBase on Amazon EMR

Module 5: Serverless Data Processing
  • Serverless data processing, transformation, and analytics 
  • Using AWS Glue with Amazon EMR workloads
  • Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions

Module 6: Security and Monitoring of Amazon EMR Clusters
  • Securing EMR clusters 
  • Interactive Demo 3: Client-side encryption with EMRFS 
  • Monitoring and troubleshooting Amazon EMR clusters 
  • Demo: Reviewing Apache Spark cluster history

Module 7: Designing Batch Data Analytics Solutions
  • Batch data analytics use cases

Building Streaming Data Analytics Solutions on AWS

Module A: Overview of Data Analytics and the Data Pipeline

  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Using Amazon Redshift in the Data Analytics Pipeline

  • Why Amazon Redshift for data warehousing?
  • Overview of Amazon Redshift

Module 2: Introduction to Amazon Redshift

  • Amazon Redshift architecture
  • Interactive Demo 1: Touring the Amazon Redshift console
  • Amazon Redshift features
  • Practice Lab 1: Load and query data in an Amazon Redshift cluster

Module 3: Ingestion and Storage

  • Ingestion
  • Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API
  • Data distribution and storage
  • Interactive Demo 3: Analyzing semi-structured data using the SUPER data type
  • Querying data in Amazon Redshift
  • Practice Lab 2: Data analytics using Amazon Redshift Spectrum

Module 4: Processing and Optimizing Data

  • Data transformation
  • Advanced querying
  • Practice Lab 3: Data transformation and querying in Amazon Redshift
  • Resource management
  • Interactive Demo 4: Applying mixed workload management on Amazon Redshift
  • Automation and optimization
  • Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster

Module 5: Security and Monitoring of Amazon Redshift Clusters

  • Securing the Amazon Redshift cluster
  • Monitoring and troubleshooting Amazon Redshift clusters

Module 6: Designing Data Warehouse Analytics Solutions

  • Data warehouse use case review
  • Activity: Designing a data warehouse analytics workflow

Module B: Developing Modern Data Architectures on AWS

  • Modern data architectures

Building Data Analytics Solutions Using Amazon Redshift


Module A: Overview of Data Analytics and the Data Pipeline
  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Using Amazon Redshift in the Data Analytics Pipeline
  • Why Amazon Redshift for data warehousing?
  • Overview of Amazon Redshift

Module 2: Introduction to Amazon Redshift
  • Amazon Redshift architecture
  • Interactive Demo 1: Touring the Amazon Redshift console
  • Amazon Redshift features
  • Practice Lab 1: Load and query data in an Amazon Redshift cluster

Module 3: Ingestion and Storage
  • Ingestion
  • Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API 
  • Data distribution and storage
  • Interactive Demo 3: Analyzing semi-structured data using the SUPER data type 
  • Querying data in Amazon Redshift 
  • Practice Lab 2: Data analytics using Amazon Redshift Spectrum

Module 4: Processing and Optimizing Data
  • Data transformation
  • Advanced querying
  • Practice Lab 3: Data transformation and querying in Amazon Redshift
  • Resource management
  • Interactive Demo 4: Applying mixed workload management on Amazon Redshift 
  •  Automation and optimization
  • Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster

Module 5: Security and Monitoring of Amazon Redshift Clusters
  • Securing the Amazon Redshift cluster
  • Monitoring and troubleshooting Amazon Redshift clusters

Module 6: Designing Data Warehouse Analytics Solutions
  • Data warehouse use case review
  • Activity: Designing a data warehouse analytics workflow

Module B: Developing Modern Data Architectures on AWS
  • Modern data architectures

Exam Readiness Workshop: AWS Certified Data Analytics Specialty
  • Testing center information and expectations
  • Exam overview and structure
  • Content domains and question breakdown
  • Topics and concepts within content domains
  • Question structure and interpretation techniques

Audience

Who Should Attend
  • IT business decision makers
  • Individuals who are new to working with AWS
  • Individuals responsible for designing and implementing big data solutions, namely Solutions Architects and SysOps Administrators.
  • Data Scientists and Data Analysts interested in learning about big data solutions on AWS.
  • Data architects
  • Developers
  • Solutions architects

Prerequisites

We recommend that attendees of this course have the following prerequisites:
  • Working knowledge of IT infrastructure concepts
  • Familiarity with basic finance concepts
  • Familiarity with basic IT security concepts
  • Basic familiarity with big data technologies, including Apache Hadoop, HDFS, and SQL/NoSQL querying.
  • Students should complete the Big Data Technology Fundamentals web-based training or have equivalent experience.
  • Working knowledge of core AWS services and public cloud implementation.
  • Students should complete the AWS Essentials course or have equivalent experience.
  • Basic understanding of data warehousing, relational database systems, and database design.
  • AWS Certified Cloud Practitioner or an Associate-level AWS Certification
  • Two or more years of hands-on experience performing complex big data analyses on AWS

Certification

product-certification
This course is not associated with any Certification.

Course Benefits

product-benefits
  • Career growth
  • Broad Career opportunities
  • Worldwide recognition from leaders
  • Up-to Date technical skills
  • Popular Certification Badges

AWS Popular Courses

aws-arcacc

Learn how to build complex AWS solutions incorporating data services, governance, and security.

aws-arc

From this course, you will learn how to optimize the AWS Cloud by understanding how AWS services fit into cloud-based solutions.

aws-mga

This course is for individuals who seek an understanding of how to plan and migrate existing workloads to the AWS Cloud.

aws-devops

Learn how to use the most common DevOps patterns to develop, deploy and maintain applications on AWS.
Enquire Now
 
 
 
 
9ewspY
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy