trainocate-aws-training-b
Home > Vendors > aws > aws-cbds-b

AWS-CBDS-B - AWS Certified Data Analytics - Specialty

Overview

Duration: 4.0 days

This track includes:

Building Data Lakes on AWS - 1 Day

Building Batch Data Analytics Solutions on AWS - 1 Day

Building Data Analytics Solutions using Amazon Redshift - 1 Day

Exam Readiness: AWS Certified Data Analytics - 1 Day

In this course, Building Data Lakes on AWS, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures.


In this Building Batch Data Analytics Solutions on AWS course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloging, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR.

In this Building Data Analytics Solutions using Amazon Redshift course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service. The course focuses on the data collection, ingestion, cataloging, storage, and processing components of the analytics pipeline. You will learn to integrate Amazon Redshift with a data lake to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon Redshift.

The AWS Certified Data Analytics - Specialty exam validates technical skills and experience in designing and implementing AWS services to derive value from data. This course is intended for Individuals with a Cloud Practitioner or Associate-level AWS certification and two or more years of experience performing complex big data analysis. The course helps you prepare for the exam by taking a deep dive into several data-driven use cases.


Objectives

Building Data Lakes on AWS
  • Apply data lake methodologies in planning and designing a data lake 
  • Articulate the components and services required for building an AWS data lake 
  • Secure a data lake with appropriate permission 
  •  Ingest, store, and transform data in a data lake 
  • Query, analyze, and visualize data within a data lake
Building Batch Data Analytics Solutions on AWS
  • Compare the features and benefits of data warehouses, data lakes, and modern data architectures 
  • Design and implement a batch data analytics solution 
  • Identify and apply appropriate techniques, including compression, to optimize data storage 
  • Select and deploy appropriate options to ingest, transform, and store data 
  • Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case 
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights 
  • Secure data at rest and in transit 
  • Monitor analytics workloads to identify and remediate problems 
  • Apply cost management best practices
Building Data Analytics Solutions Using Amazon Redshift
  • Compare the features and benefits of data warehouses, data lakes, and modern data architectures 
  • Design and implement a data warehouse analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage 
  • Select and deploy appropriate options to ingest, transform, and store data 
  • Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case 
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights 
  • Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems
  • Apply cost management best practices
Exam Readiness Workshop: AWS Certified Data Analytics Specialty
  • Navigate the AWS Certification process 
  • Understand the content domains that will be tested in the AWS Certified Big Data - Specialty exam 
  • Implement core AWS Big Data services according to architectural best practices 
  • Leverage tools to automate data analysis on AWS

Audience

Who Should Attend
  • IT business decision makers
  • Individuals who are new to working with AWS
  • Individuals responsible for designing and implementing big data solutions, namely Solutions Architects and SysOps Administrators.
  • Data Scientists and Data Analysts interested in learning about big data solutions on AWS.
  • Data architects
  • Developers
  • Solutions architects

Content

Building Data Lakes on AWS

Module 1: Introduction to data lakes
  • Describe the value of data lakes 
  • Compare data lakes and data warehouses 
  • Describe the components of a data lake 
  • Recognize common architectures built on data lakes

Module 2: Data ingestion, cataloging, and preparation
  • Describe the relationship between data lake storage and data ingestion 
  • Describe AWS Glue crawlers and how they are used to create a data catalog 
  • Identify data formatting, partitioning, and compression for efficient storage and query 
  • Lab 1: Set up a simple data lake 

Module 3: Data processing and analytics
  • Recognize how data processing applies to a data lake 
  • Use AWS Glue to process data within a data lake 
  • Describe how to use Amazon Athena to analyze data in a data lake

Module 4: Building a data lake with AWS Lake Formation
  • Describe the features and benefits of AWS Lake Formation 
  • Use AWS Lake Formation to create a data lake 
  • Understand the AWS Lake Formation security model 
  • Lab 2: Build a data lake using AWS Lake Formation

Module 5: Additional Lake Formation configurations
  • Automate AWS Lake Formation using blueprints and workflows 
  • Apply security and access controls to AWS Lake Formation 
  • Match records with AWS Lake Formation FindMatches 
  • Visualize data with Amazon QuickSight
  • Lab 3: Automate data lake creation using AWS Lake Formation blueprints
  • Lab 4: Data visualization using Amazon QuickSight

Module 6: Architecture and course review
  • Post course knowledge check 
  • Architecture review 
  • Course review 
Building Batch Data Analytics Solutions on AWS

Module A: Overview of Data Analytics and the Data Pipeline
  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Introduction to Amazon EMR
  • Using Amazon EMR in analytics solutions
  • Amazon EMR cluster architecture
  • Interactive Demo 1: Launching an Amazon EMR cluster
  • Cost management strategies

Module 2: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage
  • Storage optimization with Amazon EMR
  • Data ingestion techniques

Module 3: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR
  • Apache Spark on Amazon EMR use cases 
  • Why Apache Spark on Amazon EMR Spark concepts 
  • Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell 
  • Transformation, processing, and analytics 
  • Using notebooks with Amazon EMR 
  • Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR

Module 4: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive
  • Using Amazon EMR with Hive to process batch data 
  • Transformation, processing, and analytics 
  • Practice Lab 2: Batch data processing using Amazon EMR with Hive 
  • Introduction to Apache HBase on Amazon EMR

Module 5: Serverless Data Processing
  • Serverless data processing, transformation, and analytics 
  • Using AWS Glue with Amazon EMR workloads
  • Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions

Module 6: Security and Monitoring of Amazon EMR Clusters
  • Securing EMR clusters 
  • Interactive Demo 3: Client-side encryption with EMRFS 
  • Monitoring and troubleshooting Amazon EMR clusters 
  • Demo: Reviewing Apache Spark cluster history

Module 7: Designing Batch Data Analytics Solutions
  • Batch data analytics use cases

Building Data Analytics Solutions Using Amazon Redshift

Module A: Overview of Data Analytics and the Data Pipeline
  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Using Amazon Redshift in the Data Analytics Pipeline
  • Why Amazon Redshift for data warehousing?
  • Overview of Amazon Redshift

Module 2: Introduction to Amazon Redshift
  • Amazon Redshift architecture
  • Interactive Demo 1: Touring the Amazon Redshift console
  • Amazon Redshift features
  • Practice Lab 1: Load and query data in an Amazon Redshift cluster

Module 3: Ingestion and Storage
  • Ingestion
  • Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API 
  • Data distribution and storage
  • Interactive Demo 3: Analyzing semi-structured data using the SUPER data type 
  • Querying data in Amazon Redshift 
  • Practice Lab 2: Data analytics using Amazon Redshift Spectrum

Module 4: Processing and Optimizing Data
  • Data transformation
  • Advanced querying
  • Practice Lab 3: Data transformation and querying in Amazon Redshift
  • Resource management
  • Interactive Demo 4: Applying mixed workload management on Amazon Redshift 
  •  Automation and optimization
  • Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster

Module 5: Security and Monitoring of Amazon Redshift Clusters
  • Securing the Amazon Redshift cluster
  • Monitoring and troubleshooting Amazon Redshift clusters

Module 6: Designing Data Warehouse Analytics Solutions
  • Data warehouse use case review
  • Activity: Designing a data warehouse analytics workflow

Module B: Developing Modern Data Architectures on AWS
  • Modern data architectures

Exam Readiness Workshop: AWS Certified Data Analytics Specialty
  • Testing center information and expectations
  • Exam overview and structure
  • Content domains and question breakdown
  • Topics and concepts within content domains
  • Question structure and interpretation techniques

Prerequisites

We recommend that attendees of this course have the following prerequisites:
  • Working knowledge of IT infrastructure concepts
  • Familiarity with basic finance concepts
  • Familiarity with basic IT security concepts
  • Basic familiarity with big data technologies, including Apache Hadoop, HDFS, and SQL/NoSQL querying.
  • Students should complete the Big Data Technology Fundamentals web-based training or have equivalent experience.
  • Working knowledge of core AWS services and public cloud implementation.
  • Students should complete the AWS Essentials course or have equivalent experience.
  • Basic understanding of data warehousing, relational database systems, and database design.
  • AWS Certified Cloud Practitioner or an Associate-level AWS Certification
  • Two or more years of hands-on experience performing complex big data analyses on AWS

Certification

This course is not associated with any Certification.

Schedule

Scheduled DateCountryLocationFeesRegister
2022-11-14 - 2022-11-17 Philippines Virtual ILT PHP 60714



Enquire Now
 
 
 
 
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy