ATC-SREA - SRE for Architects

Site Reliability Engineering (SRE) is a set of principles and practices that supports software delivery - keeping production systems stable and still delivering new features at speed. In this course, Site Reliability Engineering (SRE): The Big Picture, you'll get a thorough overview of how SRE works and why it's a good choice for many organizations. First, you'll learn the differences between SRE, DevOps, and traditional operations. Next, you'll discover how engineering practices help to reduce toil and provide more time to focus on high-value tasks. Finally, you'll learn how SRE approaches monitoring and alerting, and about the SRE approach to managing incidents. When you're finished with this course, you'll be able to evaluate SRE and see if it's a good fit for your organization.

Duration: 3.0 days

Enquire Now

Start learning today!

Click Hereto customize your Training

Objectives

  • Understand what Site recovery manager is
  • Greater understanding of Practices and Principles of SRE
  • What is DevOps
  • Differentiate between devops and SRE
  • Understand various tools used in Automation
  • Understand various tools used in software build and release
  • Gain handson experience on jenkins, docker, terraform kubernetes and ansible

Content

Module 1 : SRE -Big Picture

  • History of Site Reliability Engineering
  • Introduction to SRE
  • Define Site Reliability Engineering (SRE)
  • DevOps and SRE differences

Module 2.A : Principles of SRE

  • Embracing Risk
  • Service Level Objectives
  • Eliminating Toil
  • Monitoring Distributed Systems
  • The Evolution of Automation at Google
  • Release Engineering
  • Simplicity

Module 2.B Handson Lab – Before DevOps scenario labs

  • Create repository on Bitbucket
  • Git clone, install maven
  • Perform manual package
  • Deploy application

Module 3 : Practices in SRE - Part 1

  • Practical Alerting
  • Being On-Call
  • Effective Troubleshooting
  • Emergency Response
  • Managing Incidents
  • Postmortem Culture: Learning from Failure
  • Tracking Outages
  • Testing for Reliability
  • Software Engineering in SRE

Module 4 : Practices in SRE - Part 2

  • Load Balancing at the Frontend
  • Load Balancing in the Datacenter
  • Handling Overload
  • Addressing Cascading Failures
  • Managing Critical State: Distributed Consensus for Reliability
  • Distributed Periodic Scheduling with Cron
  • Data Processing Pipelines
  • Data Integrity: What You Read Is What You Wrote
  • Reliable Product Launches at Scale

Module 5 : Containerization and Microservices

  • Monolithic application overview
  • Microservice overview and benefits
  • What is virtualization
  • What is containers
  • Virtualization and container differences
  • Kubernetes overview - orchestration of containers
  • Kubernetes architecture and Components

Module 5.B : Hands-on lab

  • Install docker
  • Create ,Login stop and delete container
  • Create image using dockerfile
  • Push image to dockerhub
  • Deploy Kubernetes cluster on Google
  • Deploy your own docker image on kubernetes
  • Expose application behind a load balancer

Module 6 : DevOps Big Picture

  • Define Waterfall model and its challenges
  • Define Agile and its advantages
  • Define DevOps
  • Difference in between agile and DevOps
  • Continuous Integration and Continuous deployment
  • Before DevOps application development and delivery
  • After DevOps application development and delivery

Module 7 : SRE and DevOps differences

  • Common myths around and SRE and DevOps are same
  • Key differences between SRE and DevOps

Module 8.A : SRE Developer Tool chain

  • Source code management tools
    • Github, bitbucket and SVN
  • Static code analysis
    • Sonarqube, Fortify , Nexus IQ
  • Build Tools
    • Maven,Ant and Gradle
  • Repository tools
    • Nexus , Artifactory, cloud storage
  • Orchestration Tools
    • Jenkins, Bamboo CI, Travis
  • Release management Tools
    • Jira Release management ,Urban code release, BMC RLM

Module 8.B : Hands on lab

  • Create a CI/CD pipeline on Jenkins which automates below tasks
    • Git clone
    • mvn install
    • code analysis by sonarqube
    • Mvn compile and mvn package
    • Upload application package to Nexus
    • Deploy application on same machine

Module 9.A : SRE Operations Tool chain

  • Infrastructure-as-a-code tools – Terraform
  • Declarative infrastructure and Deployment tools
    • AWS Cloud formation,
    • Google deployment Manager ,
    • Azure resource manager
    • Openstack Heat
  • Ops Automation tools
    • Ansible - overview ,architecture and components
    • Chef - overview ,architecture and components
    • Puppet - overview ,architecture and components
    • Saltstack - overview ,architecture and components
  • Monitoring and ticketing tools
    • Application monitoring and tracing tools
      • Newrelic
      • App Dynamics
      • DataDog
      • AWS-Xray
    • Infrastructure Monitoring Tools
      • Nagios
      • ELK and EFK
    • Ticketing Tools -
    • Cloud native monitoring Tools
      • AWS cloudwatch
      • Google Stackdriver
      • Azure Monitor

Module 9.B : Hands on lab

  • Install terraform
  • Deploy Kubernetes cluster using terraform
  • Write Ansible scripts(playbooks and apply on nodes)
  • AWS Xray – application monitoring and tracing

Audience

-

Prerequisites

None

Certification

product-certification

Course Benefits

product-benefits
  • Career growth
  • Broad Career opportunities
  • Worldwide recognition from leaders
  • Up-to Date technical skills
  • Popular Certification Badges

Advanced Technology Courses Popular Courses

atc-python-programming

The course is all about to learn the Python programming language. Its emphasis the core libraries and most useful libraries developed by the Python.

atc-aif

The Artificial Intelligence Fundamentals course provides a comprehensive introduction to the basic principles and concepts of artificial intelligence (AI). This

atc-isc

Our Trainer the Trainer course is designed to equip aspiring trainers with the skills and knowledge they need to deliver effective and engaging training session

atc-fht200

This course teaches individuals how to become proficient in the administration and management of the Falcon Platform, a cyber-security platform by CrowdStrike.
Enquire Now
 
 
 
 
hl3kh6
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy