trainocate-ibm-training
Home > Vendors > ibm > c110019g

C110019G - Performing the Post-incident Review

Overview

Duration: 1.0 day

This module explains how to perform post-incident reviews and the role of the Site Reliability Engineer (SRE) in this process.

Objectives

  • Understand the SRE role in post-incident reviews
  • Learn how to perform post-incident reviews

Audience

This course is intended for learners who are pursuing professional-level site reliability engineer certification on IBM Cloud.

Content

Module Introduction

  • Topic 1: Performing the Post Incident Review

Module Summary

Prerequisites

Before starting this curriculum, the target audience should understand:

  • System Thinking
  • DevOps practices
  • Cloud Architecture
  • Software engineering principles
  • System administration
  • Network and OSI model
  • Networking and security practices for IBM Cloud
  • Incident management
  • Root cause analysis

The target audience should also be able to:

  • Proficiently write code
  • Create run books as a reference
  • Make system components serviceable
  • Interpret data and statistics to determine actions
  • Use LogDNA, SysDig, Grafana, Prometheus, Kibana
  • Interpret schematics
  • Drive incidents to resolution
  • Remediate underlying sources of unreliability
  • Create and configure VMs
  • Create and configure Containers on IBM Kubernetes Service (IKS)/Red Hat OpenShift Kubernetes Services (ROKS)
  • Create and configure Containers using OpenShift
  • Create and configure Serverless applications
  • Configure for high availability and scalability

Certification

Schedule




Enquire Now
 
 
 
 
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy