Home > Vendors > ibm > c110020g

C110020G - Using Performance and Availability Metrics to Measure the Health of Services


Duration: 1.0 day

This module explains how to use performance and availability metrics to measure the health of services on IBM Cloud.


  • Understand which performance and availability metrics to use to achieve a desired outcome.
  • Learn to select tools based on the metrics needed.
  • Learn to use metrics tools for incident management.


This course is intended for learners who are pursuing professional-level site reliability engineer certification on IBM Cloud.


Module Introduction

  • Topic 1: Applying the Correct Metric for the Desired Outcome
  • Topic 2: Selecting the Correct Tool for the Desired Metric
  • Topic 3: Using Metrics Tools for Incident Management

Module Summary


Before starting this curriculum, the target audience should understand:

  • System Thinking
  • DevOps practices
  • Cloud Architecture
  • Software engineering principles
  • System administration
  • Network and OSI model
  • Networking and security practices for IBM Cloud
  • Incident management
  • Root cause analysis

The target audience should also be able to:

  • Proficiently write code
  • Create run books as a reference
  • Make system components serviceable
  • Interpret data and statistics to determine actions
  • Use LogDNA, SysDig, Grafana, Prometheus, Kibana
  • Interpret schematics
  • Drive incidents to resolution
  • Remediate underlying sources of unreliability
  • Create and configure VMs
  • Create and configure Containers on IBM Kubernetes Service (IKS)/Red Hat OpenShift Kubernetes Services (ROKS)
  • Create and configure Containers using OpenShift
  • Create and configure Serverless applications
  • Configure for high availability and scalability



Enquire Now
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy