trainocate-gcp-training-b

LMOGC - Logging, Monitoring and Observability in Google Cloud

Overview

Duration: 3.0 days
Logging, Monitoring and Observability in Google Cloud. This certification & training courses teaches participants techniques for monitoring, troubleshooting, and improving infrastructure and application performance in Google Cloud. Guided by the principles of Site Reliability Engineering (SRE), and using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.

Objectives

This course teaches participants the following skills:
  • Plan and implement a well-architected logging and monitoring infrastructure ?
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) ?
  • Create effective monitoring dashboards and alerts
  • Monitor, troubleshoot, and improve Google Cloud infrastructure
  • Analyze and export Google Cloud audit logs
  • Find production code defects, identify bottlenecks, and improve performance Optimize monitoring costs

Audience

This class is intended for the following participants:

Cloud architects, administrators, and SysOps personnel Cloud developers and DevOps personnel.

Content

The course includes presentations, demonstrations, and hands-on labs.

Module 1: Introduction to Google Cloud Monitoring Tools
  • Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring.
  • Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, and Profiler.
Module 2: Avoiding Customer Pain
  • Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation.
  • Measure customer pain with SLIs.
  • Define critical performance measures.
  • Create and use SLOs and SLAs.
  • Achieve developer and operation harmony with error budgets.
Module 3: Alerting Policies
  • Develop alerting strategies.
  • Define alerting policies.
  • Add notification channels.
  • Identify types of alerts and common uses for each.
  • Construct and alert on resource groups.
  • Manage alerting policies programmatically.
Module 4: Monitoring Critical Systems
  • Choose best practice monitoring project architectures.
  • Differentiate Cloud IAM roles for monitoring.
  • Use the default dashboards appropriately.
  • Build custom dashboards to show resource consumption and application load.
  • Define uptime checks to track aliveness and latency.
Module 5: Configuring Google Cloud Services for Observability
  • Integrate logging and monitoring agents into Compute Engine VMs and images.
  • Enable and utilize Kubernetes Monitoring.
  • Extend and clarify Kubernetes monitoring with Prometheus.
  • Expose custom metrics through code, and with the help of OpenCensus.
Module 6: Advanced Logging and Analysis
  • Identify and choose among resource tagging approaches.
  • Define log sinks (inclusion filters) and exclusion filters.
  • Create metrics based on logs.
  • Define custom metrics.
  • Link application errors to Logging using Error Reporting.
  • Export logs to BigQuery.
Module 7: Monitoring Network Security and Audit Logs
  • Collect and analyze VPC Flow logs and Firewall Rules logs.
  • Enable and monitor Packet Mirroring.
  • Explain the capabilities of Network Intelligence Center.
  • Use Admin Activity audit logs to track changes to the configuration or metadata of resources.
  • Use Data Access audit logs to track accesses or changes to user-provided resource data.
  • Use System Event audit logs to track GCP administrative actions.
Module 8: Managing Incidents
  • Define incident management roles and communication channels.
  • Mitigate incident impact.
  • Troubleshoot root causes.
  • Resolve incidents.
  • Document incidents in a post-mortem process.
Module 9: Investigating Application Performance Issues
  • Debug production code to correct code defects.
  • Trace latency through layers of service interaction to eliminate performance bottlenecks.
  • Profile and identify resource-intensive functions in an application.
Module 10: Optimizing the Costs of Monitoring
  • Analyze resource utilization cost for monitoring related components within Google Cloud.
  • Implement best practices for controlling the cost of monitoring within Google Cloud.

Prerequisites

To get the most out of this course, participants should have:
  • Google Cloud Platform Fundamentals: Core Infrastructure or equivalent experience
  • Basic scripting or coding familiarity
  • Proficiency with command-line tools and Linux operating system environments

Certification

This course is not associated with any certification.

Schedule




Enquire Now
 
 
 
 
By clicking "Submit", I agree to the Terms Of Use and Privacy Policy