Academy Training

Observability Training

Fundamentals, Tools, and Best Practices for Kubernetes & Cloud

2 days
Remote
Remote

Challenges

Modern IT systems are increasingly based on cloud-native architectures, microservices, and Kubernetes. These distributed, highly dynamic environments place new demands on operations, stability, and troubleshooting. Classic monitoring quickly reaches its limits here, as it often only makes symptoms visible without revealing the root causes of complex problems.

Observability enables a deep understanding of system behavior—but requires well-founded knowledge of concepts, architectures, tools, and best practices. Many teams face the challenge of not just introducing observability technically, but integrating it into their operations in a way that is meaningful, scalable, and user-oriented. This training addresses exactly this gap and creates a resilient foundation for the professional use of modern observability solutions.

Goal

The goal of the training is to teach participants the fundamentals of modern observability and empower them to effectively use observability concepts with proven patterns and tools in operations.

Participants will understand:

  • What observability is and why it is essential for modern systems.
  • How metrics, logs, and traces work together.
  • How observability tools are used to optimize performance, proactively detect errors, and sustainably improve the user experience.

After the training, participants will be able to interpret observability data and use it specifically for operations, troubleshooting, and the further development of their systems.

Target group

Content

This 2-day training lays the theoretical and practical basis for modern observability concepts. The focus is on the three pillars of observability—Metrics, Logs, and Traces—as well as their integration into a holistic observability architecture.

In addition to teaching concepts and best practices, the focus is on the user perspective:How are observability tools used in everyday life? What requirements do operations teams have regarding dashboards, alarms, and analysis functions? And how does observability data support troubleshooting and performance optimization?

The training content is divided into thematic sections. Each section combines an interactive theory part with a subsequent Lab & Exercise unit, in which participants apply what they have learned independently.

Key Topics:

  • Introduction to modern Observability – Concepts, patterns, and distinctions.
  • Differences between Monitoring, Observability, and Logging.
  • The three pillars of Observability: Metrics, Traces, and Logs.
  • Modern observability architectures and end-to-end approaches.
  • Overview and use of relevant tools (e.g., Prometheus, Grafana, Kibana, Loki, OpenTelemetry, Jaeger).
  • Best practices for SLOs, error detection, and performance optimization.
  • Observability as part of the DevOps and CI/CD lifecycle.
  • Observability in Kubernetes and cloud-native environments.
  • User perspective: Using observability for operations and collaboration.
  • Troubleshooting and scaling observability systems.

Structure / Agenda

Day 1: Fundamentals, Concepts, and Architectures

  • Introduction to Observability: Motivation, benefits, and use cases.
  • Monitoring vs. Observability vs. Logging.
  • The three pillars of Observability.
  • Distributed Tracing: Concepts and application.
  • Metrics and Logs as central analysis instruments.
  • End-to-end observability architectures.
  • Practical Exercises: Analysis of logs, metrics, and traces.

Day 2: Tools, Operations, and Practice

  • Overview of observability tools and their interaction.
  • Prometheus, Grafana, Kibana, Loki.
  • OpenTelemetry and Jaeger.
  • Best practices for observability in the DevOps environment.
  • Observability in Kubernetes and cloud environments.
  • Using dashboards and KPIs for proactive monitoring.
  • Troubleshooting and performance optimization with observability data.
  • Practical Exercises:
    • Creating and customizing dashboards.
    • Log analysis and error identification.
    • Distributed tracing and performance analysis.
    • Scaling observability solutions.

Organizational information

  • Duration: 2 days (expandable to 3 days for a Deep Dive).
  • Location: Optional On-site / Remote.
  • Language: Optional German / English.
  • Number of Participants: Min. 5 - Max. 10.

Prerequisites

  • Confident handling of Git, YAML, Bash, and Docker.
  • Basic knowledge of cloud-native development.
  • Ideally, practical experience with Kubernetes.

Technical Requirements

  • Notebook with internet connection.
  • Current web browser.
What other people say

Voices from participants

No items found.

Our Trainer

Kevin Schu
Kevin Schu is Director of Cloud & DevOps Consulting, specializing in IT process automation and cloud-native architectures. He helps companies develop and scale CI/CD pipelines and integrates AI workloads like ChatGPT to streamline complex communication processes. With expertise in Docker, Kubernetes, and multi-cloud strategies on AWS, Azure, and Google Cloud, he focuses on driving efficiency, flexibility, and security in modern IT infrastructures.

Our Academy expert

We are happy to create your individual training

Cordula Kartheininger

Cordula Kartheininger

HR & AOE Academy Strategy Lead