2025-015 picks
kubernetes/kubernetesInfra / Kubernetes

The upstream Kubernetes repo, SIG-Node, SIG-Scheduling, and SIG-Autoscaling have active good-first-issues.

Why I care: Contributing here teaches you how the scheduler and kubelet actually make decisions, invaluable for debugging production issues.

prometheus/prometheusSRE / Observability

The Prometheus time-series database. Active issues around native histogram migration and OTLP ingestion.

Why I care: Understanding how Prometheus stores and queries data changes how you design dashboards and alerts.

grafana/grafanaSRE / Observability

Grafana visualization platform. Good-first-issues around panel plugins and alerting UX.

Why I care: The plugin architecture is well-documented. Good entry point into the Go + React codebase.

Terraform core. Issues around provider development, state management edge cases, and CLI UX.

Why I care: Reading the provider protocol and state handling code explains many real-world Terraform bugs.

The OTel Collector, the pipeline between your instrumentation and your backend.

Why I care: Active contributor community. Understanding the processor/exporter model helps you build better telemetry pipelines.

2024-123 picks

ML platform on Kubernetes. Pipeline orchestration and model serving components.

Why I care: Bridges SRE skills with ML platform work, growing demand for this intersection.

argoproj/argo-cdInfra / GitOps

GitOps continuous delivery for Kubernetes. Application sync and health checking.

Why I care: ArgoCD is production-dominant for GitOps. The app-of-apps pattern code is worth reading.

Certificate management for Kubernetes. ACME, vault, and private CA integrations.

Why I care: Good-first-issues around controller logic. Certificate rotation is a real production pain, understanding this pays dividends.