Site Reliability • Platform Engineering • DevOps
Kubernetes • Cloud • Terraform • Kafka • Python

I design, migrate, and operate production-grade platforms across AWS/GCP/Azure with a heavy focus on Kubernetes reliability, infrastructure automation, streaming systems, and high-signal observability.

High-signal overview

What I do

I build and operate production platforms where reliability is non-negotiable — Kubernetes, cloud infrastructure, CI/CD, streaming systems, telemetry, and automation.

What you'll find here

  • Engagement Picks: weekly curated engineering reads + commentary
  • Open Source Picks: monthly OSS projects worth tracking (SRE/infra/security/data)
  • Evidence pages: Kubernetes / Terraform / Kafka / Cloud / Automation with real-world patterns and artifacts

Core strengths

Kubernetes
EKS/GKE ops, upgrades, autoscaling, RBAC/IRSA
Terraform
Modules, remote state, CI gating, guardrails
Kafka
Consumer lag, partitions, DLQ, retries, safe rollouts
Python
Automation tooling for migration, validation, ops
Observability
Prometheus/Grafana/OTel, SLOs, alert hygiene
Cloud
AWS/GCP/Azure, HA design, security + cost controls

Featured case studies