Why your dashboard isn't actually observability
A Grafana dashboard with 200 panels is not observability. It's a wall of charts. The difference matters when something breaks at 2am.
- Dashboards are great for exploration, not for production decisions
- Production decisions need SLIs (what does 'broken' mean?) and SLOs (how broken is too broken?)
- Most dashboards I've inherited had no defined question, just metrics someone thought were interesting
- If your on-call team opens a dashboard during an incident and can't find the right panel in 30 seconds, the dashboard failed
Bottom line: Define the question first. Build the dashboard second. If you can't name the person who owns the alert, the dashboard isn't production-ready.
Page on user impact. Ticket on symptoms. Route by owner.