Golden Signals
The Four Golden Signals, from Google's SRE book, are Latency, Traffic, Errors, and Saturation — the minimum set of signals to monitor on any user-facing service. They overlap with RED and USE but stay user-facing in framing: a latency spike that customers feel matters more than CPU saturation that they don't. TantraDev alert policies are golden-signal-shaped.
Concepts that travel with this one.
Architecture rarely lives in isolation — these are the terms that come up in the same conversation.
RED Method
RED stands for Rate, Errors, Duration — the three service-level signals every request-driven service should emit. Rate is requests per second; Errors is the fraction that fail; Duration is the latency distribution. A RED dashboard answers 'is this service healthy right now' in under five seconds. TantraDev ships a RED dashboard per service before the first cutover on every cloud engagement.
USE Method
USE — Utilisation, Saturation, Errors — is Brendan Gregg's framework for diagnosing resource-level health (CPU, memory, disk, network). Utilisation is the percent of time the resource was busy; Saturation is the queue depth waiting on it; Errors is the count of operations that failed. RED tells you *that* a service is unhealthy; USE tells you *which resource* is to blame.
Observability
Observability is the property of a system that lets you answer questions about its behaviour from its outputs alone, without shipping new code. The three signals are metrics, logs, and traces; the operational test is whether an on-call engineer can root-cause a novel incident from the existing dashboards. Observability is in the data model from week one on every TantraDev engagement, not bolted on after launch.
SLO & Error Budget
A Service Level Objective is the explicit reliability target a service commits to — say, 99.9% successful requests measured over 30 days. The complement (0.1%) is the error budget: the operationally-permitted failure for the period. When the budget is healthy, the team ships features; when it is exhausted, ship velocity stops until reliability is restored. This is the contract that turns 'reliability' from a feeling into a number.
Building a system where Golden Signals is the load-bearing decision?
30 minutes on the phone, one page in your inbox — what to build, what to skip, what it will cost. You keep the audit even if we are not the right fit.