SLO & Error Budget
Also known as: SLO · Error budget · Service Level Objective
A Service Level Objective is the explicit reliability target a service commits to — say, 99.9% successful requests measured over 30 days. The complement (0.1%) is the error budget: the operationally-permitted failure for the period. When the budget is healthy, the team ships features; when it is exhausted, ship velocity stops until reliability is restored. This is the contract that turns 'reliability' from a feeling into a number.
Concepts that travel with this one.
Architecture rarely lives in isolation — these are the terms that come up in the same conversation.
Observability
Observability is the property of a system that lets you answer questions about its behaviour from its outputs alone, without shipping new code. The three signals are metrics, logs, and traces; the operational test is whether an on-call engineer can root-cause a novel incident from the existing dashboards. Observability is in the data model from week one on every TantraDev engagement, not bolted on after launch.
Golden Signals
The Four Golden Signals, from Google's SRE book, are Latency, Traffic, Errors, and Saturation — the minimum set of signals to monitor on any user-facing service. They overlap with RED and USE but stay user-facing in framing: a latency spike that customers feel matters more than CPU saturation that they don't. TantraDev alert policies are golden-signal-shaped.
RTO & RPO
Recovery Time Objective (RTO) is how long the business can tolerate the system being down before money is at stake. Recovery Point Objective (RPO) is how much data the business can tolerate losing. The pair is the input to the disaster-recovery design — backups, replicas, failover automation, restore drills. Without an explicit RTO/RPO every architectural choice that touches recoverability is implicit and untestable.
Building a system where SLO & Error Budget is the load-bearing decision?
30 minutes on the phone, one page in your inbox — what to build, what to skip, what it will cost. You keep the audit even if we are not the right fit.