Back to glossary
Pattern

Observability

Observability is the property of a system that lets you answer questions about its behaviour from its outputs alone, without shipping new code. The three signals are metrics, logs, and traces; the operational test is whether an on-call engineer can root-cause a novel incident from the existing dashboards. Observability is in the data model from week one on every TantraDev engagement, not bolted on after launch.

Related terms

Concepts that travel with this one.

Architecture rarely lives in isolation — these are the terms that come up in the same conversation.

TECHNOLOGY

OpenTelemetry

OpenTelemetry is the CNCF standard for collecting traces, metrics, and structured logs from production systems. It is vendor-neutral on the wire (OTLP), so the same instrumentation feeds Grafana, Datadog, Honeycomb, or CloudWatch without code change. TantraDev installs OpenTelemetry from sprint one — observability is part of the data model, not a bolt-on.

TECHNOLOGY

Grafana

Grafana is the de-facto dashboarding layer for production telemetry — Prometheus metrics, Loki logs, Tempo traces, and a long list of other sources rendered into one queryable surface. TantraDev ships Grafana dashboards as a deliverable on every cloud engagement, with the RED and USE dashboards built before the first cutover and the alert rules wired into the client's existing PagerDuty.

PATTERN

RED Method

RED stands for Rate, Errors, Duration — the three service-level signals every request-driven service should emit. Rate is requests per second; Errors is the fraction that fail; Duration is the latency distribution. A RED dashboard answers 'is this service healthy right now' in under five seconds. TantraDev ships a RED dashboard per service before the first cutover on every cloud engagement.

PATTERN

Golden Signals

The Four Golden Signals, from Google's SRE book, are Latency, Traffic, Errors, and Saturation — the minimum set of signals to monitor on any user-facing service. They overlap with RED and USE but stay user-facing in framing: a latency spike that customers feel matters more than CPU saturation that they don't. TantraDev alert policies are golden-signal-shaped.

ARCHITECTURE AUDIT

Building a system where Observability is the load-bearing decision?

30 minutes on the phone, one page in your inbox — what to build, what to skip, what it will cost. You keep the audit even if we are not the right fit.