Transaction-level operations
At Level 3, alerting works at the aggregate level — percentiles, error rates, and SLOs. Traces are for investigation after an alert fires, not the source of the alert itself.
Alerting
| What to alert on | Example |
|---|
| Latency percentiles | P99 latency > 2s for payment flow across all users |
| Span error rates | Database span error rate > threshold |
| SLO burn rate | Critical path success rate burning error budget too fast |
SLOs
| SLO type | Example |
|---|
| Transaction success | 99% of checkout flows complete successfully |
| Critical path latency | 95% of payment transactions < 1s |
| End-to-end latency | 90% of user journeys < 3s total |
Dashboards
| Dashboard type | What you see |
|---|
| Trace analysis | Span breakdown, latency distribution |
| Flame graphs | Where code spends time (profiling) |
| Frontend performance | Core Web Vitals, user experience metrics |
| AI/LLM metrics | Token usage, model latency, prompt analysis |
Investigation
| Tool | How you use it at Level 3 |
|---|
| Trace Explorer | Search traces by attributes, find slow spans |
| Trace-to-logs | Jump from trace span to related logs |
| Trace-to-profiles | See code-level performance for a trace |
| Session replay | Watch what the user actually experienced |
At Level 4, you’ll alert on custom metrics and KPIs.