Testing and synthetics

Synthetic Monitoring

Create checks

Check types

Traceroute check

Grafana Cloud

Traceroute check

A Traceroute check runs a traceroute from probes to targets to visualize network paths, track how paths change over time, and show how traffic reaches a destination.

If a test fails in certain locations but works in others, it may indicate a problem with the network path. A traceroute can help resolve the ambiguity by showing where the traffic stops.

How it works

The traceroute check uses an open source Go implementation of the mtr or “My Traceroute” command-line tool. When a test executes, the check attempts 5 probe cycles, the same as running mtr -r -c 5 <hostname>. During a probe cycle, the check sends a packet to each hop along the route, collecting timing and packet loss information. A probe cycle continues until it:

Reaches the target. After the target is reached, a new cycle begins.
Reaches the maximum allowed number of hops. When the limit is reached, a new cycle begins.
Reaches the maximum allowed number of unknown hops or hop failures. If the limit is reached, the test aborts and fails with an error “max unknown hops exceeded”.

Options

The list of common options to all check types:

Option	Description
Enabled	Whether the check is enabled or not.
Job name	Refer to the check name. Check metrics include a `job` label with the value of this option.
Target	Target of the check request. Check metrics include an `instance` label with the value of this option.
Probe locations	The locations where the check should run from. Check metrics include a `probe` label with the value of the probe location running the check.
Frequency	The frequency the check should run in seconds. The value can range from 120 to 3600 seconds. Only the `sm_check_info` metric includes the `frequency` label.
Timeout	Maximum execution time for the check. For traceroute checks, the value is fixed at 30 seconds.
Custom labels	(Optional) Custom labels applied to check metrics. Refer to Custom labels for querying instructions.

Additionally, Traceroute checks have the following options:

Option name	Description
Max hops	Maximum number of hops in a cycle before giving up and starting the next cycle.
Max unknown hops	Maximum number of hop failures or unknown hops allowed in a cycle. When the test reaches this limit, it aborts with a “max unknown hops exceeded” error.
PTR lookup	Perform a reverse lookup from IP to hostname.
Publish full set of metrics	Whether to publish additional metrics to create histograms (used for Apdex scores or heatmaps). Default is false to reduce the number of active series.

These last options don’t produce any additional labels in the resulting check metrics.

Logs

A typical mtr report shows a list of hops, the addresses at each hop, and summary statistics for each hop:

> sudo mtr -r -c 5 8.8.8.8
HOP:    Address                Loss%  Sent    Last     Avg    Best   Worst
  1:|-- 10.200.1.1              0.0%     5     1.4     1.8     1.2     3.2
  2:|-- 192.168.2.1             0.0%     5     3.3     3.1     2.4     3.5
  3:|-- 242.12.188.135          0.0%     5    10.9    11.6    10.9    12.6
        242.12.189.131
  4:|-- ???                   100.0%     5     0.0     0.0     0.0     0.0
  5:|-- 99.83.65.231            0.0%     5    24.5    25.0    23.6    25.8
  6:|-- 142.250.236.113         0.0%     5    25.4    25.6    24.0    30.0
        172.253.77.117
  7:|-- 8.8.8.8                 0.0%     5    23.9    24.2    23.6    25.0

A traceroute check produces similar output in the form of logs. Each hop or “TTL” generates a log line that includes a list of hosts at each hop, the number of packets sent per hop, packet loss percentage, and the average time to receive a packet for a hop.

The following example shows the equivalent synthetic log output for the above mtr command:

TTL=1 Hosts=10.200.1.1                       LossPercent=0   Sent=5 ElapsedTime=2ms
TTL=2 Hosts=192.168.2.1                      LossPercent=0   Sent=5 ElapsedTime=3ms
TTL=3 Hosts=242.12.188.135,242.12.189.131    LossPercent=0   Sent=5 ElapsedTime=12ms
TTL=4 Hosts=                                 LossPercent=100 Sent=5 ElapsedTime=0ms
TTL=5 Hosts=99.83.65.231                     LossPercent=0   Sent=5 ElapsedTime=25ms
TTL=6 Hosts=142.250.236.113,172.253.77.117   LossPercent=0   Sent=5 ElapsedTime=26ms
TTL=7 Hosts=8.8.8.8.                         LossPercent=0   Sent=5 ElapsedTime=24ms

Note
The Sent value in the logs indicates the number of packets sent to each hop. A successful test execution has Sent=5 for each TTL. If a test is aborted due to timeout or “max unknown hops exceeded” during a cycle, partial results are returned, indicated by a Sent value less than 5.

Metrics

Checks store their results as Prometheus metrics, including the list of common metrics:

Metric	Description
`probe_all_duration_seconds`	Returns how long the probe took to complete in seconds (histogram).
`probe_duration_seconds`	Returns how long the probe took to complete in seconds.
`probe_all_success`	Displays whether or not the probe was a success (summary).
`probe_success`	Displays whether or not the probe was a success.
`sm_check_info`	Provides information about a single check configuration.

Additionally, Traceroute checks produce the following metrics:

Metric	Description
`probe_traceroute_packet_loss_percent`	Overall percentage of packet loss during the traceroute.
`probe_traceroute_route_hash`	Hash of all the hosts in a traceroute path. Used to determine route volatility.
`probe_traceroute_total_hops`	Total hops to reach a traceroute destination.

Was this page helpful?

Email docs@grafana.com

Help and support

Community

Traceroute check

How it works

Options

Logs

Metrics

Was this page helpful?

Still have questions?

Get every update

Traceroute check

How it works

Options

Logs

Metrics

Was this page helpful?

Related resources from Grafana Labs