Grafana Cloud
Last reviewed: April 1, 2026

PDC agent metrics

The PDC agent exposes Prometheus-compatible metrics for monitoring and alerting. By default, you can access these metrics at http://<agent-host>:8090/metrics. You can change the metrics port using the -metrics-addr flag. You can also disable metric parsing from SSH logs with the -parse-metrics flag.

Note

For Grafana Cloud-side PDC metrics such as connected agent count and request duration, refer to View PDC activity in the grafanacloud-usage data source.

Scrape the metrics

To collect PDC agent metrics, configure a Prometheus-compatible scraper to target the agent’s metrics endpoint. The following example shows a Grafana Alloy configuration:

Alloy
prometheus.scrape "pdc_agent" {
  targets = [
    {"__address__" = "<agent-host>:8090"},
  ]
  forward_to = [prometheus.remote_write.default.receiver]
}

Replace <agent-host> with the hostname or IP address of the machine running the PDC agent. If you changed the metrics port with -metrics-addr, update the port accordingly.

Available metrics

The metrics include counters, gauges, and native histograms that provide insight into the agent’s behavior, including:

  • SSH connection count and duration
  • TCP connection counts
  • Signing request latency
  • Restart counts with exit codes
  • Agent version information

The PDC agent exposes the following metrics:

Metric nameTypeDescriptionLabels
pdc_agent_agent_infoGaugeSet to 1 with labels identifying the agent buildversion, ssh_version, stack_id
pdc_agent_signing_requests_duration_secondsNative histogramDuration of signing requests in secondsstatus
pdc_agent_ssh_connectionsGaugeNumber of open SSH connectionsnone
pdc_agent_ssh_restarts_totalCounterTotal number of SSH restartsconnection, exit_code
pdc_agent_ssh_open_channelsGaugeNumber of open SSH channelsconnection
pdc_agent_tcp_connections_totalCounterNumber of opened TCP connectionsconnection, target, status
pdc_agent_ssh_time_to_connect_secondsNative histogramTime spent to establish SSH connectionconnection

The connection label

The connection label appears on several metrics and identifies which parallel SSH connection emitted the metric. It corresponds to the -connections flag. When running with the default single connection, the label value is 0. If you increase -connections to 3, you see values 0, 1, and 2.

Use cases and example queries

The following examples show how to use PDC agent metrics for monitoring, alerting, and troubleshooting.

Confirm agent availability

Use pdc_agent_agent_info to verify that a PDC agent is running and to identify its version:

promql
pdc_agent_agent_info

A result of 1 for each agent instance confirms it is up. The version and ssh_version labels help you verify that agents are running the expected software versions.

Track SSH restart rate

A high rate of SSH restarts can indicate network instability or server-side issues. Use the exit_code label to break down restarts by cause:

promql
sum by (exit_code) (rate(pdc_agent_ssh_restarts_total[5m]))

Refer to PDC agent exit codes for details on what each exit code means.

Monitor open SSH channels

A rising number of open channels can indicate that an agent is becoming overloaded. The troubleshooting guide recommends monitoring CPU usage as the primary indicator, and open channels provides a complementary signal:

promql
pdc_agent_ssh_open_channels

Track TCP connection success and failure rates

Use the status label to compare successful and failed TCP connections to your data source targets:

promql
sum by (target, status) (rate(pdc_agent_tcp_connections_total[5m]))

A high failure rate for a specific target suggests the data source is unreachable from the agent. Refer to the troubleshooting guide for common causes.

Measure signing request latency

Track the p99 latency of certificate signing requests to detect API performance issues:

promql
histogram_quantile(0.99, rate(pdc_agent_signing_requests_duration_seconds[5m]))

Monitor SSH connection establishment time

Track how long it takes the agent to establish an SSH connection to the PDC server. Elevated connection times can indicate network latency or congestion:

promql
histogram_quantile(0.99, rate(pdc_agent_ssh_time_to_connect_seconds[5m]))