Skip to content

Metrics API


NOTE: The metrics API may change in the future, this serves as a snapshot of the current metrics.

Admin

Administrators can monitor Serving control plane based on the metrics exposed by each Serving component. Metrics are listed next.

Activator

The following metrics allow the user to understand how application responds when traffic goes through the activator eg. scaling from zero. For example high request latency means that requests are taken too much time be fulfilled.
| Metric Name | Description | Type | Tags | Unit | Status | |:-|:-|:-|:-|:-|:-| | request_concurrency | Concurrent requests that are routed to Activator
These are requests reported by the concurrency reporter which may not be done yet.
This is the average concurrency over a reporting period | Gauge | configuration_name
container_name
namespace_name
pod_name
revision_name
service_name | Dimensionless | Stable | | request_count | The number of requests that are routed to Activator.
These are requests that have been fulfilled from the activator handler. | Counter | configuration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name | Dimensionless | Stable | | request_latencies | The response time in millisecond for the fulfilled routed requests | Histogram | configuration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name | Milliseconds | Stable |

Autoscaler

Autoscaler component exposes a number of metrics related to its decisions per revision. For example at any given time user can monitor the desired pods the Autoscaler wants to allocate for a service, the average number of requests per second during the stable window, whether autoscaler is in panic mode (KPA) etc. To read more about how autoscaler works check here.
| Metric Name | Description | Type | Tags | Unit | Status | |:-|:-|:-|:-|:-|:-| | desired_pods | Number of pods autoscaler wants to allocate | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | excess_burst_capacity | Excess burst capacity overserved over the stable window | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | stable_request_concurrency | Average of requests count per observed pod over the stable window | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | panic_request_concurrency | Average of requests count per observed pod over the panic window | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | target_concurrency_per_pod | The desired number of concurrent requests for each pod | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | stable_requests_per_second | Average requests-per-second per observed pod over the stable window | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | panic_requests_per_second | Average requests-per-second per observed pod over the panic window | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | target_requests_per_second | The desired requests-per-second for each pod | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | panic_mode | 1 if autoscaler is in panic mode, 0 otherwise | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | requested_pods | Number of pods autoscaler requested from Kubernetes | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | actual_pods | Number of pods that are allocated currently in ready state | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | not_ready_pods | Number of pods that are not ready currently | Gauge | configuration_name=
namespace_name=
revision_name
service_name | Dimensionless | Stable | | pending_pods | Number of pods that are pending currently | Gauge | configuration_name
namespace_name
revision_name
service_name | Dimensionless | Stable | | terminating_pods | Number of pods that are terminating currently | Gauge | configuration_name
namespace_name
revision_name
service_name
| Dimensionless | Stable |

Controller

The following metrics are emitted by any component that implements a controller logic. The metrics show details about the reconciliation operations and the workqueue behavior on which reconciliation requests are enqueued.

Metric Name Description Type Tags Unit Status
work_queue_depth Depth of the work queue Gauge reconciler Dimensionless Stable
reconcile_count Number of reconcile operations Counter reconciler
success
Dimensionless Stable
reconcile_latency Latency of reconcile operations Histogram reconciler
success
Milliseconds Stable
workqueue_adds_total Total number of adds handled by workqueue Counter name Dimensionless Stable
workqueue_depth Current depth of workqueue Gauge reconciler Dimensionless Stable
workqueue_queue_latency_seconds How long in seconds an item stays in workqueue before being requested Histogram name Seconds Stable
workqueue_retries_total Total number of retries handled by workqueue Counter name Dimensionless Stable
workqueue_work_duration_seconds How long in seconds processing an item from a workqueue takes. Histogram name Seconds Stable
workqueue_unfinished_work_seconds How long in seconds the outstanding workqueue items have been in flight (total). Histogram name Seconds Stable
workqueue_longest_running_processor_seconds How long in seconds the longest outstanding workqueue item has been in flight Histogram name Seconds Stable

Webhook

Webhook metrics report useful info about operations eg. CREATE on Serving resources and if admission was allowed. For example if a big number of operations fail this could be an issue with the submitted user resource.
| Metric Name | Description | Type | Tags | Unit | Status | |:-|:-|:-|:-|:-|:-| | request_count | The number of requests that are routed to webhook | Counter | admission_allowed
kind_group
kind_kind
kind_version
request_operation
resource_group
resource_namespace
resource_resource
resource_version | Dimensionless | Stable | | request_latencies | The response time in milliseconds | Histogram | admission_allowed
kind_group
kind_kind
kind_version
request_operation
resource_group
resource_namespace
resource_resource
resource_version | Milliseconds | Stable |

Go Runtime - memstats

Each Knative Serving control plane process emits a number of Go runtime memory statistics (shown next). As a baseline for monitoring purproses, user could start with a subset of the metrics: current allocations (go_alloc), total allocations (go_total_alloc), system memory (go_sys), mallocs (go_mallocs), frees (go_frees) and garbage collection total pause time (total_gc_pause_ns), next gc target heap size (go_next_gc) and number of garbage collection cycles (num_gc).
| Metric Name | Description | Type | Tags | Unit | Status | |:-|:-|:-|:-|:-|:-| | go_alloc | The number of bytes of allocated heap objects (same as heap_alloc) | Gauge | name | Dimensionless | Stable | | go_total_alloc | The cumulative bytes allocated for heap objects | Gauge | name | Dimensionless | Stable | | go_sys | The total bytes of memory obtained from the OS | Gauge | name | Dimensionless | Stable | | go_lookups | The number of pointer lookups performed by the runtime | Gauge | name | Dimensionless | Stable | | go_mallocs | The cumulative count of heap objects allocated | Gauge | name | Dimensionless | Stable | | go_frees | The cumulative count of heap objects freed | Gauge | name | Dimensionless | Stable | | go_heap_alloc | The number of bytes of allocated heap objects | Gauge | name | Dimensionless | Stable | | go_heap_sys | The number of bytes of heap memory obtained from the OS | Gauge | name | Dimensionless | Stable | | go_heap_idle | The number of bytes in idle (unused) spans | Gauge | name | Dimensionless | Stable | | go_heap_in_use | The number of bytes in in-use spans | Gauge | name | Dimensionless | Stable | | go_heap_released | The number of bytes of physical memory returned to the OS | Gauge | name | Dimensionless | Stable | | go_heap_objects | The number of allocated heap objects | Gauge | name | Dimensionless | Stable | | go_stack_in_use | The number of bytes in stack spans | Gauge | name | Dimensionless | Stable | | go_stack_sys | The number of bytes of stack memory obtained from the OS | Gauge | name | Dimensionless | Stable | | go_mspan_in_use | The number of bytes of allocated mspan structures | Gauge | name | Dimensionless | Stable | | go_mspan_sys | The number of bytes of memory obtained from the OS for mspan structures | Gauge | name | Dimensionless | Stable | | go_mcache_in_use | The number of bytes of allocated mcache structures | Gauge | name | Dimensionless | Stable | | go_mcache_sys | The number of bytes of memory obtained from the OS for mcache structures | Gauge | name | Dimensionless | Stable | | go_bucket_hash_sys | The number of bytes of memory in profiling bucket hash tables. | Gauge | name | Dimensionless | Stable | | go_gc_sys | The number of bytes of memory in garbage collection metadata | Gauge | name | Dimensionless | Stable | | go_other_sys | The number of bytes of memory in miscellaneous off-heap runtime allocations | Gauge | name | Dimensionless | Stable | | go_next_gc | The target heap size of the next GC cycle | Gauge | name | Dimensionless | Stable | | go_last_gc | The time the last garbage collection finished, as nanoseconds since 1970 (the UNIX epoch) | Gauge | name | Nanoseconds | Stable | | go_total_gc_pause_ns | The cumulative nanoseconds in GC stop-the-world pauses since the program started | Gauge | name | Nanoseconds | Stable | | go_num_gc | The number of completed GC cycles. | Gauge | name | Dimensionless | Stable | | go_num_forced_gc | The number of GC cycles that were forced by the application calling the GC function. | Gauge | name | Dimensionless | Stable | | go_gc_cpu_fraction | The fraction of this program's available CPU time used by the GC since the program started | Gauge | name | Dimensionless | Stable |

NOTE: name tag is empty.

Developer - User Services

Every Knative service has a proxy container that proxies the connections to the application container. A number of metrics are reported for the queue peroxy performance. Using the following metrics application developers, devops and others, could measure if requests are queued at the proxy side (need for backpressure) and what is the actual delay in serving requests at the application side.

Queue proxy

Requests endpoint

Metric Name Description Type Tags Unit Status
revision_request_count The number of requests that are routed to queue-proxy Counter configuration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
Dimensionless Stable
revision_request_latencies The response time in millisecond Histogram configuration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
Milliseconds Stable
revision_app_request_count The number of requests that are routed to user-container Counter configuration_name
container_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
Dimensionless Stable
revision_app_request_latencies The response time in millisecond Histogram configuration_name
namespace_name
pod_name
response_code
response_code_class
revision_name
service_name
Milliseconds Stable
revision_queue_depth The current number of items in the serving and waiting queue, or not reported if unlimited concurrency Gauge configuration_name
event-display
container_name
namespace_name
pod_name
response_code_class
revision_name
service_name
Dimensionless Stable