Questions tagged [prometheus]
The Prometheus monitoring system, including the server, alert manager, push gateway, exporters, client libraries and other components.
prometheus
7,026
questions
213
votes
3
answers
117k
views
Do I understand Prometheus's rate vs increase functions correctly?
I have read the Prometheus documentation carefully, but its still a bit unclear to me, so I am here to get confirmation about my understanding.
(Please note that for the sake of the simplest examples ...
196
votes
11
answers
347k
views
Get Total requests in a period of time
I need to show, in Grafana, a panel with the number of requests in the period of time selected in the upper right corner.
For this I need to solve 2 issues here, I will ask the prometheus question ...
79
votes
2
answers
54k
views
How to persist data in Prometheus running in a Docker container?
I'm developing something that needs Prometheus to persist its data between restarts. Having followed the instructions
$ docker volume create a-new-volume
$ docker run \
--publish 9090:9090 \
...
77
votes
4
answers
51k
views
Usecases: InfluxDB vs. Prometheus [closed]
Following the Prometheus webpage one main difference between Prometheus and InfluxDB is the usecase: while Prometheus stores time series only InfluxDB is better geared towards storing individual ...
65
votes
6
answers
106k
views
Prometheus - add target specific label in static_configs
I have job definition as follows:
- job_name: 'test-name'
static_configs:
- targets: [ '192.168.1.1:9100', '192.168.1.1:9101', '192.168.1.1:9102' ]
labels:
group: '...
65
votes
9
answers
193k
views
How do I write a Prometheus query that returns the value of a label?
I'm making a Grafana dashboard and want a panel that reports the latest version of our app. The version is reported as a label in the app_version_updated (say) metric like so:
app_version_updated{...
63
votes
4
answers
136k
views
Prometheus query to count unique label values
I want to count number of unique label values. Kind of like
select count (distinct a) from hello_info
For example if my metric 'hello_info' has labels a and b. I want to count number of unique a's. ...
62
votes
5
answers
208k
views
How to calculate containers' cpu usage in kubernetes with prometheus as monitoring?
I want to calculate the cpu usage of all pods in a kubernetes cluster. I found two metrics in prometheus may be useful:
container_cpu_usage_seconds_total: Cumulative cpu time consumed per cpu in ...
58
votes
3
answers
190k
views
How can I group labels in a Prometheus query?
If I have a metric with the following labels:
my_metric{group="group a"} 100
my_metric{group="group b"} 100
my_metric{group="group c"} 100
my_metric{group="misc group a"} 1
my_metric{group="misc ...
57
votes
3
answers
119k
views
How can I 'join' two metrics in a Prometheus query?
I am using the consul exporter to ingest the health and status of my services into Prometheus. I'd like to fire alerts when the status of services and nodes in Consul is critical and then use tags ...
53
votes
5
answers
109k
views
Increasing Prometheus storage retention
I have Prometheus server installed on my AWS instance, but the data is being removed automatically after 15 days. I need to have data for a year or months. Is there anything I need to change in my ...
52
votes
4
answers
167k
views
Prometheus - Convert cpu_user_seconds to CPU Usage %?
I'm monitoring docker containers via Prometheus.io. My problem is that I'm just getting cpu_user_seconds_total or cpu_system_seconds_total.
How to convert this ever-increasing value to a CPU ...
49
votes
14
answers
160k
views
Context Deadline Exceeded - prometheus
I have Prometheus configuration with many jobs where I am scraping metrics over HTTP. But I have one job where I need to scrape the metrics over HTTPS.
When I access:
https://ip-address:port/metrics
I ...
48
votes
1
answer
77k
views
How do I write an "or" logical operator on Prometheus or Grafana
I need to write a query that use any of the different jobs I define.
{job="traefik" OR job="cadvisor" OR job="prometheus"}
Is it possible to write logical binary operators?
47
votes
2
answers
22k
views
What does the "instant" checkbox in Grafana graphs based on prometheus do?
I have no clue what the option "instant" means in Grafana when creating graph with Prometheus.
Any ideas?
42
votes
8
answers
91k
views
Getting error "Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused"
I'm trying to configure Prometheus and Grafana with my Hyperledger fabric v1.4 network to analyze the peer and chaincode mertics. I've mapped peer container's port 9443 to my host machine's port 9443 ...
41
votes
2
answers
50k
views
Prometheus endpoint of all available metrics
I was curious concerning the workings of Prometheus. Using the Prometheus interface I am able to see a drop-down list which I assume contains all available metrics. However, I am not able to access ...
40
votes
4
answers
83k
views
Different Prometheus scrape URL for every target
Every instance of my application has a different URL.
How can I configure prometheus.yml so that it takes path of a target along with the host name?
scrape_configs:
- job_name: 'example-random'...
40
votes
11
answers
89k
views
Relabel instance to hostname in Prometheus
I have Prometheus scraping metrics from node exporters on several machines with a config like this:
scrape_configs:
- job_name: node_exporter
static_configs:
- targets:
- 1.2.3.4:...
39
votes
2
answers
66k
views
Monitor custom kubernetes pod metrics using Prometheus
I am using Prometheus to monitor my Kubernetes cluster. I have set up Prometheus in a separate namespace. I have multiple namespaces and multiple pods are running. Each pod container exposes a custom ...
39
votes
8
answers
47k
views
Is there a way to monitor kube cron jobs using prometheus
Is there a way to monitor kube cronjob?
I have a kube cronjob which runs every 10mins on my cluster. Is there a way to collect metrics every time my cronjob fails due to some error or notify when my ...
38
votes
3
answers
49k
views
How to use the selected period of time in a query?
I'm using Grafana with Prometheus and I'd like to build a query that depends on the selected period of time selected in the upper right corner of the screen.
Is there any variable (or something like ...
38
votes
4
answers
59k
views
How can I alert for container restarted?
I like to monitor the containers using Prometheus and cAdvisor so that when a container restart, I get an alert. I wonder if anyone have sample Prometheus alert for this.
38
votes
4
answers
37k
views
Monitoring log files using some metrics exporter + Prometheus + Grafana
I need to monitor very different log files for errors, success status etc. And I need to grab corresponding metrics using Prometheus and show in Grafana + set some alerting on it. Prometheus + Grafana ...
37
votes
4
answers
30k
views
Why there are both counters and gauges in Prometheus if gauges can act as counters?
When deciding between Counter and Gauge, Prometheus documentation states that
To pick between counter and gauge, there is a simple rule of thumb: if
the value can go down, it is a gauge. Counters ...
36
votes
3
answers
38k
views
How to display all metrics that don't have a specific label
I want to select all metrics that don't have label "container". Is there any possibility to do that with prometheus query?
35
votes
3
answers
65k
views
What's the difference between Prometheus and Zabbix? [closed]
What are the differences between Prometheus and Zabbix?
35
votes
3
answers
45k
views
Prometheus: grouping metrics by metric names
Is there a way to group all metrics of an app by metric names? A portion from a query listing all metrics for an app (i.e. {app="bar"}) :
ch_qos_logback_core_Appender_all_total{affiliation="foo",app="...
35
votes
2
answers
31k
views
How to add https url on target prometheus
I want to add my HTTPS target URL to Prometheus, an error like this appears:
"https://myDomain.dev" is not a valid hostname"
my domain can access and run using proxy pass Nginx with ...
35
votes
3
answers
36k
views
increase() in Prometheus sometimes doubles values: how to avoid?
I've found that for some graphs I get doubles values from Prometheus where should be just ones:
Query I use:
increase(signups_count[4m])
Scrape interval is set to the recommended maximum of 2 ...
35
votes
4
answers
32k
views
Prometheus (in Docker container) Cannot Scrape Target on Host
Prometheus running inside a docker container (version 18.09.2, build 6247962, docker-compose.xml below) and the scrape target is on localhost:8000 which is created by a Python 3 script.
Error ...
35
votes
2
answers
16k
views
Why does increase() return a value of 1.33 in prometheus?
We graph a timeseries with sum(increase(foo_requests_total[1m])) to show the number of foo requests per minute. Requests come in quite sporadically - just a couple of requests per day. The value that ...
34
votes
7
answers
125k
views
Most recent value or last seen value
Prometheus is built around returning a time series representation of metrics. In many cases, however, I only care about what the state of a metric is right now, and I'm having a hard time figuring out ...
34
votes
4
answers
49k
views
How can I visualize a histogram with Promdash or Grafana?
I'm attracted to prometheus by the histogram (and summaries) time-series, but I've been unsuccessful to display a histogram in either promdash or grafana. What I expect is to be able to show:
a ...
34
votes
2
answers
28k
views
How dangerous are high-cardinality labels in Prometheus?
I'm considering exporting some metrics to Prometheus, and I'm getting nervous about what I'm planning to do.
My system consists of a workflow engine, and I'd like to track some metrics for each step ...
33
votes
3
answers
46k
views
Prometheus vs ElasticSearch. Which is better for container and server monitoring? [closed]
ElasticSearch is a document store and more of a search engine, I think ElasticSearch is not good choice for monitoring high dimensional data as it consumes lot of resources. On the other hand ...
33
votes
3
answers
98k
views
Get total and free disk space using Prometheus
I try to get Total and Free disk space on my Kubernetes VM so I can display % of taken space on it. I tried various metrics that included "filesystem" in name but none of these displayed correct total ...
32
votes
5
answers
97k
views
How to rename label within a metric in Prometheus
I have a query:
node_systemd_unit_state{instance="server-01",job="node-exporters",name="kubelet.service",state="active"} 1
I want the label name being renamed (or replaced) to unit_name ONLY within ...
31
votes
1
answer
35k
views
Prometheus - exclude 0 values from query result
I'm displaying Prometheus query on a Grafana table.
That's the query (Counter metric):
sum(increase(check_fail{app="monitor"}[20m])) by (reason)
The result is a table of failure reason and its count....
31
votes
3
answers
105k
views
prometheus doesn't match regex query
I'm trying to write a prometheus query in grafana that will select visits_total{route!~"/api/docs/*"}
What I'm trying to say is that it should select all the instances where the route doesn't match /...
30
votes
3
answers
53k
views
Generating range vectors from return values in Prometheus queries
I have a metric varnish_main_client_req of type counter and I want to set up an alert that triggers if the rate of requests drops/raises by a certain amount in a given time (e.g. "Amount of ...
29
votes
10
answers
37k
views
Prometheus instant vector vs range vector
There's something I still dont understand about instant vector and range vectors
Instant vector - a set of time series containing a single sample for each time series, all sharing the same timestamp.
...
29
votes
6
answers
25k
views
How to scrape all metrics from a federate endpoint?
We have a hierachical prometheus setup with some server scraping others.
We'd like to have some servers scrape all metrics from others.
Currently we try to use match[]="{__name__=~".*"}" as a metric ...
29
votes
5
answers
46k
views
Understanding histogram_quantile based on rate in Prometheus
According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) ...
29
votes
2
answers
21k
views
Prometheus - Aggregate and relabel by regex
I currently have the following Promql query which allow me to query the memory used by each of my K8S pods:
sum(container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}) by (pod_name)
The pod's ...
29
votes
5
answers
30k
views
How to gracefully avoid divide by zero in Prometheus
There are times when you need to divide one metric by another metric.
For example, I'd like to calculate a mean latency like that:
rate({__name__="hystrix_command_latency_total_seconds_sum"}[60s])
/
...
28
votes
3
answers
19k
views
Prometheus/PromQL subtract two gauge metrics
I have this gauge metric "metric_awesome" from two different instances.
What i want to do, is subtract instance one from instance two like so
metric_awesome{instance="one"} - metric_awesome{instance="...
28
votes
2
answers
49k
views
multiple values from grafana variable in prometheus query
We have a situation where we need to select the multiple values (instances/servers) from grafana variable field, and multiple values needs to passed to the Prometheus query using some regex, so that i ...
28
votes
3
answers
39k
views
How to execute multiple queries in one call in Prometheus
I'm running prometheus inside kubernetes cluster.
I need to send queries to Prometheus every minute, to gather information of many metrics from many containers. There are too match queries, so I ...
26
votes
3
answers
92k
views
Filter prometheus results by metric value, not by label value
Because Prometheus topk returns more results than expected, and because https://github.com/prometheus/prometheus/issues/586 requires client-side processing that has not yet been made available via ...