ljzsdut
GitHubToggle Dark/Light/Auto modeToggle Dark/Light/Auto modeToggle Dark/Light/Auto modeBack to homepage
Edit page

00 Prometheus Operater 监控体系原理梳理文档

一、部署概述

k8s集群:1.18.6,rke部署

监控:prometheus-operator-9.3.1

相关CRD资源介绍及原理

# kubectl get crd |grep coreos
alertmanagers.monitoring.coreos.com           2020-08-31T08:53:59Z
podmonitors.monitoring.coreos.com             2020-09-08T08:46:07Z
prometheuses.monitoring.coreos.com            2020-08-31T08:53:58Z
prometheusrules.monitoring.coreos.com         2020-08-31T08:53:58Z
servicemonitors.monitoring.coreos.com         2020-08-31T08:53:59Z
thanosrulers.monitoring.coreos.com            2020-09-08T08:46:10Z
  • prometheuses:根据此对象,可以自动部署一个Prometheus应用的statefulset
  • alertmanagers:根据此对象,可以自动部署一个Prometheus应用的statefulset
  • servicemonitors:根据此对象,operater会自动生成抓取目标service的配置
  • podmonitors:根据此对象,operater会自动生成抓取目标pod的配置
  • prometheusrules:根据此对象,operater会自动生成rule配置

operater逻辑

Prometheus对象yaml文件:

#kubectl get Prometheus -o yaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
  kind: Prometheus
  metadata:
    annotations:
      meta.helm.sh/release-name: prometheus-operator
      meta.helm.sh/release-namespace: monitoring
      project.cattle.io/namespaces: '["cattle-system","kube-node-lease","kube-public","kube-system","monitoring"]'
    labels:
      app: prometheus-operator-prometheus
      app.kubernetes.io/managed-by: Helm
      chart: prometheus-operator-9.3.1
      heritage: Helm
      release: prometheus-operator
    name: prometheus-operator-prometheus
    namespace: monitoring
  spec:
    alerting:
    	# AlertmanagerEndpoints配置
    	# 指明Prometheus向哪个alertmanager发出告警信息
      alertmanagers:
      - apiVersion: v2
        name: prometheus-operator-alertmanager   #Endpoints对象的名字
        namespace: monitoring
        pathPrefix: /
        port: web
    arbitraryFSAccessThroughSMs: {}
    baseImage: quay.io/prometheus/prometheus
    externalUrl: http://prometheus.rencaiyoujia.cn/
    logFormat: logfmt
    logLevel: info
    
    # podMonitor选择器
    podMonitorNamespaceSelector: {} # 如果未nil,只检查Prometheus这个crd所在的ns,例如本例中的monitoring
    podMonitorSelector:
      matchLabels:
        release: prometheus-operator
        
    portName: web
    replicas: 1
    resources:
      limits:
        cpu: "1"
        memory: 1000Mi
      requests:
        cpu: 750m
        memory: 750Mi
    retention: 10d
    # prometheus路由前缀
    routePrefix: /
    
    # rule选择器:从PrometheusRules资源中加载告警规则等
    ruleNamespaceSelector: {}
    ruleSelector:
      matchLabels:
        app: prometheus-operator
        release: prometheus-operator
        
    rules:
      alert: {}  #指定/--rules.alert.*/命令行参数
    securityContext:
      fsGroup: 2000
      runAsGroup: 2000
      runAsNonRoot: true
      runAsUser: 1000
    ##运行Prometheus这个Pod的ServiceAccount,用于RBAC权限分配。该sa通过ClusterRoleBinding(prometheus-cluster-monitoring-cattle-prometheus)绑定了ClusterRole(prometheus-cluster-monitoring-cattle-prometheus)集群角色
    serviceAccountName: prometheus-operator-prometheus
    
    # serviceMonitor选择器:monitoring名称空间下具有release: prometheus-operator标签的service可以被check
    serviceMonitorNamespaceSelector: {}
    serviceMonitorSelector:
      matchLabels:
        release: prometheus-operator

    storage:
      volumeClaimTemplate:
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 50Gi
          storageClassName: alicloud-nas-subpath
    #Prometheus的版本信息
    version: v2.18.2
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

从上面可以看出,Prometheus资源把alertmanagers、servicemonitors、podmonitors、prometheusrules等资源给联系在一起了。

Prometheus配置文件

以上Prometheus对象会被Operater生成Pormetheus的配置文件,如下:

global:
  scrape_interval: 30s
  scrape_timeout: 10s
  evaluation_interval: 30s
  external_labels:
    prometheus: monitoring/prometheus-operator-prometheus
    prometheus_replica: prometheus-prometheus-operator-prometheus-0
alerting:
  alert_relabel_configs:
  - separator: ;
    regex: prometheus_replica
    replacement: $1
    action: labeldrop
  alertmanagers:
  - kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - monitoring
    scheme: http
    path_prefix: /
    timeout: 10s
    api_version: v2
    relabel_configs:
    - source_labels: [__meta_kubernetes_service_name]
      separator: ;
      regex: prometheus-operator-alertmanager
      replacement: $1
      action: keep
    - source_labels: [__meta_kubernetes_endpoint_port_name]
      separator: ;
      regex: web
      replacement: $1
      action: keep
rule_files:
- /etc/prometheus/rules/prometheus-prometheus-operator-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: monitoring/prometheus-operator-alertmanager/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-alertmanager
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_self_monitor]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: web
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: web
    action: replace
- job_name: monitoring/prometheus-operator-apiserver/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    server_name: kubernetes
    insecure_skip_verify: false
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_component]
    separator: ;
    regex: apiserver
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_provider]
    separator: ;
    regex: kubernetes
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_component]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https
    action: replace
- job_name: monitoring/prometheus-operator-coredns/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-coredns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
- job_name: monitoring/prometheus-operator-grafana/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_instance]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
    separator: ;
    regex: grafana
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: service
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: service
    action: replace
- job_name: monitoring/prometheus-operator-kube-controller-manager/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-kube-controller-manager
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
- job_name: monitoring/prometheus-operator-kube-etcd/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-kube-etcd
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
- job_name: monitoring/prometheus-operator-kube-proxy/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-kube-proxy
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
- job_name: monitoring/prometheus-operator-kube-scheduler/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-kube-scheduler
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
- job_name: monitoring/prometheus-operator-kube-state-metrics/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_instance]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
    separator: ;
    regex: kube-state-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http
    action: replace
- job_name: monitoring/prometheus-operator-kubelet/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kubelet
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
- job_name: monitoring/prometheus-operator-kubelet/1
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics/cadvisor
  scheme: https
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kubelet
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
- job_name: monitoring/prometheus-operator-kubelet/2
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics/probes
  scheme: https
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kubelet
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
- job_name: monitoring/prometheus-operator-kubelet/3
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics/resource/v1alpha1
  scheme: https
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kubelet
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https-metrics
    action: replace
  - source_labels: [__metrics_path__]
    separator: ;
    regex: (.*)
    target_label: metrics_path
    replacement: $1
    action: replace
- job_name: monitoring/prometheus-operator-node-exporter/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-node-exporter
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: metrics
    action: replace
- job_name: monitoring/prometheus-operator-operator/0
  honor_labels: true
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http
    action: replace
- job_name: monitoring/prometheus-operator-prometheus/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: prometheus-operator-prometheus
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: prometheus-operator
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_self_monitor]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: web
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: web
    action: replace

rancher中Prometheus对象的yaml文件,仅供参考:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  annotations:
  labels:
    app: prometheus
    chart: prometheus-0.0.1
    heritage: Tiller
    io.cattle.field/appId: cluster-monitoring
    release: cluster-monitoring
  name: cluster-monitoring
  namespace: cattle-prometheus
spec:
  # 指定了Prometheus的中AlertManager的额外配置。指定配置将附加到Prometheus Operator生成的配置中。
  additionalAlertManagerConfigs:
    key: additional-alertmanager-configs.yaml  #secret中的key
    name: prometheus-cluster-monitoring-additional-alertmanager-configs  #值为secret-name
  # 指定了Prometheus的额外的抓取配置的Secret。指定的抓取配置将附加到Prometheus Operator生成的配置中。
  additionalScrapeConfigs:
    key: additional-scrape-configs.yaml
    name: prometheus-cluster-monitoring-additional-scrape-configs  #secret,其中定义了一系列的job_name,实现自定义scrape配置文件
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app: prometheus
              prometheus: cluster-monitoring
          topologyKey: kubernetes.io/hostname
        weight: 100
  arbitraryFSAccessThroughSMs: {}
  baseImage: rancher/prom-prometheus
  configMaps:  # 在Prometheus Pod中生成/etc/prometheus/configmaps/<configmap-name>
  - prometheus-cluster-monitoring-nginx
  containers:
  - command:
    - /bin/sh
    - -c
    - cp /nginx/run-sh.tmpl /var/cache/nginx/nginx-start.sh; chmod +x /var/cache/nginx/nginx-start.sh;
      /var/cache/nginx/nginx-start.sh
    env:
    - name: POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    image: rancher/nginx:1.17.4-alpine
    name: prometheus-proxy
    ports:
    - containerPort: 8080
      name: nginx-http
      protocol: TCP
    resources:
      limits:
        cpu: 100m
        memory: 100Mi
      requests:
        cpu: 50m
        memory: 50Mi
    securityContext:
      runAsGroup: 101
      runAsUser: 101
    volumeMounts:
    - mountPath: /nginx
      name: configmap-prometheus-cluster-monitoring-nginx
    - mountPath: /var/cache/nginx
      name: nginx-home
  - args:
    - --proxy-url=http://127.0.0.1:9090
    - --listen-address=$(POD_IP):9090
    - --filter-reader-labels=prometheus
    - --filter-reader-labels=prometheus_replica
    command:
    - prometheus-auth
    env:
    - name: POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    image: rancher/prometheus-auth:v0.2.0
    livenessProbe:
      failureThreshold: 6
      httpGet:
        path: /-/healthy
        port: web
        scheme: HTTP
      initialDelaySeconds: 300
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 10
    name: prometheus-agent
    ports:
    - containerPort: 9090
      name: web
      protocol: TCP
    readinessProbe:
      failureThreshold: 10
      httpGet:
        path: /-/ready
        port: web
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 10
    resources:
      limits:
        cpu: 500m
        memory: 200Mi
      requests:
        cpu: 100m
        memory: 100Mi
  evaluationInterval: 60s
  externalLabels:
    prometheus_from: rke
  listenLocal: true
  logFormat: logfmt
  logLevel: info
  nodeSelector:
    kubernetes.io/os: linux
  podMetadata:
    labels:
      app: prometheus
      chart: prometheus-0.0.1
      release: cluster-monitoring
  replicas: 1
  resources:
    limits:
      cpu: "1"
      memory: 1000Mi
    requests:
      cpu: 750m
      memory: 750Mi
  retention: 12h
  # rule选择器
  ruleNamespaceSelector:
    matchExpressions:
    - key: field.cattle.io/projectId   #“field.cattle.io/projectId”是ns的label
      operator: In
      values:
      - p-vrc2z
    - key: field.cattle.io/projectId
      operator: In
      values:
      - p-vrc2z
  ruleSelector:
    matchExpressions:
    - key: source
      operator: In
      values:
      - rancher-alert
      - rancher-monitoring
  rules:
    alert: {}
  scrapeInterval: 60s
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: cluster-monitoring 
  # serviceMonitor选择器
  serviceMonitorNamespaceSelector:
    matchExpressions:
    - key: field.cattle.io/projectId
      operator: In
      values:
      - p-vrc2z
    - key: field.cattle.io/projectId
      operator: In
      values:
      - p-vrc2z
  serviceMonitorSelector: {}
  tolerations:
  - effect: NoSchedule
    key: cattle.io/os
    operator: Equal
    value: linux
  version: v2.17.2
  volumes:
  - emptyDir: {}
    name: nginx-home

二、自定义Prometheus配置文件

0、Prometheus的配置文件介绍

Prometheus的配置文件主要有4部分:

global:n. #第1部分:全局配置
  scrape_interval: 1m
  scrape_timeout: 10s
  evaluation_interval: 1m
  external_labels:
    prometheus: cattle-prometheus/cluster-monitoring
    prometheus_from: k8s-prodenv
    prometheus_replica: prometheus-cluster-monitoring-0
alerting: #第2部分:alertmanager配置
  alert_relabel_configs:
  - separator: ;
    regex: prometheus_replica
    replacement: $1
    action: labeldrop
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager-operated.cattle-prometheus:9093
      labels:
        cluster_id: c-qtpxd
        cluster_name: k8s-prodenv
        level: cluster
    scheme: http
    timeout: 10s
    api_version: v1
rule_files:  #第3部分:规则配置
- /etc/prometheus/rules/prometheus-cluster-monitoring-rulefiles-0/*.yaml
scrape_configs:  #第4部分:抓取配置
- job_name: cattle-prometheus/alertmanager-cluster-alerting/0
...省略...

从Prometheus对象的yaml配置文件中,我们发现可以通过如下方式自定义Prometheus的配置:

  • additionalScrapeConfigs中指定secret,其内容直接追加到Prometheus的配置文件中的scrape_configs部分
  • 定义serviceMonitor对象,由Prometheus-Operater转化为配置文件的scrape_configs部分
  • additionalAlertManagerConfigs中指定secret,其内容直接追加到Prometheus的配置文件中的alerting部分
  • 定义PrometheusRule对象,由Prometheus-Operater转化为配置文件的rule_files部分

1、secret指定配置文件scrape_configs内容

# kubectl get secret prometheus-cluster-monitoring-additional-scrape-configs -o yaml
apiVersion: v1
data:
  additional-scrape-configs.yaml: ***base64格式的密文***
kind: Secret
metadata:
  creationTimestamp: "2020-09-08T08:44:51Z"
  labels:
    app: prometheus
    chart: prometheus-0.0.1
    heritage: Tiller
    io.cattle.field/appId: cluster-monitoring
    release: cluster-monitoring
  name: prometheus-cluster-monitoring-additional-scrape-configs
  namespace: cattle-prometheus
type: Opaque


#解析密文得到:
- job_name: 'istio/envoy-stats'
  scrape_interval: 15s
  metrics_path: /stats/prometheus
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_port_name]
      action: keep
      regex: '.*-envoy-prom'
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:15090
      target_label: __address__
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: pod_name
- job_name: 'istio/istio-mesh'
...省略其他...

Prometheus指定alertmanager

操作步骤:

1、修改additional-scrape-configs.yaml文件内容:

- job_name: 'istio/envoy-stats'
  scrape_interval: 15s
  metrics_path: /stats/prometheus
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_port_name]
      action: keep
      regex: '.*-envoy-prom'
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:15090
      target_label: __address__
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: pod_name
- job_name: 'istio/istio-mesh'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-telemetry;prometheus
  metric_relabel_configs:
    - source_labels: [source_workload_namespace]
      action: replace
      target_label: namespace
- job_name: 'istio/istio-policy'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-policy;http-monitoring
- job_name: 'istio/istio-telemetry'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-telemetry;http-monitoring
- job_name: 'istio/pilot'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-pilot;http-monitoring
- job_name: 'istio/galley'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-galley;http-monitoring
- job_name: 'istio/citadel'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-citadel;http-monitoring
- job_name: 'prometheus-io-scrape'
  kubernetes_sd_configs:
  - role: pod
    namespaces:
      names:
      - ingress-nginx
      - ingress-controller
      - kube-system
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: node
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: namespace
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: pod
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_ip]
    action: replace
    target_label: pod_ip
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_host_ip]
    action: replace
    target_label: host_ip
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_controller_kind]
    action: replace
    target_label: created_by_kind
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_controller_name]
    action: replace
    target_label: created_by_kind
    regex: (.+)
    replacement: $1
- job_name: 'kubernetes-http-services'
  metrics_path: /probe
  params:
    module: [http_2xx]  # 使用定义的http模块
  kubernetes_sd_configs:
  - role: service  # service 类型的服务发现
  relabel_configs:
  # 只有service的annotation中配置了 prometheus.io/http_probe=true 的才进行发现
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_http_probe]
    action: keep
    regex: true
  - source_labels: [__address__]
    target_label: __param_target
  - target_label: __address__
    replacement: blackbox:9115  #blackbox-exporter访问地址
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    target_label: kubernetes_name

- job_name: 'kubernetes-ingresses'
  metrics_path: /probe
  params:
    module: [http_2xx]  # 使用定义的http模块
  kubernetes_sd_configs:
  - role: ingress  # ingress 类型的服务发现
  relabel_configs:
  # 只有ingress的annotation中配置了 prometheus.io/http_probe=true的才进行发现
  - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_http_probe]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
    regex: (.+);(.+);(.+)
    replacement: ${1}://${2}${3}
    target_label: __param_target
  - target_label: __address__
    replacement: blackbox:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_ingress_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_ingress_name]
    target_label: kubernetes_name
- job_name: 'blackbox_http_2xx'
  scrape_interval: 45s
  metrics_path: /probe
  params:
    module: [http_2xx]  # Look for a HTTP 200 response.
  static_configs:
      - targets:
        - https://oppc.rencaiyoujia.com/ping
        - https://api.rencaiyoujia.com/ping
        - https://www.rencaiyoujia.com
        - https://m.rencaiyoujia.com
        - https://www.m.rencaiyoujia.com
        - https://fa.rencaiyoujia.com
        - https://rencaiyoujia.com
  relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox:9115

2、重新创建secret

#删除原来的secret
kubectl delete secret prometheus-cluster-monitoring-additional-scrape-configs
#重新创建secret
kubectl create secret generic prometheus-cluster-monitoring-additional-scrape-configs --from-file=additional-scrape-configs.yaml

2、定义ServiceMonitor自动生成Prometheus中job部分的配置文件

文档:https://cloud.tencent.com/document/product/248/53288

Prometheus对象可以关联Alertmanager、PrometheusRule、ServiceMonitor。而ServiceMonitor可以被operater生成为Prometheus的配置文件,从而关联到相应的Service(Endpoints)。

pAtTzp-1596614354991

action: keep的作用

l277OZ-1596611840832

3、服务自动发现配置

后面详解!!!

4、自定义告警规则

后面详解!!!

三、自动发现配置

官方示例

如果在我们的 Kubernetes 集群中有了很多的 Service/Pod,那么我们都需要一个一个的去建立一个对应的 ServiceMonitor 对象来进行监控吗?这样岂不是又变得麻烦起来了?

为解决上面的问题,我们可以使用Prometheus的自动发现的抓取配置的来解决这个问题,想要在 Prometheus当中去自动发现并监控具有prometheus.io/scrape=true这个 annotations 的 Service,我们定义的 Prometheus 的抓取配置如下:

- job_name: 'kubernetes-service-endpoints'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name

1、如何在prometheus-operater中配置自动服务发现呢?

cat >> additionalScrapeConfigs-values.yaml <<"EOF"
prometheus:  
  prometheusSpec:
    additionalScrapeConfigs:
    #让Prometheus Operator去自动发现并监控具有prometheus.io/scrape=true这个annotations的 Service
    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name
EOF        

升级Prometheus-operater

helm upgrade prometheus-operator stable/prometheus-operator -f additionalScrapeConfigs-values.yaml  -n monitoring

除了配置prometheus.prometheusSpec.additionalScrapeConfigs,也可以将上面的配置配置成k8s的secret,使用prometheus.prometheusSpec.additionalScrapeConfigsSecret配置。

以上是书写的Prometheus原生的配置文件,我们也可以将其转换为ServiceMonitor对象。

2、测试服务的自动发现

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysqld-exporter
  namespace: cattle-prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysqld-exporter
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: mysqld-exporter
    spec:
      containers:    
      - name: mysqld-exporter
        image: prom/mysqld-exporter
        imagePullPolicy: IfNotPresent
        env:
        - name: DATA_SOURCE_NAME
          value: exporter:rcyj1qaz2wsx@(10.15.9.210:3306)/
          # CREATE USER 'exporter'@'172.22.22.%' IDENTIFIED BY 'rcyj1qaz2wsx' WITH MAX_USER_CONNECTIONS 3;
          # GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'172.22.22.%';  
          # CREATE USER 'exporter'@'10.15.9.%' IDENTIFIED BY 'rcyj1qaz2wsx' WITH MAX_USER_CONNECTIONS 3;
          # GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'10.15.9.%';    
        ports:
        - containerPort: 9104
          protocol: TCP
          name: metrics
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 128Mi
        # livenessProbe:
        #   failureThreshold: 3
        #   httpGet:
        #     path: /rcyj-platformB
        #     port: 8080
        #     scheme: HTTP
        #   initialDelaySeconds: 35
        #   periodSeconds: 5
        #   successThreshold: 1
        #   timeoutSeconds: 2          
        # readinessProbe:
        #   failureThreshold: 3
        #   httpGet:
        #     path: /rcyj-platformB
        #     port: 8080
        #     scheme: HTTP
        #   initialDelaySeconds: 35
        #   periodSeconds: 5
        #   successThreshold: 1
        #   timeoutSeconds: 2

---
apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"  #带有这个注解的svc会自动被抓取
  labels:
    app: mysqld-exporter
  name: mysqld-exporter
  namespace: cattle-prometheus
spec:
  ports:
  - port: 9104
    protocol: TCP
    targetPort: 9104
    name: metrics
  selector:
    app: mysqld-exporter
  sessionAffinity: None
  type: ClusterIP

如果现有服务的port-name不符合服务发现的预配置。可以通过设定注解来改变port-name的值。

  1. __address__:当前Target实例的访问地址[host]:[port]
  2. __scheme__:采集目标服务访问地址的HTTP Scheme,HTTP或者HTTPS
  3. __metrics_path__:采集目标服务访问地址的访问路径

例如:

  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9121"

参考文档:

Prometheus Operator 自动发现以及数据持久化

Prometheus基于service自动添加监控

四、实现自定义告警规则

自定义告警规则可以通过PrometheusRule对象实现。

PrometheusRule对象与Prometheus配置文件的关系:

创建一个 PrometheusRule 资源对象后,operater会将该对象会被转换成configmap/prometheus-prometheus-operator-prometheus-rulefiles-0。然后自动在prometheus这个Pod的/etc/prometheus/rules/prometheus-prometheus-operator-prometheus-rulefiles-0/目录下面生成一个对应的<namespace>-<name>.yaml文件。

1、添加告警规则

创建文件 prometheus-etcdRules.yaml,报警标准:不可用的 etcd 数量超过了一半那么就触发报警

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: etcd-rules
  namespace: monitoring
spec:
  groups:
  - name: etcd
    rules:
    - alert: EtcdClusterUnavailable
      annotations:
        summary: etcd cluster small
        description: If one more etcd peer goes down the cluster will be unavailable
      expr: |
                count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
      for: 3m
      labels:
        severity: critical

注意 label 标签一定至少要有 prometheus=k8s 和 role=alert-rules,创建完成后,隔一会儿再去容器中查看下 rules 文件夹,会自动在上面的 prometheus-k8s-rulefiles-0 目录下面生成一个对应的<namespace>-<name>.yaml文件,

kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
monitoring-etcd-rules.yaml            monitoring-prometheus-k8s-rules.yaml

可以看到我们创建的 rule 文件已经被注入到了对应的 rulefiles 文件夹下面了,证明我们上面的设想是正确的。然后再去 Prometheus Dashboard 的 Alert 页面下面就可以查看到上面我们新建的报警规则了:

2、修改或删除告警规则

直接修改PrometheusRule即可。

如果不是helm部署的,也可以通过如下方式:

1、 在prometheus-operator/contrib/kube-prometheus/manifests目录下(这个是Prometheus的k8s的yaml文件,不是helm的文件,可以通过helm template生成。)

先备份原来的prometheus-rules.yaml为prometheus-rules.yaml.default

2、修改vim prometheus-rules.yaml,修改完成后apply

3、 进入pod中kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring

查看上面注释删除掉告警是否已经不存在了

cat  /etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml |grep xx

4、进入Prometheus web界面,查看规则是否已经删除。

五、其他监控配置

1、etcd监控

2、黑盒监控

监控域名访问可达性、监控证书过期

1、修改secret的yaml文件,添加监控地址(job_name: 'blackbox_http_2xx')

➜  ~ cat additional-scrape-configs.yaml
- job_name: 'istio/envoy-stats'
  scrape_interval: 15s
  metrics_path: /stats/prometheus
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_port_name]
      action: keep
      regex: '.*-envoy-prom'
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:15090
      target_label: __address__
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: pod_name
- job_name: 'istio/istio-mesh'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-telemetry;prometheus
  metric_relabel_configs:
    - source_labels: [source_workload_namespace]
      action: replace
      target_label: namespace
- job_name: 'istio/istio-policy'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-policy;http-monitoring
- job_name: 'istio/istio-telemetry'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-telemetry;http-monitoring
- job_name: 'istio/pilot'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-pilot;http-monitoring
- job_name: 'istio/galley'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-galley;http-monitoring
- job_name: 'istio/citadel'
  scrape_interval: 15s
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
          - istio-system
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: istio-citadel;http-monitoring
- job_name: 'prometheus-io-scrape'
  kubernetes_sd_configs:
  - role: pod
    namespaces:
      names:
      - ingress-nginx
      - ingress-controller
      - kube-system
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: node
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: namespace
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: pod
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_ip]
    action: replace
    target_label: pod_ip
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_host_ip]
    action: replace
    target_label: host_ip
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_controller_kind]
    action: replace
    target_label: created_by_kind
    regex: (.+)
    replacement: $1
  - source_labels: [__meta_kubernetes_pod_controller_name]
    action: replace
    target_label: created_by_kind
    regex: (.+)
    replacement: $1
- job_name: 'kubernetes-http-services'
  metrics_path: /probe
  params:
    module: [http_2xx]  # 使用定义的http模块
  kubernetes_sd_configs:
  - role: service  # service 类型的服务发现
  relabel_configs:
  # 只有service的annotation中配置了 prometheus.io/http_probe=true 的才进行发现
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_http_probe]
    action: keep
    regex: true
  - source_labels: [__address__]
    target_label: __param_target
  - target_label: __address__
    replacement: blackbox:9115  #blackbox-exporter访问地址
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    target_label: kubernetes_name

- job_name: 'kubernetes-ingresses'
  metrics_path: /probe
  params:
    module: [http_2xx]  # 使用定义的http模块
  kubernetes_sd_configs:
  - role: ingress  # ingress 类型的服务发现
  relabel_configs:
  # 只有ingress的annotation中配置了 prometheus.io/http_probe=true的才进行发现
  - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_http_probe]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
    regex: (.+);(.+);(.+)
    replacement: ${1}://${2}${3}
    target_label: __param_target
  - target_label: __address__
    replacement: blackbox:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_ingress_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_ingress_name]
    target_label: kubernetes_name
- job_name: 'blackbox_http_2xx'
  scrape_interval: 45s
  metrics_path: /probe
  params:
    module: [http_2xx]  # Look for a HTTP 200 response.  
  static_configs:
      - targets:
        - https://oppc.rencaiyoujia.com/ping
        - https://api.rencaiyoujia.com/ping
        - https://www.rencaiyoujia.com
        - https://m.rencaiyoujia.com
        - https://www.m.rencaiyoujia.com
        - https://fa.rencaiyoujia.com
        - https://rencaiyoujia.com
        - https://rcpgadmin.rencaiyoujia.com
        - https://dsfh5.rencaiyoujia.com
        - http://pubapi.rencaiyoujia.com/ping
        - https://assessh5.rencaiyoujia.com
        - https://ommp.rencaiyoujia.com
        - https://ommp.rencaiyoujia.com
        - https://risk-rule.rencaiyoujia.com/ping
        - https://bcpay-ad-api.rencaiyoujia.com/ping
        - https://bcpay-api.rencaiyoujia.com/ping
        - https://walletapi.rencaiyoujia.com/ping
        - https://risk-frontend.rencaiyoujia.com
        - https://withdraw.rencaiyoujia.com
        - https://wallet.rencaiyoujia.com
        - https://nacos.rencaiyoujia.com/nacos
        - https://s.rencaiyoujia.com
        - https://www.xiaocaiyoujia.com
        - https://www.xiaocaiyoujia.com/h5/
        - https://www.m.govjob.rencaiyoujia.com/
        - https://www.rencaiyoujia.com/govapi/test/ping
        - https://govjob.rencaiyoujia.com/auth
        - https://govjobadmin.rencaiyoujia.com/
  relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__  #指定使用黑盒监控的target地址
        replacement: blackbox:9115  

2、删除老的secret,并生成新的secret。注意secret的名称要有Prometheus这个crd中定义的一致。

➜  ~ kubectl delete secret prometheus-cluster-monitoring-additional-scrape-configs -n cattle-prometheus
secret "prometheus-cluster-monitoring-additional-scrape-configs" deleted
➜  ~ kubectl create secret generic prometheus-cluster-monitoring-additional-scrape-configs --from-file=additional-scrape-configs.yaml -n cattle-prometheus
secret/prometheus-cluster-monitoring-additional-scrape-configs created

3、监控外部MySQL

alertmanager配置

查看alertmanager的配置:

kubectl get secret alertmanager-prometheus-operator-alertmanager -o go-template='{{ index .data "alertmanager.yaml" }}' |base64 -D

更新alertmanager的配置:

直接更新此secret即可,自动生效。