操作场景
容器服务 TKE 基于 Custom Metrics API 支持许多用于弹性伸缩的指标,涵盖 CPU、内存、硬盘、网络以及 GPU 相关的指标,覆盖绝大多数的 HPA 弹性伸缩场景,详细列表请参见 自动伸缩指标说明。针对例如基于业务单副本 QPS 大小来进行自动扩缩容等复杂场景,可通过安装 prometheus-adapter 来实现自动扩缩容。而 Kubernetes 提供 Custom Metrics API 与 External Metrics API 来对 HPA 指标进行扩展,让用户能够根据实际需求进行自定义。prometheus-adapter 支持以上两种 API,在实际环境中,使用 Custom Metrics API 即可满足大部分场景。本文将介绍如何通过 Custom Metrics API 实现使用自定义指标进行弹性伸缩。
前提条件
已创建1.12或以上版本的 TKE 集群,详情请参见 创建集群。 已部署 Prometheus 并进行相应的自定义指标采集。 已安装 Helm。
操作步骤
暴露监控指标
本文以 Golang 业务程序为例,该示例程序暴露了 httpserver_requests_total
指标,并记录 HTTP 的请求,通过该指标可以计算出业务程序的 QPS 值。示例如下:
package main
import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" "net/http" "strconv")
var (HTTPRequests = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "httpserver_requests_total", Help: "Number of the http requests received since the server started", }, []string{"status"}, ))
func init() { prometheus.MustRegister(HTTPRequests)}
func main() { http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { path := r.URL.Path code := 200 switch path { case "/test": w.WriteHeader(200) w.Write([]byte("OK")) case "/metrics": promhttp.Handler().ServeHTTP(w, r) default: w.WriteHeader(404) w.Write([]byte("Not Found")) } HTTPRequests.WithLabelValues(strconv.Itoa(code)).Inc() }) http.ListenAndServe(":80", nil)}
部署业务程序
将前面的程序打包成容器镜像,然后部署到集群,例如使用 Deployment 部署:
apiVersion: apps/v1kind: Deploymentmetadata: name: httpserver namespace: httpserverspec: replicas: 1 selector: matchLabels: app: httpserver template: metadata: labels: app: httpserver spec: containers: - name: httpserver image: registry.imroc.cc/test/httpserver:custom-metrics imagePullPolicy: Always
---
apiVersion: v1kind: Servicemetadata: name: httpserver namespace: httpserver labels: app: httpserver annotations: prometheus.io/scrape: "true" prometheus.io/path: "/metrics" prometheus.io/port: "http"spec: type: ClusterIP ports: - port: 80 protocol: TCP name: http selector: app: httpserver
Prometheus 采集业务监控
您可以通过 Prometheus 采集规则 或 ServiceMonitor 配置 Prometheus 采集业务暴露的监控指标。
方式1:配置 Prometheus 采集规则
在 Prometheus 的采集规则配置文件中添加以下采集规则。示例如下:
scrape_configs:- job_name: httpserver scrape_interval: 10s kubernetes_sd_configs: - role: endpoints namespaces: names: - httpserver relabel_configs: - action: keep source_labels: - __meta_kubernetes_service_label_app regex: httpserver - action: keep source_labels: - __meta_kubernetes_endpoint_port_name regex: http
方式2:配置 ServiceMonitor
若已安装 prometheus-operator,可以通过创建 ServiceMonitor 的 CRD 对象配置 Prometheus。示例如下:
apiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: name: httpserverspec: endpoints: - port: http interval: 5s namespaceSelector: matchNames: - httpserver selector: matchLabels: app: httpserver
安装 prometheus-adapter
1. 使用 Helm 安装 prometheus-adapter,安装前请确定并配置自定义指标。按照上文 暴露监控指标 中的示例,在业务中使用 httpserver_requests_total
指标来记录 HTTP 请求,因此可以通过如下的 PromQL 计算出每个业务 Pod 的 QPS 监控。示例如下:
sum(rate(http_requests_total[2m])) by (pod)
2. 将其转换为 prometheus-adapter 的配置,创建 values.yaml
,内容如下:
rules: default: false custom: - seriesQuery: 'httpserver_requests_total' resources: template: <> name: matches: "httpserver_requests_total" as: "httpserver_requests_qps" # PromQL 计算出来的 QPS 指标 metricsQuery: sum(rate(<>{<>}[1m])) by (<>)prometheus: url: http://prometheus.monitoring.svc.cluster.local # 替换 Prometheus API 的地址 (不写端口) port: 9090
3. 执行以下 Helm 命令安装 prometheus-adapter,示例如下:注意安装前需要删除 TKE 已经注册的 Custom Metrics API,删除命令如下:
kubectl delete apiservice v1beta1.custom.metrics.k8s.io
helm repo add prometheus-community https://prometheus-community.github.io/helm-chartshelm repo update# Helm 3helm install prometheus-adapter prometheus-community/prometheus-adapter -f values.yaml# Helm 2# helm install --name prometheus-adapter prometheus-community/prometheus-adapter -f values.yaml
4. 添加 prometheus 认证鉴权参数。当前社区提供的 chart 中没有暴露认证鉴权相关的入参,会导致认证鉴权失败无法正常连接 TMP 服务,为了解决这个问题,您可以查看 社区文档。解决方案要求您手动修改 Prometheus Adapter deployment,在 adapter 启动参数中添加 --prometheus-header=Authorization=Basic {token}
,其中{token}
为您从控制台获取的 token。
测试验证
若安装正确,执行以下命令,可以查看到 Custom Metrics API 返回配置的 QPS 相关指标。示例如下:
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1{ "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [ { "name": "jobs.batch/httpserver_requests_qps", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "pods/httpserver_requests_qps", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "namespaces/httpserver_requests_qps", "singularName": "", "namespaced": false, "kind": "MetricValueList", "verbs": [ "get" ] } ]}
执行以下命令,可以查看到 Pod 的 QPS 值。示例如下:说明下述示例 QPS 为500m,表示 QPS 值为0.5。
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/httpserver/pods/*/httpserver_requests_qps{ "kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/httpserver/pods/%2A/httpserver_requests_qps" }, "items": [ { "describedObject": { "kind": "Pod", "namespace": "httpserver", "name": "httpserver-6f94475d45-7rln9", "apiVersion": "/v1" }, "metricName": "httpserver_requests_qps", "timestamp": "2020-11-17T09:14:36Z", "value": "500m", "selector": null } ]}
测试 HPA
假如设置每个业务 Pod 的平均 QPS 达到50时将触发扩容,最小副本为1个,最大副本为1000个,则配置示例如下:
apiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata: name: httpserver namespace: httpserverspec: minReplicas: 1 maxReplicas: 1000 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: httpserver metrics: - type: Pods pods: metric: name: httpserver_requests_qps target: averageValue: 50 type: AverageValue
执行以下命令对业务进行压测,观察是否自动扩容。示例如下:
$ kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEhttpserver Deployment/httpserver 83933m/50 1 1000 2 18h$ kubectl get podsNAME READY STATUS RESTARTS AGEhttpserver-6f94475d45-47d5w 1/1 Running 0 3m41shttpserver-6f94475d45-7rln9 1/1 Running 0 37hhttpserver-6f94475d45-6c5xm 0/1 ContainerCreating 0 1shttpserver-6f94475d45-wl78d 0/1 ContainerCreating 0 1s
若扩容正常,则说明已实现 HPA 基于业务自定义指标进行弹性伸缩。
容器服务官网1折活动,限时活动,即将结束,速速收藏
同尘科技为腾讯云授权服务中心。
购买腾讯云产品享受折上折,更有现金返利。同意关联立享优惠
转转请注明出处:http://www.yunxiaoer.com/148704.html