当应用接口的请求访问量飙升时,您可以通过Java应用接口的QPS配置HPA弹性策略,实现应用的弹性扩缩。本文介绍如何通过ARMS APM应用监控服务实现应用的HPA弹性伸缩。
索引
-
工作原理
-
前提条件
-
操作流程
-
操作视频
-
步骤一:安装ARMS APM应用监控组件
-
步骤二:授予ARMS资源的访问权限
-
步骤三:为Java应用开启ARMS APM应用监控
-
步骤四:对接alibaba-cloud-metrics-adapter组件
-
步骤五:配置APM指标进行HPA扩缩
-
通过压测查看弹性扩缩容效果
工作原理
将ACK集群中的Java应用接入ARMS APM应用监控服务后,您可以通过ARMS APM获取应用接口的访问详情。关于如何将Java应用接入ARMS APM应用监控服务,请参见应用监控。ARMS APM应用监控服务将ARMS APM数据转换为阿里云Prometheus数据格式,alibaba-cloud-metrics-adapter组件将阿里云Prometheus指标转换成HPA可用的指标,最终实现应用的HPA弹性伸缩。
本文以部署应用arms-springboot-demo,并压测其中接口/demo/queryUser/10为例进行介绍。
前提条件
-
已部署阿里云Prometheus监控组件。具体操作,请参见开启阿里云Prometheus监控。
-
已在命名空间kube-system中部署alibaba-cloud-metrics-adapter组件。具体操作,请参见部署alibaba-cloud-metrics-adapter组件。
-
已创建命名空间。具体操作,请参见管理命名空间与配额。本文创建的示例命名空间为arms-demo。
-
已安装JDK。关于ARMS APM应用监控支持的JDK版本,请参见ARMS应用监控支持的Java组件和框架。
操作流程
操作视频
步骤一:安装ARMS APM应用监控组件
为应用接入ARMS APM应用监控功能,需要在集群中安装ARMS APM应用监控组件one-pilot。
-
登录容器服务管理控制台,在左侧导航栏选择集群。
-
在集群列表页面,单击目标集群名称,然后在左侧导航栏,选择运维管理 > 组件管理。
-
在组件管理页面,搜索并定位ack-onepilot组件,在组件卡片区域单击安装,然后按照对话框提示配置参数,并单击确认。
步骤二:授予ARMS资源的访问权限
-
如需监控ACK Serverless集群或对接了ECI的集群应用,请在云资源访问授权页面完成授权,然后重启ack-onepilot组件下的所有Pod。
-
如需监控ACK集群应用,请先查看是否存在ARMS Addon Token。关于如何查看集群是否存在ARMS Addon Token,请参见查看集群是否存在ARMS Addon Token。
-
如果ACK集群存在ARMS Addon Token,此时ARMS会进行免密授权。
说明
Kubernetes托管版集群默认存在ARMS Addon Token。但对于部分早期创建的Kubernetes托管版集群可能不存在ARMS Addon Token,请参考下文内容手动为集群授予ARMS资源的访问权限。
-
如果ACK集群中不存在ARMS Addon Token,请执行以下操作,手动为集群授予ARMS资源的访问权限。
-
-
登录容器服务管理控制台,在左侧导航栏选择集群。
-
在集群列表页面,单击目标集群名称,然后在左侧导航栏,选择集群信息。
-
单击集群资源页签,然后单击Worker RAM 角色右侧的链接。在RAM角色管理页面,单击权限管理页签上的权限策略名称。
-
在策略内容页签,单击修改策略内容然后在脚本编辑页签中添加以下内容,单击继续编辑基本信息。
{ "Action": "arms:*", "Resource": "*", "Effect": "Allow" }
在编辑权限策略页面,确认策略内容,然后单击确定。
步骤三:为Java应用开启ARMS APM应用监控
在集群中部署Java应用时,通过为应用打上Labels的方式开启ARMS APM应用监控。
-
登录容器服务管理控制台,在左侧导航栏选择集群。
-
在集群列表页面,单击目标集群名称,然后在左侧导航栏,选择工作负载 > 无状态。
-
在无状态页面右上角,单击使用YAML创建资源。
-
选择示例模板,并在模板(YAML格式)中将以下
labels
添加到spec.template.metadata层级下。labels: armsPilotAutoEnable: "on" armsPilotCreateAppName: "" # 请将替换为您的应用名称。 one-agent.jdk.version: "OpenJDK11" # 如果应用的JDK版本是JDK 11,则需要配置此参数。 armsSecAutoEnable: "on" # 如果需要接入应用安全,则需要配置此参数。
说明
-
应用安全详情,请参见什么是应用安全。
-
接入应用安全后,会产生对应的费用。关于计费详情,请参见计费规则。
以下提供YAML示例模板,展示如何创建一个无状态(Deployment)应用并开启ARMS APM应用监控。
展开查看完整YAML示例文件(Java)
apiVersion: v1 kind: Namespace metadata: name: arms-demo --- apiVersion: apps/v1 kind: Deployment metadata: name: arms-springboot-demo namespace: arms-demo labels: app: arms-springboot-demo spec: replicas: 2 selector: matchLabels: app: arms-springboot-demo template: metadata: labels: app: arms-springboot-demo armsPilotAutoEnable: "on" armsPilotCreateAppName: "arms-k8s-demo" one-agent.jdk.version: "OpenJDK11" spec: containers: - resources: limits: cpu: 0.5 image: registry.cn-hangzhou.aliyuncs.com/arms-docker-repo/arms-springboot-demo:v0.1 imagePullPolicy: Always name: arms-springboot-demo env: - name: SELF_INVOKE_SWITCH value: "true" - name: COMPONENT_HOST value: "arms-demo-component" - name: COMPONENT_PORT value: "6666" - name: MYSQL_SERVICE_HOST value: "arms-demo-mysql" - name: MYSQL_SERVICE_PORT value: "3306" --- apiVersion: v1 kind: Service metadata: labels: name: arms-springboot-demo name: arms-springboot-demo namespace: arms-demo spec: ports: # the port that this service should serve on - name: arms-demo-svc port: 6666 targetPort: 8888 # label keys and values that must match in order to receive traffic for this service selector: app: arms-springboot-demo --- apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1 kind: Deployment metadata: name: arms-springboot-demo-subcomponent namespace: arms-demo labels: app: arms-springboot-demo-subcomponent spec: replicas: 2 selector: matchLabels: app: arms-springboot-demo-subcomponent template: metadata: labels: app: arms-springboot-demo-subcomponent armsPilotAutoEnable: "on" armsPilotCreateAppName: "arms-k8s-demo-subcomponent" one-agent.jdk.version: "OpenJDK11" spec: containers: - resources: limits: cpu: 0.5 image: registry.cn-hangzhou.aliyuncs.com/arms-docker-repo/arms-springboot-demo:v0.1 imagePullPolicy: Always name: arms-springboot-demo-subcomponent env: - name: SELF_INVOKE_SWITCH value: "false" - name: MYSQL_SERVICE_HOST value: "arms-demo-mysql" - name: MYSQL_SERVICE_PORT value: "3306" --- apiVersion: v1 kind: Service metadata: labels: name: arms-demo-component name: arms-demo-component namespace: arms-demo spec: ports: # the port that this service should serve on - name: arms-demo-component-svc port: 6666 targetPort: 8888 # label keys and values that must match in order to receive traffic for this service selector: app: arms-springboot-demo-subcomponent --- apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1 kind: Deployment metadata: name: arms-demo-mysql namespace: arms-demo labels: app: mysql spec: replicas: 1 selector: matchLabels: app: mysql template: metadata: labels: app: mysql spec: containers: - resources: limits: cpu: 0.5 image: registry.cn-hangzhou.aliyuncs.com/arms-docker-repo/arms-demo-mysql:v0.1 name: mysql ports: - containerPort: 3306 name: mysql --- apiVersion: v1 kind: Service metadata: labels: name: mysql name: arms-demo-mysql namespace: arms-demo spec: ports: # the port that this service should serve on - name: arms-mysql-svc port: 3306 targetPort: 3306 # label keys and values that must match in order to receive traffic for this service selector: app: mysql ---
-
-
查看部署ARMS APM应用效果。
在无状态页面,目标应用的操作列将出现ARMS控制台按钮。
您可以单击ARMS控制台跳转查看监控数据。在左侧导航栏,单击接口调用,查看应用接口(如HTTP接口)的访问详情。此处提供的Demo应用arms-springboot-demo,已自动产生了平稳的接口调用。
-
手动创建关联应用arms-springboot-demo的Service,并开启负载均衡来访问此应用的接口。
-
在集群列表页面,单击目标集群名称,然后在左侧导航栏,选择网络 > 服务。
-
单击页面右上角创建,创建关联应用的Service,然后单击创建。关于配置项的说明,请参见创建服务。
稍等片刻,创建完成。在服务页面记录arms-demo-svc的外部端口,例如47.94.XX.XX:8080。
-
执行如下命令,通过外部端口访问此服务的/demo/queryUser/10接口。
curl http://47.94.XX.XX:8080/demo/queryUser/10
预期输出:
{"id":1,"name":"KeyOfSpectator","password":"12****"}
预期输出表明,接口访问正常。
-
步骤四:对接alibaba-cloud-metrics-adapter组件
重要
-
请确保已部署阿里云Prometheus监控组件,否则无法进行本操作。具体操作,请参见开启阿里云Prometheus监控。
-
请确保已在命名空间kube-system中部署alibaba-cloud-metrics-adapter组件,否则无法进行本操作。具体操作,请参见部署alibaba-cloud-metrics-adapter组件。
-
登录ARMS控制台,在左侧导航栏选择Prometheus监控 > Prometheus实例列表。
说明
开通了对应Region的ARMS APM应用监控后,阿里云Prometheus就会生成一个对应的Prometheus实例,此实例的名称为arms_metrics_{RegionId}_XXX,例如arms_metrics_cn-beijing_cloud_beijing。
-
单击目标实例名称(格式为arms_metrics_{RegionId}_XXX),在左侧导航栏单击设置,然后在右侧设置页签的最下方查看并记录HTTP API地址(Grafana 读取地址)(Prometheus URL)。
在ack-alibaba-cloud-metrics-adapter中填入上一步中记录的HTTP API地址(Grafana 读取地址)(Prometheus URL)。
-
登录容器服务管理控制台,在左侧导航栏选择集群。
-
在集群列表页面,单击目标集群名称,然后在左侧导航栏,选择应用 > Helm。
-
在Helm页面ack-alibaba-cloud-metrics-adapter所在行,单击操作列的更新。
-
在更新发布面板插入步骤2中记录的Prometheus URL。
-
将如下配置写入ack-alibaba-cloud-metrics-adapter的adapter-config中。
rules: - metricsQuery: sum by (rpc) (sum_over_time(<>{rpc="/demo/queryUser/{id}",service="arms-demo:arms-k8s-demo",prpc="__all__",ppid="__all__",endpoint="__all__",destId="__all__",<>}[1m])) name: as: ${1}_per_second_queryuser matches: ^(.*)_count resources: namespaced: false seriesQuery: arms_app_requests_count
-
执行如下命令,查看集群中指标数据。
-
执行如下命令,查看指标arms_app_requests_per_second_queryuser是否存在。
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"
预期输出:
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"k8s_workload_memory_working_set","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_rss","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p9999","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_traffic_rx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_inflow","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_ratio","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_traffic_tx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_packet_rx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_pass_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_avg","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_max_connection","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_day","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_month","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_connection_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_5xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_rx_rate","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_rx_errors","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_week","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_cache","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_percorepricing","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_packet_tx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_4xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_5xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_block_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_alb_ingress_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_utilization","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p95","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l4_active_connection","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_hour","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_2xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_status_3xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_cpu_util","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_tx_rate","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_total_qps","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_cpu_usage","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_network_tx_errors","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_min","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p50","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"sls_ingress_latency_p99","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"slb_l7_upstream_4xx","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"k8s_workload_memory_request","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"ahas_sentinel_avg_rt","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"cost_memory_limit","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]},{"name":"arms_app_requests_per_second_queryuser","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}
预期输出表明,指标arms_app_requests_per_second_queryuser存在。
-
执行如下命令,查看指标实时数据。
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/arms-demo/arms_app_requests_per_second_queryuser"| jq .
预期输出:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "arms_app_requests_per_second_queryuser", "metricLabels": { "rpc": "/demo/queryUser/10" }, "timestamp": "2022-11-09T07:49:07Z", "value": "6" } ] }
预期输出表明,实时数据返回正常。
-
步骤五:配置APM指标进行HPA扩缩
-
使用如下内容,创建hpa.yaml。
说明
-
hpa.yaml中的配置指标名与上一步ack-alibaba-cloud-metrics-adapter中定义的指标名需保持一致。
-
hpa.yaml中的
target
为弹性阈值,当QPS > 40时进行扩容。
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: test-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: arms-springboot-demo minReplicas: 1 maxReplicas: 10 metrics: - type: External external: metric: name: arms_app_requests_per_second_queryuser # External指标类型下只支持Value和AverageValue类型的目标值。 target: type: AverageValue averageValue: 40
-
-
执行如下命令,对业务应用arms-springboot-demo部署HPA。
kubectl apply -f hpa.yaml
-
执行如下命令,查看指标变化。
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/arms-demo/arms_app_requests_per_second_queryuser"| jq .
预期输出:
{ "kind": "ExternalMetricValueList", "apiVersion": "external.metrics.k8s.io/v1beta1", "metadata": {}, "items": [ { "metricName": "arms_app_requests_per_second_queryuser", "metricLabels": { "rpc": "/demo/queryUser/10" }, "timestamp": "2022-11-09T07:53:16Z", "value": "4216" } ] }
-
执行如下命令,查看HPA详情。
kubectl get hpa -n arms-demo
预期输出:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE test-hpa Deployment/arms-springboot-demo 300m/40 (avg) 1 10 10 148m
预期输出表明,Targets存在数据,HPA配置成功。
通过压测查看弹性扩缩容效果
-
执行如下命令,对Demo应用进行压测实验。
ab -c 50 -n 2000 http://47.94.XX.XX:8080/demo/queryUser/10
说明
47.94.XX.XX:8080
为服务arms-demo-svc的外部端口。 -
查看弹性扩缩容效果。
-
可以在ARMS APM控制台看到,此接口的请求量因压测飙升。
-
可以在Prometheus大盘看到,当应用接口的QPS值超过阈值时,达到了HPA扩缩的效果。
-
在ACK集群中可以看到此demo应用的Pod副本数随接口调用的QPS进行扩缩。
您可以通过执行命令
kubectl describe hpa test-hpa -n arms-demo
查看发生的扩缩容事件。
-
内容没看懂? 不太想学习?想快速解决? 有偿解决: 联系专家
阿里云企业补贴进行中: 马上申请
腾讯云限时活动1折起,即将结束: 马上收藏
同尘科技为腾讯云授权服务中心。
购买腾讯云产品享受折上折,更有现金返利:同意关联,立享优惠
转转请注明出处:https://www.yunxiaoer.com/171296.html