暂无图片
暂无图片
4
暂无图片
暂无图片
暂无图片

kubectl top node error- metrics not available yet 问题追踪

政采云运维团队 2021-10-16
8848

执行查看node资源命令报错:

kubectl top node
error: metrics not available yet

开始解决问题前,我们先熟悉下metrics  server 相关知识

  • 概念

Metrics Server 是 Kubernetes 集群核心监控数据的聚合器,Metrics Server 从 Kubelet 收集资源指标,并通过 Merics API 在 Kubernetes APIServer 中提供给缩放资源对象 HPA 使用。也可以通过 Metrics API 提供的 Kubectl top 查看 Pod 资源占用情况,从而实现对资源的自动缩放。

  • 设计

Metrics Server 是 Kubernetes 监控组件中的重要一部分,Metrics Server 主要分为 API 和 Server 两大部分。其中 Metrics API 部分主要通过 APIServer 对外暴露 Pod 资源使用情况,比如:HPA、kubectl top、Kubernetes dashboard 等。Metrics Server 是根据 Kubernetes 监控架构进行实施,该组件会定期通过 Summary API 从 Kubelet 所在集群节点获取服务指标,然后将指标汇总、存储到内存中,仅仅存储指标最新状态,一旦重启组件数据将会丢失。现在通过 Metrics Server 采集到了数据,也暴露了 API 那么通过 kube-aggregator 统一把 API Server 数据转发给 Metrics Server,最后通过 metrics api 统一暴露出去。

kubectl top 执行请求流程图

流程描述

1、kubectl top node 执行后向kube-apiserver /apis/metrics.k8s.io/地址发起请求,kube-apiserver 对发出请求的用户身份认证,并对请求的 API 路径执行鉴权。
2、kube-apiserver 将打到apiserver的/apis/metrics.k8s.io的请求转发给metrics api 这个扩展API,这里用到kube-aggregator,它是对apiserver 的有力扩展,它允许k8s的开发人员编写一个自己的服务,并把这个服务注册到k8s的api里面,即扩展API。
3、Metrics api 执行请求,转发到定义的service:metrics-server。
4、service 请求转发到关联的endpoint,最终到达pod。
5、pod 最终从kubelet Summary API获取。

流程1 排查

kubectl top 执行详细日志

  • kubectl --v=8 top node 命令(带上--v=8 级别日志)

# kubectl --v=8 top node
I1012 10:11:38.322658   39273 loader.go:359] Config loaded from file: /root/.kube/config
I1012 10:11:38.323804   39273 round_trippers.go:416] GET https://192.168.1.3:6443/api?timeout=32s
I1012 10:11:38.323815   39273 round_trippers.go:423] Request Headers:
I1012 10:11:38.323821   39273 round_trippers.go:426]     Accept: application/json, */*
I1012 10:11:38.323827   39273 round_trippers.go:426]     User-Agent: kubectl/v1.15.2 (linux/amd64) kubernetes/f627830
I1012 10:11:38.333052   39273 round_trippers.go:441] Response Status: 200 OK in 9 milliseconds
I1012 10:11:38.333063   39273 round_trippers.go:444] Response Headers:
I1012 10:11:38.333068   39273 round_trippers.go:447]     Content-Type: application/json
I1012 10:11:38.333091   39273 round_trippers.go:447]     Content-Length: 137
I1012 10:11:38.333095   39273 round_trippers.go:447]     Date: Tue, 12 Oct 2021 02:11:38 GMT
I1012 10:11:38.333125   39273 request.go:947] Response Body: {"kind":"APIVersions","versions":["v1"],"serverAddressByClientCIDRs":[{"clientCIDR":"0.0.0.0/0","serverAddress":"172.18.213.151:6443"}]}
I1012 10:11:38.333291   39273 round_trippers.go:416] GET https://192.168.1.3:6443/apis?timeout=32s
I1012 10:11:38.333299   39273 round_trippers.go:423] Request Headers:
I1012 10:11:38.333304   39273 round_trippers.go:426]     Accept: application/json, */*
I1012 10:11:38.333310   39273 round_trippers.go:426]     User-Agent: kubectl/v1.15.2 (linux/amd64) kubernetes/f627830
I1012 10:11:38.334485   39273 round_trippers.go:441] Response Status: 200 OK in 1 milliseconds
I1012 10:11:38.334499   39273 round_trippers.go:444] Response Headers:
I1012 10:11:38.334504   39273 round_trippers.go:447]     Content-Type: application/json
I1012 10:11:38.334509   39273 round_trippers.go:447]     Date: Tue, 12 Oct 2021 02:11:38 GMT
I1012 10:11:38.334623   39273 request.go:947] Response Body: {"kind":"APIGroupList","apiVersion":"v1","groups":[{"name":"apiregistration.k8s.io","versions":[{"groupVersion":"apiregistration.k8s.io/v1","version":"v1"},{"groupVersion":"apiregistration.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"apiregistration.k8s.io/v1","version":"v1"}},{"name":"extensions","versions":[{"groupVersion":"extensions/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"extensions/v1beta1","version":"v1beta1"}},{"name":"apps","versions":[{"groupVersion":"apps/v1","version":"v1"},{"groupVersion":"apps/v1beta2","version":"v1beta2"},{"groupVersion":"apps/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"apps/v1","version":"v1"}},{"name":"events.k8s.io","versions":[{"groupVersion":"events.k8s.io/v1beta1","version":"v1beta1"}],"preferredVersion":{"groupVersion":"events.k8s.io/v1beta1","version":"v1beta1"}},{"name":"authentication.k8s.io","versions":[{"groupVersion":"authentication.k8s.io/v1","version":"v1"},{"groupVersion":"authenticati [truncated 4396 chars]
I1012 10:11:38.334925   39273 round_trippers.go:416] GET https://192.168.1.3:6443/apis/metrics.k8s.io/v1beta1/nodes
I1012 10:11:38.334933   39273 round_trippers.go:423] Request Headers:
I1012 10:11:38.334939   39273 round_trippers.go:426]     User-Agent: kubectl/v1.15.2 (linux/amd64) kubernetes/f627830
I1012 10:11:38.334944   39273 round_trippers.go:426]     Accept: application/json, */*
I1012 10:11:38.353651   39273 round_trippers.go:441] Response Status: 200 OK in 18 milliseconds
I1012 10:11:38.353659   39273 round_trippers.go:444] Response Headers:
I1012 10:11:38.353666   39273 round_trippers.go:447]     Content-Type: application/json
I1012 10:11:38.353671   39273 round_trippers.go:447]     Date: Tue, 12 Oct 2021 02:11:38 GMT
I1012 10:11:38.353675   39273 round_trippers.go:447]     Content-Length: 137
I1012 10:11:38.353688   39273 request.go:947] Response Body: {"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[]}
F1012 10:11:38.354504   39273 helpers.go:114] error: metrics not available yet

日志可以看到请求已经发送到apiserver,Response 状态码也是:200。没有其他有用信息,继续排查。

流程2 排查

查看 kube-apiserver 是否开启kube-aggregator

metrics-server 暴露出来的 metrics API,使用kube-aggregator 将 apiserver 的请求转发给 metrics-server ,apiserver 配置参数如下:

--proxy-client-cert-file=/etc/kubernetes/certs/proxy.crt
--proxy-client-key-file=/etc/kubernetes/certs/proxy.key
--requestheader-client-ca-file=/etc/kubernetes/certs/proxy-ca.crt
--requestheader-allowed-names=aggregator
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User

如果kube-proxy没有在Master上面运行,还需要配置

--enable-aggregator-routing=true

对比kube-apiserver 启动参数,聚合层配置无误。

流程3 排查

排查Kubernetes apiserver 将请求发送到扩展 apiserver metrics API 调用配置是否设置正确。

查看apiservices  v1beta1.metrics.k8s.io

# kubectl describe apiservices v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
Namespace:    
Labels:       <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
              {"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"group...
API Version: apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
Creation Timestamp: 2019-11-26T07:32:03Z
Resource Version:   202235905
Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
UID:                 d5e26540-8f0b-4823-83d1-f92c2fb46e23
Spec:
Group:                     metrics.k8s.io
Group Priority Minimum:   100
Insecure Skip TLS Verify: true
Service:
  Name:           prometheus-adapter
  Namespace:       monitoring
  Port:           443
Version:           v1beta1
Version Priority: 100
Status:
Conditions:
  Last Transition Time: 2021-10-09T07:01:36Z
  Message:               all checks passed
  Reason:               Passed
  Status:               True
  Type:                 Available
Events:                   <none>

这里发现v1beta1.metrics.k8s.io API 绑定至一个名为prometheus-adapter,而不是metrics-server。到这里我们的问题就找到。

修改v1beta1.metrics.k8s.io APIService配置:

Service:
  Name:           prometheus-adapter
  Namespace:       monitoring
  Port:           443
   
修改为
  Service:
  Name:           metrics-server
  Namespace:       kube-system
  Port:           443

再次执行kubectl top ,执行结果如下:

# kubectl top node
NAME             CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
192.168.213.133   2208m        13%    56386Mi         88%       
192.168.213.134   989m         6%     38148Mi         59%       
192.168.213.137   837m         5%     39297Mi         61%       
192.168.213.140   5867m        37%    46577Mi         73%       
192.168.213.151   265m         3%     7596Mi          24%      


文章转载自政采云运维团队,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论