处理一例 Pod Pending 的问题

2025-11-25 19:31:00
丁国栋
原创 7
摘要:本文记录一例Pod无法调度到节点的问题。

今天日常巡检发现有两个新创建的Pod无法分配到节点。提示:0/12 nodes are available: 5 Insufficient cpu, 7 node(s) didn''t match Pod''s node affinity.

Pod的节点选择器标签对应的节点资源是非常充裕的。

Pod 的资源定义是:

    resources:
      limits:
        cpu: "1"
        memory: 512Mi
      requests:
        cpu: 500m
        memory: 256Mi
kubectl top node -l name=value 显示:
NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
10.8.0.15   413m         2%     6522Mi          23%       
10.8.0.4    447m         2%     7661Mi          27%       
10.8.0.41   625m         4%     8365Mi          30%       
10.8.0.49   548m         3%     8070Mi          29%       
10.8.0.9    811m         5%     7435Mi          26%
通过命令 kubectl describe node -l name=value |grep 'Allocated resources' -A 12 检查节点的资源分配情况,发现 Requests 都 99% 了。
kubectl describe node -l name=value |grep 'Allocated resources' -A 12
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                      Requests           Limits
  --------                      --------           ------
  cpu                           15552m (99%)       35220m (225%)
  memory                        14625692160 (50%)  27878641152 (96%)
  ephemeral-storage             0 (0%)             0 (0%)
  hugepages-1Gi                 0 (0%)             0 (0%)
  hugepages-2Mi                 0 (0%)             0 (0%)
  tke.cloud.tencent.com/eip     0                  0
  tke.cloud.tencent.com/eni-ip  1                  1
Events:                         <none>
--
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                      Requests           Limits
  --------                      --------           ------
  cpu                           15502m (99%)       38220m (245%)
  memory                        17310046720 (59%)  33784221184 (116%)
  ephemeral-storage             0 (0%)             0 (0%)
  hugepages-1Gi                 0 (0%)             0 (0%)
  hugepages-2Mi                 0 (0%)             0 (0%)
  tke.cloud.tencent.com/eip     0                  0
  tke.cloud.tencent.com/eni-ip  1                  1
Events:                         <none>
--
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                      Requests           Limits
  --------                      --------           ------
  cpu                           15502m (99%)       32320m (207%)
  memory                        15967869440 (55%)  28683947520 (99%)
  ephemeral-storage             0 (0%)             0 (0%)
  hugepages-1Gi                 0 (0%)             0 (0%)
  hugepages-2Mi                 0 (0%)             0 (0%)
  tke.cloud.tencent.com/eip     0                  0
  tke.cloud.tencent.com/eni-ip  1                  1
Events:                         <none>
--
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                      Requests           Limits
  --------                      --------           ------
  cpu                           15552m (99%)       39420m (252%)
  memory                        17146468864 (59%)  34857963008 (120%)
  ephemeral-storage             0 (0%)             0 (0%)
  hugepages-1Gi                 0 (0%)             0 (0%)
  hugepages-2Mi                 0 (0%)             0 (0%)
  tke.cloud.tencent.com/eip     0                  0
  tke.cloud.tencent.com/eni-ip  1                  1
Events:                         <none>
--
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                      Requests           Limits
  --------                      --------           ------
  cpu                           15552m (99%)       38620m (247%)
  memory                        14059461120 (48%)  30026124800 (103%)
  ephemeral-storage             0 (0%)             0 (0%)
  hugepages-1Gi                 0 (0%)             0 (0%)
  hugepages-2Mi                 0 (0%)             0 (0%)
  tke.cloud.tencent.com/eip     0                  0
  tke.cloud.tencent.com/eni-ip  0                  0
Events:                         <none>

第一感觉就是已有的 Pod 的 Request 设置的太大了。再去改已有的 Pod 的 Request又不太现实,添加新的节点需要新购资源也不好,为了临时解决这个问题,我把其他的资源富余的节点打上了相同的标签,让Pod调度到该节点上。

---

发表评论
博客分类