默认 kube-scheduler 用的是 LeastRequested:空闲越多越优先,我们需要改成:MostAllocated(已用越多、得分越高、越优先)。
我们目前想根据线上环境调整下策略,实际需求就是使用GPU的POD优先调度到没使用完GPU的节点。
需求讲完了,我们来做下实际配置,其实配置比较简单,就几个简单步骤。
1.创建调度策略配置文件
tee /etc/kubernetes/scheduler-config.yaml<<EOF
apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /etc/kubernetes/scheduler.conf
profiles:
- schedulerName: default-scheduler
plugins:
score:
enabled:
- name: NodeResourcesFit
weight: 100
disabled:
- name: LeastRequestedPriority
- name: BalancedAllocation
pluginConfig:
- name: NodeResourcesFit
args:
scoringStrategy:
type: MostAllocated
resources:
- name: nvidia.com/gpu
weight: 100
EOF由于我们只关注GPU资源,所以我们把权重全部给了GPU,资源名字需要和你自己的环境中的GPU资源名字保持一致。
2.修改调度器配置
cat /etc/kubernetes/manifests/kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --config=/etc/kubernetes/scheduler-config.yaml # 添加此处配置 - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true image: registry.k8s.io/kube-scheduler:v1.33.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /livez port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: kube-scheduler readinessProbe: failureThreshold: 3 httpGet: host: 127.0.0.1 path: /readyz port: 10259 scheme: HTTPS periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 100m startupProbe: failureThreshold: 24 httpGet: host: 127.0.0.1 path: /livez port: 10259 scheme: HTTPS initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /etc/kubernetes/scheduler.conf name: kubeconfig readOnly: true# 挂在配置到POD内部 - mountPath: /etc/kubernetes/scheduler-config.yaml name: scheduler-config readOnly: true hostNetwork: true priority: 2000001000 priorityClassName: system-node-critical securityContext: seccompProfile: type: RuntimeDefault volumes: - hostPath: path: /etc/kubernetes/scheduler.conf type: FileOrCreate name: kubeconfig#挂配置到POD内 - hostPath: path: /etc/kubernetes/scheduler-config.yaml type: File name: scheduler-config status: {}
上述配置修改好以后scheduler的POD就会自动重启,接着我们就可以进行POD调度测试了。
内容版权声明:除非注明,否则皆为本站原创文章。
评论列表