在CentOS 7上安装最新版K8S

一文搞定K8S部署

Image credit: kubernetes.io

K8s安装部署手册

系统环境

本文的操作系统为KVM下最小化安装的CentOS7.4+ 虚拟机,分配了4个CPU,20GB内存,100GB硬盘。

下面是安装K8S前需要对系统做的一些必要配置。

yum install net-tools screen wget 

关闭SELINUX

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

关闭防火墙

简单起见,可以直接把防火墙关了。等所有安全完成后再启用防火墙并配置相应开启的端口。

systemctl stop firewalld
systemctl disable firewalld

关闭SWAP

swapoff -a
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab

让IPTABLES可以看到桥接流量

sudo modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

修改主机名并添加主机名与IP的对应关系

使用下面的命令修改主机名

$ vi /etc/hostname

将下面的内容添加都/etc/hosts文件中

192.168.75.27 node27
192.168.75.26 node26
192.168.75.25 node25
192.168.75.24 node24
192.168.75.23 node23
192.168.75.22 node22

Container runtimes

什么是runtime

runtime常被翻译为运行时,理解了什么是runtime也就容易理解什么是container runtimes了。

按照维基百科的分类,计算机程序从被创建到运行一般会经历下面几个状态

  1. 编辑(edit)时,也就是编写代码的状态
  2. 编译(compile)时,是使用编译器将代码编译为程序可以运行的可执行文件或者各种库文件等
  3. 链接时、Distribution time、Installation time、加载时
  4. 运行时,就是程序运行起来的状态

在这个语义下,运行时库(runtime library)就是程序运行的时候所需要依赖的库。

但是Container runtimes中提到的runtime并不属于上面说的这个概念,而是对运行环境(Runtime Environment)运行时系统(Runtime System)的一种简称。所以在我看来,container runtime的完整理解应该是:K8S平台上,程序运行时所依赖的容器环境。搞清楚这个基本概念后,其他的内容就很容易理解了。

container runtime

K8S最初的container runtime本来没这么繁琐,但是随着后来圈子内K8S、docker和RKT等组织之间明争暗斗的政治较量,K8S的container runtime朝着无比复杂的不归路越走越远,这部分内容可以单独写一个长篇阐述。

本文主要面向刚接触K8S的人,为了不在一开始就掉到深坑里,这里以最常用的Docker为例介绍如何安装。一句话简单总结就是,选择Docker作为container runtime,并且把docker的cgroup driver改为systemd。

CentOS/RHEL 7.4+下安装docker
# (Install Docker CE)
## Set up the repository
### Install required packages
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
## Add the Docker repository
sudo yum-config-manager --add-repo \
  https://download.docker.com/linux/centos/docker-ce.repo
# Install Docker CE
sudo yum update -y && sudo yum install -y \
  containerd.io-1.2.13 \
  docker-ce-19.03.11 \
  docker-ce-cli-19.03.11
## Create /etc/docker
sudo mkdir /etc/docker

把docker的cgroup driver改为systemd

# Set up the Docker daemon
cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ]
}
EOF
sudo mkdir -p /etc/systemd/system/docker.service.d
# Restart Docker
sudo systemctl daemon-reload
sudo systemctl restart docker

设置docker服务在开机时自动启动

sudo systemctl enable docker

使用kubeadm部署K8S

安装kubeadm

所有节点都需要安装kubelet和kubeadm,master节点必须装kubectl,worker节点上kubectl可装可不装。

选择国内最方便的阿里源作为K8S的yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装kubelet、kubeadm和kubectl

sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
sudo systemctl enable --now kubelet

本次安装的版本为

$ yum list installed | grep kube

cri-tools.x86_64                   1.13.0-0                          @kubernetes
kubeadm.x86_64                     1.19.3-0                          @kubernetes
kubectl.x86_64                     1.19.3-0                          @kubernetes
kubelet.x86_64                     1.19.3-0                          @kubernetes
kubernetes-cni.x86_64              0.8.7-0                           @kubernetes

使用kubeadm创建集群Master节点

选择一个节点作为master,执行下面的操作。

初始化参数配置

导出安装K8S的默认配置文件进行修改

cd ~
kubeadm config print init-defaults > kubeadm-init.yaml

可能会看到下面这条告警,忽视之即可, 在1.20版本这个告警就会被删掉,没什么用。

W1012 15:27:12.495492   28528 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]

看一下默认配置文件的内容,当前kubernetes最新版本为1.19.0,该配置文件还缺少很多必要的参数。

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: node24
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}

一种方法是讲所有参数添加到该配置文件,然后使用修改后的配置文件初始化kubernetes的master节点。

  • 配置文件中advertiseAddress改为本机的IP地址,或者0.0.0.0来表示默认网卡,有多个网卡时最好直接写本机IP地址,避免使用0.0.0.0
  • 将imageRepository改为registry.cn-hangzhou.aliyuncs.com/google_containers,在国内下载速度更快一些
  • 添加podSubnet: 172.16.0.0/16

修改后的配置文件如下所示

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens: 
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s   
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.75.24
  bindPort: 6443 
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: node4   
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS  
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 172.16.0.0/16
scheduler: {}

但是这种方法只适用于修改少数简单参数的情况, 官网文档只提供了命令行参数的说明,命令行参数和配置文件里的变量名并不一致,例如--pod-network-cidr对应podSubnet。且官方文档未提供命令行参数和配置文件里变量的对应关系,具体可见 #1825#1899stackoverflow。因此还是更推荐直接使用命令行参数来进行初始化。如果非常热衷于使用配置文件,可以参阅 kubeadm api

只要kubernetes的版本符合需要,就可以使用配置文件提前下载镜像,具体操作如下

$ kubeadm config images pull --config kubeadm-init.yaml
W1012 15:30:42.774486   31905 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.19.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.19.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.19.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.19.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.13-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.7.0
初始化kubernetes

使用命令行参数进行初始化

kubeadm init \
 --apiserver-advertise-address 192.168.75.24 \
 --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
 --pod-network-cidr 172.16.0.0/16 

或者使用配置文件进行初始化

kubeadm init --config kubeadm-init.yaml

运行日志如下

W1112 15:27:01.666828    7218 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local node24] and IPs [10.96.0.1 192.168.75.24]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost node24] and IPs [192.168.75.24 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost node24] and IPs [192.168.75.24 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 30.070425 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node node24 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node node24 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 6gyfn0.n8sdkiqf1vad0u85
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.75.24:6443 --token 6gyfn0.n8sdkiqf1vad0u85 \
    --discovery-token-ca-cert-hash sha256:7dcdd2514207c94926599e5ad61b08e1dc765f980657f0c9047ef80959032b62

至此K8S的安装初步完成,为了让普通用户使用K8S,需要按照提示执行上述末尾的操作,

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

对于root用户,除了上面的方法,也可以使用下面的方式

export KUBECONFIG=/etc/kubernetes/admin.conf

不然会遇到问题

[root@node27 ~]# kubectl get cs
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@node27 ~]# mkdir -p $HOME/.kube
[root@node27 ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@node27 ~]# chown $(id -u):$(id -g) $HOME/.kube/config
[root@node27 ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                       ERROR
scheduler            Unhealthy   Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Unhealthy   Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused   
etcd-0               Healthy     {"health":"true"}                                       

现在查看集群的状态还不正常

[root@node27 ~]# kubectl cluster-info
Kubernetes master is running at https://192.168.75.27:6443
KubeDNS is running at https://192.168.75.27:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root@node27 ~]# kubectl get node
NAME     STATUS     ROLES    AGE   VERSION
node27   NotReady   master   23m   v1.19.2

这是因为集群还需要配置网络使Pod之间正常通信,只在Master节点配置网络后就可以,新加入的节点会根据Master节点的信息自动进行网络配置。

安装Pod网络组件

安装Calico作为K8S的网络组件,也可以选择 其他组件

Note: Currently Calico is the only CNI plugin that the kubeadm project performs e2e tests against. If you find an issue related to a CNI plugin you should log a ticket in its respective issue tracker instead of the kubeadm or kubernetes issue trackers.

下载Calico的配置文件,目前最新版本为3.16,可以去 Calico官网查询最新版本号。

sudo yum install -y wget
cd ~
wget https://docs.projectcalico.org/v3.16/manifests/calico.yaml

将配置文件中的CALICO_IPV4POOL_CIDR改为与K8S环境的pod-network-cidr一致。然后部署Calico

kubectl apply -f calico.yaml

至此集群初始化工作基本完成。

添加Worker节点

Worker节点需要满足上文所写的系统环境要求,安装Docker、kubeadm、kubelet,并且已经启动kubelet,可选装kubectl。

按照初始化master节点时输出的日志提示信息,将woker节点加入到集群中。

kubeadm join 192.168.75.24:6443 --token 6gyfn0.n8sdkiqf1vad0u85 \
>     --discovery-token-ca-cert-hash sha256:7dcdd2514207c94926599e5ad61b08e1dc765f980657f0c9047ef80959032b62

运行日志如下:

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

如果在master完成初始化之后很久才加入worker节点,初始化时的token可能会过期,看到如下报错

[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "abcdef"
To see the stack trace of this error execute with --v=5 or higher

这时需要到master节点生成新的token

$ kubeadm token create --print-join-command
W1022 11:37:04.312804   26939 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join 192.168.75.24:6443 --token 6thykj.h31vjswc2h86bey7     --discovery-token-ca-cert-hash sha256:b1bb3693bea2c5e733152d8f7113ee36d1b5ae033f7c7a0911b13d1a359de1a8 

然后在worker节点使用新生成的token加入集群。

退出集群

如果节点想退出集群,则在worker节点执行下面的命令

$ kubeadm reset -f

运行日志如下

[preflight] Running pre-flight checks
W1112 14:14:29.879137    3323 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

完成部署

如果上面所有的操作都顺利完成,现在集群就已经完成了部署工作。可以在master节点使用下面的几个命令查看集群的状态。

$ kubectl cluster-info
Kubernetes master is running at https://192.168.75.24:6443
KubeDNS is running at https://192.168.75.24:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.


$ kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
node23   Ready    <none>   19m   v1.19.3
node24   Ready    master   26m   v1.19.3

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-5c6f6b67db-sg2ph   1/1     Running   0          22m
kube-system   calico-node-db7d7                          1/1     Running   0          8m29s
kube-system   calico-node-vtddc                          1/1     Running   0          22m
kube-system   coredns-6c76c8bb89-l7r5b                   1/1     Running   0          25m
kube-system   coredns-6c76c8bb89-n9mcq                   1/1     Running   0          25m
kube-system   etcd-node24                                1/1     Running   0          26m
kube-system   kube-apiserver-node24                      1/1     Running   0          26m
kube-system   kube-controller-manager-node24             1/1     Running   1          26m
kube-system   kube-proxy-5c4cw                           1/1     Running   0          25m
kube-system   kube-proxy-cqmcc                           1/1     Running   0          18m
kube-system   kube-scheduler-node24                      1/1     Running   1          26m

参考文章

容器技术

K8S

K8S防火墙设置

Pod Network

下一页
上一页