kubadm命令_基于Linux自己初步搭建Kubernetes（k8s）集群基础详细教程

Ⅰ yum list kubeadm --showplicates 这个命令是啥左右

列出yum源内所有的kbubeadm版本
希望可以帮助你请采纳

Ⅱ 基于linux自己初步搭建Kubernetes（k8s）集群基础，详细教程

k8s官方网站：https://kubernetes.io/zh/，可自行查看相关文档说明

k8s-master：Ubuntu--192.168.152.100

k8s-node01：Ubuntu--192.168.152.101

k8s-node02：Ubuntu--192.168.152.102

全部已安装docker，未安装可根据官方文档安装：https://docs.docker.com/get-docker/

1，禁止swap分区

K8s的要求，确保禁止掉swap分区，不禁止，初始化会报错。

在每个宿主机上执行：

2，确保时区和时间正确

时区设置

3，关闭防火墙和selinux

ubuntu 查看防火墙命令，ufw status可查看状态，ubuntu20.04默认全部关闭，无需设置。

4，主机名和hosts设置（可选）

非必须，但是为了直观方便管理，建议设置。

在宿主机分别设置主机名：k8s-master，k8s-node01，k8s-node02

hosts设置

1，更改docker默认驱动为systemd

为防止初始化出现一系列的错误，请检查docker和kubectl驱动是否一致，否则kubectl没法启动造成报错。版本不一样，docker有些为cgroupfs，而kubectl默认驱动为systemd，所以需要更改docker驱动。

可查看自己docker驱动命令：

更改docker驱动，编辑 /etc/docker/daemon.json (没有就新建一个），添加如下启动项参数即可：

重启docker

需要在每台机器上安装以下的软件包：

2，更新 apt 包索引并安装使用 Kubernetes apt 仓库所需要的包

安装软件包以允许apt通过HTTPS使用存储库，已安装软件的可以忽略

3，下载公开签名秘钥、并添加k8s库

国外：下载 Google Cloud 公开签名秘钥：

国内：可以用阿里源即可：

请注意，在命令中，使用的是Ubuntu 16.04 Xenial 版本，是可用的最新 Kubernetes 存储库。所以而非20.04 的focal。

4，更新 apt 包索引，安装 kubelet、kubeadm 和 kubectl，并锁定其版本

锁定版本，防止出现不兼容情况，例如，1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器，反之则不可以。

只需要在master上操作即可。

1，初始化错误解决（没有报错的可以跳过这条）

错误提示1：

原因：kubectl没法启动，journalctl -xe查看启动错误信息。

解决方案：k8s建议systemd驱动，所以更改docker驱动即可，编辑 /etc/docker/daemon.json (没有就新建一个），添加如下启动项参数即可：

重启docker和kubectel

错误提示2：

原因：初始化生产的文件，重新初始化，需要删除即可

错误提示3：

解决方法：重置配置

2，初始化完成

无报错，最后出现以下，表示初始化完成，根据提示还需要操作。

根据用户是root或者普通用户操作，由于大多环境不会是root用户，我也是普通用户，所以选择普通用户操作命令：

如果是root用户，执行以下命令：

初始化完成，用最后的提示命令 kubeadm join.... 在node机器上加入集群即可。

3，主节点pod网络设置

主节点支持网络插件：https://kubernetes.io/zh/docs/concepts/cluster-administration/addons/

这里安装Calico网络插件：https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises

Calico官网提供三种安装方式，1）低于50个节点，2）高于50个节点，3）etcd datastore（官方不建议此方法）。

这里选择第一种：

安装完成后， kubectl get node 可查看节点状态，由NotReady变成Ready则正常，需要等几分钟完成。

1，node加入master节点

在所有node节点机器操作，统一已安装完成 kubelet、kubeadm 和 kubectl，用master初始化完成后最后提示命令加入，切记要用root用户。

加入成功后，提示如下：

再次查看kubelet服务已正常启动。

2，需注意的坑

1：加入主节点，需要 root 用户执行词条命令，才可以加入master主节点。

node在没有加入主节点master之前，kubelet服务是没法启动的，是正常情况，会报错如下：

原因是缺失文件，主节点master初始化 `kubeadm init`生成。

node节点是不需要初始化的，所以只需要用root用户`kubeadm join`加入master即可生成。

2：如果加入提示某些文件已存在，如：

原因是加入过主节点，即使没成功加入，文件也会创建，所以需要重置节点，重新加入即可，重置命令：

3，在master查看节点

加入完成后，在master节点 kubectl get node 可查看已加入的所有节点：

这里k8s集群创建完成，下一步使用可参考我的下一篇文章：k8s初步熟悉使用介绍，实践搭建nginx集群

Ⅲ k8s 基本使用（上）

本文将介绍 k8s 中的一些最基本的命令，并辅以解释一些基本概念来方便理解，也就是说，本文是一篇偏向实用性而非学术性的文章，如果你想提前了解一下 k8s 相关的知识的话，可以通过以下链接进行学习：

k8s 是经典的一对多模型，有一个主要的管理节点 master 和许多的工作节点 slaver 。当然，k8s 也可以配置多个管理节点，拥有两个以上的管理节点被称为 高可用 。k8s 包括了许多的组件，每个组件都是单运行在一个 docker 容器中，然后通过自己规划的虚拟网络相互访问。你可以通过 kubectl get pod -n kube-system 查看所有节点上的组件容器。

在管理节点中会比工作节点运行更多的 k8s 组件，我们就是靠着这些多出来的组件来对工作节点发号施令。他们都叫什么这里就不详细提了。反正对于”基本使用“来说，这些名字并不重要。

要想理解一个东西就要先明白它的内在理念。通俗点就是，k8s 做了什么？为了提供更加可靠的服务，就要增加服务器的数量，减少每个服务器的体量来平摊负载，而越来越多的虚拟机就会带来越来越高的运维成本。如何让少量的运维人员就可以管理数量众多的服务器及其上的服务呢？这就是 k8s 做的工作。

k8s 把数量众多的服务器重新抽象为一个统一的资源池 ，对于运维人员来说，他们面前没有服务器1、服务器2的概念，而是一个统一的资源池，增加新的服务器对运维人员来说，只是增加自资源池的可用量。不仅如此，k8s 把所有能用的东西都抽象成了资源的概念，从而提供了一套更统一，更简洁的管理方式。

接下来，我会把每个基本命令当做一节来进行介绍，并辅以介绍一些基本概念。本文介绍的命令涵盖了增删改查四方面，可参加下面表格，因为篇幅较长，我们将 create 及之后的不那么常用的命令放在下一篇文章 k8s 基本使用（下）里讲：

接下来进入正题，首先来了解一下 k8s 中最最最常用的命令 kubectl get ，要记住，k8s 把所有的东西都抽象成了资源，而 kubectl get 就是用来查看这些资源的。最常见的资源就是 pod 。

不仅我们自己的服务是要包装成 pod 的，就连 k8s 自己也是运行在一堆 pod 上。接下来就让我们查看一下 k8s 的 pod ：

-n 参数指定了要查看哪个命名空间下的 pod 。 k8s 所有的 pod 都被放置在 kube-system 命名空间下。

执行了 kubectl get pod -n kube-system 命令后，你就可以看到如下内容：

其中每一行就是一个资源，这里我们看到的资源是 pod 。你看到的 pod 数量可能和我的不一致，因为这个列表里包含了 k8s 在所有节点上运行的 pod ，你加入的节点越多，那么显示的 pod 也就越多。我们来一列一列的看：

kubectl get 可以列出 k8s 中所有资源

这里只介绍了如何用 kubectl 获取 pod 的列表。但是不要把 get 和 pod 绑定在一起，pod 只是 k8s 中的一种服务，你不仅可以 get pod ，还可以 get svc ( 查看服务 )、 get rs ( 查看副本控制器 )、 get deploy ( 查看部署 )等等等等，虽然说 kubectl get pod 是最常用的一个，但是如果想查看某个资源而又不知道命令是什么， kbuectl get <资源名> 就对了。

如果你想看更多的信息，就可以指定 -o wide 参数，如下：

加上这个参数之后就可以看到资源的所在 ip 和所在节点 node 了。

记得加上 -n

-n 可以说是 kubectl get 命令使用最频繁的参数了，在正式使用中，我们永远不会把资源发布在默认命名空间。所以，永远不要忘记在 get 命令后面加上 -n 。

kubectl get 命令可以列出 k8s 中的资源，而 kubectl get pod 是非常常用的查看 pod 的命令。而 -n 参数则可以指定 pod 所在的命名空间。

kubectl describe 命令可以用来查看某一资源的具体信息，他同样可以查看所有资源的详情，不过最常用的还是查看 pod 的详情。他也同样可以使用 -n 参数指定资源所在的命名空间。

举个例子，我们可以用下面命令来查看刚才 pod 列表中的某个 pod，注意不要忘记把 pod 名称修改成自己的：

然后你就可以看到很多的信息，咱们分开说，首先是基本属性，你可以在详细信息的开头找到它：

基本属性

其中几个比较常用的，例如 Node 、 labels 和 Controlled By 。通过 Node 你可以快速定位到 pod 所处的机器，从而检查该机器是否出现问题或宕机等。通过 labels 你可以检索到该 pod 的大致用途及定位。而通过 Controlled By ，你可以知道该 pod 是由那种 k8s 资源创建的，然后就可以使用 kubectl get <资源名> 来继续查找问题。例如上文 DaemonSet/kube-flannel-ds-amd64 ，就可以通过 kubectl get DaemonSet -n kube-system 来获取上一节资源的信息。

内部镜像信息

在中间部分你可以找到像下面一样的 Containers 段落。该段落详细的描述了 pod 中每个 docker 容器的信息，常用的比如 Image 字段，当 pod 出现 ImagePullBackOff 错误的时候就可以查看该字段确认拉取的什么镜像。其他的字段名都很通俗，直接翻译即可。

事件

在 describe 查看详情的时候，最常用的信息获取处就是这个 Event 段落了，你可以在介绍内容的末尾找到它，如下：

是的，如果你看到上面这样，没有任何 Events 的话，就说明该 pod 一切正常。当 pod 的状态不是 Running 时，这里一定会有或多或少的问题，长得像下面一样，然后你就可以通过其中的信息分析 pod 出现问题的详细原因了：

kubectl describe <资源名> <实例名> 可以查看一个资源的详细信息，最常用的还是比如 kubectl describe pod <pod名> -n <命名空间> 来获取一个 pod 的基本信息。如果出现问题的话，可以在获取到的信息的末尾看到 Event 段落，其中记录着导致 pod 故障的原因。

如果你想查看一个 pod 的具体日志，就可以通过 kubectl logs <pod名> 来查看。注意，这个只能查看 pod 的日志。通过添加 -f 参数可以持续查看日志。例如，查看 kube-system 命名空间中某个 flannel pod 的日志，注意修改 pod 名称：

然后就可以看到如下输出：

如果你发现某个 pod 的服务有问题，但是状态还是显示 Running ，就可以使用 kubectl logs 来查看其详细日志。

在本篇文章里，我们了解了 k8s 的宗旨和一些基本概念，并知道了最为常用的 get 、 descibe 及 logs 命令，知道了这三条命令之后就几乎可以从 k8s 中获取所有常用信息了。接下来的 k8s 基本使用（下）里，我们会更深一步，来了解 k8s 中如何创建、修改及删除资源。

Ⅳ kubernetes— 记一次用kubeadm搭建kubernetes v1.9.0集群

目标：使用kubeadm搭建kubernetes v1.9.0集群

操作系统：Ubuntu 16.04.3

Ubuntu-001 :192.168.1.110

ubuntu-002 : 192.168.1.106

步骤总结：

1、安装Docker CE

2、安装kubeadm、kubectl、kubelet

3、利用kubeadm init初始化kubernetes集群

4、利用kubeadm join加入node节点到集群

具体操作步骤：

在Ubuntu 16.04安装Docker CE (使用apt-get进行安装)

# step 1: 安装必要的一些系统工具

sudo apt-get update

sudo apt-get -y install apt-transport-httpsca-certificates curl software-properties-common

# step 2: 安装GPG证书

curl -fsSLhttp://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -

# Step 3: 写入软件源信息

sudo add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs)stable"

# Step 4: 更新并安装 Docker-CE

sudo apt-get -y update

sudo apt-get -y install docker-ce

# 安装指定版本的Docker-CE:

# Step 1: 查找Docker-CE的版本:

# apt-cache madison docker-ce

#   docker-ce | 17.03.1~ce-0~ubuntu-xenial | http://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages

#   docker-ce | 17.03.0~ce-0~ubuntu-xenial | http://mirrors.aliyun.com/docker-ce/linux/ubuntu xenial/stable amd64 Packages

# Step 2: 安装指定版本的Docker-CE: (VERSION 例如上面的 17.03.1~ce-0~ubuntu-xenial)

# sudo apt-get -y install docker-ce=[VERSION]

安装kubelet kubeadm和kubectl

由于国内google被墙，因此无法按照官方文档操作，现添加aliyun源，可成功安装kubelet kubeadm和kubectl。

# step 1：安装必要的一些系统工具

apt-get update && apt-get install -y apt-transport-https

# step 2：安装GPG证书

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

# step 3：更新软件源信息

cat << EOF >/etc/apt/sources.list.d/kubernetes.list

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

EOF

# step 4：更新并安装kubelet kubeadm kubectl

apt-get update

apt-get install -y kubelet kubeadm kubectl

# 或者安装指定版本kubelet kubeadm kubectl

apt-get install -y kubelet=1.9.6-00 kubeadm=1.9.6-00 kubectl=1.9.6-00

# step 5：设置kubelet自启动，并启动kubelet

systemctl enable kubelet && systemctl start kubelet

利用kubeadm初始化kubernetes集群

如果在国内的话，需要提前准备kubernetes的各镜像，具体参考：在国内如何巧妙获取kubernetes各镜像？

root@Ubuntu-001:~# kubeadm init --kubernetes-version=v1.9.0 --pod-network-cidr=10.244.0.0/16

[init] Using Kubernetesversion: v1.9.0

[init] Using Authorizationmodes: [Node RBAC]

[preflight] Runningpre-flight checks.

         [WARNING SystemVerification]: docker version is greater thanthe most recently validated version. Docker version: 17.12.0-ce. Max validatedversion: 17.03

         [WARNING FileExisting-crictl]: crictl not found in system path

[preflight] Starting thekubelet service

[certificates] Generated cacertificate and key.

[certificates] Generatedapiserver certificate and key.

[certificates] apiserverserving cert is signed for DNS names [ubuntu-001 kubernetes kubernetes.defaultkubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1192.168.1.110]

[certificates] Generatedapiserver-kubelet-client certificate and key.

[certificates] Generated sakey and public key.

[certificates] Generatedfront-proxy-ca certificate and key.

[certificates] Generatedfront-proxy-client certificate and key.

[certificates] Valid certificatesand keys now exist in "/etc/kubernetes/pki"

[kubeconfig] Wrote KubeConfigfile to disk: "admin.conf"

[kubeconfig] Wrote KubeConfigfile to disk: "kubelet.conf"

[kubeconfig] Wrote KubeConfigfile to disk: "controller-manager.conf"

[kubeconfig] Wrote KubeConfigfile to disk: "scheler.conf"

[controlplane] Wrote StaticPod manifest for component kube-apiserver to"/etc/kubernetes/manifests/kube-apiserver.yaml"

[controlplane] Wrote StaticPod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"

[controlplane] Wrote StaticPod manifest for component kube-scheler to"/etc/kubernetes/manifests/kube-scheler.yaml"

[etcd] Wrote Static Podmanifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"

[init] Waiting for thekubelet to boot up the control plane as Static Pods from directory"/etc/kubernetes/manifests".

[init] This might take aminute or longer if the control plane images have to be pulled.

[apiclient] All control planecomponents are healthy after 38.006067 seconds

[uploadconfig] Storingthe configuration used in ConfigMap "kubeadm-config" in the"kube-system" Namespace

[markmaster] Will mark nodeubuntu-001 as master by adding a label and a taint

[markmaster] Masterubuntu-001 tainted and labelled with key/value:node-role.kubernetes.io/master=""

[bootstraptoken] Using token:3ef896.6fe4c166c546aa89

[bootstraptoken] ConfiguredRBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes toget long term certificate credentials

[bootstraptoken] ConfiguredRBAC rules to allow the csrapprover controller automatically approve CSRs froma Node Bootstrap Token

[bootstraptoken] ConfiguredRBAC rules to allow certificate rotation for all node client certificates inthe cluster

[bootstraptoken] Creating the"cluster-info" ConfigMap in the "kube-public" namespace

[addons] Applied essentialaddon: kube-dns

[addons] Applied essentialaddon: kube-proxy

Your Kubernetes master hasinitialized successfully!

To start using your cluster,you need to run the following as a regular user:

  mkdir -p $HOME/.kube

  sudo cp -i /etc/kubernetes/admin.conf$HOME/.kube/config

  sudo chown $(id -u):$(id -g)$HOME/.kube/config

You should now deploy a podnetwork to the cluster.

Run "kubectl apply -f[podnetwork].yaml" with one of the options listed at:

https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any numberof machines by running the following on each node

as root:

kubeadm join --token 3ef896.6fe4c166c546aa89192.168.1.110:6443 --discovery-token-ca-cert-hashsha256:

至此，master节点创建完毕

常见错误：

1、Port 2379被占用

[preflight] Some fatal errors occurred:

         [ERROR Port-2379]: Port2379 is in use

解决方法：netstat -anp|grep 2379查看是哪个进程在占用，2379是etcd的端口，很可能是多次执行导致。Kill掉该进程。

2、提示swap为打开状态

[ERROR Swap]: running with swap on is not supported. Please disableswap

[preflight] If you know what you are doing, you can make a checknon-fatal with `--ignore-preflight-errors=...`

解决方法：执行swapoff -a即可

3、其他错误

https://kubernetes.io/docs/setup/independent/troubleshooting-kubeadm/

接下来，按照kubeadm init的输出打印配置

对于非root用户:

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf$HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

root用户：

export KUBECONFIG=/etc/kubernetes/admin.conf

为了能够使得pod间可以相互通信，你需要安装pod network插件

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml

一旦pod network安装成功，可以执行：

kubectl get pods --all-namespaces -o wide

利用kubeadm join加入Ubuntu-002节点到集群

Ubuntu-002节点安装Docker、kubeadm、kubectl、kubelet，并且本地已pull了kubernetes镜像。

根据kubeadm init最后输出的加入集群的命令kubeadm join，将Ubuntu-002节点加入集群成为node节点

root@Ubuntu-002:~# kubeadm join --token 6aefa6.a55aba3998eda615 192.168.1.110:6443--discovery-token-ca-cert-hashsha256:

[preflight] Runningpre-flight checks.

         [WARNING SystemVerification]: docker version is greater thanthe most recently validated version. Docker version: 17.12.0-ce. Max validatedversion: 17.03

         [WARNING FileExisting-crictl]: crictl not found in systempath

[discovery] Trying to connectto API Server "192.168.1.110:6443"

[discovery] Createdcluster-info discovery client, requesting info from"https://192.168.1.110:6443"

[discovery] Requesting infofrom "https://192.168.1.110:6443" again to validate TLS against thepinned public key

[discovery] Cluster infosignature and contents are valid and TLS certificate validates against pinnedroots, will use API Server "192.168.1.110:6443"

[discovery] Successfullyestablished connection with API Server "192.168.1.110:6443"

This node has joined thecluster:

* Certificate signing requestwas sent to master and a response

  was received.

* The Kubelet was informed ofthe new secure connection details.

Run 'kubectl get nodes'on the master to see this node join the cluster.

node节点ubuntu-002加入集群，可以依次加入其他node节点。在master节点运行kubectl get nodes。

当然，如果想要在非master节点（node节点或者非集群远程主机）执行kubectl命令，需要

scp root@:/etc/kubernetes/admin.conf .

kubectl --kubeconfig ./admin.conf get nodes

例如：

不然会出现

将node节点ubuntu-002 从集群中删除

1、kubectl drain ubuntu-002 --delete-local-data --force--ignore-daemonsets --kubeconfig ./admin.conf

2、kubectl delete node ubuntu-002 --kubeconfig admin.conf

3、kubeadm reset

Ⅳ kubeadm集群如何平滑的升级

可以先从1.6x升到 1.7 ，再升到1.8 ；不支持 1.6以下的版本

1. 升级系统包。更新 kubectl、kubeadm、kubelet 和 kubernetes-cni 的系统包。
a. 在 Debian 上这样完成：
sudo apt-get update
sudo apt-get upgrade

b. 在 CentOS/Fedora 上则运行:
sudo yum update

2. 重启 kubelet

systemctl restart kubelet

3. 删除 kube-proxy DaemonSet
虽然这个步骤自动升级了大部分组件，但当前仍需要手动删除 kube-proxy 以使其可以使用正确的版本重建：
sudo KUBECONFIG=/etc/kubernetes/admin.conf kubectl delete daemonset kube-proxy -n kube-system

4. 执行 kubeadm 升级
警告：当引导集群时，所有传递给第一个 kubeadm init 的参数都必须在用于升级的 kubeadm init 命令中指定。我们计划在 v1.8 引入这个限制。

sudo kubeadm init --skip-preflight-checks --kubernetes-version <DESIRED_VERSION>

例如，如果要升级到 1.7.0，可以运行：

sudo kubeadm init --skip-preflight-checks --kubernetes-version v1.7.0

5. 升级 CNI provider
您的 CNI provider 现在可能有它自己升级说明。检查 addons 页面，找到您的 CNI provider 并查看是否有必要的额外升级步骤。

参考：将 kubeadm 集群从 1.7 升级到 1.8

Ⅵ K8S安装和创建集群终极教程（单master多worker）

本文会以 最简单 、 最直接 、 最完整 的方式记录kubernetes（下面统称K8S）单master多工作节点（worker nodes）的集群步骤

首先要简单了解一下本文的3个核心概念：

内存建议至少4G

问：如何查看主机名？

答：执行命令hostname

问：如何修改主机名？

答：永久生效的做法：执行命令vi /etc/hostname，把第一行去掉（不能注释掉，要去掉），然后重新写上自定义的主机名（注意命名规范），保存并重启后生效；

临时生效的做法：执行以下命令

问：如何查看MAC地址？

答：执行命令ip link，然后看你的第一网卡

问：如何查看proct_uuid？

答：执行命令sudo cat /sys/class/dmi/id/proct_uuid

注意：30000-32767这个端口范围是我们创建服务的端口必须要设置的一个范围（如果设置范围以外的会有限制提示并创建失败），这是K8S规定的。

另外，如果你要直接关闭防火墙可以执行

⑥必须禁用Swap

Swap total大于0，说明Swap分区是开启的

问：如何关闭Swap？

答：编辑文件/etc/fstab，在swap行前面加上#号注释，保存并重启服务器

再次查看分区状态，已生效

常见的容器引擎（Container runtime，简称runtime）：

本文使用的容器引擎是Docker

安装完成后查看版本：

当出现可能跟Docker引擎相关的奇怪异常时可以尝试把Docker卸载干净并重新安装，但一定要注意镜像、容器、卷或配置文件这些是否需要备份。

下面记录卸载Docker引擎的步骤：

①卸载 Docker Engine、CLI 和 Containerd 包：

②主机上的映像、容器、卷或自定义配置文件不会自动删除。删除所有镜像、容器和卷：

③配置文件如果有不合法的字符时会导致启动失败，我们需要将其删除然后重建

此时Docker引擎已卸载干净

官网用的是谷歌的yum源，因为国内是连不上的，所以这里替换成阿里提供的yum源

①安装

从安装信息中可以看到版本号是1.22

Installing:

kubeadm x86_64 1.22.4-0 kubernetes 9.3 M

kubectl x86_64 1.22.4-0 kubernetes 9.7 M

kubelet x86_64 1.22.4-0 kubernetes 20 M

②启动

这就是一个驱动程序，注意cgroup和cgroupfs不要混淆了

引用官方的一段话

“由于 kubeadm 把 kubelet 视为一个系统服务来管理，所以对基于 kubeadm 的安装，我们推荐使用 systemd 驱动，不推荐 cgroupfs 驱动。”

kubeadm默认是使用systemd 驱动，而我们的Docker默认驱动是cgroupfs（docker info可以查看），所以需要将Docker的驱动改成systemd

①编辑Docker配置文件

②重启Docker服务

再次docker info查看驱动信息已变成了systemd

工作节点（worker nodes）的最小配置就到这里了

①镜像源参数说明

默认情况下, kubeadm 会从 k8s.gcr.io 仓库拉取镜像，国内是拉不了的。官方文档明确表示允许你使用其他的 imageRepository 来代替 k8s.gcr.io。

--image-repository 你的镜像仓库地址

接下来我找了一些国内的镜像源，并简单做了下分析

综合上述统计，我选择阿里云的镜像源

②ip地址范围参数说明

--pod-network-cidr =192.168.0.0/16

注意:如果192.168.0.0/16已经在您的网络中使用，您必须选择一个不同的pod网络CIDR，在上面的命令中替换192.168.0.0/16。

集群初始化命令：

因为我用的是演示机器，所以这里把完整的执行信息都贴出来方便查阅，平时工作中一定要注意保护好敏感的信息（我的ip地址范围是自定义的便于下面的功能演示，另外初次init需要下载镜像文件，一般需要等几分钟）

如上所示，集群初始化成功，此时一定要注意看上面执行结果最后的那部分操作提示，我已用标明了初始化成功后还需要执行的3个步骤

注意：如果init成功后发现参数需要调整，可以执行kubeadm reset，它的作用是尽最大努力恢复kubeadm init 或者 kubeadm join所做的更改。

To start using your cluster, you need to run the following as a regular user:

翻译：开始使用集群前，如果你是普通用户（非root），你需要执行以下的命令：

Alternatively, if you are the root user, you can run:

翻译：或者，如果你使用的是root，你可以执行以下命令：

（注意：export只是临时生效，意味着每次登录你都需要执行一次）

网络配置配的就是Pod的网络，我的网络插件选用calico

cidr就是ip地址范围，如果您使用 pod CIDR 192.168.0.0/16，请跳到下一步。

但本文中使用的pod CIDR是192.100.0.0/16，所以我需要取消对清单中的 CALICO_IPV4POOL_CIDR 变量的注释，并将其设置为与我选择的 pod CIDR 相同的值。（注意一定要注意好格式，注意对齐）

可根据需求自定义清单，一般不需要的就直接跳过这步

在所有的工作节点上执行join命令（复制之前初始化成功后返回的加入集群命令到所有的工作节点执行即可）

master上查看所有节点的状态

到这里集群已经创建完成

最后我再安装K8S的可视化界面kubernetes-dashboard，方便我们日常使用

①下载yaml文件

②修改yaml文件，新增type和nodePort，使服务能够被外部访问

③安装并查看运行情况

④新建用户

文件创建完成后保存并apply

⑤获取Token，用于界面登录

⑥登录dashboard

192.168.189.128是我的master服务器ip，另外要注意必须使用https，并且不能使用ie内核模式

复制⑤生成的token到输入框，点击登录

dashboard安装配置完成

问：如何在查看资源情况？

答：在master上执行以下命令可查看资源情况（-o wide是显示更详细的信息），

①查看所有节点

②查看所有命名空间

③查看命名空间下的pod

④查看所有命名空间的pod

⑤实时查看查看命名空间下的pod运行情况

问：kubeadm join 出现异常[ERROR Port-10250]: Port 10250 is in use，如何解决？

答：这是因为你之前join失败过了，需要先执行kubeadm reset再重新join

问：虚拟机上测试时网卡突然消失如何解决（题外问题记录）？

答：

①确认丢失的网卡信息，ens开头（可选步骤）

ifconfig -a

②执行以下命令解决

问：如何查看K8S版本？

答：kubectl version

问：join命令忘记或者过期了怎么办？

答：

生成永不过期的

生成时效24小时的

问：Pod不断重启并且无其它报错信息时怎么办？

答：这种情况通常是因为你的集群中只有master，没有worker节点，master的创建默认是有污点的，即不允许调度新的Pod，如果你需要（当然这并不推荐），就需要删除 master 上的污点。删除污点可以执行以下命令，

它应该返回以下内容。

Ⅶ Troubleshooting 故障排除 kubeadm

As with any program, you might run into an error installing or running kubeadm. This page lists some common failure scenarios and have provided steps that can help you understand and fix the problem.

对于任何项目,你可能会遇到一个错误安装或运行kubeadm。这个页面列出了一些常见的故障场景和提供可以帮助你理解和解决问题的步骤。

[TOC]

在安装期间ebtables或一些类似的可执行文件没有找到

If you see the following warnings while running kubeadm init

Then you may be missing ebtables, ethtool or a similar executable on your node. You can install them with the following commands:

If you notice that kubeadm init hangs after printing out the following line:

This may be caused by a number of problems. The most common are:

There are two common ways to fix the cgroup driver problem:

control plane Docker containers are crashlooping or hanging. You can check this by running docker ps and investigating each container by running docker logs.

The following could happen if Docker halts and does not remove any Kubernetes-managed containers:

A possible solution is to restart the Docker service and then re-run kubeadm reset:

一个可能的解决办法是重新启动 Docker 服务,然后重新运行 kubeadm reset

Inspecting the logs for docker may also be useful:
检查``docker`日志也可能是有用的

Right after kubeadm init there should not be any pods in these states.
在kubeadm init之后，这些状态中不应该有任何pod。

If there are pods in one of these states right after kubeadm init, please open an issue in the kubeadm repo. coredns (or kube-dns) should be in the Pending state until you have deployed the network solution.
如果在kubeadm初始化之后有pod处于这些状态之一，请在kubeadm repo中打开一个问题。在部署网络解决方案之前，coredns(或kube-dns)应该处于挂起状态。

If you see Pods in the RunContainerError , CrashLoopBackOff or Error state after deploying the network solution and nothing happens to coredns (or kube-dns), it’s very likely that the Pod Network solution that you installed is somehow broken.

部署网络解决方案后,如果你看到 RunContainerError , CrashLoopBackOff or Error 状态,什么都没发生coredns(或kube-dns),它很可能是你安装的吊舱网络解决方案是破碎的。

You might have to grant it more RBAC privileges or use a newer version. Please file an issue in the Pod Network providers’ issue tracker and get the issue triaged there.

你可能需要授予它更多的RBAC特权或使用一个新版本。请在Pod网络提供商的issue tracker中的一个问题,这个问题修复。

If you install a version of Docker older than 1.12.1, remove the MountFlags=slave option when booting dockerd with systemd and restart docker. You can see the MountFlags in /usr/lib/systemd/system/docker.service. MountFlags can interfere with volumes mounted by Kubernetes, and put the Pods in CrashLoopBackOff state. The error happens when Kubernetes does not find var/run/secrets/kubernetes.io/serviceaccount files.

coredns (or kube-dns)处于挂起状态

This is expected and part of the design.
这是预期和设计的一部分。

kubeadm is network provider-agnostic, so the admin should install the pod network solution of choice.
kubeadm is network provider-agnostic,,所以管理员应该安装可选的pod网络解决方案。
You have to install a Pod Network before CoreDNS may be deployed fully.
你必须安装 Pod Network , CoreDNS 才可能完全部署。

Hence the Pending state before the network is set up.
因此,网络设置之前挂起状态

The HostPort and HostIP functionality is available depending on your Pod Network provider. Please contact the author of the Pod Network solution to find out whether HostPort and HostIP functionality are available.

主机端口和HostIP功能取决于Pod网络提供程序。请联系Pod网络解决方案的作者，以了解是否有可用的主机端口和HostIP功能。

Calico, Canal, and Flannel CNI providers are verified to support HostPort.
经过验证，Calico, Canal, and Flannel CNI支持HostPort。

For more information, see the CNI portmap documentation.

If your network provider does not support the portmap CNI plugin, you may need to use the NodePort feature of services or use HostNetwork=true.

如果您的网络提供商不支持portmap CNI插件，您可能需要使用服务的NodePort特性或使用HostNetwork=true

Many network add-ons do not yet enable hairpin mode which allows pods to access themselves via their Service IP. This is an issue related to CNI. Please contact the network add-on provider to get the latest status of their support for hairpin mode.
许多网络附加组件还没有启用允许pod通过其服务IP访问自身的发夹模式。这是一个与CNI相关的问题。请与网络插件提供商联系，以获得他们支持发夹模式的最新状态。

If you are using VirtualBox (directly or via Vagrant), you will need to ensure that hostname -i returns a routable IP address. By default the first interface is connected to a non-routable host-only network. A work around is to modify /etc/hosts , see this Vagrantfile for an example.
如果您正在使用VirtualBox(直接或通过Vagrant)，则需要确保 hostname -i 返回一个可路由的IP地址。默认情况下，第一个接口连接到一个不可路由的主机网络。一种解决方法是修改 /etc/hosts ，参见这个Vagrantfile中的示例。

The following error indicates a possible certificate mismatch.

The following error might indicate that something was wrong in the pod network:

If you’re using flannel as the pod network inside Vagrant, then you will have to specify the default interface name for flannel.

如果在Vagrant中使用 flannel 作为pod网络，则必须为 flannel 指定默认接口名。

Vagrant typically assigns two interfaces to all VMs. The first, for which all hosts are assigned the IP address 10.0.2.15, is for external traffic that gets NATed.
Vagrant通常为所有vm分配两个接口。第一种方法为所有主机分配IP地址10.0.2.15，用于指定外部流量。

This may lead to problems with flannel, which defaults to the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this, pass the --iface eth1 flag to flannel so that the second interface is chosen.

这可能会导致 flannel 的问题， flannel 默认是主机上的第一个接口。这导致所有主机认为它们拥有相同的公共IP地址。为了防止这种情况发生，将 —iface eth1 标志传递给 flannel ，以便选择第二个接口。

In some situations kubectl logs and kubectl run commands may return with the following errors in an otherwise functional cluster:
在某些情况下，kubectl日志和kubectl run命令可能会在功能集群中返回以下错误

Use ip addr show to check for this scenario instead of ifconfig because ifconfig will not display the offending alias IP address.

Alternatively an API endpoint specific to Digital Ocean allows to query for the anchor IP from the droplet:

The workaround is to tell kubelet which IP to use using --node-ip . When using Digital Ocean, it can be the public one (assigned to eth0 ) or the private one (assigned to eth1 ) should you want to use the optional private network. The KubeletExtraArgs section of the kubeadm NodeRegistrationOptions structure can be used for this.

解决方法是告诉 kubelet 使用哪个IP——node-ip。在使用Digital Ocean时，如果您想使用可选的专用网络，它可以是公共的(分配给eth0)，也可以是私有的(分配给eth1)。kubeadm NodeRegistrationOptions结构中的KubeletExtraArgs部分可以用于此目的。

Then restart kubelet:

systemctl daemon-reload
systemctl restart kubelet
coredns pods have CrashLoopBackOff or Error state
If you have nodes that are running SELinux with an older version of Docker you might experience a scenario where the coredns pods are not starting. To solve that you can try one of the following options:

Upgrade to a newer version of Docker.

Disable SELinux.

Modify the coredns deployment to set allowPrivilegeEscalation to true:

kubectl -n kube-system get deployment coredns -o yaml |
sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' |
kubectl apply -f -
Another cause for CoreDNS to have CrashLoopBackOff is when a CoreDNS Pod deployed in Kubernetes detects a loop. A number of workarounds are available to avoid Kubernetes trying to restart the CoreDNS Pod every time CoreDNS detects the loop and exits.

Warning: Disabling SELinux or setting allowPrivilegeEscalation to true can compromise the security of your cluster.
etcd pods restart continually
If you encounter the following error:

rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused "read parent: connection reset by peer""
this issue appears if you run CentOS 7 with Docker 1.13.1.84. This version of Docker can prevent the kubelet from executing into the etcd container.

To work around the issue, choose one of these options:

Roll back to an earlier version of Docker, such as 1.13.1-75

yum downgrade docker-1.13.1-75.git8633870.el7.centos.x86_64 docker-client-1.13.1-75.git8633870.el7.centos.x86_64 docker-common-1.13.1-75.git8633870.el7.centos.x86_64
Install one of the more recent recommended versions, such as 18.06:

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce-18.06.1.ce-3.el7.x86_64
Not possible to pass a comma separated list of values to arguments inside a --component-extra-args flag
kubeadm init flags such as --component-extra-args allow you to pass custom arguments to a control-plane component like the kube-apiserver. However, this mechanism is limited e to the underlying type used for parsing the values (mapStringString).

If you decide to pass an argument that supports multiple, comma-separated values such as --apiserver-extra-args "enable-admission-plugins=LimitRanger,NamespaceExists" this flag will fail with flag: malformed pair, expect string=string. This happens because the list of arguments for --apiserver-extra-args expects key=value pairs and in this case NamespacesExists is considered as a key that is missing a value.

Alternatively, you can try separating the key=value pairs like so: --apiserver-extra-args "enable-admission-plugins=LimitRanger,enable-admission-plugins=NamespaceExists" but this will result in the key enable-admission-plugins only having the value of NamespaceExists.

A known workaround is to use the kubeadm configuration file.

kube-proxy scheled before node is initialized by cloud-controller-manager
In cloud provider scenarios, kube-proxy can end up being scheled on new worker nodes before the cloud-controller-manager has initialized the node addresses. This causes kube-proxy to fail to pick up the node’s IP address properly and has knock-on effects to the proxy function managing load balancers.

The following error can be seen in kube-proxy Pods:

server.go:610] Failed to retrieve node IP: host IP unknown; known addresses: []
proxier.go:340] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
A known solution is to patch the kube-proxy DaemonSet to allow scheling it on control-plane nodes regardless of their conditions, keeping it off of other nodes until their initial guarding conditions abate:

kubectl -n kube-system patch ds kube-proxy -p='{ "spec": { "template": { "spec": { "tolerations": [ { "key": "CriticalAddonsOnly", "operator": "Exists" }, { "effect": "NoSchele", "key": "node-role.kubernetes.io/master" } ] } } } }'
The tracking issue for this problem is here.

The NodeRegistration.Taints field is omitted when marshalling kubeadm configuration
Note: This issue only applies to tools that marshal kubeadm types (e.g. to a YAML configuration file). It will be fixed in kubeadm API v1beta2.

By default, kubeadm applies the role.kubernetes.io/master:NoSchele taint to control-plane nodes. If you prefer kubeadm to not taint the control-plane node, and set InitConfiguration.NodeRegistration.Taints to an empty slice, the field will be omitted when marshalling. When the field is omitted, kubeadm applies the default taint.

There are at least two workarounds:

Use the role.kubernetes.io/master:PreferNoSchele taint instead of an empty slice. Pods will get scheled on masters, unless other nodes have capacity.

Remove the taint after kubeadm init exits:

kubectl taint nodes NODE_NAME role.kubernetes.io/master:NoSchele-

Ⅷ 使用kubeadm搭建高可用的K8S集群（2022年1月亲测有效）

kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具。

这个工具能通过两条指令完成一个kubernetes集群的部署：

在开始之前，部署Kubernetes集群机器需要满足以下几个条件：

3.1 安装相关包和keepalived

3.2配置master节点

master1节点配置

master2节点配置

3.3 启动和检查

在两台master节点都执行

启动后查看master1的网卡信息

4.1 安装

4.2 配置

两台master节点的配置均相同，配置中声明了后端代理的两个master节点服务器，指定了haproxy运行的端口为16443等，因此16443端口为集群的入口

4.3 启动和检查

两台master都启动

检查端口

Kubernetes默认CRI（容器运行时）为Docker，因此先安装Docker。

5.1 安装Docker

5.2 添加阿里云YUM软件源

5.3 安装kubeadm，kubelet和kubectl

由于版本更新频繁，这里指定版本号部署：

6.1 创建kubeadm配置文件

在具有vip的master上操作，这里为master1

6.2 在master1节点执行

按照提示保存以下内容，一会要使用（kubeadm init中的回显内容）：

按照提示配置环境变量，使用kubectl工具：

查看集群状态

创建kube-flannel.yml，在master1上执行

安装flannel网络

检查

8.1 复制密钥及相关文件

从master1复制密钥及相关文件到master2

8.2 master2加入集群

执行在master1上init后输出的join命令,需要带上参数--control-plane表示把master控制节点加入集群（之前kubeadm init回显内容）

检查状态（master1上执行）

在node1上执行

向集群添加新节点，执行在kubeadm init输出的kubeadm join命令（之前kubeadm init回显内容，注意不加--control-plane）：

集群网络重新安装，因为添加了新的node节点（在master1上执行）

检查状态（在master1上执行）

在Kubernetes集群中创建一个pod，验证是否正常运行：

访问地址：http://192.168.3.158:31030

Ⅸ k8s 1.14版本证书过期问题解决

说起来，都是泪，从三年前和这个问题作斗争，证书过期和自动续期这个大问题，始终是一个心头的伤。
现在要想到一刀切的方案，还是自己更改Kubeadm源码，全部改成100年，最洒脱。
但，如果线上已运行了这些东东，且是10年1年证书过期的都有，那啷个弄嘛？

先用如下命令，看看k8s的哪些证书何时到期

输出pki下的证书情况：

输出/etc/kubernetes下的证书情况

cp -R /etc/kubernetes /etc/kubernetes$(date "+%Y%m%d")

又或者一条命令搞定
kubeadm init phase kubeconfig all
这里有个注意的细节，在使用kubeadm命令之前，它会到外网查找此K8s集群的版本信息，如果我们的机器是纯企业内网，不能访问外面，这里就会卡住。
BUT，还是可以离线进行的。
先从本集群生成一个config view类型文件。
kubeadm config view > kubeadm.conf
然后，在之后生成证书时，加上这个文件作为--config参数即可。如
kubeadm alpha phase kubeconfig scheler --config kubeadm.conf
(上面是kueadm 1.10版本的命令，新版本已从alpha转正式命令，-h可找出来)

如果生疏了，可能看看help命令

1，仍然先备份哟，备份使得万年船~~
cp -R /etc/kubernetes /etc/kubernetes$(date "+%Y%m%d")
2，先将要过期的证书作更名

3，生成k8s的config view，然后使用kubeadm生成新的证书对

4，依次升级完其它几个要过期的证书，包括与etcd连接的证书对。
5，注意，有三个根证书对，是20年过期的，我没有更新（关键我不清楚更新之后，会发生什么事）。

6，根据不同版本，查看证书过期的命令还不一样呢，最好再作个重复记录。
查看/etc/kubernetes/pki目录证书过期

查看/etc/kubernetes/目录下的几个conf里的证书过期

Ⅹ k8s常用命令

命令行敲出的指令分为2种，

资源管理方式分类

直接使用命令去操作k8s资源，命令和参数一起出现

通过命令和配置文件去操作作k8s资源，命令还是那个命令，只不过参数都放在配置文件里面

使用apply创建资源，

说明

在master节点执行以下命令即可删除

还需要在work节点上执行以下命令来清空ini配置

先在主节点创建令牌

然后在需要加入集群的节点中执行令牌，注意这里的命令是通过 kubeadm token create --print-join-command 命令返回的结果

说明

记住，名称中不能用下划线 _ ，可以用横行 - 第一种创建方式–命令行创建

第二种创建方式–命令行 + 配置文件
创建一个名为namespace-dev.yaml的文件，内容如下(注意大小写，kind的头字母必须大写)

然后偶执行命令进行创建

需要注意的是，删除后，当前命名空间下的pod、deployment、 container 也会一起删掉

第一种–使用命令删除

第二种–使用配置文件删除

说明

获取所有namespace的pod并监视资源变动
加上 -w 表示监视资源变动信息，此时命令行进入阻塞状态，如果pod有变化将会马上呈现出来；

其他参数

因为pod里面至少要有一个容器，所以pod是和容器一起创建的，新建一个文件 pod.ymal ，内容如下

然后执行命令并指定配置文件进行创建

以下示例是为pod资源打标签，这种方式是和pod一起创建的，新建一个配置文件 label.yaml

执行命令创建pod

适合更新label值，前提是label的key必须已存在；

删除key为lebelKey的标签

pod控制器有很多种，我们这里就用deployment

使用以下run命令运行一个nginx，deployment名称为 app=run-cmd-nginx-deploy-3

通过以下命令可以看到，会自动生成一个 app=run-cmd-nginx-deploy-3 的标签

新建一个deployment.yaml文件，内容如下

需要注意的是，一旦删除pod控制器，此pod控制器下的所有pod和容器也会一并删除；

默认创建的pod是只能对内访问的，所以需要创建一个对外的访问端口，创建一个service其实就是暴露对外的访问端口

说明

创建好service之后，查看service信息，可以看到，暴露的端口为：30474，

新建一个service.ymal文件，内容如下

以下三种用法都可以

查询pod控制器和pod

Endpoint是kubernetes中的一个资源对象，存储在etcd中，用来记录一个service对应的所有pod的访问地址，它是根据service配置文件中selector描述产生的。

一个Service由一组Pod组成，这些Pod通过Endpoints暴露出来，Endpoints是实现实际服务的端点集合。换句话说，service和pod之间的联系是通过endpoints实现的。

每创建一个service，k8s会自动创建一个同名的 Endpoint出来

如果是由service创建出来的endpoints，删除后会马上创建出一个同名的endpoint出来，如果要删除必须先删除service

因为每次创建一个service，k8s会自动创建一个同名的 Endpoint出来，所我们直接创建service就可以了

--help 用来查看帮助文档，如果你不知道某个命令怎么使用了，就可以用 --help 查询命令的用法

explain用来查看配置文件的资源结构，如果不知道配置文件中的资源用有哪些结构，那么就可以使用explain命令来查看

导航:首页 > 程序命令 > kubadm命令

kubadm命令

与kubadm命令相关的资料