Lab-41: Create Multi-node infrastructure using Vagrant

I will create vagrant multi-node configuration and use it for my future labs. These nodes have very basic provisioning like hostname, ip and some useful packages.

It is really simple to setup VMs using vagrant. You can read more about vagrant here.In this lab I am creating 3 VM configuration using vagrant box centos/7. I will add basic packages and then save them as vagrant boxes for future use.

you check vagrant registry for boxes here.

Below is the procedure I followed to build my multi-node infrastructure

  1. Download and install Vagrant. I am using Centos 7 host machine select vagrant package according to your machine here
  2. Install vagrant package
$ sudo rpm -Uvh vagrant_1.9.1_x86_64.rpm
We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for divine:
Preparing...                          ################################# [100%]
Updating / installing...
   1:vagrant-1:1.9.1-1                ################################# [100%]

3. Install Virtual box. Below is the procedure from official Centos site. I am using virtualbox for my VMs, it is a default provider for vagrant other options are vmware,hyper-v and docker

$sudo yum update -y
$sudo yum install dkms
$sudo wget -P /etc/yum.repos.d http://download.virtualbox.org/virtualbox/rpm/rhel/virtualbox.repo
$sudo yum install VirtualBox-5.0
$sudo usermod -a -G vboxusers <your_user_name>

4. Create vagrantfile in the directory where you want to work from. Note: vagrant checks current directory for vagrantfile

This command will create a generic vagrantfile.

$vagrant init

5. Open  vagrantfile and add below commands. These commands will spin 3 VMs. I have named them Master, Node1 & Node2. Each VM assigned 2048 MB RAM, 2 CPU core, IP address and hostname.

As can be seen I am using centos/7 in vm.box

$ cat Vagrantfile
Vagrant.configure("2") do |config|

#VM1: Master
 config.vm.define "master" do |master|
 master.vm.box = "centos/7"
 master.vm.hostname = "Master"
 master.vm.network :private_network, ip: "192.168.11.11"
 master.ssh.insert_key = false
 config.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--memory", 2048]
    v.customize ["modifyvm", :id, "--cpus", 2]
 end
end

#VM2: Node1
 config.vm.define "node1" do |node1|
 node1.vm.box = "centos/7"
 node1.vm.hostname = "Node1"
 node1.vm.network :private_network, ip: "192.168.11.12"
 node1.ssh.insert_key = false
 config.vm.provider :virtualbox do |v|
   v.customize ["modifyvm", :id, "--memory", 2048]
   v.customize ["modifyvm", :id, "--cpus", 2]
 end
end

#VM3: Node2
 config.vm.define "node2" do |node2|
 node2.vm.box = "centos/7"
 node2.vm.hostname = "Node2"
 node2.vm.network :private_network, ip: "192.168.11.13"
 node2.ssh.insert_key = false
 config.vm.provider :virtualbox do |v|
   v.customize ["modifyvm", :id, "--memory", 2048]
   v.customize ["modifyvm", :id, "--cpus", 2]
 end
end
end

5. Bring up VMs. This command will spin VMs, setup two network interface 1) eth0:NAT and 2)eth1:host only with IP address provided in vagrantfile

$ vagrant up
Bringing machine 'master' up with 'virtualbox' provider...
Bringing machine 'node1' up with 'virtualbox' provider...
Bringing machine 'node2' up with 'virtualbox' provider...
==> master: Importing base box 'centos/7'...
==> master: Matching MAC address for NAT networking...
==> master: Checking if box 'centos/7' is up to date...
==> master: Setting the name of the VM: kubernetes_master_1487800004247_23604
==> master: Clearing any previously set network interfaces...
==> master: Preparing network interfaces based on configuration...
    master: Adapter 1: nat
    master: Adapter 2: hostonly
==> master: Forwarding ports...
    master: 22 (guest) => 2222 (host) (adapter 1)
==> master: Booting VM...
==> master: Waiting for machine to boot. This may take a few minutes...
    master: SSH address: 127.0.0.1:2222
    master: SSH username: vagrant
    master: SSH auth method: private key
    master:
    master: Vagrant insecure key detected. Vagrant will automatically replace
    master: this with a newly generated keypair for better security.
    master:
    master: Inserting generated public key within guest...
    master: Removing insecure key from the guest if it's present...
    master: Key inserted! Disconnecting and reconnecting using new SSH key...
==> master: Machine booted and ready!
==> master: Checking for guest additions in VM...
    master: No guest additions were detected on the base box for this VM! Guest
    master: additions are required for forwarded ports, shared folders, host only
    master: networking, and more. If SSH fails on this machine, please install
    master: the guest additions and repackage the box to continue.
    master:
    master: This is not an error message; everything may continue to work properly,
    master: in which case you may ignore this message.
==> master: Setting hostname...
==> master: Configuring and enabling network interfaces...

<--------> output truncated

6. Check VM status

//this command gives ssh port info for each VM
$ vagrant ssh-config
Host master
  HostName 127.0.0.1
  User vagrant
  Port 2222
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /home/divine/vagrant/proj/kubernetes/.vagrant/machines/master/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host node1
  HostName 127.0.0.1
  User vagrant
  Port 2200
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /home/divine/vagrant/proj/kubernetes/.vagrant/machines/node1/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host node2
  HostName 127.0.0.1
  User vagrant
  Port 2201
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /home/divine/vagrant/proj/kubernetes/.vagrant/machines/node2/virtualbox/private_key
  IdentitiesOnly yes
  LogLevel FATAL

$ vagrant status
Current machine states:
master                    running (virtualbox)
node1                     running (virtualbox)
node2                     running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

7. login to VMs do housekeeping and  load packages. Below example of my master node

Note:The root password for vagrant box is vagrant

//ssh to Master node
$vagrant ssh master

//bring up eth1 interface by default eth1 is down
[vagrant@Master ~]$ sudo ifup eth1

[vagrant@Master ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:22:5b:53 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
       valid_lft 74600sec preferred_lft 74600sec
    inet6 fe80::5054:ff:fe22:5b53/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:f8:f1:21 brd ff:ff:ff:ff:ff:ff
    inet 192.168.11.11/24 brd 192.168.11.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fef8:f121/64 scope link
       valid_lft forever preferred_lft forever

//load some useful packages
{vagrant@Master ~]$ sudo yum update -y
[vagrant@Master ~]$ sudo yum install net-tools
[vagrant@Master ~]$ sudo yum install wget
[vagrant@Master ~]$ sudo yum

//setup ssh key
[vagrant@Master ~]$ wget https://raw.githubusercontent.com/mitchellh/vagrant/master/keys/vagrant.pub -O .ssh/authorized_keys
[vagrant@Master ~]$ chmod 700 .ssh
[vagrant@Master ~]$ chmod 600 .ssh/authorized_keys
[vagrant@Master ~]$ chown -R vagrant:vagrant .ssh

8. Package  VMs into vagrant box. I will create 3 boxes named: Master, Node1 & Node2

//we need virtualbox name to create box, below command will provide virtualbox
names
$ ps -ef | grep virtualbox
divine     366 19858  0 22:52 pts/2    00:00:00 grep --color=auto virtualbox
divine    9630     1  0 19:29 ?        00:00:14 /usr/lib/virtualbox/VBoxXPCOMIPCD
divine    9635     1  0 19:29 ?        00:00:39 /usr/lib/virtualbox/VBoxSVC --auto-shutdown
divine   10449  9635  1 19:29 ?        00:02:01 /usr/lib/virtualbox/VBoxHeadless --comment kubernetes_master_1487800004247_23604 --startvm c2acbdc8-457a-40e3-ac21-fe151b886ca7 --vrde config
divine   12633  9635  0 19:30 ?        00:01:52 /usr/lib/virtualbox/VBoxHeadless --comment kubernetes_node1_1487800067490_38506 --startvm 3bedc067-befc-4b81-9042-4a08c126ca45 --vrde config
divine   14735  9635  0 19:30 ?        00:01:51 /usr/lib/virtualbox/VBoxHeadless --comment kubernetes_node2_1487800127574_14083 --startvm cb10ab77-0940-4243-b01b-9ce2e3fcf2af --vrde config

//create box
$ vagrant package --base kubernetes_master_1487800004247_23604 --output Master --vagrantfile Vagrantfile
$ vagrant package --base kubernetes_node1_1487800067490_38506 --output Node1 --vagrantfile Vagrantfile
$ vagrant package --base kubernetes_node2_1487800127574_14083 --output Node2 --vagrantfile Vagrantfile

$ ls
Master  Node1  Node2  Vagrantfile

9.Now that I have created my boxes. I will destroy current VMs

$vagrant destroy

10. The last step is to modify  vagrantfile to use my newly minted boxes. The change is in vm.box parameter, instead of centos/7 I am providing my boxes name

$ cat Vagrantfile
Vagrant.configure("2") do |config|

#VM1: Master
 config.vm.define "master" do |master|
  master.vm.box = "Master"
  master.vm.hostname = "Master"
  master.vm.network :private_network, ip: "192.168.11.11"
  master.ssh.insert_key = false
  config.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--memory", 2048]
    v.customize ["modifyvm", :id, "--cpus", 2]
  end
 end

#VM2: Node1
 config.vm.define "node1" do |node1|
  node1.vm.box = "Node1"
  node1.vm.hostname = "Node1"
  node1.vm.network :private_network, ip: "192.168.11.12"
  node1.ssh.insert_key = false
  config.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--memory", 2048]
    v.customize ["modifyvm", :id, "--cpus", 2]
  end
 end

#VM3: Node2
 config.vm.define "node2" do |node2|
  node2.vm.box = "Node2"
  node2.vm.hostname = "Node2"
  node2.vm.network :private_network, ip: "192.168.11.13"
  node2.ssh.insert_key = false
  config.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--memory", 2048]
    v.customize ["modifyvm", :id, "--cpus", 2]
  end
 end
end

Lab-40: Container orchestration: Kubernetes

In Lab-35 I went over Docker Swarm, which is a container orchestration framework from Docker. In this lab I will go over another orchestration framework called Kubernetes. Kubernetes is an open source platform developed by Google. It provides orchestration for Docker and other types of containers

The purpose of this lab is to get familiar with Kubernetes, install it on Linux and deploy a simple Kubernetes pods. This lab is a  multi-node deployment of Kubernetes cluster.

Let’s get familiar with Kubernetes terminology

Master:

A Master is a VM or a physical computer responsible for managing the cluster. The master coordinates all activities in your cluster, such as scheduling applications, maintaining applications’ desired state, scaling applications, and rolling out new updates.

By default pods are not scheduled on Master. But if you like to schedule pods on Master try this command on Master

# kubectl taint nodes --all dedicated-

Node:

A node is a VM or a physical computer that serves as a worker machine in a Kubernetes cluster. Each node has a Kubelet, which is an agent for managing the node and communicating with the Kubernetes master. The node should also have tools for handling container operations, such as Docker.

Pod:

A pod is a  group of one or more containers. All the containers in a pod scheduled together, live together and die together. Why Kubernetes deploy pod and not containers because some applications are tightly coupled and make sense to deploy together i.e. web server and cache server. You can have separate containers for web server and cache server but deploy them together that way you make sure they are scheduled together on the same node and terminated together.It is easier  to manage pod than containers.  Read about pod here

Pod has similarity to VM in terms on process virtualization , both run multiple processes (in case of pod containers), all processes share same IP address, all processes can communicate using local host and they use separate network namespace then host

Some key points about pod:

  1. Containers is a pod are always co-located and co-scheduled, and run in a shared context
  2. Pod contains one or more application containers which are relatively tightly coupled — in a pre-container world, they would have executed on the same physical or virtual machine
  3. The shared context of a pod is a set of Linux namespaces, cgroups, and potentially other facets of isolation – the same things that isolate a Docker container
  4. Containers within a pod share an IP address, port space and hostname. Container within pod communicate using localhost
  5. Every pod get an IP address

Below example of pod deployment in a node.

kubernetes_5

Replication controller

Replication controller in Kubernetes is responsible for replicating pods. A ReplicationController ensures that a specified number of pod “replicas” are always running at any one time. It checks pod’s health and if a pod dies it quickly re-creates it automatically

API server

Kubernetes deploy API server on Master. API server provides front end to cluster. It serves REST services. You can interact with cluster using 1) cli (kubectl) 2)REST API 3)gui interface. kubectl & GUI internally uses REST API

kubernetes_6

Prerequisite:

In this lab I am using my 3 node vagrant infrastructure. Check  Lab-41 for detail how to setup VMs in vagrant. I have one Master and two Nodes. This is my VM topology

kubernetes_4

My VM specification

[root@Master ~]# cat /etc/*release*
CentOS Linux release 7.2.1511 (Core)
Derived from Red Hat Enterprise Linux 7.2 (Source)
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

CentOS Linux release 7.2.1511 (Core)
CentOS Linux release 7.2.1511 (Core)
cpe:/o:centos:centos:7

[root@Master ~]# uname -r
3.10.0-327.el7.x86_64

Procedure:

Fire up the VMs

$vagrant up

Note: There is an issue when running Kubernetes in vagrant VM environment. By default kubeadm script picks vagrant NAT interface (eth0:IP 10.0.2.15) but we need it to pick second interface (eth1) on which Master and Node communicates. In order to force kubeadm to pick eth1  interface edit your  /etc/hosts file so hostname -i returns VM IP address


[root@Master ~]# cat /etc/hosts
192.168.11.11 Master
[root@Master ~]# hostname -i
192.168.11.11
[root@Master ~]#

Try these steps on all VMs (Master and Nodes). I am following installation instruction from official Kubernetes site. It uses kubeadm to install Kubernetes.


//create file kubernetes.repo in this directory /etc/yum.repos.d
[root@Master ~]# cat /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
       https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

//disable SELinux
[root@Master ~]# sudo setenforce 0

[root@Master ~]# sudo yum install -y docker kubeadm kubelet kubectl kubernetes-cni
[root@Master ~]# sudo systemctl enable docker && systemctl start docker
[root@Master ~]# sudo systemctl enable kubelet & systemctl start kubelet

Initialize Master

Try below step on Master only. This command will initialize master. You can allow kubeadm to pick IP address or specify it explicitly which I am doing here. This is the IP address of my Master machine’s eth1 interface. Make sure Nodes can reach Master on this address

At the end of this command it will provide join command for Nodes


//this command may take couple of minutes
[root@Master ~]#  sudo kubeadm init --api-advertise-addresses 192.168.11.11
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[init] Using Kubernetes version: v1.5.3
[tokens] Generated token: "084173.692e29a481ef443d"
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 51.082637 seconds
[apiclient] Waiting for at least one node to register and become ready
[apiclient] First node is ready after 1.017582 seconds
[apiclient] Creating a test deployment
[apiclient] Test deployment succeeded
[token-discovery] Created the kube-discovery deployment, waiting for it to become ready
[token-discovery] kube-discovery is ready after 30.503718 seconds
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns

Your Kubernetes master has initialized successfully!

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
    http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node:

kubeadm join --token=084173.692e29a481ef443d 192.168.11.11

//keep note of kubeadm join from above command, you need to run this command on
Nodes to join Master
kubeadm join --token=084173.692e29a481ef443d 192.168.11.11

Deploy POD network

Try this command only on master.  Note: As per Kubernetes installation instructions this step needs to be performed before Node join


[root@Master ~]# kubectl apply -f https://git.io/weave-kube

Once a pod network has been installed, you can confirm that it is working by checking that the kube-dns pod is Running in the output of kubectl get pods --all-namespaces.

And once the kube-dns pod is up and running, you can continue by joining your nodes

Join the Master

Try below command on both Nodes to join the Master. This command will start kubelet in Nodes


[root@Node1 ~]# kubeadm join --token=084173.692e29a481ef443d 192.168.11.11
[kubeadm] WARNING: kubeadm is in alpha, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] Starting the kubelet service
[tokens] Validating provided token
[discovery] Created cluster info discovery client, requesting info from "http://192.254.211.168:9898/cluster-info/v1/?token-id=084173"
[discovery] Cluster info object received, verifying signature using given token
[discovery] Cluster info signature and contents are valid, will use API endpoints [https://192.254.211.168:6443]
[bootstrap] Trying to connect to endpoint https://192.168.11.11:6443
[bootstrap] Detected server version: v1.5.3
[bootstrap] Successfully established connection with endpoint "https://192.168.11.11:6443"
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:Minion_1 | CA: false
Not before: 2017-02-15 22:24:00 +0000 UTC Not After: 2018-02-15 22:24:00 +0000 UTC
[csr] Generating kubelet configuration
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:
* Certificate signing request sent to master and response
  received.
* Kubelet informed of new secure connection details.

Run 'kubectl get nodes' on the master to see this machine join.

Let’s check what Kubernetes processes have started on Master


//these are the kubernetes related processes are running on Master
[root@Master ~]# netstat -pan
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:2379          0.0.0.0:*               LISTEN      21274/etcd
tcp        0      0 127.0.0.1:10251         0.0.0.0:*               LISTEN      21091/kube-schedule
tcp        0      0 127.0.0.1:10252         0.0.0.0:*               LISTEN      21540/kube-controll
tcp        0      0 127.0.0.1:2380          0.0.0.0:*               LISTEN      21274/etcd
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN      21406/kube-apiserve
tcp        0      0 0.0.0.0:6783            0.0.0.0:*               LISTEN      4820/weaver
tcp        0      0 127.0.0.1:6784          0.0.0.0:*               LISTEN      4820/weaver
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      20432/kubelet

//all kube-system pods are running which is a good sign
[root@Master ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                              READY     STATUS    RESTARTS   AGE
kube-system   dummy-2088944543-hp7nl            1/1       Running   0          1d
kube-system   etcd-master                       1/1       Running   0          1d
kube-system   kube-apiserver-master             1/1       Running   0          1d
kube-system   kube-controller-manager-master    1/1       Running   0          1d
kube-system   kube-discovery-1769846148-qtjkn   1/1       Running   0          1d
kube-system   kube-dns-2924299975-15b4q         4/4       Running   0          1d
kube-system   kube-proxy-9rfxv                  1/1       Running   0          1d
kube-system   kube-proxy-qh191                  1/1       Running   0          1d
kube-system   kube-proxy-zhtlg                  1/1       Running   0          1d
kube-system   kube-scheduler-master             1/1       Running   0          1d
kube-system   weave-net-bc9k9                   2/2       Running   11         1d
kube-system   weave-net-nx7t0                   2/2       Running   2          1d
kube-system   weave-net-ql04q                   2/2       Running   11         1d
[root@Master ~]#

As you can see Master and Nodes are in ready state. Cluster is ready to deploy pods


[root@Master ~]# kubectl get nodes
NAME      STATUS         AGE
master    Ready,master   1h
node1     Ready          1h
node2     Ready          1h
[root@Master ~]# kubectl cluster-info
Kubernetes master is running at http://Master:8080
KubeDNS is running at http://Master:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

//check components status. everything looks healthy here
[root@Master ~]# kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}

Kubernetes version

[root@Master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-15T06:34:56Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
[root@Master ~]#

Kubernetes UI

As I said earlier there are 3 ways to interact with your cluster. Let’s try UI interface I am following procedure specified here

Try below command to check if Kubernetes dashboard already installed in Master

[root@Master ~]# kubectl get pods --all-namespaces | grep dashboard

If it is not installed as in my case try below command to install Kubernetes dashboard

[root@Master ~]# kubectl create -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created

Try below to setup proxy .kubectl will handle authentication with apiserver and make Dashboard available at http://localhost:8001/ui.

[root@Master ~]# kubectl proxy &
Starting to serve on 127.0.0.1:8001

Open a browser and point it to http://localhost.8001/ui. You should get Kubernetes dashboard UI like this. You can check your cluster status, deploy pod in cluster using ui

kubernetes_3

Deploy pod

Let’s deploy pod using kubectl cli. I am using yaml template. Create below template in your Master. My template file name is single_container_pod.yaml

This template will deploy a pod with one container, in this case a nginx server. I named my pod web-server and exposed container port 8000

[root@Master]# kubectl create -f single_conatiner_pod.yaml


[root@Master ~]# cat single_container_pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: web-server
  labels:
    app: web-server
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
        - containerPort: 8000

Create pod using above template

[root@Master ~]# kubectl create -f single_container_pord.yaml
pod "web-server" created

//1/1 mean this pod is running with one container
[root@Master ~]# kubectl get pods
NAME         READY     STATUS    RESTARTS   AGE
web-server   1/1       Running   0          47s

[root@Master ~]# kubectl describe pod web-server
Name:           web-server
Namespace:      default
Node:           node2/192.168.11.13
Start Time:     Sun, 26 Feb 2017 06:29:28 +0000
Labels:         app=web-server
Status:         Running
IP:             10.36.0.1
Controllers:    <none>
Containers:
  nginx:
    Container ID:       docker://3b63cab5804d1842659c6424369e6b4a163b482f560ed6324460ea16fdce791e
    Image:              nginx
    Image ID:           docker-pullable://docker.io/nginx@sha256:4296639ebdf92f035abf95fee1330449e65990223c899838283c9844b1aaac4c
    Port:               8000/TCP
    State:              Running
      Started:          Sun, 26 Feb 2017 06:29:30 +0000
    Ready:              True
    Restart Count:      0
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pdsm6 (ro)
    Environment Variables:      <none>
Conditions:
  Type          Status
  Initialized   True
  Ready         True
  PodScheduled  True
Volumes:
  default-token-pdsm6:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-pdsm6
QoS Class:      BestEffort
Tolerations:    <none>
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath           Type            Reason          Message
  ---------     --------        -----   ----                    -------------           --------        ------          -------
  21s           21s             1       {default-scheduler }                            Normal          Scheduled       Successfully assigned web-server to node2
  20s           20s             1       {kubelet node2}         spec.containers{nginx}  Normal          Pulling         pulling image "nginx"
  19s           19s             1       {kubelet node2}         spec.containers{nginx}  Normal          Pulled          Successfully pulled image "nginx"
  19s           19s             1       {kubelet node2}         spec.containers{nginx}  Normal          Created         Created container with docker id 3b63cab5804d; Security:[seccomp=unconfined]
  19s           19s             1       {kubelet node2}         spec.containers{nginx}  Normal          Started         Started container with docker id 3b63cab5804d

//this command tells you on which Node pod is running. looks like our pod scheduled
in Node2
[root@Master ~]# kubectl get pods -o wide
NAME         READY     STATUS    RESTARTS   AGE       IP          NODE
web-server   1/1       Running   0          2m        10.36.0.1   node2
[root@Master ~]#

You can login to container using kubectl exec command

[root@Master ~]# kubectl exec web-server -it sh
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
15: eth0@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP group default
    link/ether ee:9d:a1:cb:db:ee brd ff:ff:ff:ff:ff:ff
    inet 10.44.0.2/12 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::ec9d:a1ff:fecb:dbee/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
# env
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.96.0.1:443
HOSTNAME=web-server
HOME=/root
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
NGINX_VERSION=1.11.10-1~jessie
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_HOST=10.96.0.1
PWD=/
#

Let’s login to Node2 and check container using Docker cli

//as can be seen nginx container is running in Node2
[root@Node2 ~]# docker ps
CONTAINER ID        IMAGE                                              COMMAND                  CREATED             STATUS              PORTS               NAMES
3b63cab5804d        nginx                                              "nginx -g 'daemon off"   4 minutes ago       Up 4 minutes                            k8s_nginx.f4302f56_web-server_default_ec8bd607-fbec-11e6-ac27-525400225b53_2dc5c9e9
ce8cd44bd08e        gcr.io/google_containers/pause-amd64:3.0           "/pause"                 4 minutes ago       Up 4 minutes                            k8s_POD.d8dbe16c_web-server_default_ec8bd607-fbec-11e6-ac27-525400225b53_85ec4303
5a20ca6bed11        weaveworks/weave-kube:1.9.0                        "/home/weave/launch.s"   24 hours ago        Up 24 hours                             k8s_weave.c980d315_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_125b8d34
733d2927383f        weaveworks/weave-npc:1.9.0                         "/usr/bin/weave-npc"     24 hours ago        Up 24 hours                             k8s_weave-npc.a8b5954e_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_7ab4a6a7
d270cb27e576        gcr.io/google_containers/kube-proxy-amd64:v1.5.3   "kube-proxy --kubecon"   24 hours ago        Up 24 hours                             k8s_kube-proxy.3cceb559_kube-proxy-zhtlg_kube-system_5f5b707f-fb1e-11e6-ac27-525400225b53_b38dc39e
042abc6ec49c        gcr.io/google_containers/pause-amd64:3.0           "/pause"                 24 hours ago        Up 24 hours                             k8s_POD.d8dbe16c_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_02af8f33
56d00c47759f        gcr.io/google_containers/pause-amd64:3.0           "/pause"                 24 hours ago        Up 24 hours                             k8s_POD.d8dbe16c_kube-proxy-zhtlg_kube-system_5f5b707f-fb1e-11e6-ac27-525400225b53_56485a90

//docker nginx images loaded in Node2
[root@Node2 ~]# docker images
REPOSITORY                                  TAG                 IMAGE ID            CREATED             SIZE
docker.io/nginx                             latest              db079554b4d2        10 days ago         181.8 MB
gcr.io/google_containers/kube-proxy-amd64   v1.5.3              932ee3606ada        10 days ago         173.5 MB
docker.io/weaveworks/weave-npc              1.9.0               460b9ad16e86        3 weeks ago         58.22 MB
docker.io/weaveworks/weave-kube             1.9.0               568b0ac069ad        3 weeks ago         162.7 MB
gcr.io/google_containers/pause-amd64        3.0                 99e59f495ffa        9 months ago        746.9 kB

Delete  pod, try these commands on Master

[root@Master ~]# kubectl get pods
NAME         READY     STATUS    RESTARTS   AGE
web-server   1/1       Running   0          59m

[root@Master ~]# kubectl delete pod web-server
pod "web-server" deleted

[root@Master ~]# kubectl get pods
No resources found.

Check Node2 and  make sure container is deleted

//as you can see there is no nginx container running on Node2
[root@Node2 ~]# docker ps
CONTAINER ID        IMAGE                                              COMMAND                  CREATED             STATUS              PORTS               NAMES
5a20ca6bed11        weaveworks/weave-kube:1.9.0                        "/home/weave/launch.s"   24 hours ago        Up 24 hours                             k8s_weave.c980d315_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_125b8d34
733d2927383f        weaveworks/weave-npc:1.9.0                         "/usr/bin/weave-npc"     24 hours ago        Up 24 hours                             k8s_weave-npc.a8b5954e_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_7ab4a6a7
d270cb27e576        gcr.io/google_containers/kube-proxy-amd64:v1.5.3   "kube-proxy --kubecon"   24 hours ago        Up 24 hours                             k8s_kube-proxy.3cceb559_kube-proxy-zhtlg_kube-system_5f5b707f-fb1e-11e6-ac27-525400225b53_b38dc39e
042abc6ec49c        gcr.io/google_containers/pause-amd64:3.0           "/pause"                 24 hours ago        Up 24 hours                             k8s_POD.d8dbe16c_weave-net-ql04q_kube-system_5f5b0916-fb1e-11e6-ac27-525400225b53_02af8f33
56d00c47759f        gcr.io/google_containers/pause-amd64:3.0           "/pause"                 24 hours ago        Up 24 hours                             k8s_POD.d8dbe16c_kube-proxy-zhtlg_kube-system_5f5b707f-fb1e-11e6-ac27-525400225b53_56485a90

//image remains for future use
[root@Node2 ~]# docker images
REPOSITORY                                  TAG                 IMAGE ID            CREATED             SIZE
docker.io/nginx                             latest              db079554b4d2        10 days ago         181.8 MB
gcr.io/google_containers/kube-proxy-amd64   v1.5.3              932ee3606ada        10 days ago         173.5 MB
docker.io/weaveworks/weave-npc              1.9.0               460b9ad16e86        3 weeks ago         58.22 MB
docker.io/weaveworks/weave-kube             1.9.0               568b0ac069ad        3 weeks ago         162.7 MB
gcr.io/google_containers/pause-amd64        3.0                 99e59f495ffa        9 months ago        746.9 kB
[root@Node2 ~]#

Replication controller

Create yaml template for replication controller. You can read more about replication controller here.

This template replicating 10 pods using ‘replicas:10’.

[root@Master ~]# cat web-rc.yaml
apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 10
  selector:
    app: web-server
  template:
    metadata:
      name: web-server
      labels:
        app: web-server
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 8000

Execute replication controller


[root@Master ~]# kubectl create -f web-rc.yaml
replicationcontroller "nginx" created

//10 pods created
[root@Master ~]# kubectl get pods
NAME          READY     STATUS    RESTARTS   AGE
nginx-498mx   1/1       Running   0          10s
nginx-9vfsd   1/1       Running   0          10s
nginx-dgvg6   1/1       Running   0          10s
nginx-fh4bv   1/1       Running   0          10s
nginx-k7j9d   1/1       Running   0          10s
nginx-mz5r0   1/1       Running   0          10s
nginx-q2z79   1/1       Running   0          10s
nginx-w6b4d   1/1       Running   0          10s
nginx-wkshq   1/1       Running   0          10s
nginx-wz7ss   1/1       Running   0          10s


[root@Master ~]# kubectl describe replicationcontrollers/nginx
Name:           nginx
Namespace:      default
Image(s):       nginx
Selector:       app=web-server
Labels:         app=web-server
Replicas:       10 current / 10 desired
Pods Status:    10 Running / 0 Waiting / 0 Succeeded / 0 Failed
No volumes.
Events:
  FirstSeen     LastSeen        Count   From                            SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                            -------------   --------        ------                  -------
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-fh4bv
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-k7j9d
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-mz5r0
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-dgvg6
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-498mx
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-w6b4d
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-9vfsd
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-q2z79
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        Created pod: nginx-wkshq
  2m            2m              1       {replication-controller }                       Normal          SuccessfulCreate        (events with common reason combined)
[root@Master ~]#

Delete one pod. Since we desire 10 replicas, replication controller will restart another pod so total pods are always 10


[root@Master ~]# kubectl get pods
NAME          READY     STATUS    RESTARTS   AGE
nginx-498mx   1/1       Running   0          6m
nginx-9vfsd   1/1       Running   0          6m
nginx-dgvg6   1/1       Running   0          6m
nginx-fh4bv   1/1       Running   0          6m
nginx-k7j9d   1/1       Running   0          6m
nginx-mz5r0   1/1       Running   0          6m
nginx-q2z79   1/1       Running   0          6m
nginx-w6b4d   1/1       Running   0          6m
nginx-wkshq   1/1       Running   0          6m
nginx-wz7ss   1/1       Running   0          6m

[root@Master ~]# kubectl delete pod nginx-k7j9d
pod "nginx-k7j9d" deleted

[root@Master ~]# kubectl get pods
NAME          READY     STATUS              RESTARTS   AGE
nginx-498mx   1/1       Running             0          6m
nginx-74qp9   0/1       ContainerCreating   0          3s
nginx-9vfsd   1/1       Running             0          6m
nginx-dgvg6   1/1       Running             0          6m
nginx-fh4bv   1/1       Running             0          6m
nginx-mz5r0   1/1       Running             0          6m
nginx-q2z79   1/1       Running             0          6m
nginx-w6b4d   1/1       Running             0          6m
nginx-wkshq   1/1       Running             0          6m
nginx-wz7ss   1/1       Running             0          6m

[root@Master ~]# kubectl get pods
NAME          READY     STATUS    RESTARTS   AGE
nginx-498mx   1/1       Running   0          6m
nginx-74qp9   1/1       Running   0          6s
nginx-9vfsd   1/1       Running   0          6m
nginx-dgvg6   1/1       Running   0          6m
nginx-fh4bv   1/1       Running   0          6m
nginx-mz5r0   1/1       Running   0          6m
nginx-q2z79   1/1       Running   0          6m
nginx-w6b4d   1/1       Running   0          6m
nginx-wkshq   1/1       Running   0          6m
nginx-wz7ss   1/1       Running   0          6m
[root@Master ~]#

Increase and decrease number of replicas

[root@Master ~]# kubectl scale --replicas=15 replicationcontroller/nginx
replicationcontroller "nginx" scaled
//increase number of replicas to 15
[root@Master ~]# kubectl get pods
NAME          READY     STATUS    RESTARTS   AGE
nginx-1jdn9   1/1       Running   0          7s
nginx-498mx   1/1       Running   0          17m
nginx-74qp9   1/1       Running   0          11m
nginx-9vfsd   1/1       Running   0          17m
nginx-bgdc6   1/1       Running   0          7s
nginx-dgvg6   1/1       Running   0          17m
nginx-fh4bv   1/1       Running   0          17m
nginx-j2xtf   1/1       Running   0          7s
nginx-m8vlq   1/1       Running   0          7s
nginx-mz5r0   1/1       Running   0          17m
nginx-q2z79   1/1       Running   0          17m
nginx-rmrqt   1/1       Running   0          7s
nginx-w6b4d   1/1       Running   0          17m
nginx-wkshq   1/1       Running   0          17m
nginx-wz7ss   1/1       Running   0          17m
[root@Master ~]#

[root@Master ~]# kubectl scale --replicas=5 replicationcontroller/nginx
replicationcontroller "nginx" scaled
[root@Master ~]# kubectl get pods
NAME          READY     STATUS        RESTARTS   AGE
nginx-9vfsd   1/1       Running       0          19m
nginx-dgvg6   1/1       Running       0          19m
nginx-fh4bv   1/1       Running       0          19m
nginx-mz5r0   0/1       Terminating   0          19m
nginx-q2z79   1/1       Running       0          19m
nginx-w6b4d   1/1       Running       0          19m

[root@Master ~]# kubectl get pods
NAME          READY     STATUS    RESTARTS   AGE
nginx-9vfsd   1/1       Running   0          19m
nginx-dgvg6   1/1       Running   0          19m
nginx-fh4bv   1/1       Running   0          19m
nginx-q2z79   1/1       Running   0          19m
nginx-w6b4d   1/1       Running   0          19m
[root@Master ~]#

Note: I found that after vagrant VMs shutdown and restarts things doesn’t work properly. I see  API server doesn’t come up. Kubernetes documentation explain these steps to restart you cluster if you get into problem. I tried it but only able to bring up cluster with one Node

Reset cluster. Perform below on all VMs


#kubeadm reset

Redo steps


#systemctl enable kubelet && systemctl start kubelet
#kubeadm init --api-advertise-addresses 192.168.11.11
#kubectl apply -f https://git.io/weave-kube
#kubeadm join

Save

Save

Save

Lab-38: Linux bridge with Linux containers (lxc)

This is a fun lab which involves Linux bridges, Linux containers and spanning tree protocol. I will create L2 switch network using three Linux bridges and three containers connected to bridges. It is going to be a ring network so we can test some basic  basic spanning tree function.

Prerequisite:

Install Linux containers using procedure in  Lab-36.

In this lab I am using Centos 7. We will be using brctl command to create bridge so Install bridge-utils if not already installed

#yum install bridge-utils

Procedure:

Create three Linux bridges, br0,br1 & br2 using brctl command.

sudo brctl addbr <bridge name>


[root@localhost]# sudo brctl addbr br0
[root@localhost]# sudo brctl addbr br1
[root@localhost]# sudo brctl addbr br2

//bring bridge up
[root@localhost]# ifconfig br0 up
[root@localhost]# ifconfig br0
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::c0f6:91ff:fed2:38b0  prefixlen 64  scopeid 0x20
        ether c2:f6:91:d2:38:b0  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6  bytes 508 (508.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br2 up
[root@localhost]# ifconfig br1 up

Create virtual ethernet (veth) links and bring them up. veth created in pair, these will serve as trunk ports between bridges


//create veth with pair i.e. veth0 paired with veth1
[root@localhost ~]# ip link add veth0 type veth peer name veth1

[root@localhost ~]# ifconfig veth0 up
[root@localhost ~]# ifconfig veth1 up

//add veth to respective bridges (see topology diagram)
[root@localhost ~]# sudo brctl addif br0 veth0
[root@localhost ~]# sudo brctl addif br1 veth1
[root@localhost ~]# sudo brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.e274c68ffca6       no              veth0
br1             8000.d28623e04c88       no              veth1

//create veth2 paired with veth3, veth4 paired with veth5
[root@localhost ~]# ip link add veth2 type veth peer name veth3
[root@localhost ~]# ip link add veth4 type veth peer name veth5

//bring veth up
[root@localhost ~]# ifconfig veth2 up
[root@localhost ~]# ifconfig veth3 up
[root@localhost ~]# ifconfig veth4 up
[root@localhost ~]# ifconfig veth5 up

//add veth to respective bridges
[root@localhost ~]# sudo brctl addif br1 veth2
[root@localhost ~]# sudo brctl addif br2 veth3
[root@localhost ~]# sudo brctl addif br2 veth4
[root@localhost ~]# sudo brctl addif br0 veth5
//as can be seen veth interfaces are attached to bridges
[root@localhost ~]# sudo brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.c2f691d238b0       no              veth0
                                                        veth5
br1             8000.d28623e04c88       no              veth1
                                                        veth2
br2             8000.9a127ad7bf76       no              veth3
                                                        veth4

//check bridge and veth interfaces
[root@localhost ~]# ip addr
52: br0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether c2:f6:91:d2:38:b0 brd ff:ff:ff:ff:ff:ff
53: br1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether d2:86:23:e0:4c:88 brd ff:ff:ff:ff:ff:ff
54: veth1@veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UP qlen 1000
    link/ether d2:86:23:e0:4c:88 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::d086:23ff:fee0:4c88/64 scope link
       valid_lft forever preferred_lft forever
55: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP qlen 1000
    link/ether e2:74:c6:8f:fc:a6 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::e074:c6ff:fe8f:fca6/64 scope link
       valid_lft forever preferred_lft forever
56: veth3@veth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br2 state UP qlen 1000
    link/ether 9a:12:7a:d7:bf:76 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::9812:7aff:fed7:bf76/64 scope link
       valid_lft forever preferred_lft forever
57: veth2@veth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UP qlen 1000
    link/ether ee:bf:b0:92:54:a0 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ecbf:b0ff:fe92:54a0/64 scope link
       valid_lft forever preferred_lft forever
58: veth5@veth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP qlen 1000
    link/ether c2:f6:91:d2:38:b0 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::c0f6:91ff:fed2:38b0/64 scope link
       valid_lft forever preferred_lft forever
59: veth4@veth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br2 state UP qlen 1000
    link/ether fe:fe:da:e0:3a:09 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fcfe:daff:fee0:3a09/64 scope link
       valid_lft forever preferred_lft forever
60: br2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 9a:12:7a:d7:bf:76 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]#

Create three Linux containers, c1,c2 & c3

#lxc-create -t ubuntu -n c1
#lxc-create -t ubuntu -n c2
#lxc-create -t ubuntu -n c3


[root@localhost ~]# lxc-create -t ubuntu -n c1
Checking cache download in /var/cache/lxc/precise/rootfs-amd64 ...
Copy /var/cache/lxc/precise/rootfs-amd64 to /var/lib/lxc/c1/rootfs ...
Copying rootfs to /var/lib/lxc/c1/rootfs ...
Generating locales...
  en_US.UTF-8... up-to-date
Generation complete.
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
Timezone in container is not configured. Adjust it manually.

##
# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.
##

[root@localhost ~]# lxc-create -t ubuntu -n c2
Checking cache download in /var/cache/lxc/precise/rootfs-amd64 ...
Copy /var/cache/lxc/precise/rootfs-amd64 to /var/lib/lxc/c2/rootfs ...
Copying rootfs to /var/lib/lxc/c2/rootfs ...
Generating locales...
  en_US.UTF-8... up-to-date
Generation complete.
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
Timezone in container is not configured. Adjust it manually.

##
# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.
##

[root@localhost ~]# lxc-create -t ubuntu -n c3
Checking cache download in /var/cache/lxc/precise/rootfs-amd64 ...
Copy /var/cache/lxc/precise/rootfs-amd64 to /var/lib/lxc/c3/rootfs ...
Copying rootfs to /var/lib/lxc/c3/rootfs ...
Generating locales...
  en_US.UTF-8... up-to-date
Generation complete.
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
Timezone in container is not configured. Adjust it manually.

##
# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.
##

Edit container config file to create veth network types. Container config files for my system is located at /var/lib/lxc/<container name>/config

Container c1 attached to bridge br0, c2 to br1 and c3 to br2

Below sample container c1 config file /var/lib/lxc/c1/config


# Network configuration
lxc.network.type = veth
lxc.network.hwaddr = 00:16:3e:80:81:03
lxc.network.flags = up
lxc.network.link = br0
lxc.network.ipv4 = 192.168.2.1/24

below sample container c2 config file /var/lib/lxc/c2/config


# Network configuration
lxc.network.type = veth
lxc.network.hwaddr = 00:16:3e:6e:70:a7
lxc.network.flags = up
lxc.network.link = br1
lxc.network.ipv4 = 192.168.2.2/24

Below sample container c3 config file /var/lib/lxc/c3/config


# Network configuration
lxc.network.type = veth
lxc.network.hwaddr = 00:16:3e:18:56:d5
lxc.network.flags = up
lxc.network.link = br2
lxc.network.ipv4 = 192.168.2.3/24

Check container network interface. Use lxc-attach command to list ip interfaces

#lxc-attach -n <container name> /sbin/ip addr


//container c1 assigned with ip address 192.168.2.1
[root@localhost]# lxc-attach -n c1 /sbin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
67: eth0@if68: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:80:81:03 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.1/24 brd 192.168.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe80:8103/64 scope link
       valid_lft forever preferred_lft forever
//container c2 assigned ip address 192.168.2.2
[root@localhost]# lxc-attach -n c2 /sbin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
69: eth0@if70: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:6e:70:a7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.2/24 brd 192.168.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe6e:70a7/64 scope link
       valid_lft forever preferred_lft forever

//container c3 assigned ip address 192.168.2.3
[root@localhost]# lxc-attach -n c3 /sbin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
71: eth0@if72: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 00:16:3e:18:56:d5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.3/24 brd 192.168.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe18:56d5/64 scope link
       valid_lft forever preferred_lft forever

Check bridge status. As can be seen three virtual ethernet interfaces are attached to each bridge


[root@localhost]# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.c2f691d238b0       no              veth0
                                                        veth5
                                                        vethOC0NPG
br1             8000.d28623e04c88       no              veth1
                                                        veth2
                                                        vethSQUCH3
br2             8000.9a127ad7bf76       no              veth3
                                                        veth4
                                                        veth577WF3

At this point we have created this topology. Containers c1,c2 & c3 attached to respective bridges and all bridges connected in a ring. Our L2 network is ready for testing

stp_1

Check spanning tree (stp) status on bridges. As can be seen stp is disabled on all bridges as every bridge is a designated root bridge


[root@localhost]# brctl showstp br0
br0
 bridge id              8000.c2f691d238b0
 designated root        8000.c2f691d238b0
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               1.66                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  53.31
 flags


veth0 (1)
 port id                8001                    state                forwarding
 designated root        8000.c2f691d238b0       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.66
 flags

veth5 (2)
 port id                8002                    state                forwarding
 designated root        8000.c2f691d238b0       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.66
 flags

vethOC0NPG (3)
 port id                8003                    state                forwarding
 designated root        8000.c2f691d238b0       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.66
 flags

[root@localhost]# brctl showstp br1
br1
 bridge id              8000.d28623e04c88
 designated root        8000.d28623e04c88
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               1.04                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  45.06
 flags


veth1 (1)
 port id                8001                    state                forwarding
 designated root        8000.d28623e04c88       path cost                  2
 designated bridge      8000.d28623e04c88       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.04
 flags

veth2 (2)
 port id                8002                    state                forwarding
 designated root        8000.d28623e04c88       path cost                  2
 designated bridge      8000.d28623e04c88       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.04
 flags

vethSQUCH3 (3)
 port id                8003                    state                forwarding
 designated root        8000.d28623e04c88       path cost                  2
 designated bridge      8000.d28623e04c88       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.04
 flags

[root@localhost]# brctl showstp br2
br2
 bridge id              8000.9a127ad7bf76
 designated root        8000.9a127ad7bf76
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               1.07                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  16.98
 flags


veth3 (1)
 port id                8001                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.07
 flags

veth4 (2)
 port id                8002                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.07
 flags

veth577WF3 (3)
 port id                8003                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.07
 flags

Login to container c2 and ping to container c1 (192.168.2.1). Because of loop in the topology network will experience broadcast storm and create congestion. Ping will intermittently pass.

You can confirm the broadcast storm by checking packet counts on bridges. You will see packet counts increment at very high rate


ubuntu@c1:~$ ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3e:80:81:03
          inet addr:192.168.2.1  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::216:3eff:fe80:8103/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:87 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:648 (648.0 B)  TX bytes:18666 (18.6 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:56 errors:0 dropped:0 overruns:0 frame:0
          TX packets:56 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:5440 (5.4 KB)  TX bytes:5440 (5.4 KB)

ubuntu@c1:~$ ip route
192.168.2.0/24 dev eth0  proto kernel  scope link  src 192.168.2.1
ubuntu@c1:~$ ping 192.168.2.2
PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data.
64 bytes from 192.168.2.2: icmp_req=13 ttl=64 time=4009 ms
64 bytes from 192.168.2.2: icmp_req=19 ttl=64 time=3007 ms
64 bytes from 192.168.2.2: icmp_req=20 ttl=64 time=2005 ms

[root@localhost ~]# ifconfig br0
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::c0f6:91ff:fed2:38b0  prefixlen 64  scopeid 0x20<link>
        ether c2:f6:91:d2:38:b0  txqueuelen 1000  (Ethernet)
        RX packets 233826705  bytes 49096357638 (45.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost ~]# ifconfig br0
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::c0f6:91ff:fed2:38b0  prefixlen 64  scopeid 0x20<link>
        ether c2:f6:91:d2:38:b0  txqueuelen 1000  (Ethernet)
        RX packets 233921147  bytes 49127334614 (45.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Stop ping and check bridge packet count again. As can be seen packets counts are are still incrementing at a very high rate. Packets are circulating in the network due to topology loop


[root@localhost]# ifconfig br0
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::c0f6:91ff:fed2:38b0  prefixlen 64  scopeid 0x20
        ether c2:f6:91:d2:38:b0  txqueuelen 1000  (Ethernet)
        RX packets 158174506  bytes 28354814180 (26.4 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br0
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::c0f6:91ff:fed2:38b0  prefixlen 64  scopeid 0x20
        ether c2:f6:91:d2:38:b0  txqueuelen 1000  (Ethernet)
        RX packets 159069415  bytes 28530733852 (26.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br1
br1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::d086:23ff:fee0:4c88  prefixlen 64  scopeid 0x20
        ether d2:86:23:e0:4c:88  txqueuelen 1000  (Ethernet)
        RX packets 160435925  bytes 28800879432 (26.8 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3  bytes 258 (258.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br1
br1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::d086:23ff:fee0:4c88  prefixlen 64  scopeid 0x20
        ether d2:86:23:e0:4c:88  txqueuelen 1000  (Ethernet)
        RX packets 161046481  bytes 28923084308 (26.9 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3  bytes 258 (258.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br2
br2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::9812:7aff:fed7:bf76  prefixlen 64  scopeid 0x20
        ether 9a:12:7a:d7:bf:76  txqueuelen 1000  (Ethernet)
        RX packets 162073378  bytes 29129182172 (27.1 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@localhost]# ifconfig br2
br2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::9812:7aff:fed7:bf76  prefixlen 64  scopeid 0x20
        ether 9a:12:7a:d7:bf:76  txqueuelen 1000  (Ethernet)
        RX packets 162616331  bytes 29238047188 (27.2 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Enable spanning tree on all bridges, br0,br1 & br2.

As can be seen bridge br2 is elected as designated root bridge by spanning tree protocol. This is because the mac address of br2 is lower then br0 & br1

Bridge br1 port veth1(1) blocked by stp to break the loop and now there is no broadcast storm. The new topology after stp enabled should look like this

You ask why veth1 blocked this is because spanning tree packets received on this port has higher path cost than packet received on veth2 which is directly connected to root bridge

stp_2

Another logical view

stp_3


[root@localhost]# brctl stp br0 on
[root@localhost]# brctl stp br1 on
[root@localhost]# brctl stp br2 on

[root@localhost]# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.c2f691d238b0       yes             veth0
                                                        veth5
                                                        vethOC0NPG
br1             8000.d28623e04c88       yes             veth1
                                                        veth2
                                                        vethSQUCH3
br2             8000.9a127ad7bf76       yes             veth3
                                                        veth4
  
                                                      veth577WF3
//all ports in bridge br0 are in forwarding state
[root@localhost]# brctl showstp br0
br0
 bridge id              8000.c2f691d238b0
 designated root        8000.9a127ad7bf76
 root port                 2                    path cost                  2
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  62.44
 flags                  TOPOLOGY_CHANGE


veth0 (1)
 port id                8001                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

veth5 (2)
 port id                8002                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer         18.79
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vethOC0NPG (3)
 port id                8003                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

//port veth1(1) is in blocking state
[root@localhost]# brctl showstp br1
br1
 bridge id              8000.d28623e04c88
 designated root        8000.9a127ad7bf76
 root port                 2                    path cost                  2
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  73.66
 flags


veth1 (1)
 port id                8001                    state                  blocking
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer         19.64
 designated port        8001                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

veth2 (2)
 port id                8002                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer         19.64
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

vethSQUCH3 (3)
 port id                8003                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.d28623e04c88       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.57
 flags

//all ports in bridge br2 are in forwarding state
[root@localhost]# brctl showstp br2
br2
 bridge id              8000.9a127ad7bf76
 designated root        8000.9a127ad7bf76
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               1.68                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  66.65
 flags


veth3 (1)
 port id                8001                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.68
 flags

veth4 (2)
 port id                8002                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.68
 flags

veth577WF3 (3)
 port id                8003                    state                forwarding
 designated root        8000.9a127ad7bf76       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.68
 flags

Let’s try to change the root bridge selection. The easiest way to do is by changing bridge priority

Change bridge br2  priority to make it root bridge. Bridge with lower priority become root. In this case I am reducing bridge priority just by one 7999.

As can be seen port veth5(2) on bridge br0 transition to block state to break loop. Bridge br1 elected as root bridge


[root@localhost ~]# sudo brctl setbridgeprio br1 7999

[root@localhost]# brctl showstp br1
br1
 bridge id              1f3f.d28623e04c88
 designated root        1f3f.d28623e04c88
 root port                 0                    path cost                  0
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               1.30                 tcn timer                  0.00
 topology change timer    19.86                 gc timer                 243.09
 flags                  TOPOLOGY_CHANGE TOPOLOGY_CHANGE_DETECTED


veth1 (1)
 port id                8001                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      1f3f.d28623e04c88       message age timer          0.00
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.30
 flags

veth2 (2)
 port id                8002                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      1f3f.d28623e04c88       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.30
 flags

vethSQUCH3 (3)
 port id                8003                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      1f3f.d28623e04c88       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.30
 flags

[root@localhost ns]# brctl showstp br0
br0
 bridge id              8000.c2f691d238b0
 designated root        1f3f.d28623e04c88
 root port                 1                    path cost                  2
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                 223.34
 flags                  TOPOLOGY_CHANGE


veth0 (1)
 port id                8001                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      1f3f.d28623e04c88       message age timer         19.57
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

veth5 (2)
 port id                8002                    state                  blocking
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer         19.57
 designated port        8002                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

vethOC0NPG (3)
 port id                8003                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.55
 flags

[root@localhost]# brctl showstp br2
br2
 bridge id              8000.9a127ad7bf76
 designated root        1f3f.d28623e04c88
 root port                 1                    path cost                  2
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                   0.32
 flags                  TOPOLOGY_CHANGE


veth3 (1)
 port id                8001                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      1f3f.d28623e04c88       message age timer         18.34
 designated port        8002                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

veth4 (2)
 port id                8002                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8002                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

veth577WF3 (3)
 port id                8003                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

The new topology should look like this

stp_4

Let’s change the root port on  br0 from veth0 to veth5. The easiest way to do that is by changing the port cost on veth0. Edit port cost on veth0 to 10. This will cause veth5 to move to forwarding state and veth0 to blocking.

As can be seen port veth0 transitioned to blocking state


[root@localhost ~]# brctl setpathcost br0 veth0 10
[root@localhost ~]# brctl showstp br0
br0
 bridge id              8000.c2f691d238b0
 designated root        1f3f.d28623e04c88
 root port                 2                    path cost                  4
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                 226.87
 flags                  TOPOLOGY_CHANGE


veth0 (1)
 port id                8001                    state                  blocking
 designated root        1f3f.d28623e04c88       path cost                 10
 designated bridge      1f3f.d28623e04c88       message age timer         18.81
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

veth5 (2)
 port id                8002                    state                 listening
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer         18.81
 designated port        8002                    forward delay timer       12.41
 designated cost           2                    hold timer                 0.00
 flags

vethOC0NPG (3)
 port id                8003                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           4                    hold timer                 0.78
 flags

[root@localhost ~]# brctl showstp br0
br0
 bridge id              8000.c2f691d238b0
 designated root        1f3f.d28623e04c88
 root port                 2                    path cost                  4
 max age                  20.00                 bridge max age            20.00
 hello time                2.00                 bridge hello time          2.00
 forward delay            15.00                 bridge forward delay      15.00
 ageing time             300.00
 hello timer               0.00                 tcn timer                  0.00
 topology change timer     0.00                 gc timer                  32.26
 flags


veth0 (1)
 port id                8001                    state                  blocking
 designated root        1f3f.d28623e04c88       path cost                 10
 designated bridge      1f3f.d28623e04c88       message age timer         19.07
 designated port        8001                    forward delay timer        0.00
 designated cost           0                    hold timer                 0.00
 flags

veth5 (2)
 port id                8002                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.9a127ad7bf76       message age timer         19.01
 designated port        8002                    forward delay timer        0.00
 designated cost           2                    hold timer                 0.00
 flags

vethOC0NPG (3)
 port id                8003                    state                forwarding
 designated root        1f3f.d28623e04c88       path cost                  2
 designated bridge      8000.c2f691d238b0       message age timer          0.00
 designated port        8003                    forward delay timer        0.00
 designated cost           4                    hold timer                 0.00
 flags

Other useful command is to check mac table


[root@localhost ~]# brctl showmacs br0
port no mac addr                is local?       ageing timer
  2     00:16:3e:18:56:d5       no                10.04
  2     00:16:3e:6e:70:a7       no                12.00
  3     00:16:3e:80:81:03       no                17.73
  2     c2:f6:91:d2:38:b0       yes                0.00
  1     e2:74:c6:8f:fc:a6       yes                0.00
  1     e2:74:c6:8f:fc:a6       yes                0.00
  3     fe:0e:18:bc:7e:e3       yes                0.00
  3     fe:0e:18:bc:7e:e3       yes                0.00
  2     fe:fe:da:e0:3a:09       no                 0.81

You can dump packets on virtual interfaces and bridge


[root@localhost ~]# tcpdump -i veth5 -XX
tcpdump: WARNING: veth5: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth5, link-type EN10MB (Ethernet), capture size 65535 bytes
22:03:48.738230 STP 802.1d, Config, Flags [none], bridge-id 8000.9a:12:7a:d7:bf:76.8002, length 35
        0x0000:  0180 c200 0000 fefe dae0 3a09 0026 4242  ..........:..&BB
        0x0010:  0300 0000 0000 1f3f d286 23e0 4c88 0000  .......?..#.L...
        0x0020:  0002 8000 9a12 7ad7 bf76 8002 0001 1400  ......z..v......
        0x0030:  0200 0f00                                ....
22:03:50.738236 STP 802.1d, Config, Flags [none], bridge-id 8000.9a:12:7a:d7:bf:76.8002, length 35
        0x0000:  0180 c200 0000 fefe dae0 3a09 0026 4242  ..........:..&BB
        0x0010:  0300 0000 0000 1f3f d286 23e0 4c88 0000  .......?..#.L...
        0x0020:  0002 8000 9a12 7ad7 bf76 8002 0001 1400  ......z..v......
        0x0030:  0200 0f00