Deploying with Kubernetes
As a cloud-native distributed database product, PolarDB-X also supports deployment on Kubernetes clusters. With the support of PolarDB-X Operator, full lifecycle management capabilities are achieved.
Before deploying the PolarDB-X database in a Kubernetes environment according to this article, it's assumed that the following prerequisites have been met:
- A set of servers has been prepared, and they have been configured for server parameters and software installations in System and Environment Configuration;
- The deployment machine (ops) has internet access, or the relevant software packages and images have been prepared in advance in Software Package Download.
Preparation
Installing Helm
The PolarDB-X Operator is deployed through Helm, Kubernetes' standard package management and deployment tool, with version 3 or above recommended.
If using an online installation method (the ops machine can access the internet), Helm needs to be downloaded and installed in advance. If using an offline installation method, as the offline package comes with Helm's binary file, this step can be skipped.
The Helm package is in binary form; choose the package according to the server's CPU architecture. For example, for x86 architecture:
wget https://labfileapp.oss-cn-hangzhou.aliyuncs.com/helm-v3.9.0-linux-amd64.tar.gz
After extracting, copy the helm file to the /usr/local/bin directory to complete the installation:
tar -zxvf helm-v3.9.0-linux-amd64.tar.gz -C $HOME/
sudo cp $HOME/linux-amd64/helm /usr/local/bin/helm
Execute the following command to output the installed version of Helm:
helm version
Installing Docker
Kubernetes uses Docker as the container runtime environment, so Docker must be installed first. Additionally, the use of a private image registry is recommended. Follow the Installing Docker and Image Repository document for deployment.
Installing Kubernetes
Before installing the PolarDB-X Operator, you need to install a Kubernetes cluster. You can follow the Installing Kubernetes guide for deployment.
Installing PolarDB-X Operator
Directory Mapping
PolarDB-X by default stores data, logs, and backup files in the following directories. It is advisable to create symbolic links to the data disk in advance:
ansible -i ${ini_file} all -m shell -a " mkdir -p /polarx/data /polarx/data-log /polarx/filestream "
ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/data /data "
ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/data-log /data-log "
ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/filestream /filestream "
Online Installation
If the deployment machine (ops) has internet access, it is recommended to install using the online method.
- Add the PolarDB-X related URL to the Helm repository:
helm repo add polardbx https://polardbx-charts.oss-cn-beijing.aliyuncs.com
- Check the released versions of PolarDB-X Operator:
helm search repo polardbx/polardbx-operator -l
It's recommended to install the latest version, for example, v1.7.0.
- Create the polardbx-operator-system namespace:
kubectl create namespace polardbx-operator-system
- Install version v1.7.0 using the following Helm command:
helm install --namespace polardbx-operator-system \
--set node.volumes.data=/polarx/data --set imageRepo=registry:5000 \
--set extension.config.images.store.galaxy.exporter=register:5000/mysqld-exporter:master \
--version 1.6.2 \
polardbx-operator polardbx/polardbx-operator
- Check if the containers have started successfully:
kc get pods -n polardbx-operator-system
Once all container Pods are in the "Running" state, the installation is complete.
Offline Installation
If the deployment environment does not have internet access, please refer to Installing Kubernetes, download the corresponding architecture's offline installation package, and copy it to the deployment machine (ops) in the deployment environment. This offline package will by default download the latest version of PolarDB-X Operator and its dependent Docker images.
The directory structure of the K8s offline installation package is as follows:
polardbx-install
|-- helm # Helm package directory
| |-- bin
| | `-- helm # Binary file of helm, used for installation
| |-- logger-values.yaml # Values file for logcollector installation
| |-- monitor-values.yaml # Values file for monitor installation
| |-- operator-values.yaml # Values file for operator installation
| |-- polardbx-logcollector-1.3.0.tgz # Helm package for logcollector
| |-- polardbx-monitor-1.3.0.tgz # Helm package for monitor
| `-- polardbx-operator-1.3.0.tgz # Helm package for operator
`-- images # Docker image directory
| |-- image.list # List of downloaded Docker images
| |-- image.manifest # Offline environment image manifest, script parameters
| |-- load_image.sh # Script for importing images in an offline environment
|-- polardbx-init-latest-arm64.tar.gz
`-- xstore-tools-latest-arm64.tar.gz
`-- ....
`-- install.sh # Installation script
- On the deployment machine (ops) in the deployment environment, execute the following commands to enter the offline installation package directory, import Docker images with one click, and install PolarDB-X Operator and its related components.
cd polardbx-install
sh install.sh
The above install.sh script mainly performs the following tasks:
- Import the Docker images from the images directory and push them to the specified private repository. This document uses registry:5000 as an example.
Install the helm packages from the helm directory and modify the image repository in the PolarDB-X Operator to the private image repository address you specified via the corresponding values.yaml file.
Check if the container has started successfully:
kubectl get pods -n polardbx-operator-system
When all container Pods are in the "Running" state, it means the installation is complete.
Deploying PolarDB-X
Planning the Cluster Topology
A PolarDB-X database cluster consists of Compute Nodes (CN), Data Nodes (DN), Global Meta Service (GMS), and as an optional feature, Change Data Capture (CDC) Nodes. The minimum deployment scale for a production environment involves three servers, while a typical database cluster contains at least two compute nodes, more than two sets of data nodes, one set of GMS nodes, and optionally, one set of CDC nodes. Data Nodes (DN) and the Global Meta Service (GMS) achieve high availability through the X-Paxos protocol. Thus, each set of data nodes or metadata services should consist of three independent replica container instances, which need to be deployed on different servers.
The main principles for planning the PolarDB-X cluster topology based on different server models/quantities are:
- Due to different requirements for computing and I/O capabilities, it is recommended to deploy Compute Nodes (CN) and Data Nodes (DN) on two separate groups of servers.
- Select servers with better I/O capabilities and larger disk space to deploy Data Nodes (DN); servers with weaker I/O but superior CPU capabilities should be allocated for Compute Nodes (CN).
- If a single server has fewer than 16 CPUs, it is advisable to deploy one Compute Node (CN); otherwise, deploy multiple Compute Nodes.
- All Compute Nodes (CN) should be allocated the same server resources to prevent the "weakest link effect" in the cluster, where the node with the least resources becomes a performance bottleneck. All Data Nodes (DN), except for log replicas, should follow the same principle.
- The three replicas of the same set of Data Nodes should be deployed on three separate servers. The three replicas of different Data Nodes can be mixed on the same server to maximize server performance while ensuring high availability of the cluster.
- The Global Meta Service (GMS) stores the cluster's metadata, and its three replicas should be deployed on three separate servers to ensure high availability.
Here is a cluster topology planned according to the above principles:
Server | Server Specifications | Node | Container Limits | Number of Containers |
192.168.1.102 | 16c64G | CN1 | 16c32G | 1 |
192.168.1.103 | 16c64G | CN2 | 16c32G | 1 |
192.168.1.104 | 16c64G | CN3 | 16c32G | 1 |
192.168.1.105 | 16c128G | DN1-1(leader) | 16c50G | 4 |
DN2-3(log) | - | |||
DN3-2(follower) | 16c50G | |||
GMS-1 | 8c16G | |||
192.168.1.106 | 16c128G | DN1-2(follower) | 16c50G | 4 |
DN2-1(leader) | 16c50G | |||
DN3-3(log) | - | |||
GMS-2 | 8c16G | |||
192.168.1.107 | 16c128G | DN1-3(log) | - | 4 |
DN2-2(follower) | 16c50G | |||
DN3-1(leader) | 16c50G | |||
GMS-3 | 2c4G |
Note: The default memory limit for the DN's log node is set to 4GB, as the log only records events and requires minimal memory and CPU resources.
Labeling Cluster Nodes
In accordance with the aforementioned topology plan, Kubernetes cluster nodes need to be labeled into two groups: one for deploying CNs and the other for deploying DNs.
First, display the node names:
kubectl get nodes -o wide
Example of output:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
pxc-server-02 Ready <none> 4d23h v1.21.0 192.168.1.2 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
pxc-server-03 Ready <none> 4d23h v1.21.0 192.168.1.3 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
pxc-server-04 Ready <none> 4d23h v1.21.0 192.168.1.4 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
pxc-server-05 Ready <none> 4d23h v1.21.0 192.168.1.5 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
pxc-server-06 Ready control-plane,master 4d23h v1.21.0 192.168.1.6 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
pxc-server-07 Ready <none> 4d23h v1.21.0 192.168.1.7 <none> CentOS Linux 7 (Core) 3.10.0-1160.81.1.el7.x86_64 docker://19.3.15
Label the cluster nodes with the specified names:
# Label the nodes with the specified names to deploy CN
kubectl label node pxc-server-02 pxc-server-03 pxc-server-04 polardbx/node=cn
# Label the nodes with the specified names to deploy DN
kubectl label node pxc-server-05 pxc-server-06 pxc-server-07 polardbx/node=dn
The Ansible tool can be used to perform this task more quickly:
# Batch label CN nodes
ansible -i ${ini_file} cn -m shell -a " kubectl label node \`hostname\` polardbx/node=cn "
# Batch label DN nodes
ansible -i ${ini_file} dn -m shell -a " kubectl label node \`hostname\` polardbx/node=dn "
If the cluster has a small number of nodes, the following command can be executed to allow scheduling on the Kubernetes master node:
kubectl taint node -l node-role.kubernetes.io/master node-role.kubernetes.io/master-
Preparing the Topology Configuration
First, you need to obtain the latest image tags for each PolarDB-X component with the following command:
curl -s "https://polardbx-opensource.oss-cn-hangzhou.aliyuncs.com/scripts/get-version.sh" | sh
The output will be as follows (taking PolarDB-X V2.4.0 as an example):
CN polardbx/polardbx-sql:v2.4.0_5.4.19
DN polardbx/polardbx-engine:v2.4.0_8.4.19
CDC polardbx/polardbx-cdc:v2.4.0_5.4.19
According to the topology planning, create the Yaml configuration file as follows and modify the image tags according to the output:
vi polarx_lite.yaml
Contents of the configuration file:
apiVersion: polardbx.aliyun.com/v1
kind: PolarDBXCluster
metadata:
name: pxc-product
spec:
# Initial account password for PolarDB-X
privileges:
- username: admin
password: "123456"
type: SUPER
# Configuration template, using production settings
parameterTemplate:
name: product-80
# PolarDB-X cluster configuration
config:
# CN related configuration
cn:
# Static configuration
static:
# Use the new RPC protocol
RPCProtocolVersion: 2
# PolarDB-X cluster topology
topology:
# Cluster deployment rules
rules:
# Predefined node selector
selectors:
- name: node-cn
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: polardbx/node
operator: In
values:
- cn
- name: node-dn
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: polardbx/node
operator: In
values:
- dn
components:
# DN deployment rules
dn:
nodeSets:
- name: cands
role: Candidate
replicas: 2
selector:
reference: node-dn
- name: log
role: Voter
replicas: 1
selector:
reference: node-dn
# CN deployment rules
cn:
- name: cn
selector:
reference: node-cn
nodes:
# GMS specification configuration
gms:
template:
# DN image
image: registry:5000/polardbx-engine:v2.4.0_8.4.19
# Use the host network
hostNetwork: true
# GMS resource specifications
resources:
requests:
cpu: 4
memory: 16Gi
limits:
cpu: 16
memory: 16Gi
# DN specification configuration
dn:
# DN quantity configuration
replicas: 3
template:
image: registry:5000/polardbx-engine:v2.4.0_8.4.19
# Use the host network
hostNetwork: true
# DN resource specifications
resources:
requests:
cpu: 8
memory: 50Gi
limits:
cpu: 16
memory: 50Gi
# CN specification configuration
cn:
# CN quantity configuration
replicas: 3
template:
image: registry:5000/polardbx-sql:v2.4.0_5.4.19
# Use the host network
hostNetwork: true
resources:
requests:
cpu: 8
memory: 64Gi
limits:
cpu: 16
memory: 64Gi
Checking the Parameter Template
By default, the PolarDB-X Operator creates a parameter template called product in the polardbx-operator-system namespace. If your PolarDB-X cluster is created in the default namespace as shown in the above yaml, you need to execute the following command to copy the product parameter template to the default namespace.
kubectl get pxpt product-80 -n polardbx-operator-system -o json | jq '.metadata.namespace = "default"' | kubectl apply -f -
Note: The above command requires the jq tool to be installed.
Deploying the Database
Run the following command to deploy the PolarDB-X database in the Kubernetes cluster:
kubectl create -f polarx_lite.yaml
Check the container Pod status until all containers display "Running":
kubectl get pods
Confirm the status of the PolarDB-X database with the following command:
kubectl get pxc pxc-product
Accessing the Database
After the PolarDB-X database deployment is complete, use the following command to query the Cluster-IP address:
kubectl get svc pxc-product
Example of output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
pxc-product ClusterIP 10.110.125.95 <none> 3306/TCP,8081/TCP 122m
Within the Kubernetes cluster, access the database using the CLUSTER-IP
and the username/password specified in the topology Yaml configuration:
mysql -h 10.103.174.164 -P3306 -u admin -p123456 -Ac
Once connected, you can run the following SQL to display information about the storage nodes:
MySQL [(none)]> show storage;
Example of output:
+--------------------+---------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| STORAGE_INST_ID | LEADER_NODE | IS_HEALTHY | INST_KIND | DB_COUNT | GROUP_COUNT | STATUS | DELETABLE | DELAY | ACTIVE |
+--------------------+---------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| pxc-product-dn-0 | 192.168.1.105:15420 | true | MASTER | 5 | 21 | 0 | false | null | null |
| pxc-product-dn-1 | 192.168.1.106:17907 | true | MASTER | 5 | 19 | 0 | true | null | null |
| pxc-product-dn-2 | 192.168.1.107:16308 | true | MASTER | 5 | 19 | 0 | true | null | null |
| pxc-product-gms | 192.168.1.105:17296 | true | META_DB | 2 | 2 | 0 | false | null | null |
+--------------------+---------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
4 rows in set (0.01 sec)
Post-Deployment
Switching Access Mode
Within the Kubernetes cluster, the PolarDB-X database typically provides services using the Cluster-IP mode. However, servers outside the Kubernetes cluster cannot access the Cluster-IP, and at that time, you need to adjust the PolarDB-X configuration to provide services using the NodePort mode.
Run the following command:
kubectl edit svc pxc-product
Enter Yaml edit mode, change the content of spec: type: ClusterIP to NodePort, and save to exit editing:
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2023-03-20T08:25:53Z"
labels:
polardbx/cn-type: rw
polardbx/name: pxc-product
polardbx/rand: brgs
polardbx/role: cn
name: pxc-product
namespace: default
ownerReferences:
- apiVersion: polardbx.aliyun.com/v1
blockOwnerDeletion: true
controller: true
kind: PolarDBXCluster
name: pxc-product
uid: fe377807-928a-45a2-990d-756181d0e655
resourceVersion: "2928246"
uid: fcd423d2-27c7-4319-8840-eaf0ca1308a0
spec:
clusterIP: 10.110.125.95
clusterIPs:
- 10.110.125.95
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: mysql
port: 3306
protocol: TCP
targetPort: mysql
- name: metrics
port: 8081
protocol: TCP
targetPort: metrics
selector:
polardbx/cn-type: rw
polardbx/name: pxc-product
polardbx/rand: brgs
polardbx/role: cn
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}
Check the PolarDB-X database service address:
kubectl get svc pxc-product
Example of output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
pxc-product NodePort 10.110.125.95 <none> 3306:31504/TCP,8081:30975/TCP 86m
Now, from outside the Kubernetes cluster, we can access the PolarDB-X database using any Kubernetes server's IP address and port 31504:
mysql -h 192.168.1.105 -P31504 -u admin -p123456 -Ac