Deploying with Kubernetes

As a cloud-native distributed database product, PolarDB-X also supports deployment on Kubernetes clusters. With the support of PolarDB-X Operator, full lifecycle management capabilities are achieved.

Before deploying the PolarDB-X database in a Kubernetes environment according to this article, it's assumed that the following prerequisites have been met:

  1. A set of servers has been prepared, and they have been configured for server parameters and software installations in System and Environment Configuration;
  2. The deployment machine (ops) has internet access, or the relevant software packages and images have been prepared in advance in Software Package Download.


Installing Helm

The PolarDB-X Operator is deployed through Helm, Kubernetes' standard package management and deployment tool, with version 3 or above recommended.

If using an online installation method (the ops machine can access the internet), Helm needs to be downloaded and installed in advance. If using an offline installation method, as the offline package comes with Helm's binary file, this step can be skipped.

The Helm package is in binary form; choose the package according to the server's CPU architecture. For example, for x86 architecture:


After extracting, copy the helm file to the /usr/local/bin directory to complete the installation:

tar -zxvf helm-v3.9.0-linux-amd64.tar.gz -C $HOME/
sudo cp $HOME/linux-amd64/helm /usr/local/bin/helm

Execute the following command to output the installed version of Helm:

helm version

Installing Docker

Kubernetes uses Docker as the container runtime environment, so Docker must be installed first. Additionally, the use of a private image registry is recommended. Follow the Installing Docker and Image Repository document for deployment.

Installing Kubernetes

Before installing the PolarDB-X Operator, you need to install a Kubernetes cluster. You can follow the Installing Kubernetes guide for deployment.

Installing PolarDB-X Operator

Directory Mapping

PolarDB-X by default stores data, logs, and backup files in the following directories. It is advisable to create symbolic links to the data disk in advance:

ansible -i ${ini_file} all -m shell -a " mkdir -p /polarx/data /polarx/data-log /polarx/filestream "

ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/data /data "
ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/data-log /data-log "
ansible -i ${ini_file} all -m shell -a " ln -sf /polarx/filestream /filestream "

Online Installation

If the deployment machine (ops) has internet access, it is recommended to install using the online method.

  1. Add the PolarDB-X related URL to the Helm repository:
helm repo add polardbx
  1. Check the released versions of PolarDB-X Operator:
helm search repo polardbx/polardbx-operator -l

It's recommended to install the latest version, for example, v1.6.0.

  1. Create the polardbx-operator-system namespace:
kubectl create namespace polardbx-operator-system
  1. Install version v1.6.0 using the following Helm command:
helm install --namespace polardbx-operator-system \
    --set --set imageRepo=registry:5000 \
    --set \
    --version 1.6.0 \
    polardbx-operator polardbx/polardbx-operator
  1. Check if the containers have started successfully:
kc get pods -n polardbx-operator-system

Once all container Pods are in the "Running" state, the installation is complete.

Offline Installation

If the deployment environment does not have internet access, please refer to Installing Kubernetes, download the corresponding architecture's offline installation package, and copy it to the deployment machine (ops) in the deployment environment. This offline package will by default download the latest version of PolarDB-X Operator and its dependent Docker images.

The directory structure of the K8s offline installation package is as follows:

|-- helm                                       # Helm package directory
|   |-- bin
|   |   `-- helm                               # Binary file of helm, used for installation
|   |-- logger-values.yaml                     # Values file for logcollector installation
|   |-- monitor-values.yaml                    # Values file for monitor installation
|   |-- operator-values.yaml                   # Values file for operator installation
|   |-- polardbx-logcollector-1.3.0.tgz        # Helm package for logcollector
|   |-- polardbx-monitor-1.3.0.tgz             # Helm package for monitor
|   `-- polardbx-operator-1.3.0.tgz            # Helm package for operator
`-- images                                     # Docker image directory
|   |-- image.list                             # List of downloaded Docker images
|   |-- image.manifest                         # Offline environment image manifest, script parameters
|   |--                          # Script for importing images in an offline environment
    |-- polardbx-init-latest-arm64.tar.gz
    `-- xstore-tools-latest-arm64.tar.gz
    `-- ....
`--                                 # Installation script
  1. On the deployment machine (ops) in the deployment environment, execute the following commands to enter the offline installation package directory, import Docker images with one click, and install PolarDB-X Operator and its related components.
cd polardbx-install

The above script mainly performs the following tasks:

  • Import the Docker images from the images directory and push them to the specified private repository. This document uses registry:5000 as an example.
  • Install the helm packages from the helm directory and modify the image repository in the PolarDB-X Operator to the private image repository address you specified via the corresponding values.yaml file.

  • Check if the container has started successfully:

kubectl get pods -n polardbx-operator-system

When all container Pods are in the "Running" state, it means the installation is complete.

Deploying PolarDB-X

Planning the Cluster Topology

A PolarDB-X database cluster consists of Compute Nodes (CN), Data Nodes (DN), Global Meta Service (GMS), and as an optional feature, Change Data Capture (CDC) Nodes. The minimum deployment scale for a production environment involves three servers, while a typical database cluster contains at least two compute nodes, more than two sets of data nodes, one set of GMS nodes, and optionally, one set of CDC nodes. Data Nodes (DN) and the Global Meta Service (GMS) achieve high availability through the X-Paxos protocol. Thus, each set of data nodes or metadata services should consist of three independent replica container instances, which need to be deployed on different servers.

The main principles for planning the PolarDB-X cluster topology based on different server models/quantities are:

  • Due to different requirements for computing and I/O capabilities, it is recommended to deploy Compute Nodes (CN) and Data Nodes (DN) on two separate groups of servers.
  • Select servers with better I/O capabilities and larger disk space to deploy Data Nodes (DN); servers with weaker I/O but superior CPU capabilities should be allocated for Compute Nodes (CN).
  • If a single server has fewer than 16 CPUs, it is advisable to deploy one Compute Node (CN); otherwise, deploy multiple Compute Nodes.
  • All Compute Nodes (CN) should be allocated the same server resources to prevent the "weakest link effect" in the cluster, where the node with the least resources becomes a performance bottleneck. All Data Nodes (DN), except for log replicas, should follow the same principle.
  • The three replicas of the same set of Data Nodes should be deployed on three separate servers. The three replicas of different Data Nodes can be mixed on the same server to maximize server performance while ensuring high availability of the cluster.
  • The Global Meta Service (GMS) stores the cluster's metadata, and its three replicas should be deployed on three separate servers to ensure high availability.

Here is a cluster topology planned according to the above principles:

Server Server Specifications Node Container Limits Number of Containers 16c64G CN1 16c32G 1 16c64G CN2 16c32G 1 16c64G CN3 16c32G 1 16c128G DN1-1(leader) 16c50G 4
DN2-3(log) -
DN3-2(follower) 16c50G
GMS-1 8c16G 16c128G DN1-2(follower) 16c50G 4
DN2-1(leader) 16c50G
DN3-3(log) -
GMS-2 8c16G 16c128G DN1-3(log) - 4
DN2-2(follower) 16c50G
DN3-1(leader) 16c50G
GMS-3 2c4G

Note: The default memory limit for the DN's log node is set to 4GB, as the log only records events and requires minimal memory and CPU resources.

Labeling Cluster Nodes

In accordance with the aforementioned topology plan, Kubernetes cluster nodes need to be labeled into two groups: one for deploying CNs and the other for deploying DNs.

First, display the node names:

kubectl get nodes -o wide

Example of output:

NAME              STATUS   ROLES                  AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
pxc-server-02     Ready    <none>                 4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15
pxc-server-03     Ready    <none>                 4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15
pxc-server-04     Ready    <none>                 4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15
pxc-server-05     Ready    <none>                 4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15
pxc-server-06     Ready    control-plane,master   4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15
pxc-server-07     Ready    <none>                 4d23h   v1.21.0    <none>        CentOS Linux 7 (Core)   3.10.0-1160.81.1.el7.x86_64   docker://19.3.15

Label the cluster nodes with the specified names:

# Label the nodes with the specified names to deploy CN
kubectl label node pxc-server-02 pxc-server-03 pxc-server-04 polardbx/node=cn

# Label the nodes with the specified names to deploy DN
kubectl label node pxc-server-05 pxc-server-06 pxc-server-07 polardbx/node=dn

The Ansible tool can be used to perform this task more quickly:

# Batch label CN nodes
ansible -i ${ini_file} cn -m shell -a " kubectl label node \`hostname\` polardbx/node=cn "

# Batch label DN nodes
ansible -i ${ini_file} dn -m shell -a " kubectl label node \`hostname\` polardbx/node=dn "

If the cluster has a small number of nodes, the following command can be executed to allow scheduling on the Kubernetes master node:

kubectl taint node -l

Preparing the Topology Configuration

First, you need to obtain the latest image tags for each PolarDB-X component with the following command:

curl -s "" | sh

The output will be as follows (taking PolarDB-X V2.4.0 as an example):

CN polardbx/polardbx-sql:v2.4.0_5.4.19
DN polardbx/polardbx-engine:v2.4.0_8.4.19
CDC polardbx/polardbx-cdc:v2.4.0_5.4.19

According to the topology planning, create the Yaml configuration file as follows and modify the image tags according to the output:

vi polarx_lite.yaml

Contents of the configuration file:

kind: PolarDBXCluster
  name: pxc-product
  # Initial account password for PolarDB-X
  - username: admin
    password: "123456"
    type: SUPER

  # Configuration template, using production settings
    name: product-80

  # PolarDB-X cluster configuration
    # CN related configuration
      # Static configuration
        # Use the new RPC protocol
        RPCProtocolVersion: 2
  # PolarDB-X cluster topology
    # Cluster deployment rules
      # Predefined node selector
      - name: node-cn
          - matchExpressions:
            - key: polardbx/node
              operator: In
              - cn
      - name: node-dn
          - matchExpressions:
            - key: polardbx/node
              operator: In
              - dn

        # DN deployment rules
          - name: cands
            role: Candidate
            replicas: 2
              reference: node-dn
          - name: log
            role: Voter
            replicas: 1
              reference: node-dn

        # CN deployment rules
        - name: cn
            reference: node-cn
      # GMS specification configuration
          # DN image
          image: registry:5000/polardbx-engine:v2.4.0_8.4.19
          # Use the host network
          hostNetwork: true
          # GMS resource specifications
              cpu: 4
              memory: 16Gi
              cpu: 16
              memory: 16Gi

      # DN specification configuration
        # DN quantity configuration
        replicas: 3
          image: registry:5000/polardbx-engine:v2.4.0_8.4.19
          # Use the host network
          hostNetwork: true
          # DN resource specifications
              cpu: 8
              memory: 50Gi
              cpu: 16
              memory: 50Gi

      # CN specification configuration
        # CN quantity configuration
        replicas: 3
          image: registry:5000/polardbx-sql:v2.4.0_5.4.19
          # Use the host network
          hostNetwork: true
              cpu: 8
              memory: 64Gi
              cpu: 16
              memory: 64Gi

Checking the Parameter Template

By default, the PolarDB-X Operator creates a parameter template called product in the polardbx-operator-system namespace. If your PolarDB-X cluster is created in the default namespace as shown in the above yaml, you need to execute the following command to copy the product parameter template to the default namespace.

kubectl get pxpt product-80 -n polardbx-operator-system -o json | jq '.metadata.namespace = "default"' | kubectl apply -f  -

Note: The above command requires the jq tool to be installed.

Deploying the Database

Run the following command to deploy the PolarDB-X database in the Kubernetes cluster:

kubectl create -f polarx_lite.yaml

Check the container Pod status until all containers display "Running":

kubectl get pods

Confirm the status of the PolarDB-X database with the following command:

kubectl get pxc pxc-product

Accessing the Database

After the PolarDB-X database deployment is complete, use the following command to query the Cluster-IP address:

kubectl get svc pxc-product

Example of output:

NAME          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
pxc-product   ClusterIP   <none>        3306/TCP,8081/TCP   122m

Within the Kubernetes cluster, access the database using the CLUSTER-IP and the username/password specified in the topology Yaml configuration:

mysql -h -P3306 -u admin -p123456 -Ac

Once connected, you can run the following SQL to display information about the storage nodes:

MySQL [(none)]> show storage;

Example of output:

| pxc-product-dn-0   | | true       | MASTER    | 5        | 21          | 0      | false     | null  | null   |
| pxc-product-dn-1   | | true       | MASTER    | 5        | 19          | 0      | true      | null  | null   |
| pxc-product-dn-2   | | true       | MASTER    | 5        | 19          | 0      | true      | null  | null   |
| pxc-product-gms    | | true       | META_DB   | 2        | 2           | 0      | false     | null  | null   |
4 rows in set (0.01 sec)


Switching Access Mode

Within the Kubernetes cluster, the PolarDB-X database typically provides services using the Cluster-IP mode. However, servers outside the Kubernetes cluster cannot access the Cluster-IP, and at that time, you need to adjust the PolarDB-X configuration to provide services using the NodePort mode.

Run the following command:

kubectl edit svc pxc-product

Enter Yaml edit mode, change the content of spec: type: ClusterIP to NodePort, and save to exit editing:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
apiVersion: v1
kind: Service
  creationTimestamp: "2023-03-20T08:25:53Z"
    polardbx/cn-type: rw
    polardbx/name: pxc-product
    polardbx/rand: brgs
    polardbx/role: cn
  name: pxc-product
  namespace: default
  - apiVersion:
    blockOwnerDeletion: true
    controller: true
    kind: PolarDBXCluster
    name: pxc-product
    uid: fe377807-928a-45a2-990d-756181d0e655
  resourceVersion: "2928246"
  uid: fcd423d2-27c7-4319-8840-eaf0ca1308a0
  - IPv4
  ipFamilyPolicy: SingleStack
  - name: mysql
    port: 3306
    protocol: TCP
    targetPort: mysql
  - name: metrics
    port: 8081
    protocol: TCP
    targetPort: metrics
    polardbx/cn-type: rw
    polardbx/name: pxc-product
    polardbx/rand: brgs
    polardbx/role: cn
  sessionAffinity: None
  type: NodePort
  loadBalancer: {}

Check the PolarDB-X database service address:

kubectl get svc pxc-product

Example of output:

NAME          TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
pxc-product   NodePort  <none>        3306:31504/TCP,8081:30975/TCP   86m

Now, from outside the Kubernetes cluster, we can access the PolarDB-X database using any Kubernetes server's IP address and port 31504:

mysql -h -P31504 -u admin -p123456 -Ac

results matching ""

    No results matching ""