Use a program to import data

This topic describes how to write code to import data into PolarDB-X.

In this topic, a sample table is used in the example to describe the operations that are performed to import data. The following statement is executed to create the sample table:

CREATE TABLE `test1` (
    `id` int(11) NOT NULL,
    `k` int(11) NOT NULL DEFAULT '0',
    `c` char(120) NOT NULL DEFAULT '',
    `pad` char(60) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);

Export source data from a database

You can generate the source data that you want to import to PolarDB-X. You can also export the source data from a database. You can run the mysql -e command to export the source data from a database that contains the sample table. PolarDB-X and MySQL support the command. The following sample code provides an example on how to run this command:

mysql -h ip  -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" >/home/data_1000w.txt
## The raw data is tab-separated, with the data format being: 188092293    27267211    59775766593-64673028018-...-09474402685    01705051424-...-54211554755
mysql -h ip  -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" | sed 's/\t/,/g' >/home/data_1000w.csv
## The CSV (Comma-Separated Values) file format uses commas to separate data, with the format being: 188092293,27267211,59775766593-64673028018-...-09474402685,01705051424-...-54211554755

We recommend that you export the data in the CSV format to help the program that is described in the following section to read the data.

Create a table in your destination PolarDB-X database

The source data does not contain the statement that is used to create a table. Therefore, you must create a table in your destination PolarDB-X database. The following sample code provides an example on how to create a table:

CREATE TABLE `test1` (
    `id` int(11) NOT NULL,
    `k` int(11) NOT NULL DEFAULT '0',
    `c` char(120) NOT NULL DEFAULT '',
    `pad` char(60) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`),
    KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);

Use a program to import the source data into PolarDB-X

You can write a program and use the program to connect to your destination PolarDB-X database. After the destination database is connected, use the program to read the source data and perform batch inserts to import the source data into the destination PolarDB-X database.

The following sample code provides an example on how to write a Java program:

// mysql-connector-java.jar is required. For more information, visit https://mvnrepository.com/artifact/mysql/mysql-connector-java.
// Download link: https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.20/mysql-connector-java-8.0.20.jar.
// Note: Different versions of mysql-connector-java.jar contain Class.forName("com.mysql.cj.jdbc.Driver") class in different paths.
// Compile javac LoadData.java.
// Run java -cp .:mysql-connector-java-8.0.20.jar LoadData

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

public class LoadData {
  public static void main(String[] args) throws IOException, ClassNotFoundException, SQLException {
    File dataFile = new File("/home/data_1000w.csv");
    String sql = "insert into test1(id, k, c, pad) values(?, ?, ?, ?)";
    int batchSize = 1000;
    try (
            Connection connection = getConnection("ip", 3306, "db", "usr", "password");
            BufferedReader br = new BufferedReader(new FileReader(dataFile))) {
      String line;
      PreparedStatement st = connection.prepareStatement(sql);
      long startTime = System.currentTimeMillis();
      int batchCount = 0;

      while ((line = br.readLine()) != null) {
        String[] data = line.split(",");
        st.setInt(1, Integer.valueOf(data[0]));
        st.setInt(2, Integer.valueOf(data[1]));
        st.setObject(3, data[2]);
        st.setObject(4, data[3]);

        st.addBatch();
        if (++batchCount % batchSize == 0) {
          st.executeBatch();
          System.out.println(String.format("insert %d records", batchCount));
        }
      }
      if (batchCount % batchSize != 0) {
        st.executeBatch();
      }
      long cost = System.currentTimeMillis() - startTime;
      System.out.println(String.format("Take %d second,insert %d records, tps %d", cost/1000, batchCount, batchCount/(cost/1000)));
    }
  }
  /**
   * Obtain the information about the connection to the destination PolarDB-X database.
   *
   * @param host     The IP address of the destination PolarDB-X database.
   * @param port    The port number of the destination PolarDB-X database.
   * @param database The name of the destination PolarDB-X database.
   * @param username The username of the account that is used to connect to the destination PolarDB-X database.
   * @param password: The password of the account that is used to connect to the destination PolarDB-X database.
   * @return
   * @throws ClassNotFoundException
   * @throws SQLException
   */
  private static Connection getConnection(String host, int port, String database, String username, String password)
          throws ClassNotFoundException, SQLException {
    Class.forName("com.mysql.cj.jdbc.Driver");
    String url = String.format(
            "jdbc:mysql://%s:%d/%s?autoReconnect=true&socketTimeout=600000&rewriteBatchedStatements=true", host, port,
            database);
    Connection con = DriverManager.getConnection(url, username, password);
    return con;
  }
}

You can write your program based on actual application scenarios. For example, you can set an appropriate batch size and specify multiple threads to enhance performance.

results matching ""

    No results matching ""