Use a program to import data
This topic describes how to write code to import data into PolarDB-X.
In this topic, a sample table is used in the example to describe the operations that are performed to import data. The following statement is executed to create the sample table:
CREATE TABLE `test1` (
`id` int(11) NOT NULL,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);
Export source data from a database
You can generate the source data that you want to import to PolarDB-X. You can also export the source data from a database. You can run the mysql -e command to export the source data from a database that contains the sample table. PolarDB-X and MySQL support the command. The following sample code provides an example on how to run this command:
mysql -h ip -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" >/home/data_1000w.txt
## The raw data is tab-separated, with the data format being: 188092293 27267211 59775766593-64673028018-...-09474402685 01705051424-...-54211554755
mysql -h ip -P port -u usr -pPassword db_name -N -e "SELECT id,k,c,pad FROM test1;" | sed 's/\t/,/g' >/home/data_1000w.csv
## The CSV (Comma-Separated Values) file format uses commas to separate data, with the format being: 188092293,27267211,59775766593-64673028018-...-09474402685,01705051424-...-54211554755
We recommend that you export the data in the CSV format to help the program that is described in the following section to read the data.
Create a table in your destination PolarDB-X database
The source data does not contain the statement that is used to create a table. Therefore, you must create a table in your destination PolarDB-X database. The following sample code provides an example on how to create a table:
CREATE TABLE `test1` (
`id` int(11) NOT NULL,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 dbpartition by hash(`id`);
Use a program to import the source data into PolarDB-X
You can write a program and use the program to connect to your destination PolarDB-X database. After the destination database is connected, use the program to read the source data and perform batch inserts to import the source data into the destination PolarDB-X database.
The following sample code provides an example on how to write a Java program:
// mysql-connector-java.jar is required. For more information, visit https://mvnrepository.com/artifact/mysql/mysql-connector-java.
// Download link: https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.20/mysql-connector-java-8.0.20.jar.
// Note: Different versions of mysql-connector-java.jar contain Class.forName("com.mysql.cj.jdbc.Driver") class in different paths.
// Compile javac LoadData.java.
// Run java -cp .:mysql-connector-java-8.0.20.jar LoadData
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
public class LoadData {
public static void main(String[] args) throws IOException, ClassNotFoundException, SQLException {
File dataFile = new File("/home/data_1000w.csv");
String sql = "insert into test1(id, k, c, pad) values(?, ?, ?, ?)";
int batchSize = 1000;
try (
Connection connection = getConnection("ip", 3306, "db", "usr", "password");
BufferedReader br = new BufferedReader(new FileReader(dataFile))) {
String line;
PreparedStatement st = connection.prepareStatement(sql);
long startTime = System.currentTimeMillis();
int batchCount = 0;
while ((line = br.readLine()) != null) {
String[] data = line.split(",");
st.setInt(1, Integer.valueOf(data[0]));
st.setInt(2, Integer.valueOf(data[1]));
st.setObject(3, data[2]);
st.setObject(4, data[3]);
st.addBatch();
if (++batchCount % batchSize == 0) {
st.executeBatch();
System.out.println(String.format("insert %d records", batchCount));
}
}
if (batchCount % batchSize != 0) {
st.executeBatch();
}
long cost = System.currentTimeMillis() - startTime;
System.out.println(String.format("Take %d second,insert %d records, tps %d", cost/1000, batchCount, batchCount/(cost/1000)));
}
}
/**
* Obtain the information about the connection to the destination PolarDB-X database.
*
* @param host The IP address of the destination PolarDB-X database.
* @param port The port number of the destination PolarDB-X database.
* @param database The name of the destination PolarDB-X database.
* @param username The username of the account that is used to connect to the destination PolarDB-X database.
* @param password: The password of the account that is used to connect to the destination PolarDB-X database.
* @return
* @throws ClassNotFoundException
* @throws SQLException
*/
private static Connection getConnection(String host, int port, String database, String username, String password)
throws ClassNotFoundException, SQLException {
Class.forName("com.mysql.cj.jdbc.Driver");
String url = String.format(
"jdbc:mysql://%s:%d/%s?autoReconnect=true&socketTimeout=600000&rewriteBatchedStatements=true", host, port,
database);
Connection con = DriverManager.getConnection(url, username, password);
return con;
}
}
You can write your program based on actual application scenarios. For example, you can set an appropriate batch size and specify multiple threads to enhance performance.