CREATE TABLE
This article mainly introduces the syntax, clauses, parameters, and basic methods of creating tables using DDL statements.
Syntax
CREATE [PARTITION] TABLE [IF NOT EXISTS] tbl_name
(create_definition, ...)
[table_options]
[table_partition_definition]
create_definition:
col_name column_definition
| mysql_create_definition
| [UNIQUE] GLOBAL INDEX index_name [index_type] (index_sharding_col_name,...)
[global_secondary_index_option]
[index_option] ...
index_sharding_col_name:
col_name [(length)] [ASC | DESC]
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
index_type:
USING {BTREE | HASH}
# Global Secondary Indexes (GSI) Related
global_secondary_index_option:
[COVERING (col_name,...)]
[partition_options]
# Partitioned Table Type Definition
table_partition_definition:
single
| broadcast
| partition_options
# Partition Strategy Definition
partition_options:
PARTITION BY
HASH({column_name | partition_func(column_name)})
| KEY(column_list)
| RANGE{({column_name | partition_func(column_name)})
| RANGE COLUMNS(column_list)}
| LIST{({column_name | partition_func(column_name)})
| LIST COLUMNS(column_list)} }
partition_list_spec
# Partition Function Definition
partition_func:
YEAR
| TO_DAYS
| TO_SECOND
| UNIX_TIMESTAMP
| MONTH
# Partition List Definition
partition_list_spec:
hash_partition_list
| range_partition_list
| list_partition_list
# Partitioned Table Column Definition Of Hash / Key Type
hash_partition_list:
PARTITIONS partition_count
# Partitioned Table Column Definition Of Range / Range Columns Type
range_partition_list:
range_partition [, range_partition ...]
range_partition:
PARTITION partition_name VALUES LESS THAN {(expr | value_list)} [partition_spec_options]
# Partitioned Table Column Definition Of List / List Columns Type
list_partition_list:
list_partition [, list_partition ...]
list_partition:
PARTITION partition_name VALUES IN (value_list) [partition_spec_options]
partition_spec_options:
[[STORAGE] ENGINE [=] engine_name]
[COMMENT [=] 'string']
[{CHARSET | CHARACTER SET} [=] charset]
[COLLATE [=] collation]
[TABLEGROUP [=] table_group_id]
[LOCALITY [=] locality_option]
locality_option:
'dn=storage_inst_id_list'
storage_inst_id_list:
storage_inst_id[,storage_inst_id_list]
Note The DDL syntax of PolarDB-X is based on MySQL syntax. The above mainly lists the differences. For detailed syntax, please refer toMySQL Document 。
Default Automatic Partitioning
In PolarDB-X, when creating a table without specifying a partition key, the system will default to using the primary key (or an implicit primary key if none is specified) and apply the KEY partitioning strategy. The default number of partitions created is 64.
Users can also explicitly instruct PolarDB-X to choose the primary key for automatic partitioning by using the CREATE PARTITION TABLE syntax keyword during table creation. An example is as follows:
CREATE PARTITION TABLE auto_part_tbl( id bigint not null auto_increment, bid int, name varchar(30), primary key(id) );
Single Table And Broadcast Table
In PolarDB-X, when creating a table, you can specify the keyword SINGLE to create a single table that is not partitioned. Here is an example of how to create such a table:
CREATE TABLE single_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) SINGLE;
In PolarDB-X, when creating a table, you can specify the keyword BROADCAST to create a broadcast table, which means that the table will have an identical copy of its data on all Data Nodes (DN) in the cluster. Here is an example of how to create a broadcast table:
CREATE TABLE broadcast_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) BROADCAST;
Partitioned Table
In PolarDB-X, when creating a table, you can create a partitioned table that suits business requirements by specifying the semantics of the partition clause. Overall, PolarDB-X supports three types of partitioning strategies:
Hash Partitioning Strategy :Based on the values of user-specified partition columns or partition function expressions, the hash partitioning strategy uses a built-in consistent hashing algorithm to calculate hash values and perform partition routing. Depending on whether it supports the use of partition function expressions or multiple partition columns as the partition key, the hash partitioning strategy can be further subdivided into two types: Hash Partitioning and Key Partitioning.
Range Partitioning Strategy :Based on the values of user-specified partition columns or partition function expressions, this strategy determines which predefined partition ranges the data falls into and routes the data accordingly. Depending on whether it supports the use of partition function expressions or multiple partition columns as the partition key, the range partitioning strategy can be further subdivided into Range Partitioning and Range Columns Partitioning.
List Partitioning Strategy :Similar to the range partitioning strategy, the list partitioning strategy determines which predefined partition value sets the data falls into based on the values of user-specified partition columns or partition function expressions, and routes the data accordingly. Depending on the support for partition function expressions or the use of multiple partition columns as the partition key, the list partitioning strategy can be further subdivided into List Partitioning and List Columns Partitioning types.
Hash Partitioning Strategy
In PolarDB-X, the hash partitioning strategy is further subdivided into two types based on whether it supports the use of partition function expressions or the use of multiple partition columns as the partition key: Hash Partitioning and Key Partitioning.
Hash Partitioning
Hash partitioning in table creation only supports using a single integer type partition column as the partition key. However, for time-type partition columns, it supports wrapping these partition columns with a partition function expression (e.g., YEAR, TO_DAYS, TO_SECONDS, MONTH, etc.) to convert them to an integer type. It's also important to note that hash partitioning does not support the direct use of string types as partition columns.
If you want to partition a table by bid column and you intend to predefine 8 hash partitions, you can create the table using the following SQL command:
CREATE TABLE hash_tbl( id bigint not null auto_increment, bid int, name varchar(30), birthday datetime not null, primary key(id) ) partition by hash(bid) partitions 8;
If you want to partition by the birthday column and have a predefined number of 8 hash partitions, you can execute the following command to create the table:
CREATE TABLE hash_tbl_todays( id bigint not null auto_increment, bid int, name varchar(30), birthday datetime not null, primary key(id) ) PARTITION BY HASH(TO_DAYS(birthday)) PARTITIONS 8;
Currently, PolarDB-X's partitioning functions only support the following list:
YEAR
TO_DAYS
TO_SECOND
MONTH
UNIX_TIMESTAMP
Key Partitioning
Unlike Hash partitioning, Key partitioning allows for the use of multiple partition columns as the partition key when creating a table, but it does not allow for any partition function expressions to be applied to the outside of the partition columns. Key partitioning supports a richer range of column types than Hash partitioning, with supported types as follows:
Integer types:BIGINT/BIGINT UNSINGEDINT/INT UNSINGED/MEDIUMINT/MEDIUMINT UNSINGED/SMALLINT/SMALLINT UNSINGED/TINYINT/TINYINT UNSINGED
Date and time types:DATETIME/DATE/TIMESTAMP
String types:CHAR/VARCHR
Therefore, Key partitioning is the default partitioning strategy for PolarDB-X.
If you want to partition by both the id column and birthday column as partition keys, and you have a predefined number of 8 hash partitions, you can execute the following command to create the table:
CREATE TABLE key_tbl( id bigint not null auto_increment, bid int, name varchar(30), birthday datetime not null, primary key(id) ) PARTITION BY KEY(id, birthday) PARTITIONS 8;
Range Partitioning Strategy
In PolarDB-X, Range partitioning strategy is further divided into two types based on whether partition function expressions are supported or multiple partition columns are used as the partition key: Range partitioning and Range Columns partitioning.
Range Partitioning
Range partitioning only supports using a single integer type column as the partition key. However, for date and time type columns, it allows for the application of a partition function expression (such as YEAR, TO_DAYS, TO_SECONDS, MONTH, etc.) to be wrapped around these columns to convert them into integer types. It is also worth noting that Range partitioning does not directly support the use of string types as the partition column. Note Range partitioning does not support using NULL values as boundary values.
If the business requirement is to partition orders by order_time using a range partitioning strategy, with one partition per quarter, you could use the following SQL command to create the table:
CREATE TABLE orders( id int, order_time datetime not null) PARTITION BY RANGE(to_days(order_time)) ( PARTITION p1 VALUES LESS THAN (to_days('2021-01-01')), PARTITION p2 VALUES LESS THAN (to_days('2021-04-01')), PARTITION p3 VALUES LESS THAN (to_days('2021-07-01')), PARTITION p4 VALUES LESS THAN (to_days('2021-10-01')), PARTITION p5 VALUES LESS THAN (to_days('2022-01-01')) );
Range Columns Partitioning
Range Columns partitioning allows the use of multiple columns as the partition key, unlike Range partitioning, which typically only allows one partition key column. However, Range Columns partitioning does not permit the application of partition function expressions around the partition columns.The column types supported by Range Columns partitioning are more diverse than those supported by standard Range partitioning.The supported types are as follows:
Integer types:BIGINT/BIGINT UNSINGEDINT/INT UNSINGED/MEDIUMINT/MEDIUMINT UNSINGED/SMALLINT/SMALLINT UNSINGED/TINYINT/TINYINT UNSINGED
Date and time types:DATETIME/DATE
String types:CHAR/VARCHR
Note
- Range Columns partitioning currently does not support the TIMESTAMP type, but it will be improved in subsequent versions.
- Does not support using NULL values as boundary values.
If you want to partition by both order_id and order_time using range partitioning, you'll need to use a database system that supports range columns partitioning, or "range partitioning" with multiple columns. Assuming the database system in question supports this feature, here's an example SQL command to create such a table:
CREATE TABLE orders( order_id int, order_time datetime not null) PARTITION BY range columns(order_id,order_time) ( PARTITION p1 VALUES LESS THAN (10000,'2021-01-01'), PARTITION p2 VALUES LESS THAN (20000,'2021-01-01'), PARTITION p3 VALUES LESS THAN (30000,'2021-01-01'), PARTITION p4 VALUES LESS THAN (40000,'2021-01-01'), PARTITION p5 VALUES LESS THAN (50000,'2021-01-01') );
List Partitioning Strategy
Similar to Range partitioning, PolarDB-X further subdivides List partitioning strategy into two types based on whether partition function expressions are supported or multiple partition columns are used as the partition key: List partitioning and List Columns partitioning.
List Partitioning
List partitioning only supports using a single integer column as the partition key. However, for time-based columns, it supports using a partition function expression (such as YEAR, TO_DAYS, TO_SECONDS, MONTH, etc.) to convert them into integer types. Additionally, it is worth noting that Range partitioning does not support using string types directly as partition columns.
If the business wants to partition by the order_time using List partitioning, with one partition for each quarter, the following SQL command can be executed to create the table:
CREATE TABLE orders( id int, order_region varchar(64), order_time datetime not null) PARTITION BY LIST(YEAR(order_time)) ( PARTITION p1 VALUES IN (1990,1991,1992,1993,1994,1995,1996,1997,1998,1999), PARTITION p2 VALUES IN (2000,2001,2002,2003,2004,2005,2006,2007,2008,2009), PARTITION p3 VALUES IN (2010,2011,2012,2013,2014,2015,2016,2017,2018,2019) );
List Columns Partitioning
List Columns partitioning supports the use of multiple columns as the partition key, but it does not allow for any partition function expressions to be applied around the partition columns. The types of partition columns supported by List Columns partitioning are more diverse than those supported by List partitioning, and the supported types are as follows:
Integer types:BIGINT/BIGINT UNSINGEDINT/INT UNSINGED/MEDIUMINT/MEDIUMINT UNSINGED/SMALLINT/SMALLINT UNSINGED/TINYINT/TINYINT UNSINGED
Date and time types:DATETIME/DATE
String types:CHAR/VARCHR
Note
List Columns partitioning currently does not support the TIMESTAMP type; it will be improved in subsequent versions.
Does not support using NULL values as boundary values.
If the business wants to partition by both order_id and order_time using Range partitioning, the following SQL command can be executed to create the table:
CREATE TABLE orders( id int, order_region varchar(64), order_time datetime not null) PARTITION BY LIST COLUMNS(order_region) ( PARTITION p1 VALUES IN ('Hangzhou', 'Shenzhen'), PARTITION p2 VALUES IN ('Beijing', 'Shanghai'), PARTITION p3 VALUES IN ('Qingdao') );
Partitioned Table Data Types
The various partitioning strategies and their support for data types are as follows:
Parameter Description
Parameter | Description |
---|---|
CHARSET \CHARACTER SET | To specify the default character set for columns in a table, you can use the following character sets:
|
COLLATE | To specify the default collation for columns in a table, the following collations can be used:
|
TABLEGROUP | Used to specify the table group to which the partitioned table belongs. If not specified, it will automatically search for or create a table group with an identical partitioning scheme. |
LOCALITY | Used to specify the Data Node where the partitioned table resides. |