Subqueries
This topic describes the subquery categories supported by PolarDB-X. This topic also describes the limits on and the notes for subqueries in PolarDB-X.
Limits
Compared with the native MySQL, PolarDB-X provides the following limits on subqueries:
Subqueries cannot be used in HAVING clauses. The following code block provides an example on how to use subqueries in HAVING clauses:
SELECT name, AVG( quantity ) FROM tb1 GROUP BY name HAVING AVG( quantity ) > 2* ( SELECT AVG( quantity ) FROM tb2 );
Subqueries cannot be used in JOIN ON clauses. The following code block provides an example on how to use subqueries in JOIN ON clauses:
SELECT * FROM tb1 p JOIN tb2 s on (p.id=s.id and p.quantity>All(select quantity from tb3))
PolarDB-X does not support the ROW functions in the scalar subqueries that use equal signs (=) as operators. The following code block provides an example on how to use the ROW functions:
SELECT * FROM tb1 WHERE row(id, name) = (select id, name from tb2)
Subqueries cannot be used in UPDATE SET clauses. The following code block provides an example on how to use subqueries in UPDATE SET clauses:
UPDATE t1 SET c1 = (SELECT c2 FROM t2 WHERE t1.c1 = t2.c1) LIMIT 10
Note
In PolarDB-X, only the APPLY operator can be used to run specific subqueries. This makes queries inefficient. We recommend that you do not execute the following inefficient SQL statements:
SQL statements in which WHERE clauses contain OR operators and subqueries. The execution efficiency of the SQL statements is reduced based on the data in the foreign tables. The following code block provides an example on how to specify the SQL statements:
Efficient statement: select * from tb1 where id in (select id from tb2) Efficient statement: select * from tb1 where id in (select id from tb2) and id>3 Inefficient statement: select * from tb1 where id in (select id from tb2) or id>3
Correlated subqueries in which correlated items contain functions or non-equal signs. The following code block provides examples on how to specify the statement that does not contain the correlated subquery and the statements that contain the correlated subqueries:
Efficient statement: select * from tb1 a where id in (select id from tb2 b where a.name=b.name) Inefficient statement: select * from tb1 a where id in (select id from tb2 b where UPPER(a.name)=b.name) Inefficient statement: select * from tb1 a where id in (select id from tb2 b where a.decimal_test=abs(b.decimal_test)) Inefficient statement: select * from tb1 a where id in (select id from tb2 b where a.name!=b.name) Inefficient statement: select * from tb1 a where id in (select id from tb2 b where a.name>=b.name)
Correlated subqueries in which correlated items are connected with other conditions by using OR operators. The following code block provides examples on how to specify the statement that does not contain the correlated subquery and the statements that contain the correlated subqueries.
Efficient statement: select * from tb1 a where id in (select id from tb2 b where a.name=b.name and b.date_test<'2015-12-02') Inefficient statement: select * from tb1 a where id in (select id from tb2 b where a.name=b.name or b.date_test<'2015-12-02') Inefficient statement: select * from tb1 a where id in (select id from tb2 b where a.name=b.name or b.date_test=a.date_test)
Scalar subqueries that contain correlated items. The following code block provide examples on how to specify the statement that does not contain the correlated subquery and the statement that contains the correlated subquery:
Efficient statement: select * from tb1 a where id > (select id from tb2 b where b.date_test<'2015-12-02') Inefficient statement: select * from tb1 a where id > (select id from tb2 b where a.name=b.name and b.date_test<'2015-12-02')
Subqueries in which correlated items span the correlation levels.
An SQL statement has multiple correlation levels. The correlated items in each subquery are correlated only with the upper level. The statements that contain the subqueries are efficient. The following code block provides an example on how to specify the statements.
Efficient statement: select * from tb1 a where id in(select id from tb2 b where a.name=b.name and exists (select name from tb3 c where b.address=c.address))
An SQL statement has multiple correlation levels. The correlated items of subqueries in
Table c
are correlated with columns inTable a
. The statements that contain the subqueries are inefficient. The following code block provides an example on how to specify the statements.Inefficient statement: select * from tb1 a where id in(select id from tb2 b where a.name=b.name and exists (select name from tb3 c where a.address=c.address))
Note In the preceding examples, Table a
and Table b
belong to the same correlation level and Table b
and Table c
belong to the same correlation level. Table a
and Table c
are correlated across the correlation levels.
Subqueries that contain GROUP BY clauses. Make sure that the grouping columns specified in GROUP BY clauses contain the correlated items.
An SQL subquery contains aggregate functions and correlated items. The correlated item
b.pk
is correlated to thepk
grouping column. The statements that contain the subqueries are efficient. The following code block provides an example on how to specify the statements.Efficient statement: select * from tb1 a where exists (select pk from tb2 b where a.pk=b.pk and b.date_test='2003-04-05' group by pk);
An SQL subquery contains aggregate functions and correlated items. The correlated item
b.date_test
is not correlated to thepk
grouping column. The statements that contain the SQL subqueries are inefficient. The following code block provides an example on how to specify the statements.Inefficient statement: select * from tb1 a where exists (select pk from tb2 b where a.date_test=b.date_test and b.date_test='2003-04-05' group by pk);
Supported subqueries
PolarDB-X supports the following types of subqueries:
Comparisons using subqueries indicate subqueries that use comparison operators. These subqueries are most commonly used.
Syntax
non_subquery_operand comparison_operator (subquery) comparison_operator: = > < >= <= <> != <=> like
Example
select * from tb1 WHERE 'a' = (SELECT column1 FROM t1)
Note Subqueries can be specified only to the right of comparison operators.
Subqueries with ANY, ALL, IN/NOT IN, and EXISTS/NOT EXISTS
Syntax
operand comparison_operator ANY (subquery) operand comparison_operator ALL (subquery) operand IN (subquery) operand NOT IN (subquery) operand EXISTS (subquery) operand NOT EXISTS (subquery) comparison_operator:= > < >= <= <> !=
Example
ANY: If a row returned by the subquery meets the expression before ANY, the ANY operator returns TRUE. Otherwise, the ANY operator returns FALSE.
ALL: If all rows returned by the subquery meet the expression before ALL, the ALL operator returns TRUE. Otherwise, the ANY operator returns FALSE.
IN: If IN is specified before the subquery, IN is used in the same manner as
=ANY
. The following code block provides an example on how to use the subquery.SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);
NOT IN: If NOT IN is specified before the subquery, NOT IN is used in the same manner as
<>ALL
. The following code block provides an example on how to use the subquery.SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);
EXISTS: If the subquery returns a row, the EXISTS operator returns TRUE. Otherwise, the EXISTS operator returns FALSE. The following code block provides an example on how to use the subquery.
SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
Note If a subquery contains a row, the WHERE clause returns TRUE even if the subquery contains only rows in which the values are NULL.
NOT EXISTS: If the subquery returns a row, the NOT EXISTS operator returns FALSE. If the subquery returns NULL, the NOT EXISTS operator returns TRUE.
Row Subqueries
ROW subqueries support the following comparison operators:
comparison_operator: = > < >= <= <> != <=>
Example
SELECT * FROM t1 WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10); SELECT * FROM t1 WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
The preceding two SQL statements can be executed in the same manner. Data rows in the t1 table are returned only when the following conditions are met:
The subquery
SELECT col3, col4 FROM t2 WHERE id=10
returns only one row. An error is reported if multiple rows are returned.col3
andcol4
returned by the subquery are equal tocol1
andcol2
in the primary table.
Correlated subqueries are subqueries that contain references to tables that appear in the outer query. The following code block provides an example on how to use the correlated subqueries.
SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);
In the example, the subquery does not contain the t1 table and its column column2. In this case, the subquery refers to the table in the outer query.
Derived tables (subqueries in a FROM clause)Derived tables are subqueries specified in a FROM clause.
Syntax
SELECT ... FROM (subquery) [AS] tbl_name ...
Example
Prepare data.Execute the following statements to create the t1 table:
CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT); INSERT INTO t1 VALUES (1,'1',1.0); INSERT INTO t1 VALUES (2,'2',2.0);
Execute the following statement. The query result is
2, '2', 4.0
.SELECT sb1,sb2,sb3 FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb WHERE sb1 > 1;
Query the average value of grouped data processed by the SUM function. If you execute the following SQL statement, an error is reported and no result is returned.
SELECT AVG(SUM(s1)) FROM t1 GROUP BY s1;
You can execute the following statement that contains a derived table. The query result is
1.5000
.SELECT AVG(sum_s1) FROM (SELECT SUM(s1) AS sum_s1 FROM t1 GROUP BY s1) AS t1;
Note
A derived table must have an alias such as
t1
specified in the example.A derived table can return a scalar, a column, a row, or a table.
Derived tables cannot be correlated subqueries. Derived tables cannot contain references to foreign tables in outer queries.