Optimize and execute sorting operators

This topic describes how to use sorting operators together with the ORDER BY clause to reduce the amount of data to be transmitted and improve execution efficiency.

Sorting

The semantics of sorting is to sort input rows based on the columns that are specified in the ORDER BY clause. The sorting operators described in this topic are not pushed down to LogicalView. If a sorting operator is pushed down to LogicalView, MySQL at the storage layer determines how to execute the operator.

Sorting operators

PolarDB-X provides the MemSort, TopN, and MergeSort operators to sort rows.

MemSort MemSort is a common operator used to sort rows in PolarDB-X. If you execute this operator, the quicksort algorithm is run in the memory to sort data. The following example shows how to use MemSort:

> explain select t1.name from t1 join t2 on t1.id = t2.id order by t1.name,t2.name;

Project(name="name")
  MemSort(sort="name ASC,name0 ASC")
    Project(name="name", name0="name0")
      BKAJoin(condition="id = id", type="inner")
        Gather(concurrent=true)
          LogicalView(tables="t1", shardCount=2, sql="SELECT `id`, `name` FROM `t1` AS `t1`")
        Gather(concurrent=true)
          LogicalView(tables="t2_[0-3]", shardCount=4, sql="SELECT `id`, `name` FROM `t2` AS `t2` WHERE (`id` IN ('?'))")

TopN

If both ORDER BY and LIMIT are used in an SQL statement, TopN is used to optimize the SQL statement. TopN manages a list of top N rows. All rows are sorted based on the columns specified in the ORDER BY clause in descending or ascending order. After all input rows are processed, the top N rows are the requested result. In specific cases, less than N rows are returned.

> explain select t1.name from t1 join t2 on t1.id = t2.id order by t1.name,t2.name limit 10;

Project(name="name")
  TopN(sort="name ASC,name0 ASC", offset=0, fetch=?0)
    Project(name="name", name0="name0")
      BKAJoin(condition="id = id", type="inner")
        Gather(concurrent=true)
          LogicalView(tables="t1", shardCount=2, sql="SELECT `id`, `name` FROM `t1` AS `t1`")
        Gather(concurrent=true)
          LogicalView(tables="t2_[0-3]", shardCount=4, sql="SELECT `id`, `name` FROM `t2` AS `t2` WHERE (`id` IN ('?'))")

MergeSort In most cases, sorting operators in SQL statements are pushed down to MySQL and then executed if their semantics are supported by MySQL. Then, PolarDB-X executes the MergeSort operator. The MergeSort operator not only sorts data but also redistributes data in a similar way as the Gather operator. The following SQL statement is used to sort the rows in a table named t1. After the SQL statement is optimized by the query optimizer of PolarDB-X, the sorting operator is pushed down to all MySQL shards and then executed. PolarDB-X executes only the MergeSort operator.

> explain select name from t1 order by name;

MergeSort(sort="name ASC")
  LogicalView(tables="t1", shardCount=2, sql="SELECT `name` FROM `t1` AS `t1` ORDER BY `name`")

Compared with MemSort, the MergeSort algorithm reduces the memory consumed by PolarDB-X and makes full use of the computing power of MySQL.

results matching ""

    No results matching ""