Memory Management Mechanism
TiDB's memory management basically consists of a memory usage quota settings for each query, and two interfaces, called Tracker
and OOMAction
.
Tracker
Tracker
tracks the memory usage of each element with a tree structure.
Genral use cases:
/--- Tracker(Component in Executor, e.g. list/rowContainer/worker)
| ...
/--- Tracker(Executor1) ---+--- Tracker(Component)
|
Tracker(Session) ---+--- Tracker(Executor2)
| | ...
| \--- Tracker(Executor3)
OOM-Action1
|
|
OOM-Action2
...
When a component allocates some memory, it will call the function Tracker.Consume(bytes)
to tell the Tracker
how much memory it uses. Tracker.Comsume
will traverse all its ancestor nodes, accumulate memory usage and trigger OOM-Action when exceeded.
OOM-Action
OOM-Action
is a series of actions grouped in a linked list to reduce memory usage. Each node on the linked list abstracts a strategy to be used when the memory usage of a SQL exceeds the memory quota. For example, we define the spill to disk strategy as SpillDiskAction
, rate limit strategy as rateLimitAction
and cancel strategy as PanicOnExceed
.
Rate Limit
TiDB supports dynamic memory control for the operator that reads data. By default, this operator uses the maximum number of threads that tidb_disql_scan_concurrency
allows to read data. When the memory usage of a single SQL execution exceeds tidb_mem_quota_query
each time, the operator that reads data stops one thread.
We use rateLimitAction
to dynamically control the data reading speed of TableReader
.
Spill Disk
TiDB supports disk spill for execution operators. When the memory usage of a SQL execution exceeds the memory quota, tidb-server can spill the intermediate data of execution operators to the disk to relieve memory pressure. Operators supporting disk spill include Sort, MergeJoin, HashJoin, and HashAgg.
SpillDiskAction
We use SpillDiskAction
to control the spill disk of HashJoin
and MergeJoin
. The data will be placed in Chunk unit when spilling. We can get any data in Chunk through random I/O.
SortAndSpillDiskAction
We use SortAndSpillDiskAction
to control the spill disk of Sort
.
If the input of SortExec
is small, then it sorts in memory. If the input is large, the SortAndSpillDiskAction
will be triggered, and an external sort algorithm will be used. We can split the input into multiple partitions and perform a merge sort on them.
External sorting algorithms generally have two stages, sort and merge. In the sort stage, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge stage, the sorted subfiles are combined, and the final result will be outputted.
AggSpillDiskAction
We use AggSpillDiskAction
to control the spill disk of HashAgg
. When AggSpillDiskAction
is triggered, it will switch HashAgg executor to spill-mode, and the memory usage of HashAgg won't grow.
We use the following algorithm to control the memory increasing:
- When the memory usage is higher than the
mem-quota-query
, switch the HashAgg executor to spill-mode. - When HashAgg is in spill-mode, keep the tuple in the hash map no longer growing. a. If the processing key exists in the Map, aggregate the result. b. If the processing key doesn't exist in the Map, spill the data to disk.
- After all data have been processed, output the aggregate result in the map, clear the map. Then read the spilling data from disk, repeat the Step1-Step3 until all data gets aggregated.
As we can see, unlike other spilling implementations, AggSpillDiskAction
does not make the memory drop immediately, but keeps the memory no longer growing.
Log/Cancel
When the above methods cannot control the memory within the threshold, we will try to use PanicOnExceed
to cancel the SQL or use LogOnExceed
to log the SQL info.