Memory Management Mechanism
TiDB's memory management basically consists of a memory usage quota settings for each query, and two interfaces, called
Tracker tracks the memory usage of each element with a tree structure.
Genral use cases:
/--- Tracker(Component in Executor, e.g. list/rowContainer/worker) | ... /--- Tracker(Executor1) ---+--- Tracker(Component) | Tracker(Session) ---+--- Tracker(Executor2) | | ... | \--- Tracker(Executor3) OOM-Action1 | | OOM-Action2 ...
When a component allocates some memory, it will call the function
Tracker.Consume(bytes) to tell the
Tracker how much memory it uses.
Tracker.Comsume will traverse all its ancestor nodes, accumulate memory usage and trigger OOM-Action when exceeded.
OOM-Action is a series of actions grouped in a linked list to reduce memory usage. Each node on the linked list abstracts a strategy to be used when the memory usage of a SQL exceeds the memory quota. For example, we define the spill to disk strategy as
SpillDiskAction, rate limit strategy as
rateLimitAction and cancel strategy as
TiDB supports dynamic memory control for the operator that reads data. By default, this operator uses the maximum number of threads that
tidb_disql_scan_concurrency allows to read data. When the memory usage of a single SQL execution exceeds
tidb_mem_quota_query each time, the operator that reads data stops one thread.
rateLimitAction to dynamically control the data reading speed of
TiDB supports disk spill for execution operators. When the memory usage of a SQL execution exceeds the memory quota, tidb-server can spill the intermediate data of execution operators to the disk to relieve memory pressure. Operators supporting disk spill include Sort, MergeJoin, HashJoin, and HashAgg.
SpillDiskAction to control the spill disk of
MergeJoin. The data will be placed in Chunk unit when spilling. We can get any data in Chunk through random I/O.
SortAndSpillDiskAction to control the spill disk of
If the input of
SortExec is small, then it sorts in memory. If the input is large, the
SortAndSpillDiskAction will be triggered, and an external sort algorithm will be used. We can split the input into multiple partitions and perform a merge sort on them.
External sorting algorithms generally have two stages, sort and merge. In the sort stage, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge stage, the sorted subfiles are combined, and the final result will be outputted.
AggSpillDiskAction to control the spill disk of
AggSpillDiskAction is triggered, it will switch HashAgg executor to spill-mode, and the memory usage of HashAgg won't grow.
We use the following algorithm to control the memory increasing:
- When the memory usage is higher than the
mem-quota-query, switch the HashAgg executor to spill-mode.
- When HashAgg is in spill-mode, keep the tuple in the hash map no longer growing. a. If the processing key exists in the Map, aggregate the result. b. If the processing key doesn't exist in the Map, spill the data to disk.
- After all data have been processed, output the aggregate result in the map, clear the map. Then read the spilling data from disk, repeat the Step1-Step3 until all data gets aggregated.
As we can see, unlike other spilling implementations,
AggSpillDiskAction does not make the memory drop immediately, but keeps the memory no longer growing.
When the above methods cannot control the memory within the threshold, we will try to use
PanicOnExceed to cancel the SQL or use
LogOnExceed to log the SQL info.