Get desktop application:
View/edit binary Protocol Buffers messages
TopSQLAgent is the persistent agent service for TopSQL records
ReportPlanMeta reports plan meta to the agent. The agent should deal with plan meta similarly to SQL meta.
ReportSQLMeta reports SQL meta to the agent. The agent should ensure that the SQL meta exists before sending the SQL CPU time records to the remote database.
ReportTopRURecords is called periodically to save the in-memory TopRU records.
ReportTopSQLRecords is called periodically (e.g. per minute) to save the in-memory TopSQL records
TiDB implements TopSQLPubSub service for clients to subscribe to TopSQL data.
Clients subscribe to TopSQL data through this RPC, and TiDB periodically (e.g. per minute) publishes TopSQL data to clients via gRPC stream.
Semantics: - collectors empty => default enable TOPSQL - collectors non-empty => authoritative (only those enabled) Examples: - TOPSQL only: collectors=[TOPSQL] (or empty) - TOPRU only: collectors=[TOPRU] - both: collectors=[TOPSQL, TOPRU]
Only used when COLLECTOR_TYPE_TOPRU is present in collectors.
ANN = Approximate Nearest Neighbor. For some queries, ANN index can be used.
Used in: , ,
For debug purpose only. Currently only used in explain.
deprecated field, we use column to get the id first, and this field will not be set again. Retain this field to be compatible with older versions of TiDB
The reference vector to calculate distance with, with each element is a Float32
Only for ANNQueryType==Where
Only for HNSW indexes
Persists the original vector column's type information (including nullability) to ensure correct data handling. This field is always populated with the column's schema metadata, regardless of whether enable_distance_proj is enabled.
If enabled, the content of TableScan's vector output column (whose ID is column.id) will be removed and TableScan will read a distance column (whose id must be -2000) as replacement, TiFlash persistent layer does not need to really read the Vector data column when index has been built.
Used in:
Where = 2; // Not supported for now
Used in:
Used in:
Used in:
Group by clause.
Aggregate functions.
If it is a stream aggregation.
Used in:
Used in:
bucket_size is the max histograms bucket size, we need this because when primary key is handle, the histogram will be directly built.
sample_size is the max number of samples that will be collected.
sketch_size is the max sketch size.
columns_info is the info of all the columns that needs to be analyzed.
sample_rate is the sampling rate that how many samples will collected. There must be one non-zero value in sample_rate and sample_size.
ndv_rate is the sampling rate that determines how many samples will be used to calculate the NDV.
Used in:
collectors is the sample collectors for columns.
pk_hist is the histogram for primary key when it is the handle.
Used in:
bucket_size is the max histograms bucket size.
num_columns is the number of columns in the index.
Used in:
Deprecated. Start Ts has been moved to coprocessor.Request.
Used in:
Used in:
Bucket is an element of histogram.
Used in:
ByItem type for group by and order by.
Used in: , , , ,
Used in: ,
Used in:
Used in:
Used in:
Used in:
Used in:
Deprecated. Start Ts has been moved to coprocessor.Request.
Used in:
Used in:
Chunk contains multiple rows data and rows meta.
Used in: ,
Data for all rows in the chunk.
Meta data for every row.
Used in:
Represents the endian.
Used in:
For compatibility when this variable is not present. Defaults to COLLECTOR_TYPE_TOPSQL.
future: COLLECTOR_TYPE_XXX = 3;
Used in: , , , , , , , ,
MySQL type.
Encoded datum.
PK handle column value is row handle.
Used in: ,
Used in:
Used in:
consts represents all non-null const args in repeated Datum format.
Data compression mode
Used in:
no compression
fast compression/decompression speed, compression ratio is lower than HC mode
high compression (HC) ratio mode
DAGRequest represents the request that will be handled with DAG mode.
Transaction start timestamp. Deprecated. Start Ts has been moved to coprocessor.Request.
It represents push down Executors and follows the order of depth-first search with post-order traversal. That is: left child first, then right child, then parent. For example, a DAG: A / B / \ C D / / \ E F G / H Its order should be: [H, E, C, F, G, D, B, A] In most cases, there is only one child for each parent, that makes executors simple array from the srouce to the out most executors, and the response only need to output the final rows. But when a executor has more than one children, for example, IndexLookUp, some intermedidate result is required to output. The field `intermediate_output_channels` describes it.
time zone offset in seconds
flags are used to store flags that change the execution mode, it contains: ignore_truncate = 1 truncate error should be ignore if set. truncate_as_warning = 1 << 1 when ignored_truncate is not set, return warning instead of error if this flag is set. ... add more when needed.
It represents which columns we should output.
It represents whether we collect the detailed scan counts in each range.
It indicates the maximum number of warning, which is the number of messages that SHOW WARNINGS displays.
It indicates the encode type of response.
It indicates the sql_mode.
supply offset is not enough since we have daylight saving time present in some regions
It represents whether or not TiKV should collect execution summaries. Execution summaries will be collected into `execution_summaries` field in the response.
Represents the maximum size of one packet, any generated string, or any parameter sent as long data.
Represents the chunk memory layout.
Represents whether the expression use RPN form.
UserIdentity uses to do privilege check. It is only used in TiDB cluster memory table.
Represents tree struct based executors, if this field is set, should ignore the executors field, currently only used in TiFlash
Force using the encode type specified by encode_type, currently only used in TiFlash
It indicates the number of digits by which to increase the scale of the result of division operations performed with the / operator.
It inidcates the intermdidate result channels.
DynamicPartitionAccessObject represents the partitions accessed by the children of this operator.
Used in:
Used in:
Used as response type in: TopSQLAgent.ReportPlanMeta, TopSQLAgent.ReportSQLMeta, TopSQLAgent.ReportTopRURecords, TopSQLAgent.ReportTopSQLRecords
(message has no fields)
Used in: , ,
TypeCHBlock is used by TiSpark and TiFlash, in this encode mode, TiFlash will encode the data using native ch block format
Used in:
Used in:
Used in:
Used in: ,
ExchangeReceiver accept connection and receiver data from ExchangeSender.
Used in:
ExchangeSender will build connection with ExchangeReceiver.
Used in:
partition keys' types
Used in: ,
Used in:
TODO: Rename it to hash aggregation after support stream aggregation in TiKV.
Expand executor is used to expand underlying data sources to feed different grouping sets.
Expand2 executor is used to expand underlying data sources to feed different grouping sets.
It represents a Executor.
Used in: , , , , , , , , , , , , ,
It indicates the parent index of current executor. Not set indicates its parent is the next executor in `DAGRequest.executors`.
Used in: ,
Total time cost in this executor. Includes self time cost and children time cost.
How many rows this executor produced totally.
How many times executor's `next()` is called.
Coresponding executor id
The execution concurrency for this executor
Serialize kvproto resource_manager.Consumption to tell tidb the consumption info. For now it's only for tiflash. And it's the ru consumption of one MPPTask/cop/batchCop instead of one executor.
Only for tiflash, records the wait info.
Only for tiflash, records network info.
Deprecated in the nearly feature usage
Used in:
for grouping sets like: expr[a,b] and expr[c]
expand version
Used in:
for grouping sets generated projection levels, like [a, b, c, 1#gen_col1, 2#gen_col2]
for output names for generated cols like grouping_id etc, like "gen_col1", "gen_col2" here
with_runtime_stats represents if runtime stats are available. If not available, the act_rows, *_exec_info, memory_bytes and disk_bytes should not be used.
If discarded_due_to_too_long is true. The main and ctes fields should be empty and should not be used. This field can be changed to a enum or int if we need to represent more states in the future.
Used in:
the cost of the current operator
The XXXReader/XXXScan/MemTable/PointGet/BatchPointGet may use this
memory_bytes and disk_bytes are expected to be displayed as "N/A" when they are -1, this will be consistent with the result of EXPLAIN ANALYZE.
Evaluators should implement evaluation functions for every expression type.
Used in: , , , , , , , , , , , , , , , ,
Used in:
Children count 0.
Used in:
Values are encoded bytes.
Mysql specific types.
Encoded value list.
Column reference. value is int64 column ID.
Aggregate functions.
Window functions
Scalar Function
FMSketch is used to count distinct values for columns.
Used in: ,
Used in:
Used in:
Used in:
Used in: ,
Used in:
Used in:
Used in: ,
Currently only one column is supported.
For debug purpose only. Currently only used in explain.
Always the same as parser/model/index_full_text.go:FullTextParserType
These fields support pushdown for order-by and limit scenarios.
Distinguish match_word, match_expression, match_prefix, match_regexp
Pass through the inverted index conditions for the hybrid index.
Pass through the FTS boolean query.
Used in:
Means no scoring is ever needed, encourages the engine to use a fast path.
Used in: , , , , ,
Used in:
for grouping expressions like: expr[a,b]
2 dimension here, out-most dimension is for grouping(a,b) = grouping(a) << 1 + grouping(b); we should maintain a slice of grouping mark.
Contain the grouping's meta info
Used in:
Used in:
Do 'and' operation, e.g. x & y
Compare two number
Find if number in the set
Used in:
Used in: ,
ndv is the number of distinct values.
buckets represents all the buckets.
Used in:
Used in:
It represents the index columns we should use to build the row handle.
Used in:
check whether it is a unique index.
only used by TiFlash
only used by TiFlash, for TiCI hybrid vector queries
It is the data of a intermidiate output channel
Used in:
IntermediateOutputChannel is the channel description for the intermediate ouput. The SelectResponse of a DAGRequest may output some intermediate data because not all rows can be processed in DAG. For example, the executor IndexLookUp scans the index records and look up the rows locally. If a related row of a index is not found locally, this index record should be ouput into the intermediate channel for the further processment in the TiDB side.
Used in:
executor_idx indicates which executor outputs this intermediate result.
It represents which columns we should output.
Used in:
Used in:
Used in:
0 or 1
used by TiFlash join when new collation is enabled.
only used by TiFlash
only used by TiFlash join key null-safe equal
Used in:
Used in:
KeyRange is the encoded index key range, low is closed, high is open. (low <= x < high)
Used in:
Used in:
Query indicates whether terminate a single query on this connection or the whole connection. If Query is true, terminates the statement the connection is currently executing, but leaves the connection itself intact. If Query is false, terminates the connection associated with the given ConnectionID, after terminating any statement the connection is executing. See https://dev.mysql.com/doc/refman/8.0/en/kill.html.
Used in:
Limit the result to be returned.
If partition_by is not empty, it means need to return limitN of each partition. Generally used in sqls like `where row_number() over (partition by ...) < X`
Used in:
empty is not expected to be used.
Used in:
fast_scan is a feature only provided by TiFlash (but not TiKV).
conditions that are pushed down to storage layer, only used by TiFlash.
only used by TiFlash
only used by TiFlash
only used by TiFlash
only used by TiFlash
Used as request type in: TopSQLAgent.ReportPlanMeta
Used as field type in:
Plan text with sensitive fields trimmed. Producers should limit the size to less than 4KiB. Consider use `encoded_normalized_plan` if the size exceeds.
If `normalized_plan` is unacceptably large, set `encoded_normalized_plan` instead. The textual normalized plan is expected to get by following steps: 1. decode from base64 2. decode from snappy 3. decode from github.com/pingcap/tidb/util/plancodec.DecodeNormalizedPlan
Used in:
Projection expressions.
Used for range frame's comparison when finding frame's boundary
Used in:
Use to label the handling kv type of the request. This is for TiKV resource_metering to collect execution information by the key label.
Used in:
values are all in text format.
Used in:
RowMeta contains row handle and length of a row.
Used in:
Used in:
Used in:
singleton_sketch estimates the number of values that appear only once in the sample.
sketch_sample_count is the number of samples processed when building fm_sketch and singleton_sketch.
Expression organized in RPN form. https://en.wikipedia.org/wiki/Reverse_Polish_notation
Used in: , , ,
All children fields in exprs should be empty.
Used in: , ,
Used in:
Used in:
query_vector_f32_le is f32 little-endian bytes, without the dims prefix.
Optional tuning parameters.
If zero, the server uses its default base beam size.
read_only disables background fixups (split/merge) triggered by search observations. When enabled, the server must not enqueue fixups and should avoid starting any background fixup workers.
When true, the server populates SPFreshSearchRow.distance in each result row.
Used in:
Used in:
Time costs of each major stage in microseconds.
Aggregated tikv-client RPC stats (best-effort) for this /spfresh_search request.
Effective oversample factor used by the server for this request.
Process-local partition cache stats for this request.
Used as request type in: TopSQLAgent.ReportSQLMeta
Used as field type in:
SQL text with sensitive fields trimmed. Producers should limit the size to less than 4KiB. Truncation can be chosen to reduce size.
If true, this sql and plan is internally generated by tidb itself, not user.
SampleCollector is used for collect samples and calculate the count and ndv of an column.
Used in: ,
Used in: ,
Casting
compare
unimplemented in tidb
arithmetic
math
op
unimplemented in tidb
other
control
unimplemented in tidb
encryption
info
miscellaneous
like
json
vector
fts
time
String functions
Greatest,Least functions will return Date Type value when all the parameters are Date types
ScanAccessObject represents the access to a single table. It may contain multiple indexes and multiple partitions.
Used in:
Response for SelectRequest.
Result rows.
Use multiple chunks to reduce memory allocation and avoid allocating large contiguous memory.
The execution summary of each executor, in the order in request.
It indicates the encode type of response.
ndvs collects the number of distinct value information per range. It will be used to serve as execution feedback information. Helping us improve the table's statistics information.
It contains all the intermedidate outputs.
Used in:
Where conditions.
Used in:
Used in:
Data for all rows
output row count for each executor
Used in:
which engine we should in next step, only used by tiflash
For global read in join, we must point out the key ranges when we don't have the region info.
fast_scan is a feature only provided by TiFlash (but not TiKV).
conditions that are pushed down to storage layer, only used by TiFlash.
only used by TiFlash
only used by TiFlash
only used by TiFlash
only used by TiFlash
Used in:
TiCIVectorQueryInfo carries a TiCI vector top-k query. Used when a hybrid index with a vector component is chosen by the planner. The primary operator is ANN top-k; an optional filter expression can be pushed down so that TiCI can apply it before or during the ANN search.
Used in:
The index that contains the vector component.
Column ID of the vector column being searched.
Distance metric to use for the search.
Number of nearest neighbors to return.
The query vector, serialized as a sequence of little-endian float32 values.
Dimensionality of the vector. Must match the index definition.
Optional pushed-down filter expression (inverted/scalar predicates). When present, TiCI applies filtering in conjunction with ANN search.
Column name, for debug/explain purposes only.
The execution summary of each executor, no order limitation.
Used in:
Used in:
Used in:
Used in:
Index file not available in disk
Index file available in disk and not cached in memory
Index file available in disk and cached in memory
Used in:
Time waited for minTSO satified
Time waited in task queue.
Time waited for dependant pipeline finished.
Used in:
Order by clause.
If partition_by is not empty, it means need to return topN of each partition. Generally used in sqls like `where row_number() over (partition by ... order by ...) < X`
TopRU-only options
Used in:
allowed: 15/30/60; server validates and applies default if 0
TopRURecord represents RU statistics for a single (user, sql_digest, plan_digest) combination.
Used as request type in: TopSQLAgent.ReportTopRURecords
Used as field type in:
TopRURecordItem represents statistics within a single time bucket.
Used in:
timestamp in second
cumulative RU consumption (RRU + WRU)
execution count
cumulative execution time (nanoseconds)
Used as request type in: TopSQLAgent.ReportTopSQLRecords
Used as field type in:
Used in:
timestamp in second
this value can be greater than 1000 when counting concurrent running SQL queries
target => count
traffic from client
traffic to client
Used in:
Note: The name of the enum is intentionally aligned with tidb/parser/index_vector.go.
Used in: ,
Note: The name of the enum is intentionally aligned with tidb/parser/index_vector.go.
Used in:
Used in:
Used in:
Used in:
only use for `rows` frame type
Deprecated
only use for `range` frame type
only use for `range` frame type
Used in: