Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
total binary blob length
Used in:
Used in:
Used in:
Used in:
,schema tree node ID
stream sequence ID
key information attached to current node-sequence
Used in:
encoding says a map node is flattened.
Used in:
, , ,Used in:
Used in:
Encryption metadata.
Used in:
Encryption groups. There is one encryption group per set of columns that share the same encryption mechanism. At FB, that means one group per crypto project
Key provider, which determines what's stored in `keyMetadata` blobs, and how it should be used for encryption. At FB, we have one solution, which is crypto service.
Used in:
Encryption metadata for set of columns/fields that share the same encryption mechanism
Used in:
Node ids for columns/fields included in the encryption group. Once a node is marked for encryption, all child nodes are included in the same encryption group automatically. This also implies between the node and the root, there cannot be another node present in any encryption group.
Arbitrary blob representing DEK metadata used in file footer. Content of the blob is key provider specific. It could be key identifier in KMS, or encrypted DEK, or others. As an optimization, when key is not present, reuse the same key from stripe information where DEK is set.
Serialized and encrypted `FileStatistics`. Number of `ColumnStatistics` inside it should match that of nodes defined in this encryption group.
Statistics for a sub tree of the schema in depth first traveral order
Encryption metadata
Used in:
Used in:
,this covers all width of integers including byte, short, int and long
this covers binary key or utf8 string
Used in:
Used in:
Serialized length must be less that 255 bytes
define single type conversion schema
Used in:
Used in:
Used in:
,stream sequence ID within a node
logical column index in schema
`offset` of a stream can be calculated based on length of all streams in front of it. However, with encryption, client without key cannot decrypt encrypted stripe footer metadata, hence is not able to collect all the information needed. For that reason, we introduce `offset` to record relative offset of the stream to the beginning of the stripe.
if you add new index stream kinds, you need to make sure to update StreamName to ensure it is added to the stripe in the right area
Used in:
this type of streams recording map key presence
Used in:
total length of all strings
Used in:
Metadata for a set of columns that share the same data encryption key (DEK)
Serialized and encrypted (using DEK) `StripeEncryptionGroup`, one for each encryption group. Number of items should match that of encryption groups defined at file footer. Client with key to an encryption group can encrypt the corresponding blob and retrieve streams/encodings for encrypted columns.
Used in:
group size to preload hint - 0 means no grouping
DEK metadata used in the stripe. Number of items should match that of encryption groups defined at file footer. During write time, writer generates one DEK per encryption group. In the common case, only the first stripe has keyMetadata set and all other stripes and file footer share the keys as defined in the first stripe. In the case of file concat, stripe information will be copied from the original files, along with their existing keys and thus, first stripes of each of the original files have their own keys set, and the file footer will have different key.
Used in:
Used in:
Used in:
Used in: