Get desktop application:
View/edit binary Protocol Buffers messages
Used in:
sum will store the total binary blob length in a stripe
Used in:
Used in:
Used in:
The encoding of the bloom filters for this column: 0 or missing = none or original 1 = ORC-135 (utc for timestamps)
Used in:
Used in:
, ,Used in:
Used in:
min,max values saved as days since epoch
Used in:
Used in:
The contents of the file tail that must be serialized. This gets serialized as part of OrcSplit, also used by footer cache.
Used in:
Each implementation that writes ORC files should register for a code 0 = ORC Java 1 = ORC C++ 2 = Presto
Used in:
Serialized length must be less that 255 bytes
Used in:
the version of the file format [0, 11] = Hive 0.11 [0, 12] = Hive 0.12
The version of the writer that wrote the file. This number is updated when we make fixes or large changes to the writer so that readers can detect whether a given bug is present in the data. Only the Java ORC writer may use values under 6 (or missing) so that readers that predate ORC-202 treat the new writers correctly. Each writer should assign their own sequence of versions starting from 6. Version of the ORC Java writer: 0 = original 1 = HIVE-8732 fixed (fixed stripe/file maximum statistics & string statistics use utf8 for min/max) 2 = HIVE-4243 fixed (use real column names from Hive tables) 3 = HIVE-12055 fixed (vectorized writer implementation) 4 = HIVE-13083 fixed (decimals write present stream correctly) 5 = ORC-101 fixed (bloom filters use utf8 consistently) 6 = ORC-135 fixed (timestamp statistics use utc) Version of the ORC C++ writer: 6 = original Version of the Presto writer: 6 = original
Leave this last in the record
Used in:
Used in:
if you add new index stream kinds, you need to make sure to update StreamName to ensure it is added to the stripe in the right area
Used in:
Used in:
sum will store the total length of all strings in a stripe
Used in:
Used in:
Used in:
min,max values saved as milliseconds since epoch
Used in:
Used in:
Used in: