Proto commits in polarsignals/frostdb

These 27 commits are when the Protocol Buffers files have changed:

Commit:2c5b58b
Author:Alfonso Subiotto Marqués
Committer:GitHub

db: ignore writes below new block transaction on recovery (#864) * db: ignore writes below new block transaction on recovery Previously, the recovery code was ignoring writes below the persist transaction of a table block. However, this could lead to dropped writes since the active block is swapped atomically on rotation *before* the old block is persisted. Writes in between these two events would be written to the new table block, but ignored on recovery given the recovery code assumed they were in the old table block. * db: fix WAL off-by-one truncation error Previously snapshots were performed using a write txn and the WAL was truncated so that that txn would be the first txn in the truncated WAL. However, snapshots were changed to use a read txn so if a write at txn k was included in the snapshot, a truncated WAL would still contain this write as the first write in the WAL, resulting in duplicate data after recovery.

The documentation is generated from this commit.

Commit:fcd703d
Author:Matthias Loibl
Committer:GitHub

Add proto CI runs (#807) * Add proto CI runs Mostly copied from Parca. * Add comments to proto messages where buf lint complains * buf generate

Commit:827dcc5
Author:Matthias Loibl
Committer:GitHub

Add FrostDB gRPC Query service (#784) Add FrostDB gRPC Query service to query with proto

Commit:f84496b
Author:Matthias Loibl
Committer:GitHub

Add support for uint64 columns (#618) * Add support for uint64 columns * Add query support for Uint64 The commit adds support for Uint64 in binaryscalarexpr.go and various places. Most of these involve additional case statements to handle inputs of Uint64 type.

Commit:5e5bde7
Author:Geofrey Ernest
Committer:GitHub

buf format -w (#673) * buf format -w I was working on protobuf files and notices my vscode was formatting the files differently. We already use buf for generating, it is reasonable to use buf for formatting as well. Strictly for consistency. * add workflow for buf format

Commit:8b67228
Author:Nicolas Takashi
Committer:GitHub

Add support for int32 columns (#647) Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

Commit:cc5a616
Author:Thor
Committer:GitHub

use reserved keyword for deprecated txnmetadata proto (#612) * use reserved keyword for deprecated txnmetadata proto * lint

Commit:597879f
Author:Thor
Committer:GitHub

Deprecate txn metadta (#609) * proto: deprecate: txn metadata * proto: removed txn metadata from snapshots This feature is not longer used, and it didn't ever quite work with snapshots as intended.

Commit:f6a0acc
Author:Alfonso Subiotto Marqués
Committer:GitHub

dynparquet: add unique primary index option (#553) This commit adds the option to specify that the primary index should be unique. This results in dropping duplicate rows during compaction. Tables with this option might experience slower compactions.

Commit:69f3a11
Author:Thor
Committer:GitHub

Pre hash columns (#524) * proto/schema: add prehash column option * Prehash columns based on schema

Commit:89559f6
Author:Alfonso Subiotto Marqués
Committer:GitHub

*: allow attaching user-defined metadata to txns (#483) * *: add user-defined TxnMetadata to wal and snapshot protos * *: add WithUserDefinedTxnMetadatProvider option This allows users to associate txns with user-defined metadata. These are persisted in WAL records and snapshots. * *: store txn metadata in db high watermark * *: expose high watermark * *: fix TestDBRecover flake

Commit:00646f5
Author:Thor
Committer:GitHub

Repeated schema (#435) * proto: repeated field in storage layout * support repeated storage layouts

Commit:5e56e97
Author:Thor
Committer:GitHub

381 wal persist table config (#383) * proto: TableConfig primitive * support new table config def

Commit:98d47fd
Author:Alfonso Subiotto Marqués
Committer:GitHub

frostdb: miscellaneous snapshot improvements (#374) * frostdb: snapshot quality of life improvements This commit: - Adds Info/Debug log messages to snapshots - Adds an atomic that ensures only one snapshot is taken at a time. - Moves the async snapshot code into a method. - Writes WAL record before taking the snapshot. - Adds metrics. * frostdb: add active TableBlock information to snapshot This is necessary since the block size and last snapshot size was not set. The txns are also serialized (min/prev) since some stuff will probably break if not. * design: add snapshot design document

Commit:974038e
Author:Alfonso Subiotto Marqués
Committer:GitHub

frostdb: enable snapshots (#371) * frostdb: fix txpool race on shutdown Close would close a channel that is written to from another goroutine, which is a race condition since a write could panic. This commit changes the txpool to use a context with a cancel function to shut down instead. * frostdb: enable snapshots This commit adds the option to enable snapshotting a database whenever a block exceeds a given size or is rotated out. A snapshot is taken of the db state and stored in the snapshots/ folder. The use of the WAL is not modified yet, even though it could be truncated. This will be done in a future change once snapshots have been running for a while. Note that old snapshots are not cleaned up yet, which gives us more txns to reload at in case of issues. In the future, a cleanup policy will be implemented.

Commit:332af94
Author:Alfonso Subiotto Marqués
Committer:GitHub

*: add snapshots (#357) This commit adds the ability to write a database snapshot at a given tx to disk and load a snapshot into a database. This commit does not yet add a public-facing API for snapshots. The motivation for this change is to improve use cases where users enable the WAL. Snapshots aim to solve two problems: 1) WAL replay can be very slow given that each insert needs to be replayed to get to a "final" state. Snapshots can be loaded into a database in O(1) time with respect to the number of inserts performed against the database instead of the O(n) inserts required during a WAL replay. 2) With multiple tables, the WAL can only be truncated when all tables have been persisted/rotated out and only at the minimum persisted tx across all tables. If the user has a small table that doesn't change often and is therefore not persisted/rotated out, the WAL can never be truncated. Snapshots allow us to persist these small tables and truncate the WAL at the snapshot txn.

Commit:aca9ebd
Author:Thor
Committer:GitHub

Arrow wal (#320) * proto: add arrow field to wal entry * log arrow records to wal * wal: support arrow record writes * replay arrow wal entries

Commit:4288082
Author:Alfonso Subiotto Marqués
Committer:GitHub

wal: fix wal.proto backwards incompatibility (#289) The new schema changes overrode the proto field, which caused the old schema message to be incorrectly unmarshalled. This commit adds the field back in as a deprecated oneof so that we can handle WALs generated before the change.

Commit:5d971f6
Author:Thor
Committer:GitHub

Schema v2 nested (#273) * proto: added v1alphav2 schema to support nested schemas WAL: changed the schema to a generic pb.Any * wal: support pb.Any

Commit:f446aa3
Author:thorfour

collapse v2 schema

Commit:5976362
Author:thorfour

allow groups to be nullable/repeated

Commit:8c0aba9
Author:thorfour

v1alpha2: nested schema

Commit:1a71cfa
Author:thorfour

proto: add repeated field to storage layout

Commit:1f6817e
Author:Thor
Committer:GitHub

Support boolean types (#256) * support boolean types * logictest: boolean logic test

Commit:d77193a
Author:Julien Fabre
Committer:GitHub

Add support for Delta Byte Array and Delta Length Byte Array encoding (#144)

Commit:8afb5ac
Author:Frederic Branczyk
Committer:GitHub

Add parquet-reencode tool to switch encodings and compressions (#136) This tool is useful in order to try and identify whether a different encoding or compression might be useful for a certain column. It allows for a quick cycle to experiment without having to change the ingesting application and wait for data to accumulate.

Commit:e68abd7
Author:Frederic Branczyk
Committer:Frederic Branczyk

Add write-ahead log