These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)
Commit: | 1715355 | |
---|---|---|
Author: | sicheng | |
Committer: | sicheng |
[CLN] Rename $matches to $regex
Commit: | 0d136b9 | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
[ENH]: add batch get version file paths method to Sysdb
Commit: | 09e96e8 | |
---|---|---|
Author: | Drew Kim | |
Committer: | GitHub |
[ENH] Add RPC on SysDB to get fork count for a collection (#4484) ## Description of changes _Summarize the changes made by this PR._ - Improvements & Bug fixes - N/A - New functionality - Adds a new RPC to the SysDB, `CountForks`, that returns the number of forks for a given collection. ## Test plan _How are these changes tested?_ - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
The documentation is generated from this commit.
Commit: | 151e03a | |
---|---|---|
Author: | Drew Kim | |
Committer: | Drew Kim |
[ENH] Add RPC on SysDB to get fork count for a collection
The documentation is generated from this commit.
Commit: | 6a235d2 | |
---|---|---|
Author: | Robert Escriva | |
Committer: | GitHub |
[ENH] A route and tool to inspect the dirty log. (#4461) ## Description of changes To get ground truth on the dirty log, add a tool that can print it as it gets interpreted in the 'get all to compact' call. ## Test plan - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes N/A
Commit: | 4d29f0d | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
[ENH]: add batch get version file paths method to Sysdb
Commit: | ecb9980 | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
wip
Commit: | ddf02f3 | |
---|---|---|
Author: | Robert Escriva |
[ENH] A route and tool to inspect the dirty log. To get ground truth on the dirty log, add a tool that can print it as it gets interpreted in the 'get all to compact' call.
Commit: | 7e6889e | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
[ENH]: add batch get version file paths method to Sysdb
Commit: | ab87d31 | |
---|---|---|
Author: | Max Isom |
[ENH]: add batch get version file paths method to Sysdb
Commit: | d6d8129 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Wire up regex filter from client to query node (#4410) ## Description of changes _Summarize the changes made by this PR._ - Improvements & Bug fixes - N/A - New functionality - Introduce the regex filters for documents: - `{"$matches": "<regex>"}`: Evaluates to true for documents matching the regex - `{"$not_matches": "<regex>"}`: The exact opposite of `$matches`. (i.e. it will evaluate to true on records without documents) - Propagates the regex filters from the client to the query node. Current the implementation is `todo!()` for local and distributed chroma, and the code will panic when reached. The `todo!`s will be replaced with actual implementations with future impls. ## Test plan _How are these changes tested?_ - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
Commit: | 79c608f | |
---|---|---|
Author: | sicheng |
Reuse existing field for rename
Commit: | bf8f966 | |
---|---|---|
Author: | sicheng | |
Committer: | sicheng |
[ENH] Wire up regex filter from client to query node
Commit: | 1fe8a68 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Implement log forking (#4326) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - Implement log forking. Currently it is only implemented for legacy log service, by naively cloning the log offsets and records. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 5d81b6b | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Macronova |
[ENH] Implement log forking
Commit: | f4a0e6b | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Macronova |
Do not lock log when fork
Commit: | e174d32 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Wire up proto defs for sysdb fork endpoint (#4299) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - Defines the SysDB fork service request and response - Wires up the request upto `table_catalog.go` ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | b9599fd | |
---|---|---|
Author: | Sicheng Pan |
Do not lock log when fork
Commit: | df88b14 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
[ENH] Implement log forking
Commit: | 077e554 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Add log offsets to fork request
Commit: | cca7add | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Specify new collection id in request for retriable fork
Commit: | fbf42a2 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Update lineage file format
Commit: | 359f720 | |
---|---|---|
Author: | macronova | |
Committer: | Sicheng Pan |
Wire up proto defs for sysdb fork endpoint
Commit: | 67eba39 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
[ENH] Implement log forking
Commit: | f593801 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Update lineage file format
Commit: | 4b4f32a | |
---|---|---|
Author: | macronova | |
Committer: | Sicheng Pan |
Wire up proto defs for sysdb fork endpoint
Commit: | 8418e8f | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Specify new collection id in request for retriable fork
Commit: | 9840862 | |
---|---|---|
Author: | Sicheng Pan |
[ENH] Implement log forking
Commit: | 3a3a22a | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Update lineage file format
Commit: | 436d62d | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Specify new collection id in request for retriable fork
Commit: | 2b2d028 | |
---|---|---|
Author: | macronova | |
Committer: | Sicheng Pan |
Wire up proto defs for sysdb fork endpoint
Commit: | 1cdcd41 | |
---|---|---|
Author: | Jai Radhakrishnan | |
Committer: | GitHub |
[ENH] grpc changes for update collection config (#4083) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - ... - New functionality - add `configuration_json_str` to go grpc implementation and enable writing to grpc service from rust - add logic to overwrite existing configuration attributes with the updated configuration attributes ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 161ea0f | |
---|---|---|
Author: | Jai Radhakrishnan | |
Committer: | Jai Radhakrishnan |
add grpc changes for update
Commit: | 0e5dcf1 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Specify new collection id in request for retriable fork
Commit: | 7d6afc5 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Update lineage file format
Commit: | 8f33cf8 | |
---|---|---|
Author: | macronova | |
Committer: | Sicheng Pan |
Wire up proto defs for sysdb fork endpoint
Commit: | 9f394e1 | |
---|---|---|
Author: | Jai Radhakrishnan | |
Committer: | Jai Radhakrishnan |
add grpc changes for update
Commit: | 70a8d27 | |
---|---|---|
Author: | Sicheng Pan |
Specify new collection id in request for retriable fork
Commit: | 181ef3e | |
---|---|---|
Author: | macronova | |
Committer: | Sicheng Pan |
Wire up proto defs for sysdb fork endpoint
Commit: | b89ad51 | |
---|---|---|
Author: | Sicheng Pan | |
Committer: | Sicheng Pan |
Update lineage file format
Commit: | 4b5fc93 | |
---|---|---|
Author: | Jai Radhakrishnan | |
Committer: | Jai Radhakrishnan |
add grpc changes for update
Commit: | 247c330 | |
---|---|---|
Author: | Jai Radhakrishnan | |
Committer: | Jai Radhakrishnan |
add grpc changes for update
Commit: | 61d879b | |
---|---|---|
Author: | Robert Escriva | |
Committer: | GitHub |
Revert "[BUG] OBO between log service and compaction. (#4276)" This reverts commit e33f7703c3fbe6819553a228f77b794eee217708.
Commit: | e33f770 | |
---|---|---|
Author: | Robert Escriva | |
Committer: | GitHub |
[BUG] OBO between log service and compaction. (#4276) Compaction assumes that enumeration position t_i means t_i was the last record seen and therefore next reader should read from t_i + 1. Log service was built and tested for t_i meaning t_i was the first record to return. Note: I changed the go code, but only by moving a +1 out a layer. The inner version was inconsistent with convention, so I updated it.
Commit: | 9866b83 | |
---|---|---|
Author: | Robert Escriva | |
Committer: | GitHub |
[ENH] Add a scout-logs function to find the max log position. (#4232) This PR makes a currently-not-used scout-logs function. It adds it to the go service, the rust log service over grpc, and the rust log client over grpc and in-memory.
Commit: | 77f4d01 | |
---|---|---|
Author: | Max Isom | |
Committer: | GitHub |
[BUG]: use absolute cutoff time when listing collections to garbage collect (#4245) ## Description of changes We currently send a relative cutoff time in seconds to the Sysdb when it expects an absolute cutoff time. This updates the garbage collection service to send an absolute cutoff time instead. ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* n/a
Commit: | 8b6f04c | |
---|---|---|
Author: | Max Isom | |
Committer: | GitHub |
[ENH]: allow overriding default garbage collection mode per tenant & clean up cutoff time parameters (#4135) ## Description of changes The main change here is allowing setting different garbage collection modes per tenant. This will allow us to enable delete mode for one tenant while testing and keep everyone else in dry run mode. I added a new test, but for it to work I ended up needing to make some changes around how cutoff time is handled (was planning to make these changes, just in a later PR). ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust Added a new E2E test to validate tenant overrides. ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 094f248 | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
[ENH]: allow overriding default garbage collection mode per tenant
Commit: | 8c0d636 | |
---|---|---|
Author: | Max Isom | |
Committer: | GitHub |
[ENH]: rename GC `ListOnly` mode -> `DryRun`, do not delete versions in `DryRun` mode (#4126) ## Description of changes - Renames garbage collection `ListOnly` mode to `DryRun` for better clarity. - Does not delete versions in `DryRun` mode, only marks them for deletion. These changes make it possible to transition from running `DryRun` mode to running `Rename` or `Delete` modes with no additional tooling. ## Test plan *How are these changes tested?* Added a new E2E test for dry run mode. ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* n/a
Commit: | ecc42d3 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Implement rebuild endpoint for compaction server (#4132) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Updates compaction job definition - New functionality - Add the rebuild endpoint for compaction server ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | d02152d | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
[ENH]: rename GC `ListOnly` mode -> `DryRun`, do not delete versions in `DryRun` mode
Commit: | e6adcec | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] Prevent tracing:error calls for soft deleted collection in compaction. (#4146)
Commit: | d50cd02 | |
---|---|---|
Author: | Robert Escriva | |
Committer: | GitHub |
[ENH] Wire up the log service's offline path. (#4084) This makes the log-service "compact" the dirty log.
Commit: | fa48a6f | |
---|---|---|
Author: | Rohit | |
Committer: | Rohit |
Adding new Sysdb Grpc for fetching single collection. The grpc returns FailedPreCondition if the collection has been soft deleted.
Commit: | 2cccd82 | |
---|---|---|
Author: | Robert Escriva | |
Committer: | Robert Escriva |
Allow collections to be forgotten; they will be purged from dirty log on delete from sysdb.
Commit: | 965ff1f | |
---|---|---|
Author: | Jai Radhakrishnan |
.
Commit: | 6f7296d | |
---|---|---|
Author: | Jai Radhakrishnan |
Merge branch 'main' into jai/update-coll-config
Commit: | 8352744 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Calculate and update collection logical size during compaction (#4071) Fixes issues in https://github.com/chroma-core/chroma/pull/4019
Commit: | dba2436 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
Revert "[ENH] Calculate and update collection logical size during compaction" (#4070) Reverts chroma-core/chroma#4019
Commit: | 9a72ac9 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Calculate and update collection logical size during compaction (#4019) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - Propagate the new collection logical size and compaction time to SysDB ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* --------- Co-authored-by: Sicheng Pan <sicheng@trychroma.com>
Commit: | 29858bc | |
---|---|---|
Author: | Jai Radhakrishnan |
add update collection config plumbing
Commit: | d5046d6 | |
---|---|---|
Author: | Rohit | |
Committer: | Rohit |
Prop Tests for GC. Use RefState and SUT to test GC. Update TestSysDb.
Commit: | a0fb1d1 | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] Make SysDb return eligible collections to GC. (#3916) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Limit the number of collections, and return those that have at least 1 version that is elligible for GC. - New functionality - ... ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 36deeaa | |
---|---|---|
Author: | Rohit |
Prop Tests for GC. Use RefState and SUT to test GC. Update TestSysDb.
Commit: | 7c09a04 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Add log bytes pulled as part of query result (#3972) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - Added the size of pulled log on the read endpoint results ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* --------- Co-authored-by: Sicheng Pan <sicheng@trychroma.com>
Commit: | 6a58ccb | |
---|---|---|
Author: | Drew Kim | |
Committer: | GitHub |
[ENH] Add size_bytes_post_compaction and last_compaction_time_secs to SysDB collection (#3886) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - Adds `size_bytes_post_compaction` and `last_compaction_time_secs` fields to the SysDB's Collection. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 523c6b9 | |
---|---|---|
Author: | Drew Kim | |
Committer: | Drew Kim |
[ENH] Add size_bytes_post_compaction and last_compaction_time_secs to SysDB collection
Commit: | 7401b7a | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] Garbage Collection property tests. (#3881) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - ... - New functionality - ... ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 7a01076 | |
---|---|---|
Author: | Rohit | |
Committer: | Rohit |
[ENH] GC - Property test. Property tested : Create mulitple batches of embeddings. All records should exist irrespective of number of GC runs.
Commit: | 930da14 | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] GC Tool - Tilt based tests. (#3749) ## Description of changes Includes changes to SysDb. Added operators for GC. Contains tests in garbage collection orchestrator. *Summarize the changes made by this PR.* - Improvements & Bug fixes - ... - New functionality - ... ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 44ce94d | |
---|---|---|
Author: | Rohit | |
Committer: | Rohit |
[ENH] GC - Property test. Property tested : Create mulitple batches of embeddings. All records should exist irrespective of number of GC runs.
Commit: | f87af07 | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | Evan Culver |
Resolve merge conflicts
Commit: | 5e8699e | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | github-actions[bot] |
Cherry-pick with conflicts: 3a9a76ee1f9e0e7464ff9e40a0b55ec49792a468
Commit: | 3a9a76e | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | GitHub |
[ENH] Count collections implementation (#3785) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Introduces and Implements count_collections rpc in sysdb - Previously, we were suboptimally performing a len(get_collections()) which results in fan out of multiple db queries to fetch collection metadata one per collection - Both the python sysdb client and rust sysdb client consume the rpc introduced in sysdb - Both the rust frontend and python frontend consume the new impl - New functionality - ... ## Test plan *How are these changes tested?* - Added a test in test_system.py - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes None
Commit: | cc816dd | |
---|---|---|
Author: | Sanket Kedia |
Make database name optional
Commit: | 760044d | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | Sanket Kedia |
[ENH] Count collections implementation
Commit: | d9bc9b0 | |
---|---|---|
Author: | Rohit | |
Committer: | Rohit |
Fixes and tests.
Commit: | e85717d | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] Add GRPCs in SysDB to help with garbage collection (#3539) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - ... - New functionality - New GRPCs to mark versions for deletion, and to delete versions. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 712fd91 | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Serialize Where clause to protobuf (#3565) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Refactor the deserialization logic for `Where` clause - New functionality - Implement the serialization logic for `Where` clause ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | bfa2f11 | |
---|---|---|
Author: | rohitcpbot | |
Committer: | rohitcpbot |
Implementation and test cases of the GRPCs.
Commit: | 93097ea | |
---|---|---|
Author: | rohitcpbot | |
Committer: | Rohit |
Added the GRPCs
Commit: | 9addc9f | |
---|---|---|
Author: | Rohit P | |
Committer: | GitHub |
[ENH] Create VersionFiles in S3 from Sysdb. GI#822 (#3475) ## Description of changes SysDB will create version files in S3 as part of FlushCollectionCompaction call. CreateCollection will also create a version file. *Summarize the changes made by this PR.* - Improvements & Bug fixes - - New functionality - Creation of version files to aid with GC and Restore Ops. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust Added New unit tests. ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 3ceb2b0 | |
---|---|---|
Author: | rohitcpbot | |
Committer: | rohitcpbot |
Addressed review comments.
Commit: | fcd258e | |
---|---|---|
Author: | Macronova | |
Committer: | GitHub |
[ENH] Implement compactor server interface (#3375) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - N/A - New functionality - We would like to manually start a compaction on certain collections. This PR is the first in the stack to implement this feature. Specifically, it: - Introduces the grpc interface for compactor. Currently there is only a single variant of request that allows the user to specify an optional list of collection ids for compaction. - Implements the `CompactionServer` struct in rust that implements this interface. It receives the request and send it to the running `CompactionManager` - The next PRs will handle the request in `CompactionManager`. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 1f21ef8 | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | GitHub |
[ENH] Implement Get collections to GC + add to rust sysdb client (#3508) ## Description of changes *Summarize the changes made by this PR.* - New functionality - Adds an rpc on Go side that is dumb and gets all collections - Adds this rpc to the rust sysdb client + uses this rpc from rust client in gc ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes None
Commit: | 48dc2c7 | |
---|---|---|
Author: | Sanket Kedia | |
Committer: | Sanket Kedia |
Get collections to GC
Commit: | 4b8d799 | |
---|---|---|
Author: | rohitcpbot | |
Committer: | Rohit P |
Added the implementation of the GRPCs.
Commit: | a681944 | |
---|---|---|
Author: | rohitcpbot | |
Committer: | Rohit P |
Added the GRPCs
Commit: | fe90bdf | |
---|---|---|
Author: | Drew Kim | |
Committer: | GitHub |
[ENH] GetCollectionSize on SysDB read replica (#3503) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - None - New functionality - Hooks up `SysDB` Go coordinator to a DB read replica - Adds a new RPC, `GetCollectionSize`, to return the total records in a collection, pulling from the reader DB ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 8c7c435 | |
---|---|---|
Author: | Drew Kim | |
Committer: | Drew Kim |
[ENH] GetCollectionSize on SysDB read replica
Commit: | c8f5728 | |
---|---|---|
Author: | Drew Kim |
idl makefile generates read coordinator
Commit: | 0158292 | |
---|---|---|
Author: | Drew Kim |
[ENH] Expose read replica on SysDB coordinator
Commit: | 537f29a | |
---|---|---|
Author: | Drew Kim | |
Committer: | GitHub |
[ENH] Update compactor to flush total records upon compaction (#3483) ## Description of changes Updates the compactor to calculate the total records per collection and flush to the sysdb upon every compaction. *Summarize the changes made by this PR.* - New functionality - `FlushCollectionCompaction` struct includes `TotalRecordsPostCompaction` - `SysDB` populates the `total_records_post_compaction` column when receiving a flush - `ArrowBlockfileFlusher` contains a new attribute, `total_keys` - `ArrowUnorderedBlockfileWriter` sums the total keys using the `SparseIndexWriter` and returns an `ArrowBlockfileFlusher` with the summed count - `RegisterInput` contains a new attribute, `total_records_post_compaction ` - If `CompactOrchestrator`, when handling `CommitSegmentWriterOutput`, receives a `ChromaSegmentFlusher::RecordSegment`, it reads `total_keys()` and sets it as an attribute on itself. - `ChromaSegmentFlusher::RecordSegment` has `total_keys()` through its `ArrowBlockfileFlusher` - `CompactOrchestrator` sends its `num_records_last_compaction` value to a `RegisterInput` to be flushed to the `SysDB` ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust - [x] Tested locally and confirmed that compaction correctly updates the column in the `SysDB` ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 080a601 | |
---|---|---|
Author: | Drew Kim | |
Committer: | Drew Kim |
[ENH] Update compactor to flush total records upon compaction
Commit: | dd5efc1 | |
---|---|---|
Author: | Drew Kim | |
Committer: | Drew Kim |
i64 -> u64
Commit: | 6932c73 | |
---|---|---|
Author: | Drew Kim | |
Committer: | GitHub |
[ENH] Add num_records_last_compaction to sysdb (#3463) ## Description of changes *Summarize the changes made by this PR.* - New functionality - Adds `num_records_last_compaction` to distributed sysdb ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?*
Commit: | 936f58d | |
---|---|---|
Author: | Drew Kim |
i64 -> u64
Commit: | 9ee24fd | |
---|---|---|
Author: | Max Isom | |
Committer: | GitHub |
[ENH] add distributed implementation for database deletion (#3460) ## Description of changes Adds implementation for database deletion API to distributed Chroma. ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust Added a new Go test. Mostly tested with existing Python tests. ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* n/a
Commit: | ea0ac3a | |
---|---|---|
Author: | Max Isom | |
Committer: | Max Isom |
Add distributed impl
Commit: | f50a11b | |
---|---|---|
Author: | Drew Kim |
make num records required