Proto commits in MaterializeInc/materialize

These commits are when the Protocol Buffers files have changed: (only the last 100 relevant commits are shown)

Commit:23c205f
Author:Dov Alperin
Committer:GitHub

Store active rollups and GC in state (#32301) Closes https://github.com/MaterializeInc/database-issues/issues/9193 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

The documentation is generated from this commit.

Commit:a8d9a96
Author:Jan Teske
Committer:Jan Teske

storage: flatten oneshot ingestions commands

Commit:921fea0
Author:Petros Angelatos
Committer:Jan Teske

storage: flatten RunIngestion command Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:591bbd2
Author:Petros Angelatos
Committer:Jan Teske

storage: flatten RunSinks command Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:815418e
Author:Petros Angelatos
Committer:Jan Teske

storage: flatten AllowCompaction command Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:1549c28
Author:Petros Angelatos
Committer:Jan Teske

storage: flatten status updates responses Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:df971e7
Author:Petros Angelatos
Committer:Jan Teske

storage: flatten frontier upper responses Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:1d42137
Author:Parker Timmerman
Committer:GitHub

[sql_server] add negative tests to ensure we don't allow some data types (#32249) We can't support replication of certain datatypes, e.g. `text`, because they don't provide us with the "before" value when replicating. This used to work but I accidentally regressed the behavior in https://github.com/MaterializeInc/materialize/pull/32181. This PR re-adds the check to ensure all columns we're intending to replicate are supported. Also an mzcompose testdrive case to exercise these unsupported types. ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8762 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:c16a125
Author:Parker Timmerman
Committer:GitHub

[sql_server] Support `TEXT COLUMNS` and `EXCLUDE COLUMNS` (#32181) This PR updates the SQL Server source to support excluding columns or decoding them as text, specifically it does the following: * Updates `SqlServerColumnDesc` to support representing unknown columns. This allows users to replicate tables with a column type we don't recognize and specify said column in `EXCLUDE COLUMNS`. * Update `SqlServerRowDecoder` to support decoding from any type to text, adds some limited testing. * Updates parsing and purification to include `TEXT COLUMNS` and `EXCLUDE COLUMNS` in both the primary `CREATE SOURCE` statement (for round-tripping) and the generated subsource statements. ### Tips for reviewers I would start with `sql-server-util/src/desc.rs`, that's where the bulk of the changes are. This file contains the types that we use to represent columns and tables from SQL Server. ### Motivation Fixes a `TODO(sql_server1)` and adds a known feature. Progress towards https://github.com/MaterializeInc/database-issues/issues/8762 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:d37c5be
Author:Dov Alperin
Committer:GitHub

Password auth (#32131) This is the bones of self managed password auth. This is missing integration tests and documentation which will come next. You can test this locally like so: ```shell #this starts materialize with auth enabled on the external port but not the internal port $ bin/environmentd --bazel --reset -- --all-features --unsafe-mode --enable-self-hosted-auth #start a mz_system session, turn on sql support and create a user $ psql -U mz_system -h localhost -p 6877 materialize NOTICE: connected to Materialize v0.139.0-dev.0 Org ID: 1bd2c405-c638-44cc-b917-6d05dfb832ac Region: local/az1 User: mz_system Cluster: mz_system Database: materialize Schema: public Session UUID: fe6d2dbe-3a94-430d-82e1-5e79cb14b91f Issue a SQL query to get started. Need help? View documentation: https://materialize.com/s/docs Join our Slack community: https://materialize.com/s/chat psql (14.17 (Homebrew), server 9.5.0) Type "help" for help. materialize=> alter system set enable_self_managed_auth=true; NOTICE: variable "enable_self_managed_auth" was updated for the system, this will have no effect on the current session ALTER SYSTEM materialize=> create role foo with superuser password 'bar'; CREATE ROLE # Now connect over the port with auth enabled $ psql -U foo -h localhost -p 6875 materialize Password for user foo: ``` This begins implementing https://github.com/MaterializeInc/materialize/pull/32005 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:b8afb0b
Author:Jan Teske
Committer:Jan Teske

feature flag: enable_create_sockets_v2 This commit adds an `enable_create_sockets_v2` dyncfg and wires it through the `CreateTimely` command to the cluster.

Commit:49c017b
Author:Parker Timmerman
Committer:GitHub

[sql_server] Implementent purification for SQL Server Source (#32121) This PR implements "purification" in the Adapter for a SQL Server source. For folks unfamiliar, purification is a step during object creation where we query external systems for any necessary information, with the goal of making our persisted SQL "pure". During purification of a SQL Server source we do the following: 1. Ensure CDC is enabled for the current database 2. Ensure snapshot isolation is enabled for the current database 3. List all tables that currently have CDC enabled, and their capture instances * The specified capture instance for each subsource is then persisted in `PurifiedExportDetails Left as a TODO is implementing support for `CREATE TABLE ... FROM SQL SERVER`. Nothing blocks our implementation of this feature although because it's not released yet I opted out of implementing it to keep the PR small. ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8762 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:cabd31b
Author:Parker Timmerman
Committer:GitHub

[sql_server] Create all `*Source` related types, support `CREATE CONNECTION ... TO SQL SERVER` (#32087) This PR adds all of the various SQL Server 'Source' and 'Connection' types, and leaves the implementations of Purification and Source Rendering as TODOs. Given the number of new types required, and files that need to be touched, I figured it was easier to make this into its own PR. The two main types this PR adds are: 1. `SqlServerSource` (analogous to `PostgresSourceConnection`) * Implements the `SourceConnection`, `AlterCompatible`, and `SourceRender` traits. 2. `SqlServerConnectionDetails` (analogous to `PostgresConnection`) * Implements `IntoInlineConnection` and `AlterCompatible`. Note: I slightly differed from the naming convention here because to me `PostgresSourceConnection` and `PostgresConnection` were similar enough that I kept getting the two confused. We also add several types related to these two: 1. `SqlServerSource` * `SqlServerSourceExtras` - like the MySQL source, this type is empty but include to conform with other sources. * `SqlServerSourceExportDetails` - details of the specific upstream tables we'll be replicating. * `GenericSourceConnection::SqlServer` variant - enum wrapper around all `SOURCE` types. 2. `SqlServerConnectionDetails` * `Connection::SqlServer` variant - enum wrapper around all `CONNECTION` types. ### Feature Changes Everything in this PR is still gated behind the `enable_sql_server_source` flag. When this flag is enabled, the only feature related change is `CREATE CONNECTION ... TO SQL SERVER` is now supported! Otherwise purification in the Adapter and implementation of the `SourceRender` trait are left as a TODO which I will include in a follow-up PR. ### Testing This PR also adds two tests to exercise the new connection behavior, and will eventually be used to exercise CDC: 1. `tests/sql-server-cdc` - testdrive based mzcompose workflows, asserts creating and validating a `CONNECTION` is successful 2. `tests/platform-checks` - a new `SqlServerCdc` check is added to ensure creating and dropping connections works across restarts and upgrades ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8762 ### Tips for reviewer Overall a lot of this is just plumbing and cargo culting what already exists for Postgres and MySql. I would focus on the changes in the `storage-types` crate, that's where the most net new code is. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:d437817
Author:Moritz Hoffmann
Committer:GitHub

Simplify regexp replace handling (#32053) Simplify regexp replace handling by using a literal error Instead of encapsulating a result, replace the function call by a literal error. The hope is that this is semantically equivalent to the existing implementation. Part of MaterializeInc/database-issues#9138 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:22b8308
Author:Peter Travers
Committer:GitHub

Merge pull request #32045 from ptravers/pt/7101 Add string_to_array function

Commit:be051a5
Author:Peter Travers
Committer:Peter Travers

Add string_to_array function

Commit:de4665a
Author:Moritz Hoffmann
Committer:GitHub

Turn jsonb stringify functions into separate functions (#32055) Some of the jsonb functions can be used in contexts where the output is either jsonb or strings. Previously, we'd encode this as part of the function call by supplying a `bool` field. This is an abstraction leak and instead it would be better to have separate functions. This change absorbs the `stringify` parameter into different functions. No behavior change expected. Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:d38c022
Author:Parker Timmerman
Committer:GitHub

[sql_server] Add mapping from SQL Server data types to Materialize data types (#32040) This PR adds types to describe tables and columns from MS SQL Server and how data will map into Materialize. On each new type I tried to document how it fits into the flow of a theoretical SQL Server source, but reviewers please feel free to push back if something isn't obvious. Also, the module comment in `sql-server-util/src/desc.rs` should (hopefully) add some helpful context. The specific mappings from SQL Server datatypes into Materialize is described [here](https://www.notion.so/SQL-Server-Prototype-1a413f48d37b802fb08ddfa0e139b3c3?pvs=4#1a413f48d37b80c5841ee43478537f79), this PR follows that spec. For testing, this PR relies on some test helpers I added in https://github.com/MaterializeInc/tiberius/pull/1. Throughout the code you'll see a few comments like `TODO(sql_serverX)` the idea here is X defines the milestone which these need to be solved, roughly: `TODO(sql_server1)`: Before private preview `TODO(sql_server2)`: Before GA `TODO(sql_server3)`: Backlog ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8762 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:d34391a
Author:Dov Alperin
Committer:GitHub

Implement regexp_matches function (#31935) Closes https://github.com/MaterializeInc/database-issues/issues/7096 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:8cb4f89
Author:Dov Alperin
Committer:GitHub

Implement the postgres 'reverse' func (#31928) closes https://github.com/MaterializeInc/database-issues/issues/7093 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:90346e1
Author:Parker Timmerman
Committer:GitHub

[copy_from] Support specifying a column map in `COPY ... FROM <url>` (#31580) This PR adds support for specifying a column mapping and filling in default values for `COPY ... FROM <url>` (aka `OneshotSource`). Let's say we ran `CREATE TABLE t1 (c1 text, c2 text, c3 text DEFAULT 'apple')`, so our table has 3 columns, but the file we're copying into Materialize only has 2 columns. After this PR you could run `COPY INTO t1 (c2, c1) FROM <url>` and it would map the first column of the data to `t1.c2`, the second column to `t1.c1`, and it would fill in the default value of `'apple'` for all values in `t1.c3`. We do this by planning a `MapFilterProject`, converting it into a `SafeMfpPlan`, and then in `clusterd` when decoding data evaluating the `SafeMfpPlan` for every row. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8858 ### Tips for reviewer There are no tests in this PR, they are included as part of https://github.com/MaterializeInc/materialize/pull/31581 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:0776196
Author:Aljoscha Krettek
Committer:Aljoscha Krettek

storage: add replica_id to source/sink status updates absorb add replica_id column

Commit:8d3abab
Author:Marty Kulma
Committer:GitHub

Add IAM authentication support for MySQL connections (#31689) This adds the ability for users to configure IAM authentication instead of a password when using RDS MySQL. This change adds syntax to `CREATE CONNECTION` for mysql that will accept `AWS CONNECTION`. `AWS CONNECTION` is mutually exclusive with the options 1. `PASSWORD` 2. `SSL MODE 'disabled'` or not specified ### Motivation fixes https://github.com/MaterializeInc/database-issues/issues/6946 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:2ca7726
Author:Moritz Hoffmann
Committer:GitHub

Logical per-dataflow backpressure (#31553) Logical back pressure for dataflows. This PR implements logical backpressure for dataflows. Logical backpressure is a mechanism that holds back time (not data), and reveals time stepwise as the downstream dataflow makes progress. The implementation is conservative in what times it holds back. We cannot hold back all times because joins release resources as time ticks forward, and holding back time longer than necessary can cause regressions against current state. For this reason, we only hold back times until the input's write frontier (upper) at the time of creating the dataflow, but not beyond. This will allow us to apply logical backpressure while a dataflow is catching up, but it will not hold back any times in steady-state. It is important that we do not hold back times in steady-state to ensure we permit maximum compaction of arrangements that we use in a dataflow. Holding back times before the current input's upper can result in slightly less compaction, but we know that the inputs are available up to that point and should soon advance the dataflow's output to the upper of the inputs. <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:ca11a6c
Author:Moritz Hoffmann
Committer:GitHub

Update Timely/Differential; zero-copy mode (#31480) Update Timely and Differential to their latest versions. This means removing the last holdouts of flatcontainer, and changing where we get columnation-related types from. The big change is that we can enable the Timely zero-copy allocator if we want to. It is disabled by default. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:2ffda12
Author:Ben Kirwin
Committer:Dennis Felsing

Add a new batch format variant

Commit:055ba87
Author:Marty Kulma
Committer:GitHub

Remove obsolete sink partition strategy (#31463) Refactors Kafka sink to only use v1 strategy for publishing. This starts to remove the concept of a partitioning strategy version, which was use for migration from old (v0) to new (v1). This is phase 1: - retain ability to parse CREATE SINK ... WITH (PARTITION STRATEGY ...) statements - default to v1 behavior regardless of PARTITION STRATEGY - adds a migration to remove partition strategy on upgrade Phase 2 will remove the migration code and the remaining bits. ### Motivation Removing legacy code, issue https://github.com/MaterializeInc/database-issues/issues/7822 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:baf05bc
Author:Parker Timmerman
Committer:GitHub

[copy_from] Proper cancelation via `CancelOneshotIngestion` message (#31136) This PR fixes the `TODO(cf1)` related to canceling oneshot ingestions. It adds a `StorageCommand::CancelOneshotIngestion` that `reduce`s/compacts away a corresponding `StorageCommand::RunOneshotIngestion`, much like `ComputeCommand::Peek` and `ComputeCommand::CancelPeek`. We send a `StorageCommand::CancelOneshotIngestion` whenever a user has canceled a `COPY FROM` statement, but also the storage controller will send one whenever a `RunOneshotIngestion` command completes. ### Motivation Fix `TODO(cf1)` related to cancelation ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:b6570e1
Author:Ben Kirwin
Committer:Ben Kirwin

Add collection metadata for the sink's output collection

Commit:3910676
Author:Marty Kulma
Committer:GitHub

mysql: add connection timeout (#31280) Wraps `mysql_async::conn::Conn::new()` in a timeout, which is 60s by default, and can be overridden via `mysql_source_connect_timeout` configuration. ### Motivation This PR adds mysql connection timeout that (somewhat) matches up with the postgresql client - https://github.com/MaterializeInc/database-issues/issues/8295 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:ba1b32c
Author:Petros Angelatos
Committer:GitHub

Merge pull request #31245 from petrosagg/flatten-dropped-ids storage: flatten dropped ids responses

Commit:04469bf
Author:Petros Angelatos
Committer:Petros Angelatos

storage: flatten dropped ids responses We want to eventually flatten all commands and responses. This performs the first step of that. Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:ac3f564
Author:Parker Timmerman
Committer:GitHub

[copy_from] Support the Parquet format (#31173) _Stacked on top of_ https://github.com/MaterializeInc/materialize/pull/31144 This PR adds a new implementation of a `OneshotFormat` that supports reading in Parquet files. The decoding is built on top of the `ArrowReader` implemented in https://github.com/MaterializeInc/materialize/pull/30958. The strategy we use for reading and decoding Parquet files is the "split work" stage of a oneshot source will read the footer metadata from a Parquet file to determine the [Row Group](https://parquet.apache.org/docs/concepts/) boundaries. The Row Groups are then distributed among timely works for fetching and eventual decoding. Note: Through experimentation I found that Row Groups seem to typically be 10s of MB large, which makes them a pretty good unit of parallelization. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8853 ### Tips for reviewer Review on the final commit, the one titled "start, support Parquet for COPY FROM" ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:ab43e64
Author:Parker Timmerman
Committer:GitHub

[copy_from] AWS source (#31144) _Stacked on top of_: https://github.com/MaterializeInc/materialize/pull/30956 This PR implements a new AwsS3 `OneshotSource` that allows copying in files from S3, e.g. ``` COPY INTO my_table FROM 's3://my-test-bucket' (FORMAT CSV, FILES = ['important.csv']); ``` Along with `FILES = [<files>]` we also support a `PATTERN = <glob>` option which allows copying multiple files all at once. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8860 Fixes https://github.com/MaterializeInc/database-issues/issues/8855 ### Tips for reviewer Review only the final commit, the one titled "start, implementation of an S3 oneshot source" ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:a759ee5
Author:Parker Timmerman
Committer:GitHub

[copy_from]: Flush out implementation of CSV format (#30956) This PR makes the CSV implementation for bulk imports "feature complete". It adds support for specifying things like the delimiter and escape character, as well as support for handling compressed CSV files. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8902 ### Tips for reviewer While the changes exist in `storage-*` crates, they are more general async-Rust changes and nothing necessarily specific to storage itself. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:59ba5f2
Author:Ben Kirwin
Committer:GitHub

Merge pull request #31080 from bkirwi/structured-on-disk-2 [persist] Structured file format

Commit:af7048a
Author:Ben Kirwin
Committer:Ben Kirwin

Remove unused per-sink status collection id Status is no longer reported by the sink itself, so the id is no longer needed at the protocol level.

Commit:f3ee8dd
Author:Joseph Koshakow
Committer:GitHub

storage: Export primary ingestion collection (#30991) Previously, when the `force_source_table_syntax` flag was enabled, the primary source ingestion collection was not included in the source exports. This would cause the primary source ingestion collection's upper and since to be stuck at 0, and it would break some existing code. This commit always includes the primary ingestion collection in the source exports. However, when the `force_source_table_syntax` flag is enabled, then the source export details are set to `SourceExportDetails::None`. The result is that all source types with the flag enabled behave similarly to how multi-output sources behave with the flag disabled in regard to the primary ingestion collection. Specifically, their upper's and since's move forward in time and querying them returns an empty result. A downside of this commit is that a source ingestion is always scheduled with the flag enabled, even if there are no table exports. Previously, they would only be scheduled if there were table exports. Works towards resolving #MaterializeInc/database-issues/issues/8620

Commit:076aa7f
Author:Jan Teske
Committer:Jan Teske

storage: don't require backward compat for parameters.proto The protobuf types in this file are only used in the storage protocol and never used across Mz versions.

Commit:a27b29a
Author:Jan Teske
Committer:Jan Teske

storage: make KAFKA_METADATA_FETCH_INTERVAL a dyncfg This commit removes the `KAFKA_DEFAULT_METADATA_FETCH_INTERVAL` legacy config and replaces it with a `KAFKA_METADATA_FETCH_INTERVAL` dyncfg. Like `{PG,MYSQL}_OFFSET_KNOWN_INTERVAL`, this makes it possible to dynamically observe updates to this interval through a `ConfigSet`, apart from removing some boilerplate. The flag also loses its "default" part and is now the only thing determining the probe ticker interval. This is the first step in removing the user-visible option to configure the Kafka source metadata fetch interval and is done here because it simplifies dynamically updating the probe ticker interval in the next commit.

Commit:d8352f6
Author:Ben Kirwin
Committer:Ben Kirwin

Add a new batch format variant

Commit:023af02
Author:Parker Timmerman
Committer:Gabor Gevay

adapter/storage: Cast MySQL `bit` columns to `uint8`, add convience functions (#31097) This PR changes the MySQL source to support ingesting the `bit` type as `uint8`. It also adds two new Postgres functions, `bit_count(bytea)` and `get_bit(bytea, int32)` to making working with byte strings easier. ### Motivation Progress towards: https://github.com/MaterializeInc/database-issues/issues/8891 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:925f577
Author:Parker Timmerman
Committer:GitHub

adapter/storage: Cast MySQL `bit` columns to `uint8`, add convience functions (#31097) This PR changes the MySQL source to support ingesting the `bit` type as `uint8`. It also adds two new Postgres functions, `bit_count(bytea)` and `get_bit(bytea, int32)` to making working with byte strings easier. ### Motivation Progress towards: https://github.com/MaterializeInc/database-issues/issues/8891 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:f58b528
Author:Parker Timmerman
Committer:GitHub

[copy_from]: Initial implementation, add `OneshotSource` and `OneshotFormat`, support appending Batches to Tables (#30942) This PR is an initial implementation of `COPY ... FROM <url>`, aka "COPY FROM S3". **Goals for this PR** Note: traditionally we may have written a design doc for this feature, but I would instead like to try a lighter weight approach where we specifically make a decision on the core changes necessary for this feature, and later record those in a Decision Log. Those are: 1. How do we handle appending large amount of data to a Table? * This PR has implemented this via creating Persist Batches in `clusterd` and then handing them back to `environmentd` for final linking into the Persist shard, **these changes are in the 3rd commit**. A different idea would be to implement "renditions" in Persist, but that is a _much_ larger change. 2. Should this "oneshot ingestion" live in "storage" or "compute"? * I added this implementation to "storage" because it seemed easier to do. 3. (really 2a) Do the current changes to the Storage Controller API/Protocol make sense? * The Storage Controller already has lots of responsibilities, I want to make sure that changes I made resonate with folks and fit into any "north star"-ish visions we have, **these changes are in the 2nd commit**. * Specifically, I could see wanting to fold `StorageCommand::RunOneshotIngestion` into `StorageCommand::RunIngestion`, but given a oneshot ingestion is ephemeral and shouldn't be restarted when `clusterd` crashes, keeping them separate seemed reasonable. **What this PR implements** * A framework for "oneshot sources" and formats via two new traits, `OneshotSource` and `OneshotFormat`. * There is a doc comment on `src/storage-operators/src/oneshot_source.rs` that should explain how these traits are used. If that comment is not clear please let me know! * Changes to the Storage protocol for creating a "oneshot ingestion", and having it async respond to the Coordinator/environmentd with `ProtoBatch`s that can be linked into a Table. * Changes to `txn-wal` and the Coordinator to support appending `ProtoBatch`s to a Table instead of just `Vec<Row>`. * Parsing, Planning, and Sequencing changes to support `COPY ... FROM <url>` **Feature Gating** The `COPY ... FROM <url>` feature is currently gated behind a LaunchDarkly flag called `enable_copy_from_remote`, so _all_ of the Storage related code is not reachable, unless this flag is turned on. Only the SQL parser and Table appending changes are reachable without the flag. ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/6575 ### Tips for reviewer I did my best to split this PR into logically separate commits to make them easier to review: 1. Initial implementation of "oneshot ingetions". This commit defines the `OneshotSource` and `OneshotFormat` traits, and implements the dataflow rendering for a "oneshot ingestion". 2. Changes to the Storage Controller to support rendering and sending results of a "oneshot ingestion". ⭐ 3. Changes to the Coordinator and `txn-wal` to support appending Batches to tables. ⭐ 4. Parsing, Planning, and Sequencing changes in the Adapter. 5. Formatting, Linting, etc. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:929ea2b
Author:Parker Timmerman
Committer:GitHub

catalog: move protobufs to new crate (#30781) AFAICT compiling the `mz_catalog` crate takes so long because we generate a lot of code for the different `object_vX.proto` files, and then that generated code uses a lot of proc-macros. This is unfortunate because changes to the catalog crate rarely touch the protobuf files, so it's not great having to pay for this extra compile time. This PR is an MVP for moving the protobuf files to a new `mz_catalog_protos` crate so we don't need to recompile the protobuf definitions every time we make a change to the catalog. Locally I saw these improvements when running `cargo check`: kind | before | after -----|--------|------ incremental | 8s | 1 s full | 23s | 5s While this does result in a compile time improvement, it does add some boilerplate to an already boilerplate laden workflow. ### Motivation Improve compile times ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:112a594
Author:Frank McSherry
Committer:GitHub

Rationalize type of `(key, val)` to `row` mapping (#27027) We use a `BTreeMap<usize, usize>` for the "permutation" from concatenated `[key, val]` columns, when it appears it can always be a `Vec<usize>`: each output column must identify an input column. This also hints that this isn't really a permutation as much as a projection. More generally both of these could be `Vec<MirScalarExpr>` if we ever plan to be so bold (and perhaps this is an opportunity to abstract the specifics away, so that everyone's signatures don't change in the future). For example, various `permute` functions could/should be `pre-mfp` where you hand over perhaps a `Vec<MirScalarExpr>` that *could* be column references, or more general expressions. E.g. if we form a key using a cast, we cannot represent the operation that just un-casts it (in some cases) and that would be good enough vs keeping the un-cast column around as a value. ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] This PR includes the following [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note): - <!-- Add release notes here or explicitly state that there are no user-facing behavior changes. --> --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com> Co-authored-by: Moritz Hoffmann <mh@materialize.com>

Commit:9375fd3
Author:Parker Timmerman
Committer:GitHub

persist: Stabilize Schema Evolution (#30205) Requires https://github.com/MaterializeInc/materialize/pull/30725 to be merged. This PR stabilizes Schema Evolution in Persist which unblocks `ALTER TABLE` work. There are a few changes in this one PR, they are all geared around handling the slight instability we have with the nullability of columns in Materialized Views. 1. Internally in Persist, all columns are marked as nullable at the Arrow/Parquet level. 2. A one-time migration of durably persisted `arrow::DataType`s in Persist's Schema Registery that allows them to be more nullable to account for the changes from [1]. 3. Deprecation of the existing `SchemaId`s in `Part` and `Run` metadata. We do this by renaming the existing `schema_id` fields to `deprecated_schema_id` and introducing a `schema_id` field with a new tag. Also a dyncfg is added so we can turn off writing to the new `schema_id` field until we're confident all nodes have rolled to a new version. 4. During bootstrapping of the Coordinator, if the nullability of columns for a MatView have changed, we compare and evolve the new schema in Persist. 5. Changed the upgrade check in `catalog-debug` to validate that the `RelationDesc`s as planned by the new version of MZ are compatible with the ones durably recorded in Persist. Also included in this PR are two feature gate changes to disable new features in our tests until > `v0.126`. 1. Builtin Continual Tasks 2. 0dt Enabled Sources Both of these features cause the new version of MZ to issue writes during a 0dt upgrade when it's supposed to be in read-only mode. Because this PR changes the Arrow datatypes in a non-forward compatible way, this causes the 0dt tests to fail since the old version panics when it sees the new batches. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8660 ### Tips for reviewer All of these changes have been made in separate commits of the PR which ideally makes reviewing easier. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:b661761
Author:Petros Angelatos
Committer:GitHub

Merge pull request #30710 from petrosagg/delete-legacy-error-handling delete legacy upsert error handling

Commit:2a5ae9a
Author:Petros Angelatos

storage: delete status shard from metadata collection Statuses are now populated by the controller so that field was dead code Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:959cb9f
Author:Petros Angelatos
Committer:Petros Angelatos

delete legacy upsert error handling This PR removes code that was added during incident 72 (a long time ago). The incident was caused because we introduced a backward incompatible proto serialization of certain upsert errors. We then added the code removed in this PR that both handled the incompatibility and also issued corrections so that the correct serialization ends up in the shards. By now all such legacy error instances have been compacted away and so this code is not needed anymore. Signed-off-by: Petros Angelatos <petrosagg@gmail.com>

Commit:14cc787
Author:Joseph Koshakow
Committer:GitHub

adapter: Create deterministic log index IDs (#30603) This commit adds a new variant of `GlobalId` and `CatalogItemId` for introspection source indexes. The values of these IDs are deterministically derived from the cluster ID and the log variant. Introspection source indexes are a special edge case of items. They are considered system items, but they are the only system item that can be created by the user at any time. All other system items can only be created by the system during the startup of an upgrade. Previously, it was possible to allocate the same System ID to two different objects if something like the following happened: 1. Materialize version `v` is running in read-write mode. 2. Materialize version `v + 1` starts in read-only mode. 3. The next system item ID is `s`. 4. `v + 1` allocates `s` for a new system item (table, view, introspection source, etc.) 5. `v` creates a new user cluster and allocates `s` through `s + n` to the introspection source indexes in that cluster. At this point we have two separate objects with the same Global ID, which is bad. 6. `v + 1` reboots in read-write mode and allocates `s + n + 1` to the new system item. At this point the new system item has received two different IDs, which is also bad. Putting introspection source index IDs in their own namespace and making them deterministic removes this issue and ones like it. Fixes #MaterializeInc/database-issues/issues/8731

Commit:6a70b85
Author:Jan Teske
Committer:Jan Teske

compute: rename flat_plan module This commit renames the `flat_plan` module to `render_plan`, following the rename from `FlatPlan` to `RenderPlan` in the previous commit.

Commit:4496b26
Author:Jan Teske
Committer:Jan Teske

compute: clean up around render plan This commit cleans up all the code around the render plan. It updates comments in both `flat_plan.rs` and rendering, and renames the `FlatPlan` types to better describe their current roles. For example, the plan isn't fully flat anymore, so calling it `FlatPlan` is misleading. Specifically, the following types were renamed are: * `FlatPlan` -> `RenderPlan` * `FlatPlanStep` -> `Node` * `FlatPlanNode` -> `Expr`

Commit:c2573e1
Author:Jan Teske
Committer:Jan Teske

compute: extract binding structure from FlatPlan This commit extracts the `FlatPlan` structural invariants previously only defined in documentation into the type system. A `FlatPlan` now consists of a sequence of `BindStage`s that define the bindings in the plan, and a final `LetFreePlan` that's free of bindings.

Commit:11593f4
Author:Moritz Hoffmann
Committer:GitHub

Sort results on replica, merge on envd (#30558) Sort results on replica, merge on environmentd. Previously, we'd sort data only on evironmentd, which would cause it to consume more CPU than necessary. This change moves some of the sorting to clusterd, and only leaves the last merge step on environmentd. The PR selects a minimal approach, and leaves most of the code related to result finishing untouched. It introduces an invariant that peek results must always be sorted according to the finishing, anything else will lead to undefined results. However, there's nothing that enforces the results to be sorted with the same ordering, which is potentially bad. Inside environmentd, it uses a simple heap to combine $k$ sorted runs into a single permutation map. The interfaces to `RowCollection` (`new`, `sorted_view`) now take a `&[ColumnOrder]`, and internally the implementation picks the right comparison function. If the column order slice is empty, it'll skip decoding the rows and directly defer to the tiebreaker. The PR moves the `RowCollection` type into `mz-expr`, which isn't ideal. This is required because the `ColumnOrder` type is defined here, and we'd like to pass it to the constructor of the type. Alternatives would be to have a function here that passes the correct comparison function to `RowCollection`, but that seems to be strictly worse than moving the type. I considered moving the type to `compute-types`, which seems a better fit, but not all uses of `RowCollection` depend on `compute-types`. If this is upsetting, I can think about alternatives. This complexity for sorting on the cluster is roughly $\frac{n}{k}\cdot\log \frac{n}{k}$, where $n$ is the total number of result records, and $k$ the number of workers. The last merge step then has a time complexity of $n\cdot\log k$ to combine $k$ sorted runs into one. Follow-up items include: * Avoid a single `Bytes` allocation for all rows, and instead keep the individual allocations. * Assert that all `RowCollections` are sorted equally. * Move the binary heap into an iterator to avoid materializing the sorted view permutation. ### Tips to the reviewer Don't look at individual commits. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:49a980f
Author:Parker Timmerman
Committer:GitHub

network_policies: Add `user_network_policy` ID allocator (#30468) Originally when `NetworkPolicy`s were added we included the ID allocator in `initialize.rs`, so all new environments had this allocator, but the migration for the catalog, `v68_to_v69`, did not add the allocator so existing environments never got it. ### Motivation Fixes: https://github.com/MaterializeInc/database-issues/issues/8748 ### Tips for reviewer The protobuf files didn't actually change, so no need to look at `objects_v72.proto` ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:6ed06d3
Author:Parker Timmerman
Committer:GitHub

proto: unignore breaking changes (#30454) `buf` is a bit aggressive in what it considers breaking. Some changes I merged in https://github.com/MaterializeInc/materialize/pull/30189 were not wire breaking but `buf` thought they were. To prevent needing a force push to main I marked them as temporarily ignored for breaking changes, this PR unignores them. ### Motivation Re-enable protobuf linting on temporarily ignored files. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:235d12a
Author:Parker Timmerman
Committer:GitHub

adapter: Switch from `GlobalId` to `CatalogItemId` (#30189) This PR switches the Adapter/Catalog from referencing items with a `GlobalId`, to a `CatalogItemId`. Internally the compute and storage layers of Materialize will still reference objects by their `GlobalId`, but conceptually a `GlobalId` refers to a "collection of data", while a `CatalogItemId` refers to the database object. This switch allows us to associate multiple `GlobalId`s (aka "collections of data") with a single Catalog object. The primary motivator for this is supporting live schema migrations, or `ALTER TABLE ... ADD COLUMN ...`. When adding a new column to a table we'll create a new `GlobalId`, this ensures that the schema (aka `RelationDesc`) associated with a `GlobalId` never changes. ### Design The durable Catalog migration to add `CatalogItemId` took place in https://github.com/MaterializeInc/materialize/pull/30163. What took a bit of thought is where to place the associated `GlobalId`s for each object, we have the following requirements: * `Table`: multiple `GlobalId`s * `Source`: single `GlobalId` (maybe eventually multiple?) * `Log`: single `GlobalId` * `View`: single `GlobalId` * `MaterializedView`: single `GlobalId` * `ContinualTask`: single `GlobalId` * `Sink`: single `GlobalId` * `Index`: single `GlobalId` * `Type`: no `GlobalId`s * `Func`: no `GlobalId`s * `Secret`: no `GlobalId`s * `Connection`: no `GlobalId`s `Type`, `Func`, `Secret`, and `Connection` are never referenced by the compute and storage layer so they don't need a `GlobalId`. Meanwhile `Table`s can be evolved so they need to support multiple `GlobalId`s. What complicates this a bit is how we use the `create_sql` for an object as our durable record, while we could come up with some syntax like `CREATE TABLE [<name> WITH u1 u2 u3] ...` as a way to associate multiple `GlobalId`s with an object, this didn't feel great. What I landed on was persisting a single `GlobalId` for all objects, and then including an `extra_versions: BTreeMap<Version, GlobalId>` at the durable layer. While not all types need a `GlobalId`, associating a single `GlobalId` for each object makes things easier to reason about. And while I would like to "design away invalid states", trying to associate multiple `GlobalId`s with only tables added a bunch of boilerplate which I didn't think was worth it. But at the in-memory layer is where we enforce only `Table`s can have multiple `GlobalId`s. Instead of sticking a `global_id: GlobalId` field on CatalogEntry, which might feel natural, I put them on the inner `enum CatalogItem` variants. If a caller has just a `CatalogEntry`, they're forced to reason about their item possibly having multiple `GlobalId`s associated with it. But if they match on the inner `CatalogItem` they can get the inner object and generally get the single `GlobalId`, or if it's a table, reason about which `GlobalId` to use. ### Motivation Implements the approach described in: https://github.com/MaterializeInc/materialize/pull/30019 Progress towards: https://github.com/MaterializeInc/database-issues/issues/8233 ### Tips for reviewer I tried splitting up this PR into multiple smaller ones, but the amount of shimming required to convert back and forth between `GlobalId` and `CatalogItemId` made the changes very noisy. As such, I split this single PR into multiple commits where each commit migrates a single logical code path. Commits that require the most attention are annotated with a ⭐, and tagged by the teams most directly impacted. 1. ⭐ [storage] Migrates code paths for `Connection`s and `Secret`s to use `CatalogItemId`. This doesn't require much attention, it's largely just a find and replace since these objects are only ever referenced by the Catalog. * Something to call out though is this does change Protobuf definitions which AFAIK are not durably persisted anywhere. But someone from Storage should double check me. 2. ⭐ [adapter] This updates our in-memory objects to store a `GlobalId` per-object, for `Table`s we store multiple `GlobalId`s. To review this commit I would start by looking at `src/catalog/src/memory/objects.rs` to get an understanding of where `GlobalId`s live and how the individual objects were updated. It also updates our durable Catalog transactions to remove the shims introduced in https://github.com/MaterializeInc/materialize/pull/30163. What's subtle here is we also change the `allocate_user_item_ids(...)` API to return _both_ a `CatalogItemId` and `GlobalId` with the same inner value. This ensures that that string representation of a `CatalogItemId` for an object will always be the same as the `GlobalId`, this allows us to sidestep the issue for now of updating builtin tables to map between `CatalogItemId` and `GlobalId` 3. ⭐ [adapter] Migrates the builtin objects codepaths to associate both a `CatalogItemId` and a `GlobalId` with a builtin object. Builtin objects are handled separately from normal objects, hence having to migrate the separate code path. 4. ⭐ [adapter] Changes SQL name resolution to use `CatalogItemId`. It also changes the `ResolvedIds` newtype from a `BTreeSet<GlobalId>` to a `BTreeMap<CatalogItemId, BTreeSet<GlobalId>>`. This way after name resolution we not only know what objects a statement refers to, but also their underlying collections. This also makes the inner field of `ResolvedIds` private and migrates its callsites. 5. ⭐ [adapter] Adds new APIs to the Catalog to get items by their associated `GlobalId`, and adds a new trait called `CatalogCollectionItem`. I would start reviewing this commit by looking at `src/sql/src/catalog.rs` to get an idea of the API surface. * The new `CatalogCollectionItem` is really only necessary to support `ALTER TABLE` and isn't strictly necessary for this change, but I found that it adds context to the change. The `desc(...)` method is removed from `trait CatalogItem` and moved to `trait CatalogCollectionItem`. You can only get a `CatalogCollectionItem` (and thus `RelationDesc`) if you have a `GlobalId` or a `CatalogItemId` + `Version`. 6. [compute] Updates the `trait OptimizerCatalog` to allow getting an item by a `CatalogItemId`, and updates implementations to use new Catalog APIs. This commit is small and only needs to be skimmed. 7. [adapter] Introduces a new `DependencyIds` field on some Catalog items. `ResolvedIds` are derived from name resolution, meanwhile `DependencyIds` are derived from the `HirRelationExpr` created after planning. * This was added because the `CatalogEntry::uses` method wants to return a list of `CatalogItemId`s, but the `HirRelationExpr` only knows about `GlobalId`s. We could map from `GlobalId` -> `CatalogItemId` when calling the method, but this was a decent refactor on it's own, so I opted to cache these dependencies during planning on the few types of objects that need them. 8. [adapter] Migrates the Coordinator startup/bootstrap codepath to use the new APIs introduced in [5]. No new logic is introduced here, it's largely just a find and replace like `entry.id()` to `mat_view.global_id()`. 9. ⭐ [adapter] Updates all create plans (i.e. `PlanCreate*`) to associate a `GlobalId` with each item. 10. [adapter] Updates plans for `COPY` and `INSERT` to use `CatalogItemId` instead of `GlobalId`, since we also want to insert into collections at their latest version. This commit is largely a find and replace and can be skimmed. 11. [adapter] Updates all drop plans (i.e. `PlanDrop*`) to use `CatalogItemId`, this is pretty much a find and replace and can be skimmed. 12. [adapter] Updates all alter plans (i.e. `PlanAlter*`) to use `CatalogItemId`, this is pretty much a find and replace and can be skimmed. 13. [adapter] Updates all code paths for `COMMENT ON` and Webhook Sources to use `CatalogItemId`. This is a fairly small commit and can definitely be skimmed. 14. ⭐ [adapter/compute]Updates all Peek and Subscribe plans to use `CatalogItemId`. This commit is interesting and could use some extra thought since it's where we map from a catalog item to it's collection, and is one of the primary motivators of this work. 15. [adapter] Updates all `SHOW` and `INSPECT` plans to use `CatalogItemId`. This commit is quite small and can be skimmed. 16. [adapter] Updates Catalog Consistency Checks and `PlanValidity` to use `CatalogItemId`. This commit is quite small and can be skimmed. 17. [adapter] Updates `EXPLAIN` plans to use the new `GlobalId`-based Catalog APIs. It's pretty straight forward and probably doesn't require too much attention. 18. [adapter] This commit is a bit of a "grab bag", it updates code paths for auto-routing queries to `mz_catalog_server`, cluster scheduling, and "caught up" checks to map between `CatalogItemId` and `GlobalId`. It could use a bit of extra attention. 19. ⭐ [adapter/persist] Updates `ScalarType`s to use `CatalogItemId` instead of `GlobalId`. * IMO this is one of the sketchiest commits of the entire PR because we durably persist `ScalarType`s in the Persist schema registry. This change should be okay though since the protobuf type of `CatalogItemId` is exactly the same as `GlobalId` (minus the `Explain` variant) so it is wire compatible. 20. [adapter] Migrates all builtin table writes (aka the `pack_*` method`) to use `CatalogItemId`. 21. Deletes the `GlobalId::to_item_id` and `CatalogItemId::to_global_id` shim methods. ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:c10003d
Author:Justin Bradfield
Committer:GitHub

Feature/network policy predefined (#30261) <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation - adds a new predefined network policy - removes the allow list based policy system var in favor of a new `network_policy` system var - swaps policy enforcement to use the default policy. stacked on https://github.com/MaterializeInc/materialize/pull/30172 part of https://github.com/MaterializeInc/database-issues/issues/4637 <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer If you are reviewing this prior to #30172 going in, you should only look at the `add predefined network policy and system var` commit. <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Co-authored-by: Parker Timmerman <parker.timmerman@materialize.com>

Commit:5aa1c98
Author:Gábor E. Gévay
Committer:Parker Timmerman

Merge pull request #30408 from ggevay/refresh_compaction Fix Persist compaction for `REFRESH` MVs

Commit:044183a
Author:Gabor Gevay
Committer:Gabor Gevay

Add objects_needing_compaction to reasons for turning a cluster on

Commit:5954bc7
Author:Gabor Gevay
Committer:Gabor Gevay

Add objects_needing_compaction to reasons for turning a cluster on

Commit:c094caf
Author:Michael Greenberg
Committer:GitHub

[compute] map LIR to dataflow (#29848) This PR introduces two new introspection sources and two new introspection views. These novel forms of introspection allow us to map LIR operators down to dataflow operators; we can now attribute existing introspection data about dataflows to LIR operators. # Introspection The two new sources are `ComputeLog` sources that run per worker. ## `mz_introspection.mz_compute_dataflow_global_ids_per_worker` Maps dataflow identifiers to the global IDs used internally for things that get built as dataflows. ``` name | nullable | type | comment -----------+----------+-------+--------- id | f | uint8 | dataflow ID worker_id | f | uint8 | global_id | f | text | ``` ## `mz_introspection.mz_compute_lir_mapping_per_worker` Tracks attribution information for LIR terms (in terms of `FlatPlan`). ``` name | nullable | type | comment -------------------+----------+-------+--------- global_id | f | text | lir_id | f | uint8 | AST node number worker_id | f | uint8 | operator | f | text | rendered string parent_lir_id | t | uint8 | parent AST node number nesting | f | uint2 | nesting (used for indentation) operator_id_start | t | uint8 | first dataflow operator (inclusive) operator_id_end | t | uint8 | last dataflow oeprator (exclusive) ``` ## Views We use two introspection views to work with these per-worker sources. It ought to be the case that all workers agree about _this_ metadata (though they may not agree on, say, the amount of memory a dataflow operator is using!). So: these are just views that set `worker_id = 0`. ### `mz_introspection.mz_dataflow_global_ids` ``` name | nullable | type | comment -----------+----------+-------+--------- id | f | uint8 | global_id | f | text | ``` ### `mz_introspection.mz_lir_mapping` ``` name | nullable | type | comment -------------------+----------+-------+--------- global_id | f | text | lir_id | f | uint8 | operator | f | text | parent_lir_id | t | uint8 | nesting | f | uint2 | operator_id_start | t | uint8 | operator_id_end | t | uint8 | ``` # Attributing to LIR We can see a sample interaction as follows: ```sql CREATE TABLE t(x INT NOT NULL, y INT, z TEXT); CREATE VIEW v AS SELECT t1.x AS x, t1.z AS z1, t2.z AS z2 FROM t AS t1, t AS t2 WHERE t1.x = t2.y; CREATE INDEX v_idx_x ON v(x); \! sleep 1 SELECT global_id, lir_id, REPEAT(' ', MAX(nesting) * 2) || operator AS operator, SUM(duration_ns) AS duration, SUM(count) AS count FROM mz_introspection.mz_lir_mapping mlm LEFT JOIN mz_introspection.mz_compute_operator_durations_histogram mcodh ON (mlm.operator_id_start <= mcodh.id AND mcodh.id < mlm.operator_id_end) GROUP BY global_id, lir_id, operator ORDER BY global_id, lir_id DESC; ``` which yields an output like: ``` global_id | lir_id | operator | duration | count -----------+--------+----------------------------+----------+------- u2 | 4 | Join::Differential 1 » 3 | 1261568 | 17 u2 | 3 | Arrange 2 | 466944 | 16 u2 | 2 | Get::Collection u1 | 69632 | 7 u2 | 1 | Arrange 0 | 417792 | 16 u2 | 0 | Get::Collection u1 | 73728 | 7 u3 | 6 | Arrange 5 | 454656 | 17 u3 | 5 | Get::PassArrangements u2 | | ``` ### Motivation * This PR adds a known-desirable feature. The first step in attribution/plan profiling. https://github.com/MaterializeInc/database-issues/issues/6551

Commit:6782f23
Author:Moritz Hoffmann
Committer:GitHub

Determine time dependence in controller (#30330) Determine time dependence in the controller. This PR changes how we determine wall-clock time dependence. Previously, we'd use information in the catalog and within cached plans to determine how an object relates to wall-clock time. With this PR, we use the information available in the controller to make the same decision in a centralized place. The benefits are cleaner code structure. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:24d2fc2
Author:Moritz Hoffmann
Committer:GitHub

Improved replica and dataflow expiration (#30162) Support expiration of dataflows depending on wall-clock time and with refresh schedules. This is a partial re-implementation of #29587 to enable more dataflows to participate in expiration. Specifically, it introduces the abstraction of _time dependence_ to describe how a dataflow follows wall-clock time. Using this information, we can then determine how a replica's expiration time relates to a specific dataflow. This allows us to support dataflows that have custom refresh policies. I'm not sold on the names introduced by this PR, but it's the best I came up with. Open to suggestions! The implementation deviates from the existing implementation is some important ways: * We do not panic in the dataflow operator that checks for frontier advancements, but rather retain a capability until the dataflow is shut down. This avoids race-condition where dataflow shutdown happens in parallel with dropping the shutdown token, and it avoids needing to reason about what dataflows produce error streams---some have an error output that immediately advances to the empty frontier. * We do not handle the empty frontier in a special way. Previously, we considered advancing to the empty frontier acceptable. However, this makes it difficult to distinguish a shutdown from a source reading the expiration time. In the first case, the operator should drop its capability, in the second it must not for correctness reasons. * We check in the worker thread whether the replica has expired and panic if needed. There are some problems this PR does not address: * Caching the time dependence information in the physical plans seems like a hack. I think a better place would be the controller. Happy to try this in a follow-up PR. * We need a separate kill-switch to disable the feature because as it is implemented, we capture the expiration time in the controller once per replica. A second kill-switch would enable us to override the expiration to stabilize the system. Fixes MaterializeInc/database-issues#8688. Fixes MaterializeInc/database-issues#8683. ### Tips for the reviewer Don't look at individual commits, it's a work log and does not have any semantic meaning. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:0ee93b1
Author:Justin Bradfield
Committer:Justin Bradfield

add network policy resource to catalog - adds durable and mem network policy types - network_policies and network_policy_rules tables.

Commit:0a29f0e
Author:Joseph Koshakow
Committer:GitHub

catalog: Remove old migrations (#30226)

Commit:e5a1392
Author:Parker Timmerman
Committer:GitHub

catalog: Fixup v68 migration (#30213) This PR updates the v68 Catalog migration to workaround https://github.com/MaterializeInc/database-issues/issues/8700. It's a bit sketchy updating the Catalog types in-place like this, if someone has upgraded their staging environment this change will break it. https://github.com/MaterializeInc/materialize/pull/30202 is also a possible fix for the issue, will chat with @jkosh44 about it. ### Motivation Fixes https://github.com/MaterializeInc/database-issues/issues/8702 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:4ae7647
Author:Ben Kirwin
Committer:Ben Kirwin

Add RunPart enum and HollowRuns This sets up the basic machinery for shifting runs of ordered parts out to S3. It includes some basic machinery for iterating over batch parts, but leaves much TODO.

Commit:1945b14
Author:Parker Timmerman
Committer:GitHub

alter_table: Durable Catalog Migration (#30163) This PR re-keys everything in the durable Catalog on `CatalogItemId` instead of `GlobalId`, and shims everything above the durable Catalog objects using two new methods `GlobalId::to_item_id()`, and `CatalogItemId::to_global_id()`. Now every item in the Catalog has the following fields: * `id: CatalogItemId`, a stable _external_ identifier (i.e. in Catalog tables like `mz_objects`) that is the same for the entire lifetime of the object. * `global_id: GlobalId`, a stable _internal_ identifier for this object that can be used by storage and compute. * `extra_versions: BTreeMap<Version, GlobalId>`, mapping of versions of an object to the `GlobalId`s used by compute and storage to refer to a specific version. This de-coupling of `CatalogItemId` and `GlobalId` achieves two things: 1. Externally objects have a stable identifier, even as they are `ALTER`-ed. This is required for external tools like dbt and Terraform that track objects by ID. 2. Internally a `GlobalId` always refers to the same `RelationDesc` + Persist Shard, this maintains the concept from the formalism that a `GlobalId` is never re-assigned to a new pTVC. The implementation of `ALTER TABLE ... ADD COLUMN ...` will thus allocate a new `GlobalId` which will immutably refer to that specific version of the table. #### Other Changes Along with `ItemKey` and `ItemValue` I updated the following Catalog types: * `GidMappingValue`: replaced the `id` field with `catalog_id` and `global_id`, used to identify builtin catalog objects. * `ClusterIntrospectionSourceIndexValue`: replaced the `index_id` field with `catalog_id` and `global_id`, used to identify builtin introspection source indexes. * `CommentKey`: replaced `GlobalId` with `CatalogItemId`, used to identify comments on objects. * `SourceReferencesKey`: replaced `GlobalId` with `CatalogItemId`, used to track references between a Source and the subsources/tables that read from it. #### Partial Progress Today `CatalogItemId` is 1:1 with `GlobalId`, this allows us to implement the `to_item_id` and `to_global_id` shim methods. Until we support `ALTER TABLE ... ADD COLUMN ...` we can freely convert between the two. This allows us to break this change up among multiple PRs instead of a single massive change. #### Initial Migration Because `CatalogItemId` and `GlobalId` are currently 1:1, this allows us to migrate the raw values of the IDs, e.g. `GlobalId::User(42)` becomes `CatalogItemId::User(42)`, which is exactly what this PR does. ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8233 Implements changes described in https://github.com/MaterializeInc/materialize/pull/30019 ### Tips for reviewer This PR is split into two commits: 1. The durable catalog migration, and updates to durable Catalog objects. I would appreciate the most thorough reviews on this commit. 4. Shimming all calling code to convert between `CatalogItemId` and `GlobalId`. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:af34ea0
Author:Gabor Gevay
Committer:Gabor Gevay

window functions: Fusion for window aggregations Closes #8535

Commit:441b7b2
Author:Parker Timmerman
Committer:GitHub

adapter: Add `CatalogItemId` type (#30148) As it says on the tin, adds a new type called `CatalogItemId` which will be used as a stable identifier for Catalog items. Figured splitting this out into its own PR was a good first step. ### Motivation Progress towards https://github.com/MaterializeInc/database-issues/issues/8233 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:175d2ec
Author:Daniel Harrison
Committer:Daniel Harrison

compute: rename PersistSink to MaterializedViewSink There are now two sinks that write to persist but with different behavior (conflict resolution, only at input times, etc). Rename PersistSink to something that reflects the differences.

Commit:1ae1031
Author:Moritz Hoffmann
Committer:Parker Timmerman

Replica expiration applied in controller (#29996) Apply the replica expiration only once per replica. Capture the value in the replica task and encode it in the create-instance command. We only send this command once per replica. Reconciliation will cause a replica restart if the value changes and the controller restarts. This fixes a bug where we could not apply the configuration localized to a specific replica. - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:1d66921
Author:Moritz Hoffmann
Committer:GitHub

Replica expiration applied in controller (#29996) Apply the replica expiration only once per replica. Capture the value in the replica task and encode it in the create-instance command. We only send this command once per replica. Reconciliation will cause a replica restart if the value changes and the controller restarts. This fixes a bug where we could not apply the configuration localized to a specific replica. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com>

Commit:1686281
Author:Moritz Hoffmann
Committer:GitHub

Format an lint protobuf (#29997) Formats protobuf using buf, and checks that files are formatted. We're not formatting our protobuf files with the standard formatting (4 vs. 2 spaces of indentation, file structure). Format all files and add a lint to ensure files stay formatted. ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com> Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:05acf95
Author:Moritz Hoffmann
Committer:GitHub

Implement dataflow expiration to limit temporal data retention (#29587) Introduces a new feature to limit data retention in temporal filters by dropped retraction diffs beyond a configured expiration time. Motivation and logic is explained in more details in the design doc: doc/developer/design/20240919_dataflow_expiration.md. Fixes MaterializeInc/database-issues#7757 ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. --------- Signed-off-by: Moritz Hoffmann <mh@materialize.com> Co-authored-by: Siddhartha Sahu <siddhartha.sahu@materialize.com> Co-authored-by: Dennis Felsing <dennis@felsing.org>

Commit:4d91d92
Author:Sang Jun Bak
Committer:Sang Jun Bak

storage-controller: Replace keep_n with retention window for replica status history

Commit:b579caa
Author:Joseph Koshakow
Committer:GitHub

catalog: Remove epoch update kind (#29353) This commit removes the `Epoch` catalog update kind. That variant has been replaced with the `FenceToken` variant. Since the epoch and fence token are needed before migrations are run, the removal of `Epoch` had to be split across multiple releases. ### Motivation This PR adds a feature that has not yet been specified. ### Checklist - [X] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [X] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [X] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [X] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [X] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:9bffa7b
Author:Nikhil Benesch
Committer:GitHub

Merge pull request #29699 from benesch/0dt-upsert-sources adapter,storage-controller: lay down 0dt scaffolding for sources

Commit:9fd359c
Author:Nikhil Benesch
Committer:Nikhil Benesch

adapter,storage-controller: lay down 0dt scaffolding for sources Add a new feature flag called `enable_0dt_deployment_sources` which controls whether 0dt deployment is attempted for source types that support it. To start, only Kafka sources will support 0dt deployments. The rationale is twofold: * Kafka does not have a notion of replication slots, so it's easy to have multiple processes reading from the same Kafka topic. * Only Kafka sources support the upsert envelope, which is the only source operator that has a meaningful rehydration time. Other source types (e.g., PostgreSQL) effectively already support 0dt upgrades because they can do a cold restart from where they left off nearly instantaneously. This commit does not begin to tackle the hard parts of 0dt deployments for Kafka upsert sources (read-only mode for the persist sink, self-correcting upsert state, etc.). Enabling the `enable_0dt_deployment_sources` is not yet expected to work. The idea is just to lay down the scaffolding so that we can start to fill in the missing pieces in parallel. Touches #27413.

Commit:2b2c6ba
Author:Daniel Harrison
Committer:Daniel Harrison

ct: add `::ContinualTask` various adapter enums May as well rip off the band-aid and get it over with.

Commit:5ec5ae5
Author:Cara Silverstein
Committer:GitHub

Remove implicit cast from interval to mz_timestamp (#29579) Removing because: * It's not needed, we can support arithmetic like `mz_timestamp + interval` without the implicit cast, similar to how we support arithmetic like `timestamp + interval` * It doesn't make logical sense to cast an `interval` to an `mz_timestamp` (an epoch time), so is confusing * Because the implicit cast from `interval` (microseconds precision) to `mz_timestamp` (milliseconds precision) does not preserve uniqueness, it leads to some wonky and/or incorrect behavior, such as returning true for `SELECT '1970-01-01 00:00:01.0001'::timestamptz = INTERVAL '1000200 microseconds'; ` See thread for more details https://materializeinc.slack.com/archives/C063H5S7NKE/p1726236187815369 ### Motivation * This PR fixes a previously unreported bug. ### Tips for reviewer Is there anything else I need to do when removing a row from the protobuf file, beyond marking the id as reserved? ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post. **Requires a changelog post that I will write.**

Commit:26741be
Author:Nikhil Benesch
Committer:GitHub

Merge pull request #29611 from chaas/gh-issues-to-discussions Change all references in user-facing error messages to point to discussions instead of issues

Commit:a945f4e
Author:Cara Haas

rename the proto field without changing its ID at @benesch's suggestion

Commit:117460b
Author:Cara Silverstein
Committer:GitHub

Apply suggestion from review Co-authored-by: Joseph Koshakow <koshy44@gmail.com>

Commit:a2feb1c
Author:Daniel Harrison
Committer:Daniel Harrison

ct: add strawman impl of CREATE CONTINUAL TASK Strawman because: - I personally find it much easier to start with a crappy thing and incrementally improve it than to iteration on a huge branch forever. - Allows for more easily collaborating on the remaining work. - Also to build excitement internally! A continual task presents as something like a `BEFORE TRIGGER`: it watches some _input_ and whenever it changes at time `T`, executes a SQL txn, writing to some _output_ at the same time `T`. It can also read anything in materialize as a _reference_, most notably including the output. Only reacting to new inputs (and not the full history) makes a CT's rehydration time independent of the size of the inputs (NB this is not true for references), enabling things like writing UPSERT on top of an append-only shard in SQL (ignore the obvious bug with my upsert impl): ```sql CREATE CONTINUAL TASK upsert (key INT, val INT) ON INPUT append_only AS ( DELETE FROM upsert WHERE key IN (SELECT key FROM append_only); INSERT INTO upsert SELECT key, max(val) FROM append_only GROUP BY key; ) ``` Unlike a materialized view, the continual task does not update outputs if references later change. This enables things like auditing: ```sql CREATE CONTINUAL TASK audit_log (count INT8) ON INPUT anomalies AS ( INSERT INTO audit_log SELECT * FROM anomalies; ) ``` As mentioned above, this is in no way the final form of CTs. There's lots of big open questions left on what the feature should look like as presented to users. However, we'll start shipping it by exposing incrementally less limited (and more powerful) surface areas publicly: e.g. perhaps a RETENTION WINDOW on sources.

Commit:3dd23d4
Author:Daniel Harrison
Committer:Daniel Harrison

ct: establish CREATE CONTINUAL TASK plumbing

Commit:0758048
Author:Gabor Gevay
Committer:Gabor Gevay

Window functions: Fuse `Reduce` with `FlatMap UnnestList`

Commit:35b3016
Author:Gábor E. Gévay
Committer:GitHub

Merge pull request #29554 from ggevay/reduce-flatmap-fusion Window functions: `Reduce` - `FlatMap UnnestList` fusion

Commit:dab35b1
Author:Gabor Gevay
Committer:Gabor Gevay

Window functions: Fuse `Reduce` with `FlatMap UnnestList`

Commit:63e6ad1
Author:Sang Jun Bak
Committer:Sang Jun Bak

adapter: add cluster_id to audit log create events

Commit:1364077
Author:Cara Haas
Committer:Cara Haas

Change all references in user-facing error messages to point to discussions instead of issues, or remove the issue reference if it wasn't useful anymore

Commit:9f82913
Author:Nikhil Benesch
Committer:Nikhil Benesch

sql: add a starts_with function Fix #29520.

Commit:75963e5
Author:Cara Silverstein
Committer:GitHub

Support implicit cast from date to mz_timestamp (#29494) Fixes: https://github.com/MaterializeInc/materialize/issues/29493 ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](https://github.com/MaterializeInc/cloud/pull/5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Commit:1d3f195
Author:Roshan Jobanputra
Committer:GitHub

Merge pull request #29383 from rjobanp/kafka-source-tables storage/sources: Implement 'CREATE TABLE .. FROM SOURCE' parsing and planning for Kafka sources

Commit:043e712
Author:Jan Teske
Committer:Jan Teske

storage: truncate replica status history This commit adds support for truncation of the `ReplicaStatusHistory` collection during storage controller startup. In contrast to the other status histories, the replica status history is keyed by two columns (`replica_id`, `process_id`), so we need to modify the truncation code to work with multi-column keys. To this end, a `StatusHistoryDesc` type is introduced that allows the caller to configure the truncation behavior while being generic over the key type.

Commit:90b0702
Author:Ben Kirwin
Committer:GitHub

Merge pull request #29412 from bkirwi/new-key-lower [persist] Add a structured key lower

Commit:500e4c3
Author:Ben Kirwin
Committer:Ben Kirwin

New part-level key lower field

Commit:a626e50
Author:Nikhil Benesch
Committer:GitHub

Merge pull request #29438 from benesch/mysql-exclude-columns sql: change IGNORE COLUMNS MySQL option to EXCLUDE COLUMNS