-
Notifications
You must be signed in to change notification settings - Fork 704
feat: apply column-aware row encoding & enable schema change #8394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
…ncoding-for-table Signed-off-by: Bugen Zhao <[email protected]>
| // TODO: If we do some `Delete` after schema change, we cannot ensure the encoded value | ||
| // with the new version of serializer is the same as the old one, even if they can be | ||
| // decoded into the same value. The table is now performing consistency check on the raw | ||
| // bytes, so we need to turn off the check here. We may turn it on if we can compare the | ||
| // decoded row. | ||
| let state_table = | ||
| StateTableInner::from_table_catalog_inconsistent_op(table_catalog, store, vnodes).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hzxa21 😰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc. @hzxa21 Sanity check is disabled here
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
|
Do existing e2e/unit tests cover schema change cases? |
Oops, I forgot. I'll add some. |
…ncoding-for-table
Signed-off-by: Bugen Zhao <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #8394 +/- ##
==========================================
- Coverage 71.62% 71.61% -0.02%
==========================================
Files 1135 1135
Lines 185458 185541 +83
==========================================
+ Hits 132835 132875 +40
- Misses 52623 52666 +43
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Done. PTAL 🥺 |
| &mut self, | ||
| buffer: MaterializeBuffer, | ||
| table: &StateTable<S>, | ||
| table: &StateTableInner<S, SD>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we extract a trait for the state table such as StateTableRead and change the type to impl StateTableRead to erase the S and SD here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea. We can do this refactor in the future.
Signed-off-by: Bugen Zhao <[email protected]>
fuyufjh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
|
|
||
| pub type Result<T> = std::result::Result<T, ValueEncodingError>; | ||
|
|
||
| /// The kind of all possible `ValueRowSerde`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we run fast enough, perhaps we could remove the Basic encoding in 0.18. (Because the storage's side work has break the compatibility anyway. Of course just for this time 🥵)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of still using the basic value-encoding for the tables that (temporarily) not supports schema change, like internal tables or normal materialized views (except for Index). The reason is that "column ID" does not make much sense for these tables with immutable schema, and we can save more storage costs.
| uint32 read_prefix_len_hint = 7; | ||
| // Whether the table is versioned. If `true`, column-aware row encoding will be used | ||
| // to be compatible with schema changes. | ||
| bool versioned = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that a table contains both basic encoded and column-aware encoded rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't consider backward compatibility, then nope. Currently, we follow the version field of TableCatalog to decide the encoding.
y-wei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any concerns, but I may be better to have another senior's approval 😀
fuyufjh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Bugen Zhao <[email protected]>
| // Decide which serializer to use based on whether the table is versioned or not. | ||
| let row_serde = if versioned { | ||
| ColumnAwareSerde::new(&column_ids, schema.into()).into() | ||
| } else { | ||
| BasicSerde::new(&column_ids, schema.into()).into() | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found that there's something else we need to update in the storage java binding. cc @wenym1 Is it possible to reuse some code of StorageTable here?
Signed-off-by: Bugen Zhao [email protected]I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Enable column-aware row encoding for
MaterializeExecutorand allStorageTableusages.StateTablewill decide which encoding format is used according to whetherversionfield is set. Note that this is only set for regular tables.versionedfield toTableDesc.StorageTablewill similarly decide the format by this.ALTER TABLE [ADD|DROP] COLUMNin production.Checklist For Contributors
./risedev check(or alias,./risedev c)Checklist For Reviewers
Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
ALTER TABLE [ADD|DROP] COLUMNRelease note
Initial support for
ALTER TABLE [ADD|DROP] COLUMNon regular tables (without connectors).