Skip to content

Conversation

@y-wei
Copy link
Contributor

@y-wei y-wei commented Dec 28, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This section will be used as the commit message. Please do not leave this empty!

Please explain IN DETAIL what the changes are in this PR and why they are needed:

  • Summarize your change (mandatory)
    • add one returning field in DML related structs ( INSERT , UPDATE, and DELETE)
    • corresponding executor and query handler will alter output format if returning included.
  • Additional side change
    run_sql will return an extra bool to indicate if the statement is a DML with returning, used by query handler.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Types of user-facing changes

Please keep the types that apply to your changes, and remove those that do not apply.

  • SQL commands, functions, and operators

Refer to a related PR or issue link (optional)

#7044

@y-wei y-wei requested a review from BugenZhao December 28, 2022 04:26
@y-wei y-wei changed the title WIP feat: support RETURNING in DML statements feat: WIP support RETURNING in DML statements Dec 28, 2022
let rows_count = match stmt_type {
StatementType::SELECT => None,
let rows_count: Option<i32> = match stmt_type {
StatementType::SELECT | StatementType::INSERT_RETURNING | StatementType::DELETE_RETURNING | StatementType::UPDATE_RETURNING => None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, in Postgres, we will get both returning rows and affected row count. We may fix this in the future.

postgres=# insert into t values (233) returning *;
  v  
-----
 233
(1 row)

INSERT 0 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ideas? 😰 cc @BowenXiao1999

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure, but this seems do not affect the correctness.

If the protocol allows, I think we can add the values in future. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have fixed!

@lmatz lmatz added the user-facing-changes Contains changes that are visible to users label Dec 29, 2022
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Great job!

Comment on lines 42 to 59
pub returning_list: Vec<ExprImpl>,

pub schema: Schema,
pub returning_schema: Option<Schema>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding some brief doc for this? 🥰 Same for other occurrences.

insert into t values (0,1), (1,2) returning *, a, a+b;
logical_plan: |
LogicalProject { exprs: [*VALUES*_0.column_0, *VALUES*_0.column_1, *VALUES*_0.column_0, (*VALUES*_0.column_0 + *VALUES*_0.column_1)] }
└─LogicalInsert { table: t }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be better if we can add a new field of returning in fmt::Display. Only show it when returning is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem is that current we do not hold returning projection info in LogicalInsert/Delete/Update struct but just a bool indicating if there were returning. Do you think it will be better if we print LogicalInsert { table: t, returning: true }?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogicalInsert { table: t, returning: true } looks clear 🤣

let rows_count = match stmt_type {
StatementType::SELECT => None,
let rows_count: Option<i32> = match stmt_type {
StatementType::SELECT | StatementType::INSERT_RETURNING | StatementType::DELETE_RETURNING | StatementType::UPDATE_RETURNING => None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@BugenZhao BugenZhao marked this pull request as ready for review December 30, 2022 11:30
@BugenZhao BugenZhao changed the title feat: WIP support RETURNING in DML statements feat: support RETURNING in DML statements Dec 30, 2022
@BugenZhao BugenZhao changed the title feat: support RETURNING in DML statements feat: support RETURNING in DML statements Dec 30, 2022
@codecov
Copy link

codecov bot commented Dec 30, 2022

Codecov Report

Merging #7094 (fde984e) into main (ce7c724) will decrease coverage by 0.67%.
The diff coverage is 83.52%.

@@            Coverage Diff             @@
##             main    #7094      +/-   ##
==========================================
- Coverage   72.33%   71.66%   -0.67%     
==========================================
  Files        1071     1083      +12     
  Lines      171769   173589    +1820     
==========================================
+ Hits       124243   124401     +158     
- Misses      47526    49188    +1662     
Flag Coverage Δ
rust 71.66% <83.52%> (-0.67%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/frontend/src/handler/create_table_as.rs 0.00% <0.00%> (ø)
src/frontend/src/handler/query.rs 17.70% <0.00%> (-0.88%) ⬇️
...c/frontend/src/optimizer/plan_node/batch_delete.rs 80.55% <0.00%> (-2.31%) ⬇️
...c/frontend/src/optimizer/plan_node/batch_insert.rs 52.08% <0.00%> (-1.11%) ⬇️
...c/frontend/src/optimizer/plan_node/batch_update.rs 59.18% <0.00%> (-1.24%) ⬇️
src/sqlparser/src/keywords.rs 100.00% <ø> (ø)
src/utils/pgwire/src/pg_protocol.rs 56.16% <ø> (ø)
src/utils/pgwire/src/pg_response.rs 71.30% <25.00%> (-2.57%) ⬇️
src/utils/pgwire/src/pg_message.rs 69.52% <50.00%> (-0.38%) ⬇️
src/batch/src/executor/delete.rs 83.33% <80.00%> (-0.73%) ⬇️
... and 49 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@y-wei
Copy link
Contributor Author

y-wei commented Dec 30, 2022

Have checked and rebased! Ready to be reviewed!

Copy link
Contributor

@BowenXiao1999 BowenXiao1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now, we are returning the columns for the RETURNING clause and no rows count?

Rest LGTM

}
// } else {
// let ret_chunk = DataChunk::new(returning_columns, rows_inserted);
// yield ret_chunk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

pub fn new(input: PlanRef, table_name: String, table_id: TableId) -> Self {
pub fn new(input: PlanRef, table_name: String, table_id: TableId, returning: bool) -> Self {
let ctx = input.ctx();
// TODO: support `RETURNING`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it is already fully supported, can we remove it?

Comment on lines +146 to +148
| StatementType::INSERT_RETURNING
| StatementType::DELETE_RETURNING
| StatementType::UPDATE_RETURNING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question, as we treat XXX_RETURNING as a query. Once they are used in the CTE and the CTE is used multiple times in a single query, can we reuse the CTE result? Reuse means we execute the query once, while not reuse means we execute the query multiple times and then the dml will be executed multiple times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! So we must "share" this Insert and support DAG execution in batch as well? 🥵 Luckily, the parser will reject RETURNING in CTE now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we need to make sure the dml of the RETURNING in CTE will be executed exactly once. That's the PG behavior.

insert into t values (0,1), (1,2) returning *, a, a+b;
logical_plan: |
LogicalProject { exprs: [*VALUES*_0.column_0, *VALUES*_0.column_1, *VALUES*_0.column_0, (*VALUES*_0.column_0 + *VALUES*_0.column_1)] }
└─LogicalInsert { table: t }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogicalInsert { table: t, returning: true } looks clear 🤣

@BugenZhao BugenZhao added the type/feature Type: New feature. label Jan 3, 2023
let data_types = self.child.schema().data_types();
let mut builder = DataChunkBuilder::new(data_types, self.chunk_size);
let mut builder = DataChunkBuilder::new(data_types.clone(), self.chunk_size);
let mut updated_builder = DataChunkBuilder::new(data_types.clone(), self.chunk_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas: the Update with returning should behave like a Project, which does not "buffer" or reorganize the chunks with a builder. 🤔 In this sense, should we directly yield every updated chunk in the execution loop?

@y-wei
Copy link
Contributor Author

y-wei commented Jan 5, 2023

Could you please also add some end-to-end tests?

Have added some. Those tests would pass should this change to sqllogictest be merged.

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Good work!!

Comment on lines 77 to 78
"{} {{ table: {} }}, {{ returning: {} }}",
name, self.table_name, self.returning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should only be one pair of brackets, like Rust's Debug. Same for other occurrences.

Suggested change
"{} {{ table: {} }}, {{ returning: {} }}",
name, self.table_name, self.returning
"{} {{ table: {}{} }}",
name, self.table_name, if self.returning { ", returning: true" } else { "" }

|| lower_sql.starts_with("with")
|| lower_sql.starts_with("describe")
|| lower_sql.starts_with("explain")
|| lower_sql.contains("returning")
Copy link
Member

@BugenZhao BugenZhao Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be too hacky? But as we make RETURNING a reserved word and do not support RETURNING in CTE, it seems okay. 🤔 cc @BowenXiao1999

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@BowenXiao1999 BowenXiao1999 Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly operating on str is always hacky. For example, I guess insert into t values ('returning') will be bad for this case. starts_with looks no bad. Anyway I don't quite like process str directly before use Parser, but it's seems like non-trival to avoid. let's see whether we can remove all is_query_sql in future (by introduce Parser AST?, not sure).

cc @xxchan not sure whether has better ideas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of #6849 (comment) 🤪 So we are really going to introduce parser in pgwire..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly operating on str is always hacky.

+1. For sqllogictest, this is good enough. But in pgwire, we should be more careful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I revert this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can revert these changes that is not must-required in this pr, and add some comments to explain that it miss the ability to process Returning keywords.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

@BugenZhao
Copy link
Member

Could you please also add some end-to-end tests?

Have added some. Those tests would pass should this change to sqllogictest be merged.

Remember to bump the docker image version and sqllogictest version after you do this. 🥰

######################################################
# !!! CHANGE THIS WHEN YOU WANT TO BUMP CI IMAGE !!! #
# AND ALSO docker-compose.yml #
######################################################
export BUILD_ENV_VERSION=v20221212

&& cargo install --git https://github.com/risinglightdb/sqllogictest-rs --rev 65b122f --bin sqllogictest \

@y-wei
Copy link
Contributor Author

y-wei commented Jan 9, 2023

e2e tests have been commented out (blocked by #7277). This pr should be able to merge!

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to bump the sqllogictest library version.

sqllogictest = "0.9"

batch_plan: |
BatchExchange { order: [], dist: Single }
└─BatchInsert { table: t }
└─BatchInsert { table: t }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the trailing space here. Same for Update and Delete.

insert into t values (timestamp '2020-01-01 01:02:03'), (time '03:04:05');
batch_plan: |
BatchExchange { order: [], dist: Single }
<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve the conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/feature Type: New feature. user-facing-changes Contains changes that are visible to users

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants