Skip to content

Conversation

@lmatz
Copy link
Contributor

@lmatz lmatz commented Mar 11, 2022

What's changed and what's your intention?

PLEASE DO NOT LEAVE THIS EMPTY !!!

Please explain IN DETAIL what the changes are in this PR and why they are needed:
As detected by @xiangjinwu in #831,
the previous implementation does not treat null as a value. However, single_value aggregation should accept at most one value no matter the value is null or not. Therefore, we need a struct instead of a function to record the number of inputs as a null of Option has already been interpreted as a null value.

This PR partially reverts the changes made in #825 as RTFn becomes a trait that needs to be implemented again.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

#831

@github-actions github-actions bot added the type/fix Type: Bug fix. Only for pull requests. label Mar 11, 2022
@codecov
Copy link

codecov bot commented Mar 11, 2022

Codecov Report

Merging #854 (8e2a3c2) into main (9f2387b) will decrease coverage by 0.05%.
The diff coverage is 78.94%.

@@             Coverage Diff              @@
##               main     #854      +/-   ##
============================================
- Coverage     72.02%   71.96%   -0.06%     
  Complexity     2766     2766              
============================================
  Files           927      927              
  Lines         54086    54139      +53     
  Branches       1787     1787              
============================================
+ Hits          38953    38960       +7     
- Misses        14243    14289      +46     
  Partials        890      890              
Flag Coverage Δ
java 61.22% <ø> (ø)
rust 76.37% <78.94%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...t/common/src/vector_op/agg/general_distinct_agg.rs 74.67% <14.28%> (-0.99%) ⬇️
rust/common/src/vector_op/agg/aggregator.rs 90.32% <55.55%> (-5.91%) ⬇️
rust/common/src/vector_op/agg/functions.rs 82.92% <90.00%> (+7.92%) ⬆️
rust/common/src/vector_op/agg/general_agg.rs 97.04% <96.77%> (-0.60%) ⬇️
rust/frontend/src/observer/observer_manager.rs 85.71% <0.00%> (-0.96%) ⬇️
rust/common/src/types/ordered_float.rs 25.49% <0.00%> (-0.34%) ⬇️
rust/meta/src/manager/catalog_v2.rs 0.00% <0.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@lmatz lmatz requested review from skyzh and xiangjinwu March 11, 2022 07:21
Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about impl Aggregator rather than reusing GeneralAgg? The extension in this PR means some state are captured in the GeneralAgg::result, and some in GeneralAgg::f. Can we keep f pure and capture all states explicitly in a struct that impl Aggregator?

@xiangjinwu
Copy link
Contributor

As for 2-phase agg, the sub-counts from local results needs to be summed. An immature idea is to rewrite it like [localCount,first] - [globalSum,first] - Project[ globalSum <= 1 ? first : err ], just like avg is rewritten to [localCount, localSum] - [globalSum0, globalSum1] - Project [ globalSum1 / globalSum0 ]

@lmatz
Copy link
Contributor Author

lmatz commented Mar 11, 2022

What about impl Aggregator rather than reusing GeneralAgg? The extension in this PR means some state are captured in the GeneralAgg::result, and some in GeneralAgg::f. Can we keep f pure and capture all states explicitly in a struct that impl Aggregator?

Let me try it 👌 .

As for 2-phase agg, the sub-counts from local results needs to be summed. An immature idea is to rewrite it like [localCount,first] - [globalSum,first] - Project[ globalSum <= 1 ? first : err ], just like avg is rewritten to [localCount, localSum] - [globalSum0, globalSum1] - Project [ globalSum1 / globalSum0 ]

You are suggesting this can be done purely in the frontend, do I get your point?
Once a local single_value aggregation has a count > 1, then the entire query should stop and return a runtime error as PG does. We can error earlier, i.e. at the local stage, instead of erroring later, i.e. at the global stage? This could be the only reason I think rewriting may be worse in terms of performance, but very close as the number of messages needed to be exchanged is small. (but may also be large as first still needs to be called a lot of times if the query is ill-written?)

Let's hold this PR for a while and seek other ones' opinion.

@lmatz
Copy link
Contributor Author

lmatz commented Mar 11, 2022

@liurenjie1024 @fuyufjh @TennyZhuang @st1page What do you guys think?

@xiangjinwu
Copy link
Contributor

As for 2-phase agg, the sub-counts from local results needs to be summed. An immature idea is to rewrite it like [localCount,first] - [globalSum,first] - Project[ globalSum <= 1 ? first : err ], just like avg is rewritten to [localCount, localSum] - [globalSum0, globalSum1] - Project [ globalSum1 / globalSum0 ]

You are suggesting this can be done purely in the frontend, do I get your point? Once a local single_value aggregation has a count > 1, then the entire query should stop and return a runtime error as PG does. We can error earlier, i.e. at the local stage, instead of erroring later, i.e. at the global stage? This could be the only reason I think rewriting may be worse in terms of performance, but very close as the number of messages needed to be exchanged is small.

Yes, it is mostly frontend. I also prefer to error out earlier at local stage, but unsure if it is always possible. What if there is 1 null at each local node? The local node cannot call it an error yet, and the global one cannot differentiate this case with a correct empty result, without getting count from local.

There may be better solutions. Just to complete the idea above, frontend rewrite will remove this special aggregate function, but it still need backend support for first and error reporting in Project. Actually first can be any or even min, but min runs slower and any is less useful.

@lmatz
Copy link
Contributor Author

lmatz commented Mar 14, 2022

As @fuyufjh suggests, there could be a more general solution that uses other operators, i.e. join, to achieve this instead of a special single_value aggregation function. Since we want to run the streaming tpch q15 query now, we may still need this aggregation function as a temporary workaround. Therefore, the related issue post will be kept.

Since we may remove this altogether later, the PR keeps the current implementation for the moment. And this issue will be investigated further.

@lmatz lmatz merged commit f342dcb into main Mar 14, 2022
@lmatz lmatz deleted the lz/sv branch March 14, 2022 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/fix Type: Bug fix. Only for pull requests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants