-
Notifications
You must be signed in to change notification settings - Fork 707
fix(connector): fix start_offset and empty range #7112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d78b9ee to
aac5f1b
Compare
| if let Some(start_offset) = self.start_offset && start_offset == stop_offset { | ||
| yield Vec::new(); | ||
| return Ok(()); | ||
| } else if stop_offset == 0 { | ||
| yield Vec::new(); | ||
| return Ok(()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it returns nothing. Consider refusing to create a reader in such case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we refuse to create a reader and return a error, there are two option:
- We directly report error if the range return nothing. This behavior seems weird because we will not report error if a table is empty.
- We compare the error and transfer it from bottom to top. Because
SplitReaderImpl::createreturn as a AnyError, we compare the error by comparing string, I'm not sure whether is a good way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We directly report error if the range return nothing.
I vote for this solution. Table is an internal concept, which is easy to handle but querying from kafka has more overhead than from table. If we know the result is empty deterministically, why do we waste resources on it?
@liurenjie1024 @fuyufjh what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will create batch source executor each split which means that we can't create source executor if all split is empty and they both be filtered. So for now we directly return error in this case.
This behavior maybe cause the problem: If we left join a table and a empty source, e.g.select * from table left join empty_source ,directly return error will cause error return. But this scenario is rare I think...
If we want the user to see the empty result instead of a error, maybe we can:
- create a empty batch vlaue executor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should return error here and it feels like abusing of error reporting to me since empty result is common in query. I think just return empty vec is good enough.
Codecov Report
@@ Coverage Diff @@
## main #7112 +/- ##
==========================================
- Coverage 73.15% 73.15% -0.01%
==========================================
Files 1052 1052
Lines 167399 167424 +25
==========================================
+ Hits 122467 122477 +10
- Misses 44932 44947 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
930428c to
aac5f1b
Compare
|
This PR is included in #7150 |
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
fix:
Checklist
./risedev check(or alias,./risedev c)Documentation
If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.
Types of user-facing changes
Please keep the types that apply to your changes, and remove those that do not apply.
Release note
Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.
Refer to a related PR or issue link (optional)