-
Notifications
You must be signed in to change notification settings - Fork 5k
Fix GCS resource list operation correctness and performance #3023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GCS resource list operation correctness and performance #3023
Conversation
Thanks for providing the pull request. We are going to prioritize soon. |
Is there a way you could write an integration that demonstrate that the implementation is now more efficient and performant? |
@bmuschko I can add an integration test that makes sure that the |
@bmuschko I have updated GCS Maven integration tests with an additional test that triggers the case where repository is missing the root Maven metadata and Gradle falls back to listing files iteratively. The server stub was updated to verify that only requests with a specific prefix are used when listing module resources. |
Bump @bmuschko |
1de802f
to
75afd6a
Compare
@eleventigerssc Thanks for the excellent contribution, I've finished review and everything looks fine. I'm waiting for our CI feedback then I can merge it. The only question is unrelated to this PR, in your previous contribution: https://github.com/eleventigerssc/gradle/blob/4da6ba7cfd72ab0027c890f9cc9e3db1e26310f0/subprojects/resources-gcs/src/main/java/org/gradle/internal/resource/transport/gcp/gcs/GcsClient.java#L162 Is this
|
@blindpirate good point re: double decode. It looks like a plain oversight from my end however we should check why the original GCS support PR had manual |
@eleventigerssc Got it. Thanks for your reply. |
Signed-off-by: Bo Zhang <[email protected]>
@eleventigerssc Merged, thanks! |
@blindpirate awesome, thank you! |
Context
The GCS support added through #2258 has a severe performance/liveness bug that manifests when using large GCS repositories. When
ExternalResourceLister
performs a "list" action the internalGcsClient
will always try to read the full list of GCS bucket objects starting from the root.ExternalResourceLister
expects to receive children of theURI
parent that is provided howeverGcsClient
fails to fulfil the contract by fetching and returning full list of objects for a bucket.When using the google json API we can specify a "prefix" when listing objects on a bucket. This "prefix" can be extracted as the path from an
URI
that is provided to theGcsClient#list
method. This fix allows us to correctly list objects for a specified bucket sub-path and gets rid of the performance issue.Contributor Checklist
<subproject>/src/integTest
) to verify changes from a user perspective<subproject>/src/test
) to verify logic./gradlew <changed-subproject>:check
Gradle Core Team Checklist