Memory optimizations #31941

jvandort · 2025-01-06T22:07:07Z

Gradle is currently not able to execute assemble on the Android 6k project benchmark without running out of memory with a 55GB heap.

This PR includes changes to help reduce the amount of retained memory during task graph traversal.

While Gradle still cannot execute assemble with a 55GB heap after these changes, it does reduce the required memory to execute assembleDebug by ~3GB. Before these changes, a 26GB heap was required to complete task graph traversal. After these changes, only a 23GB heap is required.

The majority of reduced memory usage was by using immutable collections instead of mutable collections. Immutable collections are much more memory efficient than mutable collections, as collections like (non-linked)/Linked Hash Map/Set allocate a Node on the heap for every entry in the collection, while Guava immutable collections store data in a contiguous array.

This PR is separated into 5 commits:

Avoid Lists.newArrayList(E...). This likely does not cause much benefit, but this factory method allocates an oversized array for the new array list -- often wasting memory.
Size CachePolicy rules to initial expected size. A ResolutionStrategy is one of the largest parts of a Configuration. This marginally reduces the size of a Configuration.
Use ImmutableSet when caching task graph node values. When walking the task graph, we keep an in-memory cache of node dependencies. We use an immutable collection instead of mutable collection as the value to this map, reducing the memory overhead of the cache.
Use immutable data structures for ResolvedComponentResult. Migrate to immutable data structures in this type, as it is retained in memory for the entirety of graph traversal. We could look into potentially not keeping this in memory in future changes.
Make MutationInfo immutable. MutationInfo on a task graph Node stores data related to the files that a node creates, consumes, and destroys. This was functionally immutable in the past. The code has been migrated to use immutable data structures to store this data.

Future things to look at:

Reduce the footprint of Configuration. In the 6k project benchmark, there are over 1.3 million Configuration instances. They are quite heavy-weight and the average "empty" configuration is ~5kb.
Reduce the footprint of DependencyPredecessorsOnlyNodeSet. It stores a TreeSet of nodes. For this large graph, the overhead of the TreeSet itself (only the nodes of the set, not the retained data) is ~5-7GB.

A snapshot of the memory situation this PR worked with:

Reviewing cheatsheet

Before merging the PR, comments starting with

❌ ❓must be fixed
🤔 💅 should be fixed
💭 may be fixed
🎉 celebrate happy things

alllex · 2025-01-13T10:07:35Z

Do we have a way to lock down the current state of memory usage in such a scenario in a performance test or something similar? I am afraid introducing a wide array of local improvements might not hold up after some later refactorings, which would accidentally negate or remove these smaller improvements.

Having a way to lock down the current state, we should then be able to capture the improved state.

jvandort · 2025-01-13T14:56:34Z

Do we have a way to lock down the current state of memory usage in such a scenario in a performance test or something similar? I am afraid introducing a wide array of local improvements might not hold up after some later refactorings, which would accidentally negate or remove these smaller improvements.

Having a way to lock down the current state, we should then be able to capture the improved state.

Our performance tests do have their max daemon heap sizes configured, so it could be possible to fine-tune that number for each perf test so that if the memory usage rises the test would OOM.

Though I see a couple issues with this

It is quite tedious to run those tests manually and find out how much memory the tests actually need, as it requires a trial and error like process
When Gradle builds do run out of memory, it often surfaces as a hang of the build process and not a strict OOM Error.

A nice goal would be to say a success here is to be able to run the 6k project benchmark on CI. We already have gradle/perf-android-extra-large but when I tried to set up CI benchmarks on it last it hung on CI

jvandort · 2025-01-13T14:58:06Z

I know the Android team has some benchmark suite that captures heap dumps as certain times and diffs them, verifying changes in memory usage.

Though I'm not sure we should hold up many GB of memory usage improvements over further testing improvements

...untime/base-services/src/main/java/org/gradle/internal/graph/CachingDirectedGraphWalker.java

tresat · 2025-01-13T16:14:38Z

platforms/ide/ide-plugins/src/main/java/org/gradle/plugins/ide/eclipse/EclipseWtpPlugin.java

+                        List<WbResource> result = new ArrayList<>(1);
+                        result.add(new WbResource("/", webAppDirName));
+                        return result;


💭 Collections.singletonList()?

tresat · 2025-01-13T16:16:10Z

platforms/ide/ide-plugins/src/main/java/org/gradle/plugins/ide/eclipse/EclipseWtpPlugin.java

-                            new Facet(Facet.FacetType.installed, "jst.utility", "1.0"),
-                            new Facet(Facet.FacetType.installed, "jst.java", toJavaFacetVersion(project.getExtensions().getByType(JavaPluginExtension.class).getSourceCompatibility()))
-                        );
+                        List<Facet> result = new ArrayList<>(3);


💭 Is something modifying the return value from an anonymous type that promises only some List, nothing definitely modifiable? Can we use immutable list here and avoid the modification?

EclipseWtpFacet is part of an extension on the Project. It is a user-modifyable model, and should be mutable

tresat · 2025-01-13T16:17:19Z

...native/src/main/java/org/gradle/nativeplatform/internal/modulemap/GenerateModuleMapFile.java

-            "module " + moduleName + " {"
-        );
+        String firstLine = "module " + moduleName + " {";
+        List<String> lines = new ArrayList<>();


💅 Could pre-compute the size needed here, too.

I didn't feel this was necessary. This isn't really performance sensitive at all and I didn't want to add the maintenance burden

If it were simpler (didnt have a dynamic size based on filtering) I'd say this is more worth it

tresat · 2025-01-13T16:17:54Z

.../java/org/gradle/nativeplatform/toolchain/internal/msvcpp/ArchitectureDescriptorBuilder.java

        File commonTools = new File(vsPath, PATH_COMMONTOOLS);
        File commonIde = new File(vsPath, PATH_COMMONIDE);
-        List<File> paths = Lists.newArrayList(commonTools, commonIde);
+        List<File> paths = new ArrayList<>();


Suggested change

List<File> paths = new ArrayList<>();

List<File> paths = new ArrayList<>(3);

tresat · 2025-01-13T16:26:12Z

...t/src/main/java/org/gradle/api/internal/artifacts/result/DefaultResolvedComponentResult.java

+    public void complete() {
+        dependents = ImmutableSet.copyOf(dependents);
+    }


🤔 I think this needs a short explanation. Why should it be called? What is "completed"? Why are we putting the responsibility for "finalizing" the collection as an immutable type onto clients of this class, especially if they can only get an unmodifiable set view of this collection anyways.

tresat · 2025-01-13T16:27:55Z

...ces-http/src/main/java/org/gradle/internal/resource/transport/http/HttpClientConfigurer.java

+            AllSchemesAuthentication authentication = getAuthForProxy(httpsProxy);
+            useCredentials(credentialsProvider, Collections.singleton(authentication));


💭 Can all of this be moved into a new useCredsForProxy method, removing the need for getAuthForProxy as a separate method?

tresat · 2025-01-13T16:28:14Z

subprojects/core/src/main/java/org/gradle/execution/plan/ConsumerState.java

+import java.util.HashSet;
+import java.util.Set;
+
+public class ConsumerState {


💅 New top-level class needs javadoc, maybe final?

Made final.

Unfortunately I'm not sure I know all that much about this type and its usage. I just moved the code somewhere else

tresat · 2025-01-13T16:38:05Z

subprojects/core/src/main/java/org/gradle/execution/plan/ResolveMutationsNode.java

+    private void validateMutations(MutationInfo mutations) {
+        if (!mutations.getDestroyablePaths().isEmpty()) {
+            if (mutations.hasOutputs()) {
+                throw new IllegalStateException("Task " + node + " has both outputs and destroyables defined.  A task can define either outputs or destroyables, but not both.");
+            }
+            if (mutations.hasFileInputs()) {
+                throw new IllegalStateException("Task " + node + " has both inputs and destroyables defined.  A task can define either inputs or destroyables, but not both.");
+            }
+            if (mutations.hasLocalState()) {
+                throw new IllegalStateException("Task " + node + " has both local state and destroyables defined.  A task can define either local state or destroyables, but not both.");
+            }
+        }
+    }


🤔 Would this be better as a MutationInfo method: mutations.validate(node)?

Thats definitely an option, but I like the idea of having MutationInfo as mostly a data class

This allocates an oversized list, potentially leading to unnecessary memory usage.

We know the size of these arrays by default. Size them to avoid wasted memory

This cache gets retained in heap for the entirety of cache graph walking. For very large graphs, this cache can retain tens of GB of memory. The values of this cache are immutable, yet we store a mutable set within it -- a LinkedHashSet -- which is very memory inefficient In the Android 6k project benchmark project, when running assemble, this cache was found to contain over 124 million LinkedHashMap entries, totaling almost 8GB of retained heap _for the map nodes themselves_, not including the contents of the map. This commit stores an immutable Set instead, ensuring we use a much more space efficient data structure for this cache.

Each graph resolved during build dependency resolution is retained in memory while walking the task graph. While this is only a partial graph, without external dependencies, it can still take up a significant amount of memory. We update ResolvedComponentResult to use immutable data structures, which are signifncantly more memory efficient than mutable data structures. This reduces the amount of memory retained during task dependency graph walking, and later in the build if any build logic accesses a ResolutionResult

MutationInfo, with the exception of the nodesYetToConsumeOutput state, was generated wholly by the ResolveMutationsNode. After the ResolveMutationsNode executes, the MutationInfo is not mutated further. For very large graphs, this MutationInfo can retain a significant amount of memory -- in part due to memory inefficient mutable data structures used to store its state. This commit ensures we calculate the entirety of MutationInfo at once, so that we can store its state in a memory efficient immutable data structure. This greately reduces the retained heap from this state.

bot-gradle · 2025-01-15T18:43:46Z

The following builds have passed:

Pull Request Feedback (Trigger) (Check)
- Build Scan

jvandort · 2025-01-15T20:45:08Z

@bot-gradle test this

bot-gradle · 2025-01-15T20:45:45Z

The following builds have passed:

Pull Request Feedback (Trigger) (Check)
- Build Scan

MrG9090

?

jvandort force-pushed the jvandort/performance/memory-optimizations branch 2 times, most recently from 9a46be9 to 0cf728c Compare January 6, 2025 23:34

This comment has been minimized.

Sign in to view

jvandort marked this pull request as ready for review January 11, 2025 01:04

jvandort requested a review from a team as a code owner January 11, 2025 01:04

jvandort requested a review from a team January 11, 2025 01:04

jvandort requested a review from a team as a code owner January 11, 2025 01:04

jvandort requested review from a team January 11, 2025 01:04

jvandort requested review from a team as code owners January 11, 2025 01:04

jvandort requested review from a team, alllex, donat, mlopatkin and tresat and removed request for a team January 11, 2025 01:04

This comment has been minimized.

Sign in to view

tresat approved these changes Jan 13, 2025

View reviewed changes

jvandort added 5 commits January 15, 2025 11:56

Avoid Lists.newArrayList(E...)

a2c5c84

This allocates an oversized list, potentially leading to unnecessary memory usage.

Size CachePolicy rule arrays to initial expected size

7850706

We know the size of these arrays by default. Size them to avoid wasted memory

jvandort force-pushed the jvandort/performance/memory-optimizations branch from 685f8aa to a170385 Compare January 15, 2025 16:57

Apply review changes

0a84ddd

jvandort force-pushed the jvandort/performance/memory-optimizations branch from 03406e3 to 0a84ddd Compare January 15, 2025 17:07

This comment has been minimized.

Sign in to view

jvandort added this pull request to the merge queue Jan 15, 2025

bot-gradle added this to the 8.13 RC1 milestone Jan 15, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 15, 2025

This comment has been minimized.

Sign in to view

jvandort added this pull request to the merge queue Jan 15, 2025

Merged via the queue into master with commit 636df91 Jan 15, 2025
26 of 28 checks passed

jvandort deleted the jvandort/performance/memory-optimizations branch January 15, 2025 22:19

MrG9090 reviewed May 13, 2025

View reviewed changes

	List<File> paths = new ArrayList<>();
	List<File> paths = new ArrayList<>(3);

		AllSchemesAuthentication authentication = getAuthForProxy(httpsProxy);
		useCredentials(credentialsProvider, Collections.singleton(authentication));

Memory optimizations #31941

Memory optimizations #31941

Uh oh!

Conversation

jvandort commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing cheatsheet

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

alllex commented Jan 13, 2025

Uh oh!

jvandort commented Jan 13, 2025

Uh oh!

jvandort commented Jan 13, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

This comment has been minimized.

bot-gradle commented Jan 15, 2025

Uh oh!

Uh oh!

jvandort commented Jan 15, 2025

Uh oh!

This comment has been minimized.

bot-gradle commented Jan 15, 2025

Uh oh!

Uh oh!

MrG9090 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jvandort commented Jan 6, 2025 •

edited

Loading