Skip to content

Conversation

jvandort
Copy link
Member

@jvandort jvandort commented Jan 6, 2025

Gradle is currently not able to execute assemble on the Android 6k project benchmark without running out of memory with a 55GB heap.

This PR includes changes to help reduce the amount of retained memory during task graph traversal.

While Gradle still cannot execute assemble with a 55GB heap after these changes, it does reduce the required memory to execute assembleDebug by ~3GB. Before these changes, a 26GB heap was required to complete task graph traversal. After these changes, only a 23GB heap is required.

The majority of reduced memory usage was by using immutable collections instead of mutable collections. Immutable collections are much more memory efficient than mutable collections, as collections like (non-linked)/Linked Hash Map/Set allocate a Node on the heap for every entry in the collection, while Guava immutable collections store data in a contiguous array.

This PR is separated into 5 commits:

  1. Avoid Lists.newArrayList(E...). This likely does not cause much benefit, but this factory method allocates an oversized array for the new array list -- often wasting memory.
  2. Size CachePolicy rules to initial expected size. A ResolutionStrategy is one of the largest parts of a Configuration. This marginally reduces the size of a Configuration.
  3. Use ImmutableSet when caching task graph node values. When walking the task graph, we keep an in-memory cache of node dependencies. We use an immutable collection instead of mutable collection as the value to this map, reducing the memory overhead of the cache.
  4. Use immutable data structures for ResolvedComponentResult. Migrate to immutable data structures in this type, as it is retained in memory for the entirety of graph traversal. We could look into potentially not keeping this in memory in future changes.
  5. Make MutationInfo immutable. MutationInfo on a task graph Node stores data related to the files that a node creates, consumes, and destroys. This was functionally immutable in the past. The code has been migrated to use immutable data structures to store this data.

Future things to look at:

  • Reduce the footprint of Configuration. In the 6k project benchmark, there are over 1.3 million Configuration instances. They are quite heavy-weight and the average "empty" configuration is ~5kb.
  • Reduce the footprint of DependencyPredecessorsOnlyNodeSet. It stores a TreeSet of nodes. For this large graph, the overhead of the TreeSet itself (only the nodes of the set, not the retained data) is ~5-7GB.

A snapshot of the memory situation this PR worked with:
image

Reviewing cheatsheet

Before merging the PR, comments starting with

  • ❌ ❓must be fixed
  • 🤔 💅 should be fixed
  • 💭 may be fixed
  • 🎉 celebrate happy things

@jvandort jvandort force-pushed the jvandort/performance/memory-optimizations branch 2 times, most recently from 9a46be9 to 0cf728c Compare January 6, 2025 23:34
@jvandort

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@jvandort

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@jvandort

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@jvandort jvandort marked this pull request as ready for review January 11, 2025 01:04
@jvandort jvandort requested a review from a team as a code owner January 11, 2025 01:04
@jvandort jvandort requested a review from a team January 11, 2025 01:04
@jvandort jvandort requested a review from a team as a code owner January 11, 2025 01:04
@jvandort jvandort requested review from a team January 11, 2025 01:04
@jvandort jvandort requested review from a team as code owners January 11, 2025 01:04
@jvandort jvandort requested review from a team, alllex, donat, mlopatkin and tresat and removed request for a team January 11, 2025 01:04
@jvandort

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@alllex
Copy link
Member

alllex commented Jan 13, 2025

Do we have a way to lock down the current state of memory usage in such a scenario in a performance test or something similar? I am afraid introducing a wide array of local improvements might not hold up after some later refactorings, which would accidentally negate or remove these smaller improvements.

Having a way to lock down the current state, we should then be able to capture the improved state.

@jvandort
Copy link
Member Author

Do we have a way to lock down the current state of memory usage in such a scenario in a performance test or something similar? I am afraid introducing a wide array of local improvements might not hold up after some later refactorings, which would accidentally negate or remove these smaller improvements.

Having a way to lock down the current state, we should then be able to capture the improved state.

Our performance tests do have their max daemon heap sizes configured, so it could be possible to fine-tune that number for each perf test so that if the memory usage rises the test would OOM.

Though I see a couple issues with this

  1. It is quite tedious to run those tests manually and find out how much memory the tests actually need, as it requires a trial and error like process
  2. When Gradle builds do run out of memory, it often surfaces as a hang of the build process and not a strict OOM Error.

A nice goal would be to say a success here is to be able to run the 6k project benchmark on CI. We already have gradle/perf-android-extra-large but when I tried to set up CI benchmarks on it last it hung on CI

@jvandort
Copy link
Member Author

I know the Android team has some benchmark suite that captures heap dumps as certain times and diffs them, verifying changes in memory usage.

Though I'm not sure we should hold up many GB of memory usage improvements over further testing improvements

Comment on lines +207 to +209
List<WbResource> result = new ArrayList<>(1);
result.add(new WbResource("/", webAppDirName));
return result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Collections.singletonList()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below

new Facet(Facet.FacetType.installed, "jst.utility", "1.0"),
new Facet(Facet.FacetType.installed, "jst.java", toJavaFacetVersion(project.getExtensions().getByType(JavaPluginExtension.class).getSourceCompatibility()))
);
List<Facet> result = new ArrayList<>(3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Is something modifying the return value from an anonymous type that promises only some List, nothing definitely modifiable? Can we use immutable list here and avoid the modification?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EclipseWtpFacet is part of an extension on the Project. It is a user-modifyable model, and should be mutable

"module " + moduleName + " {"
);
String firstLine = "module " + moduleName + " {";
List<String> lines = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 Could pre-compute the size needed here, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't feel this was necessary. This isn't really performance sensitive at all and I didn't want to add the maintenance burden

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were simpler (didnt have a dynamic size based on filtering) I'd say this is more worth it

File commonTools = new File(vsPath, PATH_COMMONTOOLS);
File commonIde = new File(vsPath, PATH_COMMONIDE);
List<File> paths = Lists.newArrayList(commonTools, commonIde);
List<File> paths = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
List<File> paths = new ArrayList<>();
List<File> paths = new ArrayList<>(3);

Comment on lines +202 to +204
public void complete() {
dependents = ImmutableSet.copyOf(dependents);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think this needs a short explanation. Why should it be called? What is "completed"? Why are we putting the responsibility for "finalizing" the collection as an immutable type onto clients of this class, especially if they can only get an unmodifiable set view of this collection anyways.

Comment on lines 193 to 194
AllSchemesAuthentication authentication = getAuthForProxy(httpsProxy);
useCredentials(credentialsProvider, Collections.singleton(authentication));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Can all of this be moved into a new useCredsForProxy method, removing the need for getAuthForProxy as a separate method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

import java.util.HashSet;
import java.util.Set;

public class ConsumerState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 New top-level class needs javadoc, maybe final?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made final.

Unfortunately I'm not sure I know all that much about this type and its usage. I just moved the code somewhere else

Comment on lines +186 to +198
private void validateMutations(MutationInfo mutations) {
if (!mutations.getDestroyablePaths().isEmpty()) {
if (mutations.hasOutputs()) {
throw new IllegalStateException("Task " + node + " has both outputs and destroyables defined. A task can define either outputs or destroyables, but not both.");
}
if (mutations.hasFileInputs()) {
throw new IllegalStateException("Task " + node + " has both inputs and destroyables defined. A task can define either inputs or destroyables, but not both.");
}
if (mutations.hasLocalState()) {
throw new IllegalStateException("Task " + node + " has both local state and destroyables defined. A task can define either local state or destroyables, but not both.");
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Would this be better as a MutationInfo method: mutations.validate(node)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats definitely an option, but I like the idea of having MutationInfo as mostly a data class

This allocates an oversized list, potentially leading to unnecessary memory usage.
We know the size of these arrays by default. Size them to avoid wasted memory
This cache gets retained in heap for the entirety of cache graph walking.
For very large graphs, this cache can retain tens of GB of memory.
The values of this cache are immutable, yet we store a mutable set within it -- a LinkedHashSet -- which is very memory inefficient

In the Android 6k project benchmark project, when running assemble, this cache was found to contain over 124 million LinkedHashMap entries,
totaling almost 8GB of retained heap _for the map nodes themselves_, not including the contents of the map.

This commit stores an immutable Set instead, ensuring we use a much more space efficient data structure for this cache.
Each graph resolved during build dependency resolution is retained in memory while walking the task graph.
While this is only a partial graph, without external dependencies, it can still take up a significant amount of memory.

We update ResolvedComponentResult to use immutable data structures, which are signifncantly more memory efficient
than mutable data structures. This reduces the amount of memory retained during task dependency graph walking, and
later in the build if any build logic accesses a ResolutionResult
MutationInfo, with the exception of the nodesYetToConsumeOutput state, was generated wholly by the ResolveMutationsNode.
After the ResolveMutationsNode executes, the MutationInfo is not mutated further.
For very large graphs, this MutationInfo can retain a significant amount of memory -- in part due to memory inefficient mutable
data structures used to store its state.

This commit ensures we calculate the entirety of MutationInfo at once, so that we can store its state in a memory efficient immutable
data structure. This greately reduces the retained heap from this state.
@jvandort jvandort force-pushed the jvandort/performance/memory-optimizations branch from 685f8aa to a170385 Compare January 15, 2025 16:57
@jvandort jvandort force-pushed the jvandort/performance/memory-optimizations branch from 03406e3 to 0a84ddd Compare January 15, 2025 17:07
@jvandort

This comment has been minimized.

@bot-gradle

This comment has been minimized.

@bot-gradle
Copy link
Collaborator

The following builds have passed:

@jvandort jvandort added this pull request to the merge queue Jan 15, 2025
@bot-gradle bot-gradle added this to the 8.13 RC1 milestone Jan 15, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 15, 2025
@jvandort
Copy link
Member Author

@bot-gradle test this

@bot-gradle

This comment has been minimized.

@bot-gradle
Copy link
Collaborator

The following builds have passed:

@jvandort jvandort added this pull request to the merge queue Jan 15, 2025
Merged via the queue into master with commit 636df91 Jan 15, 2025
26 of 28 checks passed
@jvandort jvandort deleted the jvandort/performance/memory-optimizations branch January 15, 2025 22:19
Copy link

@MrG9090 MrG9090 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants