document how to make ad-hoc tasks cacheable #6691

jnizet · 2018-09-08T22:47:30Z

Context

I was experimenting with the build cache, and couldn't find any information in the gradle documentation on how to make ad-hoc tasks cacheable, although I feel it's a common use-case. I even thought, reading the build cache chapter of the documentation, that this wasn't supported.

But then I googled and found this document. So I thought adding a section with an example in the documentation could be useful.

Contributor Checklist

Review Contribution Guidelines
Make sure that all commits are signed off to indicate that you agree to the terms of Developer Certificate of Origin.
Check "Allow edit from maintainers" option in pull request so that additional changes can be pushed by Gradle team
Provide integration tests (under <subproject>/src/integTest) to verify changes from a user perspective
Provide unit tests (under <subproject>/src/test) to verify logic
Update User Guide, DSL Reference, and Javadoc for public-facing changes
Ensure that tests pass locally: ./gradlew <changed-subproject>:check

Gradle Core Team Checklist

Verify design and implementation
Verify test coverage and CI build status
Verify documentation
Recognize contributor in release notes

Signed-off-by: jnizet <[email protected]>

wolfs · 2018-09-12T09:51:58Z

@jnizet Thank you for your contribution! The example is really good and yes, we should have an example how to make custom tasks cacheable, like, as you a custom Exec task e.g. for Webpack.

I don't think that we should have a guide how to make ad-hoc tasks cacheable. The runtime API for making tasks cacheable is meant to be used for tasks where you cannot change the implementation. For the use-case you described above I think a custom task implementation would be much better. And then we don't need to use the runtime API.

Luckily, we already have a guide which describes how to get a cacheable webpack task: https://guides.gradle.org/running-webpack-with-gradle/#step_4_leverage_gradle_build_cache
The downside is that the guide is currently not correct - the cacheable task will be non-relocatable.

Instead of doing the changes here, could you improve the Webpack guide and link to there from here?

jnizet · 2018-09-12T10:37:51Z

@wolfs I understand your argument.

My main concern is that when looking for how to use the build cache, few people would go to a guide about webpack. But I also understand that I shouldn't encourage the usage of cacheable ad-hoc tasks.

How about the following improvements?

I update the webpack guide to make it relocatable (i.e. add the property names on inputs and outputs, in a separate PR?)
I reword my example in order to avoid talking about ad-hoc tasks, but only about generic tasks like Exec, JavaExec, or other tasks added by plugins which are not cacheable by default
I replace the ad-hoc task by a task extending JavaExec (I could use Exec and execute cat for example, but that wouldn't make the sample portable and the tests would fail on Windows). The text would insist on the fact that, in a real use-case, JavaExec would be some non-modifiable Webpack task coming from a plugin, or would simply be an Exec task executing webpack.

wolfs · 2018-09-12T10:46:37Z

@jnizet Your suggestions sound good. IMO, it would be enough to link to the webpack example from the build cache chapter. If you think that is not enough, then the rest of your suggestions are fine with me, too.

Thanks again!

Signed-off-by: jnizet <[email protected]>

wolfs · 2018-09-12T14:43:47Z

subprojects/docs/src/samples/buildCache/cacheable-bundle/groovy/build.gradle

-    }
+    main = "org.gradle.sample.Bundle"
+    classpath = project.sourceSets.main.runtimeClasspath
+    args projectDir.toPath().relativize(scripts.toPath()).toString(), projectDir.toPath().relativize(bundle.toPath()).toString()


I think we should not promote using of relative paths for the JavaExec task. We have CommandLineArgumentProvider and Exec.argumentProviders.add exactly for that reason. I think we should leverage CommandLineArgumentProvider in this example.

Actually, we need to ignore the path to the executable as well (at least as long as something like #6711 has not been merged). If not, people are forced to install Java in the same directory. So why not declare a custom webpack task which uses Project.javaExec? The other option would be to subclass the Exec task and override the getExecutable() method annotating it with @Internal.

And the final option is to override the executable property via the runtime API by doing something like inputs.property("executable", "webpack").

I would prefer using the custom task class and Project.javaExec.

OK, I'll go with a Webpack task.

For the purpose of this example, it will not be cacheable (of course), and will take two files as arguments, but these arguments won't be inputs, otherwise I will have the same problem as for the JavaExec task: the task will have inputs that I can't remove (AFAIK), and that will be absolute paths, and thus make the task not relocatable.

Given that the task won't even appear in the guide, there is no need to use javaExec, right? I can just use groovy/kotlin code directly in the task as I did in my first attempt.

Actually, to be more realistic, I think it would be more logical to use a very generic NpmTask, that would just take a list of arguments. A specific WebpackTask would probably define inputs and outputs, making the example look weird.

Signed-off-by: jnizet <[email protected]>

wolfs

That looks pretty good already! Thank you!

I added some comments.

Regarding where to place this content: Looking at it, it seems more suited for the build cache guide, since this guide is targeted more at build authors.

Would it be OK for you to add it as a chapter there? If you want to, I can also take care of moving the content you provided over there.

At some point, we wanted to move the build cache guide back to the userguide, so it would only be a different chapter instead of a whole new repository.

WDYT?

wolfs · 2018-09-13T07:02:17Z

subprojects/docs/src/samples/buildCache/non-cacheable-bundle/kotlin/build.gradle.kts

+// Fake NPM task that would normally execute npm with its provided arguments
+open class NpmTask : DefaultTask() {
+
+    lateinit var args: List<String>


A ListProvider<String> would be better.

wolfs · 2018-09-13T07:04:40Z

subprojects/docs/src/docs/userguide/build_cache.adoc

+
+As we have seen, built-in tasks, or tasks provided by plugins, are cacheable if their class is annotated with the `Cacheable` annotation. But what if you want to make cacheable a task whose class is not cacheable? Let's take a concrete example: your build script uses a generic `NpmTask` task to create a JavaScript bundle by delegating to NPM (and running `npm run bundle`). This process is similar to a complex compilation task, but `NpmTask` is too generic to be cacheable by default: it just takes arguments and runs npm with those arguments.
+
+The inputs and outputs of this task are obvious. The input is the directory containing the JavaScript files. The output is the bundle file generated by this task.


What about the configuration files for the bundle task?

wolfs · 2018-09-13T07:05:37Z

subprojects/docs/src/docs/userguide/build_cache.adoc

+include::sample[dir="buildCache/non-cacheable-bundle/kotlin",files="build.gradle.kts[tags=bundle-task]"]
+====
+
+If you're a modern JavaScript developer, you know that bundling can be quite long, and is worth caching. To achieve that, we need to tell Gradle when it's allowed to cache the output of that task, using the link:{javadocPath}/org/gradle/api/tasks/TaskOutputs.html#cacheIf-org.gradle.api.specs.Spec-[TaskOutputs.cacheIf()] method. In this case, it's always allowed. We also need to assign a property name to each input and output, used by the cache to identify them.


Nit: We (try to) use one sentence per line style in the userguide.

wolfs · 2018-09-13T07:07:20Z

subprojects/docs/src/docs/userguide/build_cache.adoc

+
+If you're a modern JavaScript developer, you know that bundling can be quite long, and is worth caching. To achieve that, we need to tell Gradle when it's allowed to cache the output of that task, using the link:{javadocPath}/org/gradle/api/tasks/TaskOutputs.html#cacheIf-org.gradle.api.specs.Spec-[TaskOutputs.cacheIf()] method. In this case, it's always allowed. We also need to assign a property name to each input and output, used by the cache to identify them.
+
+This is sufficient to make the task cacheable on your own machine. However, input files are identified by default by their absolute path. So if the cache needs to be shared between several developers or machines using different paths, that won't work as expected. So we also need to set the "path sensitivity". In this case, the relative path of the input files can be used to identify them.


Let's link to path sensitivity in the build cache guide: https://guides.gradle.org/using-build-cache/#relocatability

wolfs · 2018-09-13T07:09:59Z

subprojects/docs/src/samples/buildCache/cacheable-bundle/kotlin/build.gradle.kts

+// Fake NPM task that would normally execute npm with its provided arguments
+open class NpmTask : DefaultTask() {
+
+    lateinit var args: List<String>


Let's use a ListProperty<String>.

- one sentence per line - NPM config files as input - ListProperty<String> - pass a string rather than a file to outputs.file() - link to guide for relocatability Signed-off-by: jnizet <[email protected]>

jnizet · 2018-09-13T16:48:21Z

Thanks for the review @wolfs.
I made all the requested changes.

Regarding the location of this new section, if you think it should be in the build cache guide rather than here, that's fine with me.

I'm not too familiar with Spock though, so I'd be more comfortable if you moved it there (and rewrote the exemplar test as a Spock integration test).

wolfs

Looks good from my side! I will merge this soon.

for adding documentation for caching. PR: #6691

document how to make ad-hoc tasks cacheable

b40263f

Signed-off-by: jnizet <[email protected]>

eriwen requested review from lptr and wolfs September 10, 2018 16:16

eriwen added in:documentation-infrastructure waiting-for:gradle-team labels Sep 10, 2018

Avoid promoting ad-hoc tasks, and use a JavaExec-based example instead

c03872c

Signed-off-by: jnizet <[email protected]>

wolfs requested changes Sep 12, 2018

View reviewed changes

wolfs added the in:build-cache label Sep 12, 2018

wolfs self-assigned this Sep 12, 2018

wolfs added waiting-for:contributor and removed waiting-for:gradle-team labels Sep 12, 2018

Use a generic NpmTask, which just takes arguments

5a04be0

Signed-off-by: jnizet <[email protected]>

wolfs requested changes Sep 13, 2018

View reviewed changes

Changes after review

fe9d350

- one sentence per line - NPM config files as input - ListProperty<String> - pass a string rather than a file to outputs.file() - link to guide for relocatability Signed-off-by: jnizet <[email protected]>

wolfs approved these changes Sep 19, 2018

View reviewed changes

wolfs merged commit a27ca6a into gradle:master Sep 20, 2018

wolfs added a commit that referenced this pull request Sep 20, 2018

Recognize Jean-Baptiste Nizet in release notes

99f7207

for adding documentation for caching. PR: #6691

wolfs mentioned this pull request Sep 20, 2018

Add sample for cacheable custom task #6836

Merged

lptr added the @build-cache label Apr 16, 2019


		As we have seen, built-in tasks, or tasks provided by plugins, are cacheable if their class is annotated with the `Cacheable` annotation. But what if you want to make cacheable a task whose class is not cacheable? Let's take a concrete example: your build script uses a generic `NpmTask` task to create a JavaScript bundle by delegating to NPM (and running `npm run bundle`). This process is similar to a complex compilation task, but `NpmTask` is too generic to be cacheable by default: it just takes arguments and runs npm with those arguments.

		The inputs and outputs of this task are obvious. The input is the directory containing the JavaScript files. The output is the bundle file generated by this task.


		If you're a modern JavaScript developer, you know that bundling can be quite long, and is worth caching. To achieve that, we need to tell Gradle when it's allowed to cache the output of that task, using the link:{javadocPath}/org/gradle/api/tasks/TaskOutputs.html#cacheIf-org.gradle.api.specs.Spec-[TaskOutputs.cacheIf()] method. In this case, it's always allowed. We also need to assign a property name to each input and output, used by the cache to identify them.

		This is sufficient to make the task cacheable on your own machine. However, input files are identified by default by their absolute path. So if the cache needs to be shared between several developers or machines using different paths, that won't work as expected. So we also need to set the "path sensitivity". In this case, the relative path of the input files can be used to identify them.

document how to make ad-hoc tasks cacheable #6691

document how to make ad-hoc tasks cacheable #6691

Uh oh!

Conversation

jnizet commented Sep 8, 2018 • edited by wolfs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Contributor Checklist

Gradle Core Team Checklist

Uh oh!

wolfs commented Sep 12, 2018

Uh oh!

jnizet commented Sep 12, 2018

Uh oh!

wolfs commented Sep 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wolfs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnizet commented Sep 13, 2018

Uh oh!

wolfs left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jnizet commented Sep 8, 2018 •

edited by wolfs

Loading