Fixes #27574: Post-hooks for campaigns should be executed even even if pre-hooks are in failure #6611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

fanf merged 1 commit into Normation:branches/rudder/9.0 from fanf:bug_27574/post_hooks_for_campaigns_should_be_executed_even_even_if_pre_hooks_are_in_failure

Oct 7, 2025

Member

fanf commented Sep 22, 2025 •

edited

Loading

https://issues.rudder.io/issues/27574

This commit handle one annoyance and a semantic change for campaign hooks:

add clearer logs in case of warning or error with the corresponding log level and faulty script
we always go to post-hooks, even if pre-hooks failed, so that we let user a chance to correct state changed in pre-hooks

Clearer logs

Now, when a campaign hook fails or warn, it has a corresponding line in webapp logs:

2025-10-03 22:40:58+0200 ERROR hooks.campaigns.1bb22fc5-f41c-46fb-a44b-54b68b8254f2.pre-hooks - Campaign 'test campaign' pre-hooks returned code '24' in 'test-hook.sh'

This needed some adaptation:

also keep hook names in the history from RunHooks.asyncRunHistory
post process campaign hook history execution to log to correct level

Workflow change: always go to `post-hook`

The idea here is that is we do something in pre-hooks, we may want to undo it in post-hooks.
For that, we must always go to post-hooks - but we need to more info:

one to let the user know that we came to that post-hooks because something failed and the campaign wasn't exec
one to let the workflow know that after these post-hooks, the final campaign state must be "failure" even if all post hooks succeded.

And actually, it's the same bit of information: we just need to set a next-state value in post-hook before we go to them:

if it's from a failure in pre-hook, the next-state is failure ,
if it's from the running state, the next-state is finished.
And we let the user know in the hook with a new environnment variable: CAMPAIGN_NEXT_STATE.

From a code point of view, I had to:

change CampaignEventState.PostHooks to store nextState: CampaignEventStateType
change the workflow to direct to post-hooks in place of failure after pre hooks, seting nextState
change serialisation to save the nextState in base for post-hooks
And add a unit test to demonstrate the behavior

fanf force-pushed the bug_27574/post_hooks_for_campaigns_should_be_executed_even_even_if_pre_hooks_are_in_failure branch 2 times, most recently from 823b2ff to 9806fb6 Compare

October 3, 2025 20:55

Member Author

fanf commented Oct 3, 2025

PR updated with a new commit

fanf commented

View reviewed changes

webapp/sources/rudder/rudder-core/src/main/scala/com/normation/rudder/hooks/RunHooks.scala

    
              object HookReturnCode {

                def isError(code:   Int): Boolean = code < 0 || code > 0 && code < 32

                def isWarning(code: Int): Boolean = code >= 32 && code < 64

Member Author

fanf Oct 3, 2025

This will go well

fanf commented

View reviewed changes

webapp/sources/rudder/rudder-core/src/main/scala/com/normation/rudder/hooks/RunHooks.scala

    
                             PureHooksLogger.For(logIdentifier).trace(s"  -> stderr : ${result.stderr}")

                           }

                      _ <- ZIO.when(result.code >= 32 && result.code <= 64) { // warning

                      _ <- ZIO.when(HookReturnCode.isWarning(result.code)) { // warning

Member Author

fanf Oct 3, 2025 •

edited

Loading

This is a change that will certainly lead to no problem at all, it's really not the kind of change that lead to error, at all.
Please double check definition of isWarning.

fanf commented

View reviewed changes

webapp/sources/rudder/rudder-core/src/main/scala/com/normation/rudder/hooks/RunHooks.scala

    
                    } else if (HookReturnCode.isError(result.code)) { // error

                      ScriptError(path, result.code, result.stdout, result.stderr, msg)

                    } else if (result.code >= 32 && result.code <= 64) { // warning

                    } else if (HookReturnCode.isWarning(result.code)) { // warning

Member Author

fanf Oct 3, 2025

This is a change that will certainly lead to no problem at all, it's really not the kind of change that lead to error, at all.
Please double check definition of isError and isWarning.

fanf commented

View reviewed changes

webapp/sources/rudder/rudder-core/src/main/scala/com/normation/rudder/hooks/RunHooks.scala

    
                              case HookExecutionHistory.DoNotKeep => (c, Nil)

                              // Noop are not real hook execution, just filter them out from the result

                              case HookExecutionHistory.Keep      => (c, (x :: historyList).filterNot(_ == Noop))

                  val runAllSeq: IOResult[(HookReturnCode, List[(String, HookReturnCode)])] = {

Member Author

fanf Oct 3, 2025

add the String which is the command name in the history of returned code

fanf commented

View reviewed changes

webapp/sources/rudder/rudder-core/src/main/scala/com/normation/rudder/hooks/RunHooks.scala

    
                              history match {

                                case HookExecutionHistory.DoNotKeep => (c, Nil)

                                // Noop are not real hook execution, just filter them out from the result

                                case HookExecutionHistory.Keep      => (c, ((nextHookName, c) :: historyList).filterNot(_._2 == Noop))

Member Author

fanf Oct 3, 2025

we now keep the last code also as head of the history, because we need the name.
So we could just have no first returned value and use a non empty list with some default value and check the head, but I think we will want to have different global value and last one at some point - perhaps in case of warning for example.

fanf commented

View reviewed changes

...udder-core/src/test/scala/com/normation/rudder/campaign/CampaignOrchestrationLogicTest.scala

    
                                Scheduled,

                                PreHooks(HookResults(Nil)),

                                PreHooks(HookResults(HookResult(e.id.value, 1, "pre-hooks", "", "") :: Nil)),

                                PostHooks(FailureType, HookResults(Nil)),

Member Author

fanf Oct 3, 2025

look, directly from pre-hooks to post-hooks.

fanf commented

View reviewed changes

...r/rudder-core/src/main/scala/com/normation/rudder/campaigns/CampaignOrchestrationLogic.scala

    
                            case (None, FailureType)     => Failure("pre-hooks were in error and post-hooks completed successfully", "")

                            case (None, s)               => getDefault(s)

                            case (Some(f1), FailureType) => Failure("pre-hooks and post-hooks were in error. Last error:", f1.cause)

                            case (Some(f1), _)           => f1

Member Author

fanf Oct 3, 2025

this logic is just here to put something relevant in the failure message. We don't keep the whole error, so we don't know what hook errored at that point.
In the futur, we will just have the whole history in the campaign event UI and the user will be able to see what broke.

fanf commented

View reviewed changes

.../rudder/rudder-core/src/main/scala/com/normation/rudder/campaigns/CampaignHooksService.scala

    
                                     timeHooks1 <- currentTimeMillis

                                     _          <-

                                       PureHooksLogger.For(loggerName).trace(s"Campaign pre-hooks ran in ${timeHooks1 - timeHooks0} ms")

                                       PureHooksLogger.For(loggerName).trace(s"Campaign ${hookType.entryName} ran in ${timeHooks1 - timeHooks0} ms")

Member Author

fanf Oct 3, 2025

all that foreach just for better logs :)

fanf commented

View reviewed changes

.../rudder/rudder-core/src/main/scala/com/normation/rudder/campaigns/CampaignHooksService.scala

    
                                                         ("CAMPAIGN_EVENT_ID", e.id.value),

                                                         ("CAMPAIGN_EVENT_NAME", e.name)

                                                       )

                                                       .optAdd("CAMPAIGN_NEXT_STATE", nextState.map(_.entryName)),

Member Author

fanf Oct 3, 2025

the new parameter, only defined in post-hooks

fanf marked this pull request as ready for review

October 3, 2025 21:15

VinceMacBuche approved these changes

View reviewed changes

clarktsiory reviewed

View reviewed changes

...r/rudder-core/src/main/scala/com/normation/rudder/campaigns/CampaignOrchestrationLogic.scala

    
                      }

                    case PostHooksType =>

                      val state = event.state.asInstanceOf[PostHooks]

Contributor

clarktsiory Oct 7, 2025

could this not be avoided, by matching on the event.state: CampaignEventState instead ?

Member Author

fanf Oct 7, 2025

Perhaps, but most 8f the time it's far too powerful for the workflow engine, those state's coverture becomes very hard to check.
But I agree that having that is a smell in the encoding that exposes a mismatch. I chose ro keep it because it's very obviously Not Good and claims it needs to be refactored and I didn't want to massively extends the risks with a big refactoring so late.

...r/rudder-core/src/main/scala/com/normation/rudder/campaigns/CampaignOrchestrationLogic.scala

    
                        val nextState = {

                          val optError = res.results.collectFirst { case r if isError(r) => Failure("post-hooks were in error", r.stderr) }

                          (optError, state.nextState) match {

                            case (None, FailureType)     => Failure("pre-hooks were in error and post-hooks completed successfully", "")

Contributor

clarktsiory Oct 7, 2025

does the FailureType not mean that the post-hook completed with a "failure", instead of "successfully" ?

Member

VinceMacBuche Oct 7, 2025

no optError is None then it mean that post-hooks ran successfully, but we have nextState in Failure type, meaning that there was an error in pre-hooks

Member

VinceMacBuche Oct 7, 2025

The case you talk about is treated two lines below

Member Author

fanf Oct 7, 2025

👍

Contributor

Normation-Quality-Assistant commented Oct 7, 2025

This PR is not mergeable to upper versions.
Since it is "Ready for merge" you must merge it by yourself using the following command:
rudder-dev merge https://github.com/Normation/rudder/pull/6611
-- Your faithful QA
Kant merge: "Live your life as though your every act were to become a universal law."
(https://ci.normation.com/jenkins/job/merge-accepted-pr/108127/console)

Normation-Quality-Assistant added the qa: Can't merge label

Member Author

fanf commented Oct 7, 2025

OK, squash merging this PR


          Fixes #27574: Post-hooks for campaigns should be executed even even i…

4b4dc4f

…f pre-hooks are in failure

fanf force-pushed the bug_27574/post_hooks_for_campaigns_should_be_executed_even_even_if_pre_hooks_are_in_failure branch from 23235cd to 4b4dc4f Compare

October 7, 2025 19:44

fanf merged commit 4b4dc4f into Normation:branches/rudder/9.0

1 check passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa: Can't merge