Skip to content

Conversation

sam-panzer
Copy link
Contributor

State::KeepRunning() can take large amounts of time relative to quick
operations (on the order of 1ns, depending on hardware). For such
sensitive operations, it is recommended to run batches of repeated
operations.

This commit simplifies handling of total_iterations_. Rather than
predecrementing such that total_iterations_ == 1 signals that
KeepRunning() should exit, total_iterations_ == 0 now signals the
intention for the benchmark to exit.

@AppVeyorBot
Copy link

Build benchmark 973 failed (commit d090cf4b56 by @sam-panzer)

// while (state.KeepRunningBatch(1000)) {
// // process 1000 elements
// }
bool KeepRunningBatch(size_t n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should likely be an int. int64_t if we expect very large batches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still should be an int or int64_t. So should the iterations tracking really. I'd say leave this, but i'd like the public API not to change very often, even in the signedness of a parameter.

return res;
}

inline bool State::KeepRunning() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BENCHMARK_ALWAYS_INLINE ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@coveralls
Copy link

coveralls commented Jan 31, 2018

Coverage Status

Coverage increased (+0.1%) to 87.048% when pulling 17180dd on sam-panzer:master into df415ad on google:master.

@sam-panzer
Copy link
Contributor Author

I appreciate the quick review! This is my first time using Github in years, so I apologize in advance for any procedural mistakes.

Copy link
Member

@dmah42 dmah42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like someone else to take a look before i merge, but this lgtm.

@EricWF
Copy link
Contributor

EricWF commented Jan 31, 2018

Ide like to take a look at this, but I'm busy until Friday.

@AppVeyorBot
Copy link

Build benchmark 974 failed (commit 6629056c2e by @sam-panzer)

State::KeepRunning() can take large amounts of time relative to quick
operations (on the order of 1ns, depending on hardware). For such
sensitive operations, it is recommended to run batches of repeated
operations.

This commit simplifies handling of total_iterations_. Rather than
predecrementing such that total_iterations_ == 1 signals that
KeepRunning() should exit, total_iterations_ == 0 now signals the
intention for the benchmark to exit.
@AppVeyorBot
Copy link

Build benchmark 975 completed (commit de38336323 by @sam-panzer)

@AppVeyorBot
Copy link

Build benchmark 976 completed (commit fb400d9288 by @sam-panzer)

@EricWF
Copy link
Contributor

EricWF commented Feb 2, 2018

So, this patch actually results in worse codegen for the KeepRunning loop as a result of adding the extra completed_iterations_ variable. Every loop now includes an extra load from memory.

Sample Code:

#include <benchmark/benchmark.h>
using namespace benchmark;
extern "C" void foo();
void BM_TestKeepRunning(State &S) {
  while (S.KeepRunning())
    foo();
}

ASM Before Change:

BM_TestKeepRunning(benchmark::State&): # @BM_TestKeepRunning(benchmark::State&)
  pushq %rbx
  movq %rdi, %rbx
  cmpb $1, (%rbx)
  je .LBB0_3
  jmp .LBB0_2
.LBB0_4: # %while.body
  callq foo
  cmpb $1, (%rbx)
  jne .LBB0_2
.LBB0_3: # %if.end.i
  addq $-1, 8(%rbx)
  jne .LBB0_4
  jmp .LBB0_5
.LBB0_2: # %if.then.i
  movq %rbx, %rdi
  callq benchmark::State::StartKeepRunning()
  addq $-1, 8(%rbx)
  jne .LBB0_4
.LBB0_5: # %while.end
  movq %rbx, %rdi
  popq %rbx
  jmp benchmark::State::FinishKeepRunning() # TAILCALL

ASM After Change:

BM_TestKeepRunning(benchmark::State&): # @BM_TestKeepRunning(benchmark::State&)
  pushq %rbx
  movq %rdi, %rbx
  cmpb $1, (%rbx)
  je .LBB0_3
  jmp .LBB0_2
.LBB0_4: # %while.body
  callq foo
  cmpb $1, (%rbx)
  jne .LBB0_2
.LBB0_3: # %if.end.i
  movq 8(%rbx), %rax # Worse codegen starts here.
  leaq -1(%rax), %rcx
  movq %rcx, 8(%rbx)
  testq %rax, %rax
  jne .LBB0_4
  jmp .LBB0_5
.LBB0_2: # %if.then.i
  movq %rbx, %rdi
  callq benchmark::State::StartKeepRunning()
  jmp .LBB0_3
.LBB0_5: # %while.end
  xorl %esi, %esi
  movq %rbx, %rdi
  popq %rbx
  jmp benchmark::State::FinishKeepRunning(unsigned long) # TAILCALL

The simple benchmark doesn't exactly show a concerning performance difference in terms of "reported iteration time", but I would still like to keep this loop as tight as possible (and an empty benchmark runs fewer iterations than it previously did).

More investigation/suggestions coming later.

@EricWF
Copy link
Contributor

EricWF commented Feb 3, 2018

I'm wondering if it might be better to formulate this as a for loop with special iterators. By using iterators to count we can avoid the problem of non optimizable memory accesses in State.

For example:

for (auto It=State.begin_batch(1000); It != State.end_batch(); ++It) {
  /* Do stuff */
}

Or

for (auto It=State.begin_batch(); It != State.end_batch(); It += 1000) {
 /* Do stuff */
}

@sam-panzer
Copy link
Contributor Author

My goal with this PR is to unify the open-source Google Benchmark API with the internal version. KeepRunningBatch() is missing from the open-source version.

I've taken a cue from the internal version to create a fast path in the common case that the benchmark has already started and will continue running. What do you think?

@AppVeyorBot
Copy link

Build benchmark 980 failed (commit 8065d70394 by @sam-panzer)

@AppVeyorBot
Copy link

Build benchmark 981 failed (commit 91e1d51239 by @sam-panzer)

batch_leftover_ = n - total_iterations_;
total_iterations_ = 0;
if (!error_occurred_) {
if (total_iterations_ >= n) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we get here? on line 642 if total_iterations_ >= n, we return. if we're here then total_iterations_ must be < n.

i think you can avoid this check and improve the code generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. total_iterations_ is now set to 0 when State is initialized. StartKeepRunning() can set total_iterations_ to any positive number of iterations. For example, if max_iterations is 10 and KeepRunningBatch() is given a batch size of 100, we should hit this path. I see that I can fold it into the if block just above the call to FinishKeepRunning(), though I wouldn't be surprised if it doesn't change generated code with optimizations enabled. Does this address your comment, or were you looking for something else?

On another front, how much does code generation matter this far down? The fast path should be hit every time within a benchmark except for up to two calls:

  1. The first call, which starts the benchmark
  2. The last call, either due to error or because it's the last iteration/batch.

@AppVeyorBot
Copy link

Build benchmark 982 failed (commit a81b440a8d by @sam-panzer)

// max_iterations > 0. The first iteration is always valid.
--total_iterations_;
return true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can replace this with something like:

bool State::KeepRunningInternal(int batch) {
  if (BENCHMARK_BUILTIN_EXPECT(total_iterations_ >= n, true)) {
    total_iterations -= n;
    return true;
  }
  if (!started_) {
    StartKeepRunning();
    if (!error_occurred_ && total_iterations_ >= n) {
      total_iterations_ -= n;
      return true;
    }
  }
  if (total_iterations_ > 0) {
    batch_leftover_ = n - total_iterations_;
    total_iterations_ = 0;
    return true;
  }
  FinishKeepRunning();
  return false;
}

bool State::KeepRunning() { return KeepRunningInternal(1); }
bool State::KeepRunningBatch(int n) { return KeepRunningInternal(n); }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll send a PR to implement that.

@AppVeyorBot
Copy link

Build benchmark 983 completed (commit 7cc0b0a728 by @sam-panzer)

@EricWF
Copy link
Contributor

EricWF commented Feb 10, 2018

This LGTM. The codegen looks amazing! Thanks.

@EricWF EricWF merged commit 296ec56 into google:master Feb 10, 2018
@EricWF
Copy link
Contributor

EricWF commented Feb 10, 2018

@dominichamon Woops, sorry I missed your last comment. Could we address it in a follow up commit since I just merged this change :-S

@sam-panzer ^

bool KeepRunning();

// Returns true iff the benchmark should run n more iterations.
// NOTE: A benchmark must not return from the test until KeepRunningBatch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note the requirement that n != 0 since we'll enter an infinite loop if a user passes 0. And users are silly, they might attempt to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll send a PR to assert and note it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes are in #526.

JBakamovic pushed a commit to JBakamovic/benchmark that referenced this pull request Dec 6, 2018
* Support State::KeepRunningBatch().

State::KeepRunning() can take large amounts of time relative to quick
operations (on the order of 1ns, depending on hardware). For such
sensitive operations, it is recommended to run batches of repeated
operations.

This commit simplifies handling of total_iterations_. Rather than
predecrementing such that total_iterations_ == 1 signals that
KeepRunning() should exit, total_iterations_ == 0 now signals the
intention for the benchmark to exit.

* Create better fast path in State::KeepRunningBatch()

* Replace int parameter with size_t to fix signed mismatch warnings

* Ensure benchmark State has been started even on error.

* Simplify KeepRunningBatch()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants