Use uintXX_t instead of size_t for items/bytes/iterations #507

Maratyszcza · 2018-01-03T08:59:09Z

7a76701 seemingly unintentionally changed the data types for bytes_processed, items_processed, and iterations to size_t. On 32-bit systems, the bytes_processed and items_processed variables easily overflow, and often report fewer items/second for faster code. The fact that overflow chances depends on architecture and even ABI (e.g. x64 vs x32 ABI on x86-64) makes it particularly annoying. iterations counter is made to use uint32_t because it is incremented in the hot loop, and incrementing uint64_t is slow on some 32-bit platforms (e.g. Asm.js).

coveralls · 2018-01-03T09:11:22Z

Coverage remained the same at 86.926% when pulling 7def119 on Maratyszcza:master into e4ccad7 on google:master.

coveralls · 2018-01-03T09:21:14Z

Coverage remained the same at 86.926% when pulling 7def119 on Maratyszcza:master into e4ccad7 on google:master.

AppVeyorBot · 2018-01-03T10:03:39Z

✅ Build benchmark 927 completed (commit da546b801d by @Maratyszcza)

pleroy

The change to size_t in the API was intentional, I believe, it solved many signed/unsigned confusions that existed in the previous code.

pleroy · 2018-01-03T10:20:10Z

include/benchmark/benchmark.h


  BENCHMARK_ALWAYS_INLINE
-  size_t iterations() const { return (max_iterations - total_iterations_) + 1; }
+  uint32_t iterations() const { return (max_iterations - total_iterations_) + 1; }


This change seems unnecessary. On a 32-bit system it has no effect and on a 64-bit system there is no harm in using size_t. Same comment for all the places that touch iterations counters.

With this change, if a benchmark has an overflow bug on 32-bit systems, it will also manifest on a 64-bit system. E.g. state.SetItemsProcessed(1000000 * state.iterations()) would have identical overflow behavior on both 32-bit and 64-bit systems.

And I claim that it's a bad property.

I happen to only ever run benchmarks on 64-bit systems. Why would you force me to have to deal with 32-bit overflows just for a misconceived notion of uniformity?

AppVeyorBot · 2018-01-03T11:07:34Z

✅ Build benchmark 928 completed (commit 7afcca4fc8 by @Maratyszcza)

Maratyszcza · 2018-01-03T21:59:12Z

I changed interface for State.iterations() to return uint64_t, so code that relied on iterations being 64-bit on x86-64 will keep working, and will also work on 32-bit ABIs. In the private classes, I left iterations as uint32_t. It is upper bounded by 1e+9 iterations (kMaxIterations), so this wouldn't break anything on either 32-bit or 64-bit systems.

AppVeyorBot · 2018-01-03T22:59:05Z

✅ Build benchmark 929 completed (commit 8b4f11e1f9 by @Maratyszcza)

coveralls · 2018-01-03T23:02:53Z

Coverage remained the same at 86.926% when pulling b5ba7a9 on Maratyszcza:master into e4ccad7 on google:master.

dmah42 · 2018-01-04T17:41:27Z

for one thing, these should not be explicitly unsigned unless we're representing bitfields or closed-group arithmetic. We're not, so we should be using signed integers if we're going to move away from the standard size_t type (which is unsigned by the standard, but that was an unfortunate decision).

while signed integer overflow is undefined behaviour, this is detectable with ubsan and is safer (in a way) than the wrapping that unsigned integers would give us.

ie, i'm comfortable with the bytes/items processed being intxx_t instead of size_t, but not uintxx_t.

EricWF · 2018-01-04T22:43:09Z

@Maratyszcza Thanks for the patch. Don't worry about ABI breaks. The project is free to break the ABI.

First, I agree with @dominichamon about using signed integer types. And I agree about making this change for things we explicitly want to be 64 bits.

However, I don't know if using explicit 32 bit types makes sense. In these cases, it seems to make the most sense to simply use int, which I think is what we should try to do universally except where 64 bits are needed. What do you think?

coveralls · 2018-01-04T23:35:50Z

Coverage remained the same at 86.926% when pulling be535f8 on Maratyszcza:master into e4ccad7 on google:master.

AppVeyorBot · 2018-01-05T00:08:35Z

✅ Build benchmark 930 completed (commit 08a0561a14 by @Maratyszcza)

Maratyszcza · 2018-01-05T00:37:18Z

@dominichamon I changed types in the interface to int64_t, can we merge now? Internally, I left iterations as uint32_t.

@EricWF Using int would hurt portability. We need a type that can hold values up to 1e+9. uint32_t/int32_t guarantees that, int does not (in fact, INT_MAX is usually just 32767 on 16-bit systems).

coveralls · 2018-01-05T00:49:13Z

Coverage remained the same at 86.926% when pulling be535f8 on Maratyszcza:master into e4ccad7 on google:master.

dmah42 · 2018-01-05T00:59:35Z

include/benchmark/benchmark.h

  bool started_;
  bool finished_;
-  size_t total_iterations_;
+  uint32_t total_iterations_;


i buy your 'int' argument and if we need a guarantee, which we do, then this should be ?int32_t. The Google C++ style guide, interestingly, states that we can assume an int is at least 32 bits, which is a bit surprising.

However, in the same section it also specifically recommends avoiding unsigned unless we need overflow modulo 2^N or representing bit patterns. Given we're doing neither, this should be signed.

So: int32_t please. And then promoting once to int64_t in 'iterations' seems reasonable, though i'd promote the individual parts of the calculation. ie, max_iterations - total_iterations_ could be INT_MAX already if total is 0. the +1 will cause an overflow before you get the chance to promote to int64.

I'm sorry, I don't buy the int vs. int32_t argument.

According to cppreference, int can only be a 16-bit type on LP32 systems, e.g., Win16 or, maybe, some microcontrollers. We have no evidence that the library was ever ported to such a system. If software hasn't been ported, it's not portable, period. The code is choke-full of usages of int so there is no point in an illusory attempt at improving it by using int32_t in a handful of places.

If someone wanted to port the library to Win16 that might be an interesting exercise, but that's not what this PR does.

++pleroy. @Maratyszcza I maintain an STL, and I honestly I don't think we even worry about 16 bit integers. Let's go with the Google style guide and assume that int is at least 32 bits.

AppVeyorBot · 2018-01-05T01:12:45Z

✅ Build benchmark 931 completed (commit cdd596fd8d by @Maratyszcza)

coveralls · 2018-01-05T02:07:04Z

Coverage decreased (-0.06%) to 86.926% when pulling 33a3e03 on Maratyszcza:master into 052421c on google:master.

coveralls · 2018-01-05T02:17:53Z

Coverage decreased (-0.06%) to 86.926% when pulling 33a3e03 on Maratyszcza:master into 052421c on google:master.

AppVeyorBot · 2018-01-05T04:19:03Z

❌ Build benchmark 936 failed (commit 62340aac82 by @Maratyszcza)

7a76701 seemingly unintentionally changed the data types for bytes_processed, items_processed, and iterations to size_t. On 32-bit systems, the bytes_processed and items_processed variables easily overflow, and often report fewer items/second for faster code. The fact that overflow chances depends on architecture and even ABI (e.g. x64 vsi x32 ABI on x86-64) makes it particularly annoying. iterations counter is made to use int32_t internally because it is incremented in the hot loop, and incrementing int64_t is slow on some 32-bit platforms (e.g. Asm.js). However, the public interface State.iterations() is changed to return int64_t.

coveralls · 2018-01-05T05:11:20Z

Coverage decreased (-0.06%) to 86.926% when pulling a01cb72 on Maratyszcza:master into 052421c on google:master.

AppVeyorBot · 2018-01-05T06:03:35Z

✅ Build benchmark 937 completed (commit fefdcd5b80 by @Maratyszcza)

dmah42 · 2018-05-29T12:23:55Z

I think this has been obsoleted by other changes.

googlebot added the cla: yes label Jan 3, 2018

Maratyszcza force-pushed the master branch from f29bd50 to 7def119 Compare January 3, 2018 09:00

pleroy reviewed Jan 3, 2018

View reviewed changes

Maratyszcza force-pushed the master branch from 7def119 to b5ba7a9 Compare January 3, 2018 21:54

Maratyszcza force-pushed the master branch 2 times, most recently from 9082adf to be535f8 Compare January 4, 2018 23:03

dmah42 reviewed Jan 5, 2018

View reviewed changes

Maratyszcza force-pushed the master branch 2 times, most recently from 0e93a18 to 33a3e03 Compare January 5, 2018 01:44

Maratyszcza force-pushed the master branch from 33a3e03 to a01cb72 Compare January 5, 2018 04:58

dmah42 closed this May 29, 2018

Use uintXX_t instead of size_t for items/bytes/iterations #507

Use uintXX_t instead of size_t for items/bytes/iterations #507

Uh oh!

Conversation

Maratyszcza commented Jan 3, 2018

Uh oh!

coveralls commented Jan 3, 2018

Uh oh!

coveralls commented Jan 3, 2018

Uh oh!

AppVeyorBot commented Jan 3, 2018

Uh oh!

pleroy left a comment

Choose a reason for hiding this comment

Uh oh!

pleroy Jan 3, 2018

Choose a reason for hiding this comment

Uh oh!

Maratyszcza Jan 3, 2018

Choose a reason for hiding this comment

Uh oh!

pleroy Jan 3, 2018

Choose a reason for hiding this comment

Uh oh!

AppVeyorBot commented Jan 3, 2018

Uh oh!

Maratyszcza commented Jan 3, 2018

Uh oh!

AppVeyorBot commented Jan 3, 2018

Uh oh!

coveralls commented Jan 3, 2018

Uh oh!

dmah42 commented Jan 4, 2018

Uh oh!

EricWF commented Jan 4, 2018

Uh oh!

coveralls commented Jan 4, 2018

Uh oh!

AppVeyorBot commented Jan 5, 2018

Uh oh!

Maratyszcza commented Jan 5, 2018

Uh oh!

coveralls commented Jan 5, 2018

Uh oh!

dmah42 Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

pleroy Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

EricWF Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

AppVeyorBot commented Jan 5, 2018

Uh oh!

coveralls commented Jan 5, 2018

Uh oh!

coveralls commented Jan 5, 2018

Uh oh!

AppVeyorBot commented Jan 5, 2018

Uh oh!

coveralls commented Jan 5, 2018

Uh oh!

AppVeyorBot commented Jan 5, 2018

Uh oh!

dmah42 commented May 29, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants