Skip to content

demula/mksuid

 
 

Repository files navigation

mksuid

mksuid is a fork with sub millisecond precision of the popular ksuid, an efficient, comprehensive, battle-tested Go library for generating and parsing a specific kind of globally unique identifier called a KSUID.

Install

go get -u github.com/demula/mksuid/v2

What is a mKSUID?

mKSUID is for K-Sortable Unique IDentifier. It is a kind of globally unique identifier similar to a RFC 4122 UUID, built from the ground-up to be "naturally" sorted by generation timestamp without any special type-aware logic.

In short, running a set of mKSUIDs through the UNIX sort command will result in a list ordered by generation time.

Why use mKSUIDs?

There are numerous methods for generating unique identifiers, so why mKSUID?

  1. Naturally ordered by generation time
  2. Collision-free, coordination-free, dependency-free
  3. Highly portable representations
  4. Sub-millisecond precision

Only if #4 is needed to you use this project, for everything else use original ksuid project.

For a follow up read on the topic: A brief history of UUID

1. Naturally Ordered By Generation Time

Unlike the more ubiquitous UUIDv4, a mKSUID contains a timestamp component that allows them to be loosely sorted by generation time. This is not a strong guarantee (an invariant) as it depends on wall clocks, but is still incredibly useful in practice. Both the binary and text representations will sort by creation time without any special sorting logic.

2. Collision-free, Coordination-free, Dependency-free

While RFC 4122 UUIDv1s do include a time component, there aren't enough bytes of randomness to provide strong protection against collisions (duplicates). With such a low amount of entropy, it is feasible for a malicious party to guess generated IDs, creating a problem for systems whose security is, implicitly or explicitly, sensitive to an adversary guessing identifiers.

To fit into a 64-bit number space, Snowflake IDs and its derivatives require coordination to avoid collisions, which significantly increases the deployment complexity and operational burden.

A mKSUID includes 112 bits of pseudorandom data ("entropy"). This number space is 1024 times smaller than the 122 bits used by the well-accepted RFC 4122 UUIDv4 standard. Jointly with the sub-millisecond timestamp component, mKSUID can lower the probability of collisions, to the point of physical infeasibility in some specific practical implementations.

3. Highly Portable Representations

The text and binary representations are lexicographically sortable, which allows them to be dropped into systems which do not natively support mKSUIDs and retain their time-ordered property.

The text representation is an alphanumeric base62 encoding, so it "fits" anywhere alphanumeric strings are accepted. No delimiters are used, so stringified mKSUIDs won't be inadvertently truncated or tokenized when interpreted by software that is designed for human-readable text, a common problem for the text representation of RFC 4122 UUIDs.

4. Sub-millisecond precision

By shifting 16 bits from the payload to the timestamp representation we can deliver a time precision of 0.01ms.

How do mKSUIDs work?

Binary mKSUIDs are 20-bytes: a 48-bit unsigned integer UTC timestamp and a 112-bit randomly generated payload. The timestamp uses big-endian encoding, to support lexicographic sorting. The timestamp epoch is adjusted to Feb 12th 2019, providing over 89 years of life. The payload is generated by a cryptographically-strong pseudorandom number generator.

The text representation is always 27 characters, encoded in alphanumeric base62 that will lexicographically sort by timestamp.

High Performance

This library is designed to be used in code paths that are performance critical. Its code has been tuned to eliminate all non-essential overhead. The KSUID type is derived from a fixed-size array, which eliminates the additional reference chasing and allocation involved in a variable-width type.

The API provides an interface for use in code paths which are sensitive to allocation. For example, the Append method can be used to parse the text representation and replace the contents of a KSUID value without additional heap allocation.

All public package level "pure" functions are concurrency-safe, protected by a global mutex. For hot loops that generate a large amount of KSUIDs from a single Goroutine, the Sequence type is provided to elide the potential contention.

By default, out of an abundance of caution, the cryptographically-secure PRNG is used to generate the random bits of a KSUID. This can be relaxed in extremely performance-critical code using the included FastRander type. FastRander uses the standard PRNG with a seed generated by the cryptographically-secure PRNG.

NOTE: While there is no evidence that FastRander will increase the probability of a collision, it shouldn't be used in scenarios where uniqueness is important to security, as there is an increased chance the generated IDs can be predicted by an adversary.

NOT Battle Tested

There are many trade-offs to this implementation and it is very important that the user understands and mitigate potential collision scenarios.

While adding extra segmentation in the timestamp, this increased substantially the sensitivity to traffic distribution of the collision space. The original second interval meant no matter how concentrated in a specific millisecond the traffic might be, the full 128 bits of random space was available ensuring no collision.

For this implementation we need to consider peaks on specific timestamp for calculating if we might run into potential collisions.

Interval Space Traffic Prob. collision1
1 s 128 bit 1 T 1.44e-15
0.01 ms 112 bit 0.01 T 9.66e-15

As the table above shows, given the same density in the slot, the mKSUD is ~7 times more likely to have a collision than the original implementation. That said, 10 * 10^9 IDs in a 0.01 ms slot is significantly higher that a lot of the traffic that some systems see and even then the probability of collision is still small.

Nevertheless is should be noted that the following scenarios that increases probability of collision and should be avoided where possible:

  • The generator has millisecond accuracy. This means that traffic will pile up on the exact millisecond slot and not extend over the available sub-millisecond slots.

  • Very long batches created on the same slot.

Also data lifetime might be consider as 89 years might pass fast.

Plays Well With Others

Designed to be integrated with other libraries, the KSUID type implements many standard library interfaces, including:

  • Stringer
  • database/sql.Scanner and database/sql/driver.Valuer
  • encoding.BinaryMarshal and encoding.BinaryUnmarshal
  • encoding.TextMarshal and encoding.TextUnmarshal (encoding/json friendly!)

Command Line Tool

This package comes with a command-line tool mksuid, useful for generating mKSUIDs as well as inspecting the internal components of existing mKSUIDs. Machine-friendly output is provided for scripting use cases.

Given a Go build environment, it can be installed with the command:

go install github.com/demula/mksuid/v2/cmd/mksuid

CLI Usage Examples

Generate a mKSUID

$ mksuid
0ujsswThIGTUYm2K8FjOOfXtY1K

Generate 4 mKSUIDs

$ mksuid -n 4
0ujsszwN8NRY24YaXiTIE2VWDTS
0ujsswThIGTUYm2K8FjOOfXtY1K
0ujssxh0cECutqzMgbtXSGnjorm
0ujsszgFvbiEr7CDgE3z8MAUPFt

Inspect the components of a mKSUID

$ mksuid -f inspect 0ujtsYcgvSTl8PAuAdqWYSMnLOv

REPRESENTATION:

  String: 0ujtsYcgvSTl8PAuAdqWYSMnLOv
     Raw: 0669F7EFB5A1CD34B5F99D1154FB6853345C9735

COMPONENTS:

       Time: 2021-05-09 03:00:10.14689 +0200 CEST
  Timestamp: 7052201014689
    Payload: CD34B5F99D1154FB6853345C9735

Generate a mKSUID and inspect its components

$ mksuid -f inspect

REPRESENTATION:

  String: 2a7pbkpmLmlwhDvcQAjfpwHFBOX
     Raw: 1219C838EDA55DCEABF735C32A3150927A558AB5

COMPONENTS:

       Time: 2025-06-04 08:43:46.56421 +0200 CEST
  Timestamp: 19901942656421
    Payload: 5DCEABF735C32A3150927A558AB5

Inspect a mKSUID with template formatted inspection output

$ mksuid -f template -t '{{ .Time }}: {{ .Payload }}' 0ujtsYcgvSTl8PAuAdqWYSMnLOv
2021-05-09 03:00:10.14689 +0200 CEST: CD34B5F99D1154FB6853345C9735

Inspect multiple mKSUIDs with template formatted output

$ mksuid -f template -t '{{ .Time }}: {{ .Payload }}' $(mksuid -n 4)
2025-06-04 08:45:23.68534 +0200 CEST: 671D45E054FFFCA38A4B54EB4B91
2025-06-04 08:45:23.68534 +0200 CEST: 38B7102B33B3FF1CE98E1C047BCF
2025-06-04 08:45:23.68534 +0200 CEST: C55D184E6032222D106EC7CB4B1E
2025-06-04 08:45:23.68534 +0200 CEST: 75C0D93F3C6A9B53D6FE6021F7FD

Generate mKSUIDs and output JSON using template formatting

$ mksuid -f template -t '{ "timestamp": "{{ .Timestamp }}", "payload": "{{ .Payload }}", "mksuid": "{{.String}}"}' -n 4
{ "timestamp": "19901957503430", "payload": "1C12D719DAC325D36BD0FD297839", "mksuid": "2a7q4DwuJHXG98ezsedUpLu08Qj"}
{ "timestamp": "19901957503430", "payload": "C6FAAE9E580507DF929BC5B0EE7D", "mksuid": "2a7q4DwucCJjEUDOMi5k5NyRNaX"}
{ "timestamp": "19901957503430", "payload": "ED907028E1B44B2B4AA7C50C03F4", "mksuid": "2a7q4DwugT5rRVpJlZZPpsMYJvA"}
{ "timestamp": "19901957503430", "payload": "985053DFA36882A969FAA03DB68C", "mksuid": "2a7q4DwuX25jQZZKpRY3ZH2ASD6"}

OrNil functions

There are times when you are sure your mksuid is correct. But you need to get it from bytes or string and pass it it's to the structure. For this, there are OrNil functions that return mksuid.Nil on error and can be called directly in the structure.

Functions:

  • ParseOrNil()
  • FromPartsOrNil()
  • FromBytesOrNil()

An example of using the function without OrNil:

func getPosts(before, after []byte) {
 b, err := mksuid.FromBytes(before)
 if err != nil {
  // handle error
 }

 a, err := mksuid.FromBytes(after)
 if err != nil {
  // handle error
 }

 sortOptions := SortOptions{Before: b, After: a}
}

It is much more convenient to do it like this:

func getPosts(before, after []byte) {
 sortOptions := SortOptions{
  Before: mksuid.FromBytesOrNil(before),
  After:  mksuid.FromBytesOrNil(after),
 }
}

OrNil functions are also used in many other libraries:

Implementations for other languages

Footnotes

License

mksuid source code is available under an MIT License.

Footnotes

  1. Calculated using Birthday problem from thinxer on Github issue #8

About

Milliseconds K-Sortable Globally Unique IDs

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Go 100.0%