Skip to content

Conversation

darkgnotic
Copy link
Contributor

Introduce ZeroEvents, a channel for surfacing realtime "events" for the purposes of production ops and observability.

As a monitoring signal, events differ from (1) metrics, which are periodic and limited to numeric values, and (2) logs, which are textual and generally have a multi-minute processing delay. Events, on the other hand, are designed for low latency delivery / processing with structured data payloads.

ZeroEvents are published using the Cloud Events specification, with configuration designed to facilitate integration with knative sink bindings. However, any CloudEvent routing framework can be used to route and handle the Cloud Events.

The first specific ZeroEvent type is used for publishing the status of replication. ReplicationStatusEvents are emitted:

  • periodically during initial sync
  • when replication begins (i.e. indicating that initial sync has finished)
  • whenever a schema change is processed

The events themselves contain the replicated schema data, which facilitates verification by the operator that the expected tables and columns are being replicated.

@darkgnotic darkgnotic requested a review from cesara August 14, 2025 10:24
Copy link

vercel bot commented Aug 14, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Project Deployment Preview Comments Updated (UTC)
replicache-docs Ready Preview Comment Aug 14, 2025 10:35am
zbugs Ready Preview Comment Aug 14, 2025 10:35am

Copy link

github-actions bot commented Aug 14, 2025

🐰 Bencher Report

Branchdarkgnotic/zero-events
TestbedLinux

🚨 2 Alerts

BenchmarkMeasure
Units
ViewBenchmark Result
(Result Δ%)
Lower Boundary
(Limit %)
src/client/zero.bench.ts > pk compare > pk = NThroughput
operations / second (ops/s) x 1e3
📈 plot
🚷 threshold
🚨 alert (🔔)
21.51 ops/s x 1e3
(-24.64%)Baseline: 28.54 ops/s x 1e3
24.50 ops/s x 1e3
(113.90%)

src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)Throughput
operations / second (ops/s) x 1e3
📈 plot
🚷 threshold
🚨 alert (🔔)
2.35 ops/s x 1e3
(-8.12%)Baseline: 2.56 ops/s x 1e3
2.47 ops/s x 1e3
(105.16%)

Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s) x 1e3
(Result Δ%)
Lower Boundary
operations / second (ops/s) x 1e3
(Limit %)
src/client/custom.bench.ts > big schema📈 view plot
🚷 view threshold
366.99 ops/s x 1e3
(-4.54%)Baseline: 384.45 ops/s x 1e3
353.76 ops/s x 1e3
(96.40%)
src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers)📈 view plot
🚷 view threshold
1.61 ops/s x 1e3
(-1.04%)Baseline: 1.63 ops/s x 1e3
1.59 ops/s x 1e3
(98.58%)
src/client/zero.bench.ts > pk compare > pk = N📈 view plot
🚷 view threshold
🚨 view alert (🔔)
21.51 ops/s x 1e3
(-24.64%)Baseline: 28.54 ops/s x 1e3
24.50 ops/s x 1e3
(113.90%)

src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)📈 view plot
🚷 view threshold
🚨 view alert (🔔)
2.35 ops/s x 1e3
(-8.12%)Baseline: 2.56 ops/s x 1e3
2.47 ops/s x 1e3
(105.16%)

🐰 View full continuous benchmarking report in Bencher

Copy link

github-actions bot commented Aug 14, 2025

🐰 Bencher Report

Branchdarkgnotic/zero-events
TestbedLinux
Click to view all benchmark results
BenchmarkFile SizeBenchmark Result
kilobytes (KB)
(Result Δ%)
Upper Boundary
kilobytes (KB)
(Limit %)
zero-package.tgz📈 view plot
🚷 view threshold
1,280.67 KB
(+0.37%)Baseline: 1,276.00 KB
1,301.52 KB
(98.40%)
zero.js📈 view plot
🚷 view threshold
207.51 KB
(0.00%)Baseline: 207.51 KB
211.66 KB
(98.04%)
zero.js.br📈 view plot
🚷 view threshold
58.11 KB
(0.00%)Baseline: 58.11 KB
59.28 KB
(98.04%)
🐰 View full continuous benchmarking report in Bencher

@darkgnotic darkgnotic enabled auto-merge (squash) August 14, 2025 10:34
@darkgnotic darkgnotic merged commit b0f6be9 into main Aug 14, 2025
13 of 15 checks passed
@darkgnotic darkgnotic deleted the darkgnotic/zero-events branch August 14, 2025 10:55
Copy link
Contributor

@arv arv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

"version": "0.0.0",
"private": true,
"type": "module",
"module": "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the lack of context. i which we could have comments in package.json

eventually i intend to publish this package alone as something like rocicorp/zero-events, so that production / operational code (i.e. cloudzero) can import just the ZeroEvent interfaces and not full zero package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but "module": "true", is not part of package.json spec as far as I can tell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, then it looks like I just cargo-culted it from another (mistaken) package.json 😅

},
"devDependencies": {
"@rocicorp/eslint-config": "^0.7.0",
"@rocicorp/prettier-config": "^0.3.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vitest?

It does look like this package has no test so you should remove the test scripts

export const replicationStatusEventSchema = statusEventSchema.extend({
type: v
.literal('zero/events/status/replication/v1')
.assert(v => v.startsWith(STATUS_EVENT_PREFIX)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this assert does here. Seems like it is more of a "safety check". You can probably achieve the same thing using typescript satisfies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I got something to work using template literals, so we can avoid the somewhat cryptic use of assert.

void publishFn(lc, event);
}

export async function publishCriticalEvent<E extends ZeroEvent>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just the same as publishEvent? What is the usecase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most events are published fire-and-forget in the steady state.

But for errors, particularly when the server is going to be shut down, I expose this method that is intended to be awaited to make sure that the error event gets published before the server shuts down. This is to guarantee that the error makes it out of the server before it dies.

return errorDetails;
}

const pathUnused = {push: () => {}, pop: () => {}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path unused can also be an array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm ... can you clarify? I'm looking at this method:

export function isJSONValue(v: unknown, path: Path): v is JSONValue {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of isJSONValue usually takes an array. Array implements Path :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, I see. Roger.

component: 'replication',
stage: 'Initializing',
status: 'OK',
description: /Copying \d+ upstream tables at version \w+/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, you can use RegExps in these

type: 'zero/events/status/replication/v1',
component: 'replication',
stage: 'Initializing',
status: 'OK',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this have the description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first status we publish when we start initial sync, before we've touched upstream at all. I couldn't think of any useful information to embellish here ... seems like "Initializing" is enough? It will be followed up with the "Copying ... " event pretty quickly. 🤷

lc,
'Indexing',
`Creating ${indexes.length} indexes`,
5000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 5000? and how is it exposed/tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the interval between continuous updates. So during initial sync, events will be published every 5 seconds to surface the latest state of initial sync. In the PG case, it's mainly the replica size that will change.

It isn't tested. 🤫

publishEvent,
} from '../../observability/events.ts';

const byKeys = ([a]: [string, unknown], [b]: [string, unknown]) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is hot but for hot code do not destructure arrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super hot ... events should be infrequent. But I'll remember this. Thank you! 🙏

interval = 0,
): this {
this.stop();
publishEvent(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense to make publishEvent a property that is passed in instead of overriding it for tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be cleaner ... bit it would be a lot of plumbing to get to this object from the top level of the test. i chose the lazy / less invasive route. i think it's "okay" because tests are run serially per process ... but please correct me if i'm mistaken.

Copy link
Contributor Author

@darkgnotic darkgnotic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @arv!

void publishFn(lc, event);
}

export async function publishCriticalEvent<E extends ZeroEvent>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most events are published fire-and-forget in the steady state.

But for errors, particularly when the server is going to be shut down, I expose this method that is intended to be awaited to make sure that the error event gets published before the server shuts down. This is to guarantee that the error makes it out of the server before it dies.

return errorDetails;
}

const pathUnused = {push: () => {}, pop: () => {}};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm ... can you clarify? I'm looking at this method:

export function isJSONValue(v: unknown, path: Path): v is JSONValue {

type: 'zero/events/status/replication/v1',
component: 'replication',
stage: 'Initializing',
status: 'OK',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first status we publish when we start initial sync, before we've touched upstream at all. I couldn't think of any useful information to embellish here ... seems like "Initializing" is enough? It will be followed up with the "Copying ... " event pretty quickly. 🤷

lc,
'Indexing',
`Creating ${indexes.length} indexes`,
5000,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the interval between continuous updates. So during initial sync, events will be published every 5 seconds to surface the latest state of initial sync. In the PG case, it's mainly the replica size that will change.

It isn't tested. 🤫

publishEvent,
} from '../../observability/events.ts';

const byKeys = ([a]: [string, unknown], [b]: [string, unknown]) =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super hot ... events should be infrequent. But I'll remember this. Thank you! 🙏

interval = 0,
): this {
this.stop();
publishEvent(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be cleaner ... bit it would be a lot of plumbing to get to this object from the top level of the test. i chose the lazy / less invasive route. i think it's "okay" because tests are run serially per process ... but please correct me if i'm mistaken.

"version": "0.0.0",
"private": true,
"type": "module",
"module": "true",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the lack of context. i which we could have comments in package.json

eventually i intend to publish this package alone as something like rocicorp/zero-events, so that production / operational code (i.e. cloudzero) can import just the ZeroEvent interfaces and not full zero package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants