Skip to content

Conversation

darkgnotic
Copy link
Contributor

@darkgnotic darkgnotic commented Oct 9, 2025

Improve IVM time-slicing to run slices in separate iterations of the Node JS event loop, thus allowing I/O processing to happen between slices and preventing connection failures due to unresponsive pings.

The simulator illustrates the behavior of the previous and new time-slicing implementations based on a "pipeline" which:

  • fetches ~250K rows from an SQLite db
  • yields every 50K rows (roughly 500ms)
  • measures the delay between a per-second ping and its received response:
Screenshot 2025-10-09 at 11 10 04

Previous implementation

The previous setTimeout()-based implementation, while achieving fairness across pipelines, was problematic in that the slices of all pipelines were run in a single iteration of the event loop (specifically, the pending callbacks phase), thereby incurring an I/O processing delay proportional to the number of pipelines.

For example, with 20 pipelines (i.e. ViewSyncers) in the process, the ping delay goes up to 20 x 500ms = 10sec.

Screenshot 2025-10-09 at 11 09 23

New implementation

The new time slicing implementation leverages a queue (implemented with a Lock) to:

  • run a single time-slice per iteration of the event loop
  • using setImmediate() to run logic after I/O, and
  • scheduling the next slice by running setImmediate() in that phase.
const queue = new Lock();

function yieldProcess() {
  // previously ...
  // return new Promise(resolve => setTimeout(resolve, 0));

  // now ...
  return queue.withLock(() => new Promise(setImmediate));
}

As explained in the specification of setImmediate():

If an immediate timer is queued from inside an executing callback, that timer will not be triggered until the next event loop iteration.

The resulting behavior achieves the same fairness and overall completion time, but allows I/O to be processed between slices, capping the ping delay to the duration of a single time slice:

Screenshot 2025-10-09 at 11 13 52

Miscellaneous

Also consolidated the timer and yielding code, and made it such that we yield before the first time slice (e.g. so that we don't run into the same problem when a bunch of ViewSyncers start processing a long advancement).

@darkgnotic darkgnotic requested a review from arv October 9, 2025 19:45
Copy link

vercel bot commented Oct 9, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
replicache-docs Ready Ready Preview Comment Oct 9, 2025 8:36pm
zbugs Ready Ready Preview Comment Oct 9, 2025 8:36pm

@darkgnotic darkgnotic changed the title fix(zero-cache): improve ivm time-slicing to avoid I/O starvation fix(zero-cache): improve IVM time-slicing to avoid I/O starvation Oct 9, 2025
Copy link

github-actions bot commented Oct 9, 2025

🐰 Bencher Report

Branchdarkgnotic/avoid-io-starvation
Testbedself-hosted
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s)
(Result Δ%)
Lower Boundary
operations / second (ops/s)
(Limit %)
zpg: (pk lookup) select * from track where id = 3163📈 view plot
🚷 view threshold
912.44 ops/s
(-9.09%)Baseline: 1,003.63 ops/s
849.21 ops/s
(93.07%)
zpg: (secondary index lookup) select * from track where album_id = 248📈 view plot
🚷 view threshold
1,111.54 ops/s
(+3.70%)Baseline: 1,071.89 ops/s
969.66 ops/s
(87.24%)
zpg: (table scan) select * from album📈 view plot
🚷 view threshold
706.59 ops/s
(-0.86%)Baseline: 712.73 ops/s
609.52 ops/s
(86.26%)
zpg: OR with empty branch and limit📈 view plot
🚷 view threshold
890.98 ops/s
(-3.51%)Baseline: 923.36 ops/s
837.47 ops/s
(93.99%)
zpg: OR with empty branch and limit with exists📈 view plot
🚷 view threshold
728.22 ops/s
(-4.30%)Baseline: 760.91 ops/s
676.93 ops/s
(92.96%)
zpg: all playlists📈 view plot
🚷 view threshold
5.57 ops/s
(+0.75%)Baseline: 5.53 ops/s
5.37 ops/s
(96.36%)
zpg: scan with one depth related📈 view plot
🚷 view threshold
428.75 ops/s
(-0.01%)Baseline: 428.80 ops/s
410.74 ops/s
(95.80%)
zql: (pk lookup) select * from track where id = 3163📈 view plot
🚷 view threshold
115,312.11 ops/s
(-6.36%)Baseline: 123,145.04 ops/s
106,318.44 ops/s
(92.20%)
zql: (secondary index lookup) select * from track where album_id = 248📈 view plot
🚷 view threshold
2,289.03 ops/s
(+5.47%)Baseline: 2,170.28 ops/s
1,998.47 ops/s
(87.31%)
zql: (table scan) select * from album📈 view plot
🚷 view threshold
6,726.54 ops/s
(+1.59%)Baseline: 6,621.51 ops/s
6,279.81 ops/s
(93.36%)
zql: OR with empty branch and limit📈 view plot
🚷 view threshold
60,973.26 ops/s
(+8.36%)Baseline: 56,267.43 ops/s
45,863.32 ops/s
(75.22%)
zql: OR with empty branch and limit with exists📈 view plot
🚷 view threshold
13,461.00 ops/s
(+9.02%)Baseline: 12,347.46 ops/s
11,003.57 ops/s
(81.74%)
zql: all playlists📈 view plot
🚷 view threshold
4.38 ops/s
(+2.96%)Baseline: 4.26 ops/s
4.01 ops/s
(91.51%)
zql: edit for limited query, inside the bound📈 view plot
🚷 view threshold
231,134.68 ops/s
(+4.14%)Baseline: 221,942.63 ops/s
209,047.60 ops/s
(90.44%)
zql: edit for limited query, outside the bound📈 view plot
🚷 view threshold
237,308.16 ops/s
(+7.22%)Baseline: 221,335.40 ops/s
188,870.79 ops/s
(79.59%)
zql: push into limited query, inside the bound📈 view plot
🚷 view threshold
110,745.89 ops/s
(+0.76%)Baseline: 109,913.92 ops/s
106,918.94 ops/s
(96.54%)
zql: push into limited query, outside the bound📈 view plot
🚷 view threshold
402,209.91 ops/s
(-5.57%)Baseline: 425,937.38 ops/s
389,870.69 ops/s
(96.93%)
zql: push into unlimited query📈 view plot
🚷 view threshold
350,928.90 ops/s
(+3.13%)Baseline: 340,270.20 ops/s
316,378.96 ops/s
(90.15%)
zql: scan with one depth related📈 view plot
🚷 view threshold
505.94 ops/s
(+6.24%)Baseline: 476.25 ops/s
438.74 ops/s
(86.72%)
zqlite: (pk lookup) select * from track where id = 3163📈 view plot
🚷 view threshold
42,927.62 ops/s
(+0.42%)Baseline: 42,746.99 ops/s
40,811.21 ops/s
(95.07%)
zqlite: (secondary index lookup) select * from track where album_id = 248📈 view plot
🚷 view threshold
10,188.09 ops/s
(+0.58%)Baseline: 10,129.00 ops/s
9,849.35 ops/s
(96.68%)
zqlite: (table scan) select * from album📈 view plot
🚷 view threshold
1,364.97 ops/s
(+1.87%)Baseline: 1,339.94 ops/s
1,252.98 ops/s
(91.80%)
zqlite: OR with empty branch and limit📈 view plot
🚷 view threshold
17,909.44 ops/s
(-0.33%)Baseline: 17,969.34 ops/s
17,365.29 ops/s
(96.96%)
zqlite: OR with empty branch and limit with exists📈 view plot
🚷 view threshold
5,663.73 ops/s
(+3.66%)Baseline: 5,463.62 ops/s
5,154.76 ops/s
(91.01%)
zqlite: all playlists📈 view plot
🚷 view threshold
1.43 ops/s
(+2.55%)Baseline: 1.40 ops/s
1.32 ops/s
(92.21%)
zqlite: edit for limited query, inside the bound📈 view plot
🚷 view threshold
121,757.37 ops/s
(+2.53%)Baseline: 118,747.75 ops/s
111,557.94 ops/s
(91.62%)
zqlite: edit for limited query, outside the bound📈 view plot
🚷 view threshold
124,235.44 ops/s
(+2.86%)Baseline: 120,784.83 ops/s
113,264.49 ops/s
(91.17%)
zqlite: push into limited query, inside the bound📈 view plot
🚷 view threshold
3,811.57 ops/s
(-0.59%)Baseline: 3,834.36 ops/s
3,690.43 ops/s
(96.82%)
zqlite: push into limited query, outside the bound📈 view plot
🚷 view threshold
145,646.49 ops/s
(+0.82%)Baseline: 144,454.98 ops/s
137,132.93 ops/s
(94.15%)
zqlite: push into unlimited query📈 view plot
🚷 view threshold
130,368.07 ops/s
(+3.98%)Baseline: 125,373.52 ops/s
113,544.68 ops/s
(87.10%)
zqlite: scan with one depth related📈 view plot
🚷 view threshold
163.74 ops/s
(+0.09%)Baseline: 163.59 ops/s
153.14 ops/s
(93.52%)
🐰 View full continuous benchmarking report in Bencher

Copy link

github-actions bot commented Oct 9, 2025

🐰 Bencher Report

Branchdarkgnotic/avoid-io-starvation
TestbedLinux
Click to view all benchmark results
BenchmarkFile SizeBenchmark Result
kilobytes (KB)
(Result Δ%)
Upper Boundary
kilobytes (KB)
(Limit %)
zero-package.tgz📈 view plot
🚷 view threshold
1,371.10 KB
(+0.02%)Baseline: 1,370.76 KB
1,398.18 KB
(98.06%)
zero.js📈 view plot
🚷 view threshold
225.67 KB
(0.00%)Baseline: 225.67 KB
230.19 KB
(98.04%)
zero.js.br📈 view plot
🚷 view threshold
63.05 KB
(0.00%)Baseline: 63.05 KB
64.31 KB
(98.04%)
🐰 View full continuous benchmarking report in Bencher

Copy link

github-actions bot commented Oct 9, 2025

🐰 Bencher Report

Branchdarkgnotic/avoid-io-starvation
Testbedself-hosted
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s) x 1e3
(Result Δ%)
Lower Boundary
operations / second (ops/s) x 1e3
(Limit %)
src/client/custom.bench.ts > big schema📈 view plot
🚷 view threshold
908.58 ops/s x 1e3
(+10.62%)Baseline: 821.33 ops/s x 1e3
655.20 ops/s x 1e3
(72.11%)
src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers)📈 view plot
🚷 view threshold
2.86 ops/s x 1e3
(+2.13%)Baseline: 2.80 ops/s x 1e3
2.68 ops/s x 1e3
(93.71%)
src/client/zero.bench.ts > pk compare > pk = N📈 view plot
🚷 view threshold
44.18 ops/s x 1e3
(+1.05%)Baseline: 43.72 ops/s x 1e3
41.95 ops/s x 1e3
(94.94%)
src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)📈 view plot
🚷 view threshold
4.15 ops/s x 1e3
(-0.14%)Baseline: 4.16 ops/s x 1e3
3.99 ops/s x 1e3
(95.99%)
🐰 View full continuous benchmarking report in Bencher

@aboodman
Copy link
Contributor

aboodman commented Oct 9, 2025 via email

Copy link

github-actions bot commented Oct 9, 2025

🐰 Bencher Report

Branchdarkgnotic/avoid-io-starvation
Testbedself-hosted
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s) x 1e3
(Result Δ%)
Lower Boundary
operations / second (ops/s) x 1e3
(Limit %)
src/size-of-value.bench.ts > getSizeOfValue performance > arrays > large array (100 items)📈 view plot
🚷 view threshold
1,725.09 ops/s x 1e3
(+1.87%)Baseline: 1,693.48 ops/s x 1e3
1,634.51 ops/s x 1e3
(94.75%)
src/size-of-value.bench.ts > getSizeOfValue performance > arrays > small array (10 items)📈 view plot
🚷 view threshold
5,413.99 ops/s x 1e3
(-0.50%)Baseline: 5,441.31 ops/s x 1e3
5,212.91 ops/s x 1e3
(96.29%)
src/size-of-value.bench.ts > getSizeOfValue performance > datasets > large dataset (100x512B)📈 view plot
🚷 view threshold
85.93 ops/s x 1e3
(+9.09%)Baseline: 78.77 ops/s x 1e3
62.67 ops/s x 1e3
(72.94%)
src/size-of-value.bench.ts > getSizeOfValue performance > datasets > small dataset (10x256B)📈 view plot
🚷 view threshold
776.50 ops/s x 1e3
(+10.95%)Baseline: 699.89 ops/s x 1e3
561.91 ops/s x 1e3
(72.36%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > nested object📈 view plot
🚷 view threshold
3,375.24 ops/s x 1e3
(+3.48%)Baseline: 3,261.70 ops/s x 1e3
2,907.58 ops/s x 1e3
(86.14%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > structured object (1KB)📈 view plot
🚷 view threshold
4,129.44 ops/s x 1e3
(+6.53%)Baseline: 3,876.26 ops/s x 1e3
3,290.51 ops/s x 1e3
(79.68%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > structured object (256B)📈 view plot
🚷 view threshold
4,093.43 ops/s x 1e3
(+6.11%)Baseline: 3,857.86 ops/s x 1e3
3,287.61 ops/s x 1e3
(80.31%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > boolean📈 view plot
🚷 view threshold
7,611.93 ops/s x 1e3
(+1.67%)Baseline: 7,487.07 ops/s x 1e3
7,079.48 ops/s x 1e3
(93.01%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > integer📈 view plot
🚷 view threshold
7,760.64 ops/s x 1e3
(+3.29%)Baseline: 7,513.57 ops/s x 1e3
7,079.12 ops/s x 1e3
(91.22%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > null📈 view plot
🚷 view threshold
7,619.74 ops/s x 1e3
(+0.81%)Baseline: 7,558.67 ops/s x 1e3
7,091.04 ops/s x 1e3
(93.06%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > string (100 chars)📈 view plot
🚷 view threshold
1,148.66 ops/s x 1e3
(+2.58%)Baseline: 1,119.72 ops/s x 1e3
1,069.02 ops/s x 1e3
(93.07%)
🐰 View full continuous benchmarking report in Bencher

@darkgnotic darkgnotic enabled auto-merge October 9, 2025 20:41
@darkgnotic darkgnotic added this pull request to the merge queue Oct 9, 2025
Merged via the queue into main with commit 65f2d92 Oct 9, 2025
24 of 27 checks passed
@darkgnotic darkgnotic deleted the darkgnotic/avoid-io-starvation branch October 9, 2025 21:05
q.id,
q.ast,
timer.start(),
timer.startWithoutYielding(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arv

I think this would have cleaner / clearer if I was able to await timer.start() here like I do in the other places, but that would then require this generator (and the other generators that iterate over Database results) to be AsyncIterables instead of Iterables.

Would that have been expensive, i.e. to

  • wrap the sqlite Iterables as AsyncIterables and
  • add an await to every iterator call in the #processChanges() loop

when most of the time the iterator fetch is synchronous, or is that essentially negligible?

Copy link
Contributor

@arv arv Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are fine making this an AsyncIterator. The micro task loop does add some overhead but we are already using yield so I don't think it should be that bad.

Benchmark it!

Copy link
Contributor Author

@darkgnotic darkgnotic Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR I wrote a benchmark that iterates through 10,000 rows in a simple SELECT statement.

describe('sqlite iteration variations', () => {
  const db = new Database(
    createSilentLogContext(),
    '/tmp/zbugs-sync-replica.db',
  );
  const select = db.prepare('SELECT * FROM comment LIMIT 10000');

  bench('synchronous', () => {
    for (const _ of select.iterate()) {
    }
  });

  bench('synchronous with for await', async () => {
    for await (const _ of select.iterate()) {
    }
  });

  bench('synchronous with wrapped async iterator', async () => {
    const asyncIterator = (async function* () {
      for (const item of select.iterate()) {
        yield item;
      }
    })();
    for await (const _ of asyncIterator) {
    }
  });
});

The plain synchronous iteration is reliably ~10% faster.

 ✓  zero-cache/no-pg  bench/sqlite-iteration.bench.ts > sqlite iteration variations 1858ms
     name                                          hz     min      max     mean      p75      p99     p995     p999     rme  samples
   · synchronous                               108.03  8.6723   9.8659   9.2564   9.3097   9.8659   9.8659   9.8659  ±0.59%       55
   · synchronous with for await               98.1465  9.6024  10.6957  10.1888  10.2620  10.6957  10.6957  10.6957  ±0.50%       50
   · synchronous with wrapped async iterator  98.6003  9.5159  10.6587  10.1420  10.2224  10.6587  10.6587  10.6587  ±0.50%       50

 BENCH  Summary

   zero-cache/no-pg  synchronous - bench/sqlite-iteration.bench.ts > sqlite iteration variations
    1.10x faster than synchronous with wrapped async iterator
    1.10x faster than synchronous with for await

Granted, the overhead may be less significant depending on the cost of the logic inside of the loop, but the cost of the additional awaits and Promises over a plain table iteration is not negligible.

So I'll keep the code as it is (i.e. with sync iterators for row iteration and the occasional await for yielding).

@erikmunson
Copy link
Contributor

omg you found it!!! huge!!

@arv
Copy link
Contributor

arv commented Oct 10, 2025

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants