[erts] Scheduler pollset migration only when single process enif_select #10323

potatosalad · 2025-10-23T19:51:57Z

Potential fix for #10322

If I'm understanding what happens when scheduler pollset migration gets triggered: once the state->count++ > 10, we migrate the "hot fd" to the scheduler pollset.

However, the state->count++ is global, so if there are multiple processes all calling socket:accept at the same time, then state->count++ > 10 could be triggered almost immediately.

For things like socket:recv, this generally would work as expected since there is typically one process calling this in a loop.

However, for socket:accept, it's more of a common practice to have multiple processes all trying to accept inbound connections.

If all of that's correct, then this is my assumption:

OTP 27: Direct message delivery from the I/O thread(?)
OTP 28: Process signals, sent from the normal scheduler thread(?)

The change here is to add a state->last_select_pid field and reset the state->count = 0 if we detect that the process has changed between calls to enif_select.

github-actions · 2025-10-23T19:52:52Z

CT Test Results

3 files 135 suites 49m 22s ⏱️
1 655 tests 1 598 ✅ 57 💤 0 ❌
2 293 runs 2 217 ✅ 76 💤 0 ❌

Results for commit c89cbee.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

sverker · 2025-10-29T19:02:19Z

This looks promising. I pushed two minor fixups.
Does this PR fix your regression?

potatosalad · 2025-10-30T01:29:42Z

@sverker Yes, we tested the patches today and all the issues we were seeing appear to have been resolved.

sverker · 2025-10-30T15:47:54Z

I rebased-squashed it for inclusion into maint (OTP-28.2).

potatosalad · 2025-10-30T17:40:39Z

@sverker Some additional information: we've noticed a latency regression when at high CPU utilization (90%+) that seems to be due to scheduler slowdown from the large number of pollset entries per scheduler.

I'll draw some ASCII art in a separate comment...

potatosalad · 2025-10-30T17:53:58Z

@sverker Alright, so before scheduler pollsets, my understanding is that this is roughly what was happening:

┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐
│ S0      │  │ S1      │  │ S2      │  │ S3      │  │ I/O     │
└─────────┘  └─────────┘  └─────────┘  └─────────┘  │ PS a..h │
                                                    └─────────┘

At high load, all of the pollset operations are still isolated to the I/O event loop and won't uniformly slow down the schedulers when there are a really large number of file descriptors in the pollsets.

With the new scheduler pollset feature, we have this:

┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐
│ S0      │  │ S1      │  │ S2      │  │ S3      │
│ PS a..b │  │ PS c..d │  │ PS e..f │  │ PS g..h │
└─────────┘  └─────────┘  └─────────┘  └─────────┘

First off: is this an accurate view of how this works?

We tested a version of OTP where ERTS_POLL_USE_SCHEDULER_POLLING was defined as (0) to disable the feature entirely and the latency regression at high CPU went away.

If all of this is correct, would it make sense to have a socket option that we could use to selectively prevent them from being included in the scheduler pollset optimization? We think this might help us in very large number of sockets scenarios.

For small number of sockets, the scheduler pollset optimization should provide more of a benefit, I would think.

Maybe something like:

socket:setopt(Socket, {otp, scheduler_polling}, false).

What do you think?

sverker · 2025-11-03T12:36:03Z

There is one scheduler pollset, namely sched_pollset in erl_check_io.c. Only one scheduler at a time waits for events from the pollset and not until all outstanding triggered events have been handled. There are some internal docs written about this at https://www.erlang.org/doc/apps/erts/checkio.html.

You can disable scheduler pollset with erl +IOs false. If you don't get any performance gain from scheduler pollset then that seems to be the easy solution.

If an option would be necessary, I would prefer something less implementation specific than {otp, scheduler_polling} if possible.

potatosalad marked this pull request as ready for review October 23, 2025 20:45

IngelaAndin added the team:VM Assigned to OTP team VM label Oct 27, 2025

sverker self-assigned this Oct 27, 2025

[erts] Scheduler pollset migration only when single process enif_select

c89cbee

sverker force-pushed the potatosalad/10322-non-global-scheduler-pollset-count branch from 0b47de1 to c89cbee Compare October 30, 2025 15:45

sverker changed the base branch from master to maint October 30, 2025 15:46

sverker added fix testing currently being tested, tag is used by OTP internal CI labels Oct 30, 2025

sverker self-requested a review October 30, 2025 15:49

sverker approved these changes Oct 30, 2025

View reviewed changes

potatosalad mentioned this pull request Oct 30, 2025

[erts][socket] Add {otp, scheduler_polling} option #10336

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[erts] Scheduler pollset migration only when single process enif_select #10323

[erts] Scheduler pollset migration only when single process enif_select #10323

potatosalad commented Oct 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

sverker commented Oct 29, 2025 •

edited

Loading

Uh oh!

potatosalad commented Oct 30, 2025

Uh oh!

sverker commented Oct 30, 2025

Uh oh!

potatosalad commented Oct 30, 2025

Uh oh!

potatosalad commented Oct 30, 2025 •

edited

Loading

Uh oh!

sverker commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[erts] Scheduler pollset migration only when single process enif_select #10323

Are you sure you want to change the base?

[erts] Scheduler pollset migration only when single process enif_select #10323

Conversation

potatosalad commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CT Test Results

Artifacts

Uh oh!

sverker commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

potatosalad commented Oct 30, 2025

Uh oh!

sverker commented Oct 30, 2025

Uh oh!

potatosalad commented Oct 30, 2025

Uh oh!

potatosalad commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sverker commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

potatosalad commented Oct 23, 2025 •

edited

Loading

github-actions bot commented Oct 23, 2025 •

edited

Loading

sverker commented Oct 29, 2025 •

edited

Loading

potatosalad commented Oct 30, 2025 •

edited

Loading