Skip to content
This repository was archived by the owner on Jan 25, 2022. It is now read-only.
This repository was archived by the owner on Jan 25, 2022. It is now read-only.

Document prior art / common finalizer pitfalls #78

@wingo

Description

@wingo

I think it may be a good idea to mention the historical pitfalls that finalizers have had in practice from other languages, and how this proposal is affected by them. Just off the top of my head:

Reachable finalized objects

In Java, finalizers run as instance methods on the object being finalized. If a clique of finalizable objects becomes unreachable, Java is still able to collect the clique, but it may be that a finalizer for object A could see object B in an already-finalized state, which is often a source of bugs.

This proposal avoids this problem by design: objects are already dead and unreachable when finalizers run. Finalizers will never see a finalized object.

Scarce resource management

Scarce resources like file descriptors are often a motivation for finalizers, and indeed they show up in the explainer. The explainer is correct that finalizers are useful as a backstop to release scarce resources. However in practice it's possible that users write code that acquires scarce resources faster than the garbage collector will run; e.g.

for (const name of fileNames) {
  foo(openFile(name));
}

Here it's likely that the GC won't run before all file descriptors are exhausted, if fileNames is long enough.

Of course the correct construct here is something that will release scarce resources (file descriptors, in this case) promptly, i.e.

for (const name of fileNames) {
  let fd = openFile(name);
  try { foo(fd); } finally { closeFile(fd); }
}

However users do all kinds of things, and to be fair sometimes you don't know when you have this situation in testing and instead you see it in production. Although this spec doesn't create this hazard, as host objects may have finalizers of some sort, it will likely exacerbate it.

To compensate, in languages that expose an interface to a garbage collector, it's usual for the implementation of openFile to explicitly run a gc() if allocating a file descriptor fails. I think this spec is going to encourage users to ask for a standardized interface to run GC; it would be nice to avoid this pressure but I don't see a way around it (the pressure, I mean; we can probably avoid giving in, though).

Concurrent finalization

Finalizers introduce concurrency: in addition to the promises, callbacks, event handlers, and so on which are present in a JS system, there will also be finalizers. The only useful thing a finalizer can do is to modify global state in some way, so finalizers are necessarily concurrent with main-program modifications on that global state.

This proposal avoids two problems that Java experienced. One is that because finalization group callbacks are only invoked in their own microtask, they aren't concurrent with any other microtask. In contrast, in early Java they could be invoked within other "microtasks", from the "main thread". Early Java moved away from this model because if a finalizer needs lock A but the finalizer runs when the main thread already has aquired lock A, you get a deadlock. So Java moved to always run finalizers from dedicated threads, to avoid this issue.

The other problem Java still has but which this proposal does not have is that it runs finalizer "microtasks" from a separate thread, so they need judicious lock discipline to coordinate access to shared global state. This proposal doesn't need locks as such.

However there is the possibility for concurrent mutation during long-running async tasks. For example, consider:

  for (const fd of File.allFds()) {
    yield visit(fd);
  }

Here while the values from File.allFds() are being consumed, the finalizer may be mutating the set of values that are in that backing set. However this problem is shared with other kinds of shared-state concurrency in JS and the specifications cover what happens when e.g. a Set is modified while it is being iterated, so there's no particular hazard here.

Early finalization

Consider:

async function f(x) {
  let fd = x.fd;
  // yield some things
  g(fd);
}

When g is called, it could be that GC determined that x was no longer reachable, and that therefore its finalizer could be called. For synchronous functions, this proposal avoids the issue, as finalizers aren't concurrent with microtasks. However for async functions it's perfectly possible. Implementations have their own characteristics but I can imagine that they will push back on any specification of what it means for an object to be reachable.

The situation is of course aggravated in the presence of aggressive inlining and/or tail calls.

I am not sure that we can avoid this one entirely from the specification perspective and I think it's likely that some users will be bitten by this class of bug.

See also, "Finalization Should Not Be Based On Reachability", a small polemic by Hans Boehm: https://www.hboehm.info/misc_slides/ISMMfinalization.pdf, or also https://www.hboehm.info/popl03/web/html/slide_12.html.

Miscellaneous references

[1] Destructors, Finalizers, and Synchronization, Hans Boehm, https://www.hpl.hp.com/techreports/2002/HPL-2002-335.pdf

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions