Streams

Living Standard — Last Updated

Participate:
GitHub whatwg/streams (new issue, open issues)
Chat on Matrix
Commits:
GitHub whatwg/streams/commits
Snapshot as of this commit
@streamsstandard
Tests:
web-platform-tests streams/ (ongoing work)
Translations (non-normative):
日本語
简体中文
한국어
Demos:
streams.spec.whatwg.org/demos

Abstract

This specification provides APIs for creating, composing, and consuming streams of data that map efficiently to low-level I/O primitives.

1. Introduction

This section is non-normative.

Large swathes of the web platform are built on streaming data: that is, data that is created, processed, and consumed in an incremental fashion, without ever reading all of it into memory. The Streams Standard provides a common set of APIs for creating and interfacing with such streaming data, embodied in readable streams, writable streams, and transform streams.

These APIs have been designed to efficiently map to low-level I/O primitives, including specializations for byte streams where appropriate. They allow easy composition of multiple streams into pipe chains, or can be used directly via readers and writers. Finally, they are designed to automatically provide backpressure and queuing.

This standard provides the base stream primitives which other parts of the web platform can use to expose their streaming data. For example, [FETCH] exposes Response bodies as ReadableStream instances. More generally, the platform is full of streaming abstractions waiting to be expressed as streams: multimedia streams, file streams, inter-global communication, and more benefit from being able to process data incrementally instead of buffering it all into memory and processing it in one go. By providing the foundation for these streams to be exposed to developers, the Streams Standard enables use cases like:

Web developers can also use the APIs described here to create their own streams, with the same APIs as those provided by the platform. Other developers can then transparently compose platform-provided streams with those supplied by libraries. In this way, the APIs described here provide unifying abstraction for all streams, encouraging an ecosystem to grow around these shared and composable interfaces.

2. Model

A chunk is a single piece of data that is written to or read from a stream. It can be of any type; streams can even contain chunks of different types. A chunk will often not be the most atomic unit of data for a given stream; for example a byte stream might contain chunks consisting of 16 KiB Uint8Arrays, instead of single bytes.

2.1. Readable streams

A readable stream represents a source of data, from which you can read. In other words, data comes out of a readable stream. Concretely, a readable stream is an instance of the ReadableStream class.

Although a readable stream can be created with arbitrary behavior, most readable streams wrap a lower-level I/O source, called the underlying source. There are two types of underlying source: push sources and pull sources.

Push sources push data at you, whether or not you are listening for it. They may also provide a mechanism for pausing and resuming the flow of data. An example push source is a TCP socket, where data is constantly being pushed from the OS level, at a rate that can be controlled by changing the TCP window size.

Pull sources require you to request data from them. The data may be available synchronously, e.g. if it is held by the operating system’s in-memory buffers, or asynchronously, e.g. if it has to be read from disk. An example pull source is a file handle, where you seek to specific locations and read specific amounts.

Readable streams are designed to wrap both types of sources behind a single, unified interface. For web developer–created streams, the implementation details of a source are provided by an object with certain methods and properties that is passed to the ReadableStream() constructor.

Chunks are enqueued into the stream by the stream’s underlying source. They can then be read one at a time via the stream’s public interface, in particular by using a readable stream reader acquired using the stream’s getReader() method.

Code that reads from a readable stream using its public interface is known as a consumer.

Consumers also have the ability to cancel a readable stream, using its cancel() method. This indicates that the consumer has lost interest in the stream, and will immediately close the stream, throw away any queued chunks, and execute any cancellation mechanism of the underlying source.

Consumers can also tee a readable stream using its tee() method. This will lock the stream, making it no longer directly usable; however, it will create two new streams, called branches, which can be consumed independently.

For streams representing bytes, an extended version of the readable stream is provided to handle bytes efficiently, in particular by minimizing copies. The underlying source for such a readable stream is called an underlying byte source. A readable stream whose underlying source is an underlying byte source is sometimes called a readable byte stream. Consumers of a readable byte stream can acquire a BYOB reader using the stream’s getReader() method.

2.2. Writable streams

A writable stream represents a destination for data, into which you can write. In other words, data goes in to a writable stream. Concretely, a writable stream is an instance of the WritableStream class.

Analogously to readable streams, most writable streams wrap a lower-level I/O sink, called the underlying sink. Writable streams work to abstract away some of the complexity of the underlying sink, by queuing subsequent writes and only delivering them to the underlying sink one by one.

Chunks are written to the stream via its public interface, and are passed one at a time to the stream’s underlying sink. For web developer-created streams, the implementation details of the sink are provided by an object with certain methods that is passed to the WritableStream() constructor.

Code that writes into a writable stream using its public interface is known as a producer.

Producers also have the ability to abort a writable stream, using its abort() method. This indicates that the producer believes something has gone wrong, and that future writes should be discontinued. It puts the stream in an errored state, even without a signal from the underlying sink, and it discards all writes in the stream’s internal queue.

2.3. Transform streams

A transform stream consists of a pair of streams: a writable stream, known as its writable side, and a readable stream, known as its readable side. In a manner specific to the transform stream in question, writes to the writable side result in new data being made available for reading from the readable side.

Concretely, any object with a writable property and a readable property can serve as a transform stream. However, the standard TransformStream class makes it much easier to create such a pair that is properly entangled. It wraps a transformer, which defines algorithms for the specific transformation to be performed. For web developer–created streams, the implementation details of a transformer are provided by an object with certain methods and properties that is passed to the TransformStream() constructor. Other specifications might use the GenericTransformStream mixin to create classes with the same writable/readable property pair but other custom APIs layered on top.

An identity transform stream is a type of transform stream which forwards all chunks written to its writable side to its readable side, without any changes. This can be useful in a variety of scenarios. By default, the TransformStream constructor will create an identity transform stream, when no transform() method is present on the transformer object.

Some examples of potential transform streams include:

2.4. Pipe chains and backpressure

Streams are primarily used by piping them to each other. A readable stream can be piped directly to a writable stream, using its pipeTo() method, or it can be piped through one or more transform streams first, using its pipeThrough() method.

A set of streams piped together in this way is referred to as a pipe chain. In a pipe chain, the original source is the underlying source of the first readable stream in the chain; the ultimate sink is the underlying sink of the final writable stream in the chain.

Once a pipe chain is constructed, it will propagate signals regarding how fast chunks should flow through it. If any step in the chain cannot yet accept chunks, it propagates a signal backwards through the pipe chain, until eventually the original source is told to stop producing chunks so fast. This process of normalizing flow from the original source according to how fast the chain can process chunks is called backpressure.

Concretely, the original source is given the controller.desiredSize (or byteController.desiredSize) value, and can then adjust its rate of data flow accordingly. This value is derived from the writer.desiredSize corresponding to the ultimate sink, which gets updated as the ultimate sink finishes writing chunks. The pipeTo() method used to construct the chain automatically ensures this information propagates back through the pipe chain.

When teeing a readable stream, the backpressure signals from its two branches will aggregate, such that if neither branch is read from, a backpressure signal will be sent to the underlying source of the original stream.

Piping locks the readable and writable streams, preventing them from being manipulated for the duration of the pipe operation. This allows the implementation to perform important optimizations, such as directly shuttling data from the underlying source to the underlying sink while bypassing many of the intermediate queues.

2.5. Internal queues and queuing strategies

Both readable and writable streams maintain internal queues, which they use for similar purposes. In the case of a readable stream, the internal queue contains chunks that have been enqueued by the underlying source, but not yet read by the consumer. In the case of a writable stream, the internal queue contains chunks which have been written to the stream by the producer, but not yet processed and acknowledged by the underlying sink.

A queuing strategy is an object that determines how a stream should signal backpressure based on the state of its internal queue. The queuing strategy assigns a size to each chunk, and compares the total size of all chunks in the queue to a specified number, known as the high water mark. The resulting difference, high water mark minus total size, is used to determine the desired size to fill the stream’s queue.

For readable streams, an underlying source can use this desired size as a backpressure signal, slowing down chunk generation so as to try to keep the desired size above or at zero. For writable streams, a producer can behave similarly, avoiding writes that would cause the desired size to go negative.

Concretely, a queuing strategy for web developer–created streams is given by any JavaScript object with a highWaterMark property. For byte streams the highWaterMark always has units of bytes. For other streams the default unit is chunks, but a size() function can be included in the strategy object which returns the size for a given chunk. This permits the highWaterMark to be specified in arbitrary floating-point units.

A simple example of a queuing strategy would be one that assigns a size of one to each chunk, and has a high water mark of three. This would mean that up to three chunks could be enqueued in a readable stream, or three chunks written to a writable stream, before the streams are considered to be applying backpressure.

In JavaScript, such a strategy could be written manually as { highWaterMark: 3, size() { return 1; }}, or using the built-in CountQueuingStrategy class, as new CountQueuingStrategy({ highWaterMark: 3 }).

2.6. Locking

A readable stream reader, or simply reader, is an object that allows direct reading of chunks from a readable stream. Without a reader, a consumer can only perform high-level operations on the readable stream: canceling the stream, or piping the readable stream to a writable stream. A reader is acquired via the stream’s getReader() method.

A readable byte stream has the ability to vend two types of readers: default readers and BYOB readers. BYOB ("bring your own buffer") readers allow reading into a developer-supplied buffer, thus minimizing copies. A non-byte readable stream can only vend default readers. Default readers are instances of the ReadableStreamDefaultReader class, while BYOB readers are instances of ReadableStreamBYOBReader.

Similarly, a writable stream writer, or simply writer, is an object that allows direct writing of chunks to a writable stream. Without a writer, a producer can only perform the high-level operations of aborting the stream or piping a readable stream to the writable stream. Writers are represented by the WritableStreamDefaultWriter class.

Under the covers, these high-level operations actually use a reader or writer themselves.

A given readable or writable stream only has at most one reader or writer at a time. We say in this case the stream is locked, and that the reader or writer is active. This state can be determined using the readableStream.locked or writableStream.locked properties.

A reader or writer also has the capability to release its lock, which makes it no longer active, and allows further readers or writers to be acquired. This is done via the defaultReader.releaseLock(), byobReader.releaseLock(), or writer.releaseLock() method, as appropriate.

3. Conventions

This specification depends on the Infra Standard. [INFRA]

This specification uses the abstract operation concept from the JavaScript specification for its internal algorithms. This includes treating their return values as completion records, and the use of ! and ? prefixes for unwrapping those completion records. [ECMASCRIPT]

This specification also uses the internal slot concept and notation from the JavaScript specification. (Although, the internal slots are on Web IDL platform objects instead of on JavaScript objects.)

The reasons for the usage of these foreign JavaScript specification conventions are largely historical. We urge you to avoid following our example when writing your own web specifications.

In this specification, all numbers are represented as double-precision 64-bit IEEE 754 floating point values (like the JavaScript Number type or Web IDL unrestricted double type), and all arithmetic operations performed on them must be done in the standard way for such values. This is particularly important for the data structure described in § 8.1 Queue-with-sizes. [IEEE-754]

4. Readable streams

4.1. Using readable streams

The simplest way to consume a readable stream is to simply pipe it to a writable stream. This ensures that backpressure is respected, and any errors (either writing or reading) are propagated through the chain:
readableStream.pipeTo(writableStream)
  .then(() => console.log("All data successfully written!"))
  .catch(e => console.error("Something went wrong!", e));
If you simply want to be alerted of each new chunk from a readable stream, you can pipe it to a new writable stream that you custom-create for that purpose:
readableStream.pipeTo(new WritableStream({
  write(chunk) {
    console.log("Chunk received", chunk);
  },
  close() {
    console.log("All data successfully read!");
  },
  abort(e) {
    console.error("Something went wrong!", e);
  }
}));

By returning promises from your write() implementation, you can signal backpressure to the readable stream.

Although readable streams will usually be used by piping them to a writable stream, you can also read them directly by acquiring a reader and using its read() method to get successive chunks. For example, this code logs the next chunk in the stream, if available:
const reader = readableStream.getReader();

reader.read().then(
  ({ value, done }) => {
    if (done) {
      console.log("The stream was already closed!");
    } else {
      console.log(value);
    }
  },
  e => console.error("The stream became errored and cannot be read from!", e)
);

This more manual method of reading a stream is mainly useful for library authors building new high-level operations on streams, beyond the provided ones of piping and teeing.

The above example showed using the readable stream’s default reader. If the stream is a readable byte stream, you can also acquire a BYOB reader for it, which allows more precise control over buffer allocation in order to avoid copies. For example, this code reads the first 1024 bytes from the stream into a single memory buffer:
const reader = readableStream.getReader({ mode: "byob" });

let startingAB = new ArrayBuffer(1024);
const buffer = await readInto(startingAB);
console.log("The first 1024 bytes: ", buffer);

async function readInto(buffer) {
  let offset = 0;

  while (offset < buffer.byteLength) {
    const { value: view, done } =
     await reader.read(new Uint8Array(buffer, offset, buffer.byteLength - offset));
    buffer = view.buffer;
    if (done) {
      break;
    }
    offset += view.byteLength;
  }

  return buffer;
}

An important thing to note here is that the final buffer value is different from the startingAB, but it (and all intermediate buffers) shares the same backing memory allocation. At each step, the buffer is transferred to a new ArrayBuffer object. The view is destructured from the return value of reading a new Uint8Array, with that ArrayBuffer object as its buffer property, the offset that bytes were written to as its byteOffset property, and the number of bytes that were written as its byteLength property.

Note that this example is mostly educational. For practical purposes, the min option of read() provides an easier and more direct way to read an exact number of bytes:

const reader = readableStream.getReader({ mode: "byob" });
const { value: view, done } = await reader.read(new Uint8Array(1024), { min: 1024 });
console.log("The first 1024 bytes: ", view);

4.2. The ReadableStream class

The ReadableStream class is a concrete instance of the general readable stream concept. It is adaptable to any chunk type, and maintains an internal queue to keep track of data supplied by the underlying source but not yet read by any consumer.

4.2.1. Interface definition

The Web IDL definition for the ReadableStream class is given as follows:

[Exposed=*, Transferable]
interface ReadableStream {
  constructor(optional object underlyingSource, optional QueuingStrategy strategy = {});

  static ReadableStream from(any asyncIterable);

  readonly attribute boolean locked;

  Promise<undefined> cancel(optional any reason);
  ReadableStreamReader getReader(optional ReadableStreamGetReaderOptions options = {});
  ReadableStream pipeThrough(ReadableWritablePair transform, optional StreamPipeOptions options = {});
  Promise<undefined> pipeTo(WritableStream destination, optional StreamPipeOptions options = {});
  sequence<ReadableStream> tee();

  async_iterable<any>(optional ReadableStreamIteratorOptions options = {});
};

typedef (ReadableStreamDefaultReader or ReadableStreamBYOBReader) ReadableStreamReader;

enum ReadableStreamReaderMode { "byob" };

dictionary ReadableStreamGetReaderOptions {
  ReadableStreamReaderMode mode;
};

dictionary ReadableStreamIteratorOptions {
  boolean preventCancel = false;
};

dictionary ReadableWritablePair {
  required ReadableStream readable;
  required WritableStream writable;
};

dictionary StreamPipeOptions {
  boolean preventClose = false;
  boolean preventAbort = false;
  boolean preventCancel = false;
  AbortSignal signal;
};

4.2.2. Internal slots

Instances of ReadableStream are created with the internal slots described in the following table:

Internal Slot Description (non-normative)
[[controller]] A ReadableStreamDefaultController or ReadableByteStreamController created with the ability to control the state and queue of this stream
[[Detached]] A boolean flag set to true when the stream is transferred
[[disturbed]] A boolean flag set to true when the stream has been read from or canceled
[[reader]] A ReadableStreamDefaultReader or ReadableStreamBYOBReader instance, if the stream is locked to a reader, or undefined if it is not
[[state]] A string containing the stream’s current state, used internally; one of "readable", "closed", or "errored"
[[storedError]] A value indicating how the stream failed, to be given as a failure reason or exception when trying to operate on an errored stream

4.2.3. The underlying source API

The ReadableStream() constructor accepts as its first argument a JavaScript object representing the underlying source. Such objects can contain any of the following properties:

dictionary UnderlyingSource {
  UnderlyingSourceStartCallback start;
  UnderlyingSourcePullCallback pull;
  UnderlyingSourceCancelCallback cancel;
  ReadableStreamType type;
  [EnforceRange] unsigned long long autoAllocateChunkSize;
};

typedef (ReadableStreamDefaultController or ReadableByteStreamController) ReadableStreamController;

callback UnderlyingSourceStartCallback = any (ReadableStreamController controller);
callback UnderlyingSourcePullCallback = Promise<undefined> (ReadableStreamController controller);
callback UnderlyingSourceCancelCallback = Promise<undefined> (optional any reason);

enum ReadableStreamType { "bytes" };
start(controller), of type UnderlyingSourceStartCallback

A function that is called immediately during creation of the ReadableStream.

Typically this is used to adapt a push source by setting up relevant event listeners, as in the example of