1. Introduction
This section is non-normative.
Large swathes of the web platform are built on streaming data: that is, data that is created, processed, and consumed in an incremental fashion, without ever reading all of it into memory. The Streams Standard provides a common set of APIs for creating and interfacing with such streaming data, embodied in readable streams, writable streams, and transform streams.
These APIs have been designed to efficiently map to low-level I/O primitives, including specializations for byte streams where appropriate. They allow easy composition of multiple streams into pipe chains, or can be used directly via readers and writers. Finally, they are designed to automatically provide backpressure and queuing.
This standard provides the base stream primitives which other parts of the web platform can use to
expose their streaming data. For example, [FETCH] exposes Response
bodies as
ReadableStream
instances. More generally, the platform is full of streaming abstractions waiting
to be expressed as streams: multimedia streams, file streams, inter-global communication, and more
benefit from being able to process data incrementally instead of buffering it all into memory and
processing it in one go. By providing the foundation for these streams to be exposed to developers,
the Streams Standard enables use cases like:
-
Video effects: piping a readable video stream through a transform stream that applies effects in real time.
-
Decompression: piping a file stream through a transform stream that selectively decompresses files from a .tgz archive, turning them into
img
elements as the user scrolls through an image gallery. -
Image decoding: piping an HTTP response stream through a transform stream that decodes bytes into bitmap data, and then through another transform that translates bitmaps into PNGs. If installed inside the
fetch
hook of a service worker, this would allow developers to transparently polyfill new image formats. [SERVICE-WORKERS]
Web developers can also use the APIs described here to create their own streams, with the same APIs as those provided by the platform. Other developers can then transparently compose platform-provided streams with those supplied by libraries. In this way, the APIs described here provide unifying abstraction for all streams, encouraging an ecosystem to grow around these shared and composable interfaces.
2. Model
A chunk is a single piece of data that is written to or read from a stream. It can
be of any type; streams can even contain chunks of different types. A chunk will often not be the
most atomic unit of data for a given stream; for example a byte stream might contain chunks
consisting of 16 KiB Uint8Array
s, instead of single bytes.
2.1. Readable streams
A readable stream represents a source of data, from which you can read. In other
words, data comes
out of a readable stream. Concretely, a readable stream is an instance of the
ReadableStream
class.
Although a readable stream can be created with arbitrary behavior, most readable streams wrap a lower-level I/O source, called the underlying source. There are two types of underlying source: push sources and pull sources.
Push sources push data at you, whether or not you are listening for it. They may also provide a mechanism for pausing and resuming the flow of data. An example push source is a TCP socket, where data is constantly being pushed from the OS level, at a rate that can be controlled by changing the TCP window size.
Pull sources require you to request data from them. The data may be available synchronously, e.g. if it is held by the operating system’s in-memory buffers, or asynchronously, e.g. if it has to be read from disk. An example pull source is a file handle, where you seek to specific locations and read specific amounts.
Readable streams are designed to wrap both types of sources behind a single, unified interface. For
web developer–created streams, the implementation details of a source are provided by an object with certain methods and properties that is passed to
the ReadableStream()
constructor.
Chunks are enqueued into the stream by the stream’s underlying source. They can then be read
one at a time via the stream’s public interface, in particular by using a readable stream reader
acquired using the stream’s getReader()
method.
Code that reads from a readable stream using its public interface is known as a consumer.
Consumers also have the ability to cancel a readable
stream, using its cancel()
method. This indicates that the consumer has lost
interest in the stream, and will immediately close the stream, throw away any queued chunks, and
execute any cancellation mechanism of the underlying source.
Consumers can also tee a readable stream using its
tee()
method. This will lock the stream, making it
no longer directly usable; however, it will create two new streams, called branches, which can be consumed independently.
For streams representing bytes, an extended version of the readable stream is provided to handle
bytes efficiently, in particular by minimizing copies. The underlying source for such a readable
stream is called an underlying byte source. A readable stream whose underlying source is
an underlying byte source is sometimes called a readable byte stream. Consumers of
a readable byte stream can acquire a BYOB reader using the stream’s
getReader()
method.
2.2. Writable streams
A writable stream represents a destination for data, into which you can write. In
other words, data goes in to a writable stream. Concretely, a writable stream is an
instance of the WritableStream
class.
Analogously to readable streams, most writable streams wrap a lower-level I/O sink, called the underlying sink. Writable streams work to abstract away some of the complexity of the underlying sink, by queuing subsequent writes and only delivering them to the underlying sink one by one.
Chunks are written to the stream via its public interface, and are passed one at a time to the
stream’s underlying sink. For web developer-created streams, the implementation details of the
sink are provided by an object with certain methods that is
passed to the WritableStream()
constructor.
Code that writes into a writable stream using its public interface is known as a producer.
Producers also have the ability to abort a writable stream,
using its abort()
method. This indicates that the producer believes something has
gone wrong, and that future writes should be discontinued. It puts the stream in an errored state,
even without a signal from the underlying sink, and it discards all writes in the stream’s
internal queue.
2.3. Transform streams
A transform stream consists of a pair of streams: a writable stream, known as its writable side, and a readable stream, known as its readable side. In a manner specific to the transform stream in question, writes to the writable side result in new data being made available for reading from the readable side.
Concretely, any object with a writable
property and a readable
property
can serve as a transform stream. However, the standard TransformStream
class makes it much
easier to create such a pair that is properly entangled. It wraps a transformer, which
defines algorithms for the specific transformation to be performed. For web developer–created
streams, the implementation details of a transformer are provided by an
object with certain methods and properties that is passed to the TransformStream()
constructor. Other specifications might use the GenericTransformStream
mixin to create classes
with the same writable
/readable
property pair but other custom APIs
layered on top.
An identity transform stream is a type of transform stream which forwards all
chunks written to its writable side to its readable side, without any changes. This can
be useful in a variety of scenarios. By default, the
TransformStream
constructor will create an identity transform stream, when no
transform()
method is present on the transformer object.
Some examples of potential transform streams include:
-
A GZIP compressor, to which uncompressed bytes are written and from which compressed bytes are read;
-
A video decoder, to which encoded bytes are written and from which uncompressed video frames are read;
-
A text decoder, to which bytes are written and from which strings are read;
-
A CSV-to-JSON converter, to which strings representing lines of a CSV file are written and from which corresponding JavaScript objects are read.
2.4. Pipe chains and backpressure
Streams are primarily used by piping them to each other. A readable stream can be piped
directly to a writable stream, using its pipeTo()
method, or it can be piped
through one or more transform streams first, using its pipeThrough()
method.
A set of streams piped together in this way is referred to as a pipe chain. In a pipe chain, the original source is the underlying source of the first readable stream in the chain; the ultimate sink is the underlying sink of the final writable stream in the chain.
Once a pipe chain is constructed, it will propagate signals regarding how fast chunks should flow through it. If any step in the chain cannot yet accept chunks, it propagates a signal backwards through the pipe chain, until eventually the original source is told to stop producing chunks so fast. This process of normalizing flow from the original source according to how fast the chain can process chunks is called backpressure.
Concretely, the original source is given the
controller.desiredSize
(or
byteController.desiredSize
) value, and can then adjust
its rate of data flow accordingly. This value is derived from the
writer.desiredSize
corresponding to the ultimate sink, which gets updated as the ultimate sink finishes writing chunks. The
pipeTo()
method used to construct the chain automatically ensures this
information propagates back through the pipe chain.
When teeing a readable stream, the backpressure signals from its two branches will aggregate, such that if neither branch is read from, a backpressure signal will be sent to the underlying source of the original stream.
Piping locks the readable and writable streams, preventing them from being manipulated for the duration of the pipe operation. This allows the implementation to perform important optimizations, such as directly shuttling data from the underlying source to the underlying sink while bypassing many of the intermediate queues.
2.5. Internal queues and queuing strategies
Both readable and writable streams maintain internal queues, which they use for similar purposes. In the case of a readable stream, the internal queue contains chunks that have been enqueued by the underlying source, but not yet read by the consumer. In the case of a writable stream, the internal queue contains chunks which have been written to the stream by the producer, but not yet processed and acknowledged by the underlying sink.
A queuing strategy is an object that determines how a stream should signal backpressure based on the state of its internal queue. The queuing strategy assigns a size to each chunk, and compares the total size of all chunks in the queue to a specified number, known as the high water mark. The resulting difference, high water mark minus total size, is used to determine the desired size to fill the stream’s queue.
For readable streams, an underlying source can use this desired size as a backpressure signal, slowing down chunk generation so as to try to keep the desired size above or at zero. For writable streams, a producer can behave similarly, avoiding writes that would cause the desired size to go negative.
Concretely, a queuing strategy for web developer–created streams is given by
any JavaScript object with a highWaterMark
property. For byte streams the
highWaterMark
always has units of bytes. For other streams the default unit is
chunks, but a size()
function can be included in the strategy object
which returns the size for a given chunk. This permits the highWaterMark
to be
specified in arbitrary floating-point units.
In JavaScript, such a strategy could be written manually as
, or using the built-in CountQueuingStrategy
class, as
.
2.6. Locking
A readable stream reader, or simply reader, is an
object that allows direct reading of chunks from a readable stream. Without a reader, a
consumer can only perform high-level operations on the readable stream: canceling the stream, or piping the readable stream to a writable stream. A reader is
acquired via the stream’s getReader()
method.
A readable byte stream has the ability to vend two types of readers: default readers and BYOB readers. BYOB ("bring your
own buffer") readers allow reading into a developer-supplied buffer, thus minimizing copies. A
non-byte readable stream can only vend default readers. Default readers are instances of the
ReadableStreamDefaultReader
class, while BYOB readers are instances of
ReadableStreamBYOBReader
.
Similarly, a writable stream writer, or simply
writer, is an object that allows direct writing of chunks to a writable stream. Without a
writer, a producer can only perform the high-level operations of aborting the stream or piping a readable stream to the writable stream. Writers are
represented by the WritableStreamDefaultWriter
class.
Under the covers, these high-level operations actually use a reader or writer themselves.
A given readable or writable stream only has at most one reader or writer at a time. We say in this
case the stream is locked, and that the
reader or writer is active. This state can be
determined using the readableStream.locked
or
writableStream.locked
properties.
A reader or writer also has the capability to release its lock, which makes it no longer active, and allows further readers or
writers to be acquired. This is done via the
defaultReader.releaseLock()
,
byobReader.releaseLock()
, or
writer.releaseLock()
method, as appropriate.
3. Conventions
This specification depends on the Infra Standard. [INFRA]
This specification uses the abstract operation concept from the JavaScript specification for its internal algorithms. This includes treating their return values as completion records, and the use of ! and ? prefixes for unwrapping those completion records. [ECMASCRIPT]
This specification also uses the internal slot concept and notation from the JavaScript specification. (Although, the internal slots are on Web IDL platform objects instead of on JavaScript objects.)
The reasons for the usage of these foreign JavaScript specification conventions are largely historical. We urge you to avoid following our example when writing your own web specifications.
In this specification, all numbers are represented as double-precision 64-bit IEEE 754 floating
point values (like the JavaScript Number type or Web IDL unrestricted double
type), and all
arithmetic operations performed on them must be done in the standard way for such values. This is
particularly important for the data structure described in § 8.1 Queue-with-sizes. [IEEE-754]
4. Readable streams
4.1. Using readable streams
readableStream. pipeTo( writableStream) . then(() => console. log( "All data successfully written!" )) . catch ( e=> console. error( "Something went wrong!" , e));
readableStream. pipeTo( new WritableStream({ write( chunk) { console. log( "Chunk received" , chunk); }, close() { console. log( "All data successfully read!" ); }, abort( e) { console. error( "Something went wrong!" , e); } }));
By returning promises from your write()
implementation, you can signal
backpressure to the readable stream.
read()
method to get
successive chunks. For example, this code logs the next chunk in the stream, if available:
const reader= readableStream. getReader(); reader. read(). then( ({ value, done}) => { if ( done) { console. log( "The stream was already closed!" ); } else { console. log( value); } }, e=> console. error( "The stream became errored and cannot be read from!" , e) );
This more manual method of reading a stream is mainly useful for library authors building new high-level operations on streams, beyond the provided ones of piping and teeing.
const reader= readableStream. getReader({ mode: "byob" }); let startingAB= new ArrayBuffer( 1024 ); const buffer= await readInto( startingAB); console. log( "The first 1024 bytes: " , buffer); async function readInto( buffer) { let offset= 0 ; while ( offset< buffer. byteLength) { const { value: view, done} = await reader. read( new Uint8Array( buffer, offset, buffer. byteLength- offset)); buffer= view. buffer; if ( done) { break ; } offset+= view. byteLength; } return buffer; }
An important thing to note here is that the final buffer
value is different from the
startingAB
, but it (and all intermediate buffers) shares the same backing memory
allocation. At each step, the buffer is transferred to a new
ArrayBuffer
object. The view
is destructured from the return value of reading a
new Uint8Array
, with that ArrayBuffer
object as its buffer
property, the
offset that bytes were written to as its byteOffset
property, and the number of
bytes that were written as its byteLength
property.
Note that this example is mostly educational. For practical purposes, the
min
option of read()
provides an easier and more direct way to read an exact number of bytes:
const reader= readableStream. getReader({ mode: "byob" }); const { value: view, done} = await reader. read( new Uint8Array( 1024 ), { min: 1024 }); console. log( "The first 1024 bytes: " , view);
4.2. The ReadableStream
class
The ReadableStream
class is a concrete instance of the general readable stream concept. It
is adaptable to any chunk type, and maintains an internal queue to keep track of data supplied
by the underlying source but not yet read by any consumer.
4.2.1. Interface definition
The Web IDL definition for the ReadableStream
class is given as follows:
[Exposed=*,Transferable ]interface {
ReadableStream constructor (optional object ,
underlyingSource optional QueuingStrategy = {});
strategy static ReadableStream from (any );
asyncIterable readonly attribute boolean locked ;Promise <undefined >cancel (optional any );
reason ReadableStreamReader getReader (optional ReadableStreamGetReaderOptions = {});
options ReadableStream pipeThrough (ReadableWritablePair ,
transform optional StreamPipeOptions = {});
options Promise <undefined >pipeTo (WritableStream ,
destination optional StreamPipeOptions = {});
options sequence <ReadableStream >tee ();async_iterable <any >(optional ReadableStreamIteratorOptions = {}); };
options typedef (ReadableStreamDefaultReader or ReadableStreamBYOBReader );
ReadableStreamReader enum {
ReadableStreamReaderMode };
"byob" dictionary {
ReadableStreamGetReaderOptions ReadableStreamReaderMode ; };
mode dictionary {
ReadableStreamIteratorOptions boolean =
preventCancel false ; };dictionary {
ReadableWritablePair required ReadableStream ;
readable required WritableStream ; };
writable dictionary {
StreamPipeOptions boolean =
preventClose false ;boolean =
preventAbort false ;boolean =
preventCancel false ;AbortSignal ; };
signal
4.2.2. Internal slots
Instances of ReadableStream
are created with the internal slots described in the following
table:
Internal Slot | Description (non-normative) |
---|---|
[[controller]] | A ReadableStreamDefaultController or
ReadableByteStreamController created with the ability to control the state and queue of this
stream
|
[[Detached]] | A boolean flag set to true when the stream is transferred |
[[disturbed]] | A boolean flag set to true when the stream has been read from or canceled |
[[reader]] | A ReadableStreamDefaultReader or ReadableStreamBYOBReader
instance, if the stream is locked to a reader, or undefined if it is not
|
[[state]] | A string containing the stream’s current state, used internally; one
of "readable ", "closed ", or "errored "
|
[[storedError]] | A value indicating how the stream failed, to be given as a failure reason or exception when trying to operate on an errored stream |
4.2.3. The underlying source API
The ReadableStream()
constructor accepts as its first argument a JavaScript object representing
the underlying source. Such objects can contain any of the following properties:
dictionary {
UnderlyingSource UnderlyingSourceStartCallback start ;UnderlyingSourcePullCallback pull ;UnderlyingSourceCancelCallback cancel ;ReadableStreamType type ; [EnforceRange ]unsigned long long autoAllocateChunkSize ; };typedef (ReadableStreamDefaultController or ReadableByteStreamController );
ReadableStreamController callback =
UnderlyingSourceStartCallback any (ReadableStreamController );
controller callback =
UnderlyingSourcePullCallback Promise <undefined > (ReadableStreamController );
controller callback =
UnderlyingSourceCancelCallback Promise <undefined > (optional any );
reason enum {
ReadableStreamType "bytes" };
start(controller)
, of type UnderlyingSourceStartCallback-
A function that is called immediately during creation of the
ReadableStream
.Typically this is used to adapt a push source by setting up relevant event listeners, as in the example of