io — Core tools for working with streams¶
Source code: Lib/io.py
Overview¶
The io module provides Python’s main facilities for dealing with various
types of I/O. There are three main types of I/O: text I/O, binary I/O
and raw I/O. These are generic categories, and various backing stores can
be used for each of them. A concrete object belonging to any of these
categories is called a file object. Other common terms are stream
and file-like object.
Independent of its category, each concrete stream object will also have various capabilities: it can be read-only, write-only, or read-write. It can also allow arbitrary random access (seeking forwards or backwards to any location), or only sequential access (for example in the case of a socket or pipe).
All streams are careful about the type of data you give to them. For example
giving a str object to the write() method of a binary stream
will raise a TypeError. So will giving a bytes object to the
write() method of a text stream.
Changed in version 3.3: Operations that used to raise IOError now raise OSError, since
IOError is now an alias of OSError.
Text I/O¶
Text I/O expects and produces str objects. This means that whenever
the backing store is natively made of bytes (such as in the case of a file),
encoding and decoding of data is made transparently as well as optional
translation of platform-specific newline characters.
The easiest way to create a text stream is with open(), optionally
specifying an encoding:
f = open("myfile.txt", "r", encoding="utf-8")
In-memory text streams are also available as StringIO objects:
f = io.StringIO("some initial text data")
Note
When working with a non-blocking stream, be aware that read operations on text I/O objects
might raise a BlockingIOError if the stream cannot perform the operation
immediately.
The text stream API is described in detail in the documentation of
TextIOBase.
Binary I/O¶
Binary I/O (also called buffered I/O) expects
bytes-like objects and produces bytes
objects. No encoding, decoding, or newline translation is performed. This
category of streams can be used for all kinds of non-text data, and also when
manual control over the handling of text data is desired.
The easiest way to create a binary stream is with open() with 'b' in
the mode string:
f = open("myfile.jpg", "rb")
In-memory binary streams are also available as BytesIO objects:
f = io.BytesIO(b"some initial binary data: \x00\x01")
The binary stream API is described in detail in the docs of
BufferedIOBase.
Other library modules may provide additional ways to create text or binary
streams. See socket.socket.makefile() for example.
Raw I/O¶
Raw I/O (also called unbuffered I/O) is generally used as a low-level building-block for binary and text streams; it is rarely useful to directly manipulate a raw stream from user code. Nevertheless, you can create a raw stream by opening a file in binary mode with buffering disabled:
f = open("myfile.jpg", "rb", buffering=0)
The raw stream API is described in detail in the docs of RawIOBase.
Text Encoding¶
The default encoding of TextIOWrapper and open() is
locale-specific (locale.getencoding()).
However, many developers forget to specify the encoding when opening text files encoded in UTF-8 (e.g. JSON, TOML, Markdown, etc…) since most Unix platforms use UTF-8 locale by default. This causes bugs because the locale encoding is not UTF-8 for most Windows users. For example:
# May not work on Windows when non-ASCII characters in the file.
with open("README.md") as f:
long_description = f.read()
Accordingly, it is highly recommended that you specify the encoding
explicitly when opening text files. If you want to use UTF-8, pass
encoding="utf-8". To use the current locale encoding,
encoding="locale" is supported since Python 3.10.
See also
- Python UTF-8 Mode
Python UTF-8 Mode can be used to change the default encoding to UTF-8 from locale-specific encoding.
- PEP 686
Python 3.15 will make Python UTF-8 Mode default.
Opt-in EncodingWarning¶
Added in version 3.10: See PEP 597 for more details.
To find where the default locale encoding is used, you can enable
the -X warn_default_encoding command line option or set the
PYTHONWARNDEFAULTENCODING environment variable, which will
emit an EncodingWarning when the default encoding is used.
If you are providing an API that uses open() or
TextIOWrapper and passes encoding=None as a parameter, you
can use text_encoding() so that callers of the API will emit an
EncodingWarning if they don’t pass an encoding. However,
please consider using UTF-8 by default (i.e. encoding="utf-8") for
new APIs.
High-level Module Interface¶
- io.DEFAULT_BUFFER_SIZE¶
An int containing the default buffer size used by the module’s buffered I/O classes.
open()uses the file’s blksize (as obtained byos.stat()) if possible.
- io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)¶
This is an alias for the builtin
open()function.This function raises an auditing event
openwith arguments path, mode and flags. The mode and flags arguments may have been modified or inferred from the original call.
- io.open_code(path)¶
Opens the provided file with mode
'rb'. This function should be used when the intent is to treat the contents as executable code.path should be a
strand an absolute path.The behavior of this function may be overridden by an earlier call to the
PyFile_SetOpenCodeHook(). However, assuming that path is astrand an absolute path,open_code(path)should always behave the same asopen(path, 'rb'). Overriding the behavior is intended for additional validation or preprocessing of the file.Added in version 3.8.
- io.text_encoding(encoding, stacklevel=2, /)¶
This is a helper function for callables that use
open()orTextIOWrapperand have anencoding=Noneparameter.This function returns encoding if it is not
None. Otherwise, it returns"locale"or"utf-8"depending on UTF-8 Mode.This function emits an
EncodingWarningifsys.flags.warn_default_encodingis true and encoding isNone. stacklevel specifies where the warning is emitted. For example:def read_text(path, encoding=None): encoding = io.text_encoding(encoding) # stacklevel=2 with open(path, encoding) as f: return f.read()
In this example, an
EncodingWarningis emitted for the caller ofread_text().See Text Encoding for more information.
Added in version 3.10.
Changed in version 3.11:
text_encoding()returns “utf-8” when UTF-8 mode is enabled and encoding isNone.
- exception io.BlockingIOError¶
This is a compatibility alias for the builtin
BlockingIOErrorexception.