io
— Core tools for working with streams¶
Source code: Lib/io.py
Overview¶
The io
module provides Python’s main facilities for dealing with various
types of I/O. There are three main types of I/O: text I/O, binary I/O
and raw I/O. These are generic categories, and various backing stores can
be used for each of them. A concrete object belonging to any of these
categories is called a file object. Other common terms are stream
and file-like object.
Independent of its category, each concrete stream object will also have various capabilities: it can be read-only, write-only, or read-write. It can also allow arbitrary random access (seeking forwards or backwards to any location), or only sequential access (for example in the case of a socket or pipe).
All streams are careful about the type of data you give to them. For example
giving a str
object to the write()
method of a binary stream
will raise a TypeError
. So will giving a bytes
object to the
write()
method of a text stream.
Changed in version 3.3: Operations that used to raise IOError
now raise OSError
, since
IOError
is now an alias of OSError
.
Text I/O¶
Text I/O expects and produces str
objects. This means that whenever
the backing store is natively made of bytes (such as in the case of a file),
encoding and decoding of data is made transparently as well as optional
translation of platform-specific newline characters.
The easiest way to create a text stream is with open()
, optionally
specifying an encoding:
f = open("myfile.txt", "r", encoding="utf-8")
In-memory text streams are also available as StringIO
objects:
f = io.StringIO("some initial text data")
The text stream API is described in detail in the documentation of
TextIOBase
.
Binary I/O¶
Binary I/O (also called buffered I/O) expects
bytes-like objects and produces bytes
objects. No encoding, decoding, or newline translation is performed. This
category of streams can be used for all kinds of non-text data, and also when
manual control over the handling of text data is desired.
The easiest way to create a binary stream is with open()
with 'b'
in
the mode string:
f = open("myfile.jpg", "rb")
In-memory binary streams are also available as BytesIO
objects:
f = io.BytesIO(b"some initial binary data: \x00\x01")
The binary stream API is described in detail in the docs of
BufferedIOBase
.
Other library modules may provide additional ways to create text or binary
streams. See socket.socket.makefile()
for example.
Raw I/O¶
Raw I/O (also called unbuffered I/O) is generally used as a low-level building-block for binary and text streams; it is rarely useful to directly manipulate a raw stream from user code. Nevertheless, you can create a raw stream by opening a file in binary mode with buffering disabled:
f = open("myfile.jpg", "rb", buffering=0)
The raw stream API is described in detail in the docs of RawIOBase
.
Text Encoding¶
The default encoding of TextIOWrapper
and open()
is
locale-specific (locale.getencoding()
).
However, many developers forget to specify the encoding when opening text files encoded in UTF-8 (e.g. JSON, TOML, Markdown, etc…) since most Unix platforms use UTF-8 locale by default. This causes bugs because the locale encoding is not UTF-8 for most Windows users. For example:
# May not work on Windows when non-ASCII characters in the file.
with open("README.md") as f:
long_description = f.read()
Accordingly, it is highly recommended that you specify the encoding
explicitly when opening text files. If you want to use UTF-8, pass
encoding="utf-8"
. To use the current locale encoding,
encoding="locale"
is supported since Python 3.10.
See also
- Python UTF-8 Mode
Python UTF-8 Mode can be used to change the default encoding to UTF-8 from locale-specific encoding.
- PEP 686
Python 3.15 will make Python UTF-8 Mode default.
Opt-in EncodingWarning¶
Added in version 3.10: See PEP 597 for more details.
To find where the default locale encoding is used, you can enable
the -X warn_default_encoding
command line option or set the
PYTHONWARNDEFAULTENCODING
environment variable, which will
emit an EncodingWarning
when the default encoding is used.
If you are providing an API that uses open()
or
TextIOWrapper
and passes encoding=None
as a parameter, you
can use text_encoding()
so that callers of the API will emit an
EncodingWarning
if they don’t pass an encoding
. However,
please consider using UTF-8 by default (i.e. encoding="utf-8"
) for
new APIs.
High-level Module Interface¶
- io.DEFAULT_BUFFER_SIZE¶
An int containing the default buffer size used by the module’s buffered I/O classes.
open()
uses the file’s blksize (as obtained byos.stat()
) if possible.
- io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)¶
This is an alias for the builtin
open()
function.This function raises an auditing event
open
with arguments path, mode and flags. The mode and flags arguments may have been modified or inferred from the original call.
- io.open_code(path)¶
Opens the provided file with mode
'rb'
. This function should be used when the intent is to treat the contents as executable code.path should be a
str
and an absolute path.The behavior of this function may be overridden by an earlier call to the
PyFile_SetOpenCodeHook()
. However, assuming that path is astr
and an absolute path,open_code(path)
should always behave the same asopen(path, 'rb')
. Overriding the behavior is intended for additional validation or preprocessing of the file.Added in version 3.8.
- io.text_encoding(encoding, stacklevel=2, /)¶
This is a helper function for callables that use
open()
orTextIOWrapper
and have anencoding=None
parameter.This function returns encoding if it is not
None
. Otherwise, it returns"locale"
or"utf-8"
depending on UTF-8 Mode.This function emits an
EncodingWarning
ifsys.flags.warn_default_encoding
is true and encoding isNone
. stacklevel specifies where the warning is emitted. For example:def read_text(path, encoding=None): encoding = io.text_encoding(encoding) # stacklevel=2 with open(path, encoding) as f: return f.read()
In this example, an
EncodingWarning
is emitted for the caller ofread_text()
.See Text Encoding for more information.
Added in version 3.10.
Changed in version 3.11:
text_encoding()
returns “utf-8” when UTF-8 mode is enabled and encoding isNone
.
- exception io.BlockingIOError¶
This is a compatibility alias for the builtin
BlockingIOError
exception.
- exception io.UnsupportedOperation¶
An exception inheriting
OSError
andValueError
that is raised when an unsupported operation is called on a stream.
See also
sys
contains the standard IO streams:
sys.stdin
,sys.stdout
, andsys.stderr
.
Class hierarchy¶
The implementation of I/O streams is organized as a hierarchy of classes. First abstract base classes (ABCs), which are used to specify the various categories of streams, then concrete classes providing the standard stream implementations.
Note
The abstract base classes also provide default implementations of some
methods in order to help implementation of concrete stream classes. For
example, BufferedIOBase
provides unoptimized implementations of
readinto()
and readline()
.
At the top of the I/O hierarchy is the abstract base class IOBase
. It
defines the basic interface to a stream. Note, however, that there is no
separation between reading and writing to streams; implementations are allowed
to raise UnsupportedOperation
if they do not support a given operation.
The RawIOBase
ABC extends IOBase
. It deals with the reading
and writing of bytes to a stream. FileIO
subclasses RawIOBase
to provide an interface to files in the machine’s file system.
The BufferedIOBase
ABC extends IOBase
. It deals with
buffering on a raw binary stream (RawIOBase
). Its subclasses,
BufferedWriter
, BufferedReader
, and BufferedRWPair
buffer raw binary streams that are writable, readable, and both readable and writable,
respectively. BufferedRandom
provides a buffered interface to seekable streams.
Another BufferedIOBase
subclass, BytesIO
, is a stream of
in-memory bytes.
The TextIOBase
ABC extends IOBase
. It deals with
streams whose bytes represent text, and handles encoding and decoding to and
from strings. TextIOWrapper
, which extends TextIOBase
, is a buffered text
interface to a buffered raw stream (BufferedIOBase
). Finally,
StringIO
is an in-memory stream for text.
Argument names are not part of the specification, and only the arguments of
open()
are intended to be used as keyword arguments.
The following table summarizes the ABCs provided by the io
module:
ABC |
Inherits |
Stub Methods |
Mixin Methods and Properties |
---|---|---|---|
|
|
||
|
Inherited |
||
|
Inherited |
||
|
Inherited |
I/O Base Classes¶
- class io.IOBase¶
The abstract base class for all I/O classes.
This class provides empty abstract implementations for many methods that derived classes can override selectively; the default implementations represent a file that cannot be read, written or seeked.
Even though
IOBase
does not declareread()
orwrite()
because their signatures will vary, implementations and clients should consider those methods part of the interface. Also, implementations may raise aValueError
(orUnsupportedOperation
) when operations they do not support are called.The basic type used for binary data read from or written to a file is
bytes
. Other bytes-like objects are accepted as method arguments too. Text I/O classes work withstr
data.Note that calling any method (even inquiries) on a closed stream is undefined. Implementations may raise
ValueError
in this case.IOBase
(and its subclasses) supports the iterator protocol, meaning that anIOBase
object can be iterated over yielding the lines in a stream. Lines are defined slightly differently depending on whether the stream is a binary stream (yielding bytes), or a text stream (yielding character strings). Seereadline()
below.IOBase
is also a context manager and therefore supports thewith
statement. In this example, file is closed after thewith
statement’s suite is finished—even if an exception occurs:with open('spam.txt', 'w') as file: file.write('Spam and eggs!')
IOBase
provides these data attributes and methods:- close()¶
Flush and close this stream. This method has no effect if the file is already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will raise a
ValueError
.As a convenience, it is allowed to call this method more than once; only the first call, however, will have an effect.
- closed¶
True
if the stream is closed.
- fileno()¶
Return the underlying file descriptor (an integer) of the stream if it exists. An
OSError
is raised if the IO object does not use a file descriptor.
- flush()¶
Flush the write buffers of the stream if applicable. This does nothing for read-only and non-blocking streams.
- isatty()¶
Return
True
if the stream is interactive (i.e., connected to a terminal/tty device).
- readline(size=-1, /)¶
Read and return one line from the stream. If size is specified, at most size bytes will be read.
The line terminator is always
b'\n'
for binary files; for text files, the newline argument toopen()
can be used to select the line terminator(s) recognized.
- readlines(hint=-1, /)¶
Read and return a list of lines from the stream. hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.
hint values of
0
or less, as well asNone
, are treated as no hint.Note that it’s already possible to iterate on file objects using
for line in file: ...
without callingfile.readlines()
.
- seek(offset, whence=os.SEEK_SET, /)¶
Change the stream position to the given byte offset, interpreted relative to the position indicated by whence, and return the new absolute position. Values for whence are:
os.SEEK_SET
or0
– start of the stream (the default); offset should be zero or positiveos.SEEK_CUR
or1
– current stream position; offset may be negativeos.SEEK_END
or2
– end of the stream; offset is usually negative
Added in version 3.1: The
SEEK_*
constants.Added in version 3.3: Some operating systems could support additional values, like
os.SEEK_HOLE
oros.SEEK_DATA
. The valid values for a file could depend on it being open in text or binary mode.
- seekable()¶
Return
True
if the stream supports random access. IfFalse
,seek()
,tell()
andtruncate()
will raiseOSError
.
- tell()¶
Return the current stream position.
- truncate(size=None, /)¶
Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled). The new file size is returned.
Changed in version 3.5: Windows will now zero-fill files when extending.
- writable()¶
Return
True
if the stream supports writing. IfFalse
,write()
andtruncate()
will raiseOSError
.
- writelines(lines, /)¶
Write a list of lines to the stream. Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
- class io.RawIOBase¶
Base class for raw binary streams. It inherits from
IOBase
.Raw binary streams typically provide low-level access to an underlying OS device or API, and do not try to encapsulate it in high-level primitives (this functionality is done at a higher-level in buffered binary streams and text streams, described later in this page).
RawIOBase
provides these methods in addition to those fromIOBase
:- read(size=-1, /)¶
Read up to size bytes from the object and return them. As a convenience, if size is unspecified or -1, all bytes until EOF are returned. Otherwise, only one system call is ever made. Fewer than size bytes may be returned if the operating system call returns fewer than size bytes.
If 0 bytes are returned, and size was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available,
None
is returned.The default implementation defers to
readall()
andreadinto()
.
- readall()¶
Read and return all the bytes from the stream until EOF, using multiple calls to the stream if necessary.
- readinto(b, /)¶
Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read. For example, b might be a
bytearray
. If the object is in non-blocking mode and no bytes are available,None
is returned.
- write(b, /)¶
Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written. This can be less than the length of b in bytes, depending on specifics of the underlying raw stream, and especially if it is in non-blocking mode.
None
is returned if the raw stream is set not to block and no single byte could be readily written to it. The caller may release or mutate b after this method returns, so the implementation should only access b during the method call.
- class io.BufferedIOBase¶
Base class for binary streams that support some kind of buffering. It inherits from
IOBase
.The main difference with
RawIOBase
is that methodsread()
,readinto()
andwrite()
will try (respectively) to read as much input as requested or to emit all provided data.In addition, if the underlying raw stream is in non-blocking mode, when the system returns would block
write()
will raiseBlockingIOError
withBlockingIOError.characters_written
andread()
will return data read so far orNone
if no data is available.Besides, the
read()
method does not have a default implementation that defers toreadinto()
.A typical
BufferedIOBase
implementation should not inherit from aRawIOBase
implementation, but wrap one, likeBufferedWriter
andBufferedReader
do.BufferedIOBase
provides or overrides these data attributes and methods in addition to those fromIOBase
:- raw¶
The underlying raw stream (a
RawIOBase
instance) thatBufferedIOBase
deals with. This is not part of the