duck.utils.fileio

FileIOStream module.

Provides both synchronous and asynchronous file streaming interfaces. Ideal for efficient reading of large files using chunked reads and supporting standard seek, tell, and close operations in both environments.

Methods that do not need to be async: Even in async context, the below methods don’t necessarily need to be async:

  1. open - Time complexity is O(1)

  2. seek - Time complexity is O(1)

  3. tell - Time complexity is O(1)

In async context, only read, write, and close need to be asynchronous.

Module Contents

Classes

AsyncFileIOStream

Asynchronous file streaming class.

FileIOStream

Synchronous file streaming class that mimics io.IOBase.

Functions

to_async_fileio_stream

Converts file_io_stream to async file io stream if not already async.

API

class duck.utils.fileio.AsyncFileIOStream(*args, **kwargs)[source]

Bases: duck.utils.fileio.FileIOStream

Asynchronous file streaming class.

Provides async-compatible methods for reading and writing files in a non-blocking way using threads via asyncio.to_thread.

Notes:

  • This implementation is compatible with context managers.

Initialization

Initialize the FileIOStream object.

Parameters:
  • filepath – Path to the file to be streamed.

  • chunk_size – Maximum number of bytes to read or write at once. Default is 2MB.

  • open_now – Whether to open the file immediately. Defaults to False.

  • mode – File open mode (default: ‘rb’).

async __aenter__()[source]
async __aexit__(exc_type, exc, tb)[source]
async async_open()[source]

Asynchronously open the file.

async close()[source]

Asynchronously close the file.

async read(size: int = -1) bytes[source]

Asynchronously read from the file.

Parameters:

size – Max bytes to read. -1 reads full content.

Returns:

Data read from file.

Return type:

bytes

async write(data: bytes) int[source]

Asynchronously write data to the file.

Parameters:

data – Bytes to write.

Returns:

Number of bytes written.

Return type:

int

class duck.utils.fileio.FileIOStream(filepath: str, chunk_size: int = 2 * 1024 * 1024, open_now: bool = False, mode: str = 'rb')[source]

Bases: io.IOBase

Synchronous file streaming class that mimics io.IOBase.

This class provides an interface to stream file contents using standard file operations such as read, write, seek, tell, and close. It is optimized for chunked reading of large files and is designed to be used strictly in synchronous contexts.

Initialization

Initialize the FileIOStream object.

Parameters:
  • filepath – Path to the file to be streamed.

  • chunk_size – Maximum number of bytes to read or write at once. Default is 2MB.

  • open_now – Whether to open the file immediately. Defaults to False.

  • mode – File open mode (default: ‘rb’).

__del__()[source]

Ensure the file is closed on delete else it raises a RuntimeError if file not closed.

__slots__

None

close()[source]

Close the file.

is_open() bool[source]

Check if the file is currently open.

open()[source]

Open the file using the provided mode.

raise_if_in_async_context(message: str)[source]

Raise an error if used inside an async context.

read(size: int = -1) bytes[source]

Synchronously read data from the file.

Parameters:

size – Number of bytes to read. -1 reads all.

Returns:

File data.

Return type:

bytes

seek(offset: int, whence: int = os.SEEK_SET)[source]

Move the file pointer to a new location.

tell() int[source]

Get the current position in the file.

write(data: bytes) int[source]

Synchronously write data to the file.

Parameters:

data – Data to write.

Returns:

Number of bytes written.

Return type:

int

duck.utils.fileio.to_async_fileio_stream(fileio_stream: FileIOStream) AsyncFileIOStream[source]

Converts file_io_stream to async file io stream if not already async.