transfer Module

Low-level classes for managing data transfer.

Classes

ADLTransferClient

Client for transferring data from/to Azure DataLake Store

This is intended as the underlying class for ADLDownloader and ADLUploader. If necessary, it can be used directly for additional control.

:param : :param fn(adlfs: :param src: :param dst: :param offset: :param size: :param buffersize: :param blocksize: :param shutdown_event).: :param adlfs is the ADL filesystem instance. src and dst refer to the source: :param and destination of the respective file transfer. offset is the location: :param in src to read size bytes from. buffersize is the number of bytes: :param used for internal buffering before transfer. blocksize is the number of: :param bytes in a chunk to write at one time. The callable should return an: :param integer representing the number of bytes written.: :param The merge callable has the function signature: :param : :param fn(adlfs: :param outfile: :param files: :param shutdown_event). adlfs is the ADL filesystem: :param instance. outfile is the result of merging files.: :param For both transfer callables: :param shutdown_event is optional. In particular: :param : :param shutdown_event is a threading.Event that is passed to the callable.: :param The event will be set when a shutdown is requested. It is good practice: :param to listen for this.: :param Internal State: :param ————–: :param self._fstates: This captures the current state of each transferred file. :type self._fstates: StateManager :param self._files: Using a tuple of the file source/destination as the key, this

dictionary stores the file metadata and all chunk states. The dictionary key is (src, dst) and the value is dict(length, cstates, exception).

Chunk
File
StateManager

Manages state for any hashable object.

When tracking multiple files and their chunks, each file/chunk can be in any valid state for that particular type.

At the simplest level, we need to set and retrieve an object's current state, while only allowing valid states to be used. In addition, we also need to give statistics about a group of objects (are all objects in one state? how many objects are in each available state?).