ADLUploader Class

Upload local file(s) using chunks and threads

Launches multiple threads for efficient uploading, with chunksize assigned to each. The path can be a single file, a directory of files or a glob pattern.

Inheritance
builtins.object
ADLUploader

Constructor

ADLUploader(adlfs, rpath, lpath, nthreads=None, chunksize=268435456, buffersize=4194304, blocksize=4194304, client=None, run=True, overwrite=False, verbose=False, progress_callback=None, timeout=0)

Parameters

adlfs
<xref:<xref:ADL filesystem instance>>
Required
rpath
str
Required

remote path to upload to; if multiple files, this is the dircetory root to write within

lpath
str
Required

local path. Can be single file, directory (in which case, upload recursively) or glob pattern. Recursive glob patterns using **** are not supported.

nthreads
int[None]
default value: None

Number of threads to use. If None, uses the number of cores.

chunksize
int[<xref:228>]
default value: 268435456

Number of bytes for a chunk. Large files are split into chunks. Files smaller than this number will always be transferred in a single thread.

buffersize
int[<xref:222>]
default value: 4194304

Number of bytes for internal buffer. This block cannot be bigger than a chunk and cannot be smaller than a block.

blocksize
int[<xref:222>]
default value: 4194304

Number of bytes for a block. Within each chunk, we write a smaller block for each API call. This block cannot be bigger than a chunk.

client
ADLTransferClient[None]
default value: None

Set an instance of ADLTransferClient when finer-grained control over transfer parameters is needed. Ignores nthreads and chunksize set by constructor.

run
bool[True]
default value: True

Whether to begin executing immediately.

overwrite
bool[False]
default value: False

Whether to forcibly overwrite existing files/directories. If False and remote path is a directory, will quit regardless if any files would be overwritten or not. If True, only matching filenames are actually overwritten.

progress_callback
callable[None]
default value: False

Callback for progress with signature function(current, total) where current is the number of bytes transfered so far, and total is the size of the blob, or None if the total size is unknown.

timeout
int(<xref:0>)
default value: None

Default value 0 means infinite timeout. Otherwise time in seconds before the process will stop and raise an exception if transfer is still in progress

timeout
default value: 0

Methods

active

Return whether the uploader is active

clear_saved

Remove references to all persisted uploads.

load

Load list of persisted transfers from disk, for possible resumption.

run

Populate transfer queue and execute downloads

save

Persist this upload

Saves a copy of this transfer process in its current state to disk. This is done automatically for a running transfer, so that as a chunk is completed, this is reflected. Thus, if a transfer is interrupted, e.g., by user action, the transfer can be restarted at another time. All chunks that were not already completed will be restarted at that time.

See methods load to retrieved saved transfers and run to resume a stopped transfer.

successful

Return whether the uploader completed successfully.

It will raise AssertionError if the uploader is active.

active

Return whether the uploader is active

active()

clear_saved

Remove references to all persisted uploads.

static clear_saved()

load

Load list of persisted transfers from disk, for possible resumption.

static load()

Returns

  • A dictionary of upload instances. The hashes are auto

  • generated unique. The state of the chunks completed, errored, etc.,

  • can be seen in the status attribute. Instances can be resumed with

  • run().

run

Populate transfer queue and execute downloads

run(nthreads=None, monitor=True)

Parameters

nthreads
int[None]
default value: None

Override default nthreads, if given

monitor
bool[True]
default value: True

To watch and wait (block) until completion.

save

Persist this upload

Saves a copy of this transfer process in its current state to disk. This is done automatically for a running transfer, so that as a chunk is completed, this is reflected. Thus, if a transfer is interrupted, e.g., by user action, the transfer can be restarted at another time. All chunks that were not already completed will be restarted at that time.

See methods load to retrieved saved transfers and run to resume a stopped transfer.

save(keep=True)

Parameters

keep
bool(True)
default value: True

If True, transfer will be saved if some chunks remain to be completed; the transfer will be sure to be removed otherwise.

successful

Return whether the uploader completed successfully.

It will raise AssertionError if the uploader is active.

successful()

Attributes

hash