Upload files to Azure using a custom client

Wansart, Christian 20 Reputation points
2024-04-16T05:59:11.2166667+00:00

Hello,

we are facing an issue uploading files to Azure using a custom client using azure-storage-cpplite which was developed years ago. We are currently in the process of renewing the api but for now we need a way to work around that. The current file upload limit of upload_block_blob_from_stream is 256 megabyte but we have some files of around 2 gigabyte.

When researching the issue I found several pages including https://learn.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs. It mentions the append block endpoints which seemed to be the right option, but the API caused some headaches. But after reading the last paragraph, I was wondering if using append blocks it the right choice. We just need to upload bigger files.

After reading the other sources as well I was wondering if using block lists are what we are searching for?

If so, we need to find a way to actually do the upload. The azure-storage-cpplite has the upload_block_blob_from_buffer function but it accepts only a buffer, not a stream, which is a bad idea when uploading files that have several gigabyte of size. Also, there are examples of using block lists using parallelism, but how does this work if I want to upload one big file not several ones?

So, my question is: what is the right choice for uploading big files? Append Blocks or Block Lists?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,430 questions
C++
C++
A high-level, general-purpose programming language, created as an extension of the C programming language, that has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
3,530 questions
0 comments No comments
{count} votes

Accepted answer
  1. Anand Prakash Yadav 5,765 Reputation points Microsoft Vendor
    2024-04-16T11:22:04.05+00:00

    Hello Wansart, Christian,

    Thank you for posting your query here!

    For uploading large files to Azure Blob Storage, Block Blobs are generally the best choice. Block Blobs are designed to handle large files efficiently by breaking them into blocks. You can upload each block in parallel to increase upload speed, and each block can be retried separately upon failure, saving bandwidth.

    The azure-storage-cpplite library you’re using seems to have some limitations with large files. However, you can work around this by reading your large file into smaller chunks and uploading each chunk with the Put Block API. After all blocks are uploaded, you can call the Put Block List API to commit the blob.

    Here’s a high-level overview of the steps:

    · You can decide the size of the chunks based on your requirements. A common choice is to use chunks of a few megabytes each.

    · Each chunk is uploaded as a separate block. Each block is identified by a block ID, which must be unique within the blob.

    · Call the Put Block List API to commit the blob: After all blocks have been uploaded, you provide a list of block IDs to the Put Block List API in the order that they should appear in the blob.

    As for the upload_block_blob_from_buffer function, it’s not suitable for large files because it requires the entire file to be loaded into memory. This can lead to out-of-memory errors for large files.

    Regarding parallelism, it’s about uploading multiple blocks at the same time. This can significantly increase upload speed, especially for large files. However, it also requires more memory and CPU resources. If you want to upload a single large file, you can still use parallelism by uploading multiple chunks of the file at the same time.

    https://inside.covve.com/uploading-block-blobs-larger-than-256-mb-in-azure/

    Note: You can use Azure Storage ExplorerAzcopy, Power Shell, CLI or any programming language to upload large files.

    Upload large amounts of random data in parallel to Azure storage

    How to upload big files to Azure Blob Storage (.NET Core)

    If you want to upload larger files to file share or blob storage, there is an Azure Storage Data Movement Library. It provides high-performance for uploading, downloading larger files. Please consider using this library for larger files.

    Choose an Azure solution for data transfer : This article provides an overview of some of the common Azure data transfer solutions. The article also links out to recommended options depending on the network bandwidth in your environment and the size of the data you intend to transfer.

    I hope this helps! Please let me know if the issue persists or if you have any other questions.

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members. 

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful