WebHDFS FileSystem APIs
Azure Data Lake Store is a cloud-scale file system that is compatible with Hadoop Distributed File System (HDFS) and works with the Hadoop ecosystem. Your existing applications or services that use the WebHDFS API can easily integrate with ADLS.
URL for REST calls
A typical WebHDFS REST URL looks like the following:
http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=<OP>...
To map this URL for a REST call to Data Lake Store, make the following changes:
Use
https
instead ofhttp
For
<HOST>
, use the fully-qualified account name, like<data_lake_store_name>.azuredatalakestore.net
The
:<PORT>
is optional
So, a REST endpoint URL for Data Lake Store using the WebHDFS APIs should look like the following:
https://<data_lake_store_name>.azuredatalakestore.net/webhdfs/v1/<PATH>?op=<OP>...
Passing authorization token in the message header
Data Lake Store uses Azure Active Directory to authorize REST calls. All REST calls to Data Lake Store must include an authorization token as part of the message header. For a detailed discussion on how Azure Active Directory uses OAuth, see OAuth2.0 in Azure Active Directory. For instructions on how to request an authorization token, see How do I authenticate using Azure Active Directory.
Note
For a list of common headers and parameters that are required for calls to Data Lake Store, see Common parameters and headers.
WebHDFS compliant APIs for Data Lake Store
The table below lists the WebHDFS APIs that can be used with Data Lake Store. Wherever applicable, the table also lists deviation from the standard WebHDFS APIs, such as if some parameters are not supported, or when some parameters are supported differently.
Note
Data Lake Store currently supports WebHDFS version 2.7.2.
WebHDFS API with Data Lake Store | Request/Response | Important considerations |
---|---|---|
CREATE | See here | The following request parameters are not supported. - blocksize - This is fixed at 256MB and cannot be changed. - replication - This is handled internally by Data Lake Store.Even if you provide this parameter, it will be ignored and no error will be returned. - buffersize - This is fixed at 4MB and cannot be changed. |
APPEND | See here | The following request parameters are not supported: - buffersize - This is fixed at 4MB and cannot be changed |
CONCAT | See here | - |
OPEN | See here | The following request parameters are not supported: - buffersize - This is fixed at 4MB and cannot be changed |
MKDIRS | See here | - |
RENAME | See here | - |
DELETE | See here | - |
GETFILESTATUS | See here | The following response parameters are supported differently: - type - SYMLINK is not supported so it will not be returned; FILE and DIRECTORY will be. |
LISTSTATUS | See here | - |
GETCONTENTSUMMARY | See here | The following response parameters are not supported: - quota - Data Lake Store does not return quota. - spaceQuota - Data Lake Store does not return spaceQuota. |
SETPERMISSION | See here | - |
SETOWNER | See here | - |
MODIFYACLENTRIES | See here | - |
REMOVEACLENTRIES | See here | - |
SETACL | See here | - |
GETACLSTATUS | See here | - |
CHECKACCESS | See here | - |