question

DavidZanter-2017 avatar image
0 Votes"
DavidZanter-2017 asked PradeepKommaraju-MSFT answered

Azure Kubernetes Cluster: Errno 122 when attempting to concurrently file lock the same file in AzureFiles from >= 100 nodes

I have the following issue I am witnessing when I have a AKS k8s cluster with >100 nodes, and I attempt to have every node lock the same file which is located on a shared AzureFile mount; the 100th node to request the lock is returned an errno 122 (Disk Quote Exceeded.) . (I am doing this to in-parallel have a 100 node computation platform parse a dataset.)

   volumes:
   - azureFile:
       readOnly: false
       secretName: azure-secret
       shareName: aksshare 

This happens always exactly on the 100th node to request the concurrent file lock; and I have never seen it on < 100 nodes; So I am assuming there is some Hard Limit on the amount of concurrent locks that are allowed.

Specifically I was curious if anyone had seen this; and if possibly there is some configuration setting that could be increased to allow more concurrent locks?

To simplify the scenario I wrote a simple C program (shown below) which is able to reproduce the problem.

(Other data points:
*I did try have 100 pids/program on a single kubernetes node lock the same file in an azure files mount and that did work.
*I did also have the 100 separate k8s nodes each lock a hostPath file, and that did work. (would expect that to work since those are different files per each host, but just to sanity check it.)
)



Simple repro-program:

 int main(int argc, char *argv[])
 {
      struct flock fltest = {0,0,0,0,0};
      fltest.l_type = F_RDLCK;
      int fd = open( argv[1], O_RDONLY, 0);
     printf("opened file: %s, fid:%d errno:%d\n", argv[1], fd, errno);
     int irc = fcntl(fd, F_SETLK, &fltest);
     printf("locked file: %s, %d errno %d\n", argv[1], irc, errno);
     if (irc == 0)
     {
         printf("Waiting 30 seconds while holding lock.\n");
         sl_ep(30); 
     }
     fltest.l_type = F_UNLCK;
     irc = fcntl(fd, F_SETLK, &fltest);
     printf("Lock released irc:%d errno %d\n", irc, errno);
 }


azure-kubernetes-serviceazure-files
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

PradeepKommaraju-MSFT avatar image
1 Vote"
PradeepKommaraju-MSFT answered

Hi @DavidZanter-2017

Thank you for reaching out to Microsoft Q&A Forum ,

Looks like in your case you were hitting the maximum concurrent requests for the Azure Files.
Reviewed various other similar cases as well and unfortunately all these limits are Hard limits and there is no way out of it .

Hope you find another cloud solution for your use case .

Thanks & Regards,
Pradeep



Please don't forget to accept the answer if this clarifies your ask .


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.