question

NiclasWeh-1159 avatar image
0 Votes"
NiclasWeh-1159 asked ·

How to handle more than 100.000 items?

I have a Azure Function which returns a json array. This array can contain 50 items or 500.000.

As I just realisied, a foreach-loop can only handle up to 100.000 items. Is there a way I could split my json array up at a specific point? (Like ID = 99.999, ID = 199.999, ...) and handle each "block" with an extra foreach?

azure-logic-apps
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered ·

If you can split it in the Function, then that is going to be much easier.

If you need to do it inside the logic app, I would create a few variables- a working array to hold the chunk you will be currently working on, an integer set to your chunk size (whether you want to set this to 100k or something a bit smaller), and an array that will temporarily hold the remaining items since you can't call a variable that you are currently setting.

  1. I'd nest my foreach (or use a child logic app) inside a while loop, which runs until my data array length is 0.

  2. Use the take() function to load the current chunk into the working array.

  3. Call the foreach or child app with that chunk as the input.

  4. Use the skip() function to set my temp array value to the remaining data minus the items I just processed. This is necessary because you can set a variable you are calling in the value parameter.

  5. Load the temp data back into the main array.

This is what it looks like at a high level:
11742-2020-07-09-22-06-18-logic-apps-designer-microsoft.png

I used an extra array variable in my example to hold the dummy data instead of passing it in the request, but here is the json schema for it:

 {
     "definition": {
         "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
         "actions": {
             "Initialize_chunkSize": {
                 "inputs": {
                     "variables": [
                         {
                             "name": "chunkSize",
                             "type": "integer",
                             "value": 2
                         }
                     ]
                 },
                 "runAfter": {
                     "Initialize_tempArray": [
                         "Succeeded"
                     ]
                 },
                 "type": "InitializeVariable"
             },
             "Initialize_tempArray": {
                 "inputs": {
                     "variables": [
                         {
                             "name": "tempArray",
                             "type": "array"
                         }
                     ]
                 },
                 "runAfter": {
                     "Initialize_workingArray": [
                         "Succeeded"
                     ]
                 },
                 "type": "InitializeVariable"
             },
             "Initialize_testArray": {
                 "inputs": {
                     "variables": [
                         {
                             "name": "testArray",
                             "type": "array",
                             "value": [
                                 1,
                                 2,
                                 3,
                                 4,
                                 5,
                                 6,
                                 7,
                                 8,
                                 9,
                                 10,
                                 11
                             ]
                         }
                     ]
                 },
                 "runAfter": {},
                 "type": "InitializeVariable"
             },
             "Initialize_workingArray": {
                 "inputs": {
                     "variables": [
                         {
                             "name": "workingArray",
                             "type": "array"
                         }
                     ]
                 },
                 "runAfter": {
                     "Initialize_testArray": [
                         "Succeeded"
                     ]
                 },
                 "type": "InitializeVariable"
             },
             "Until": {
                 "actions": {
                     "For_each": {
                         "actions": {
                             "Compose": {
                                 "inputs": "@items('For_each')",
                                 "runAfter": {},
                                 "type": "Compose"
                             }
                         },
                         "foreach": "@variables('workingArray')",
                         "runAfter": {
                             "Set_Working_Array": [
                                 "Succeeded"
                             ]
                         },
                         "type": "Foreach"
                     },
                     "Remove_items_from_main_array": {
                         "inputs": {
                             "name": "testArray",
                             "value": "@variables('tempArray')"
                         },
                         "runAfter": {
                             "Set_Temp_Array": [
                                 "Succeeded"
                             ]
                         },
                         "type": "SetVariable"
                     },
                     "Set_Temp_Array": {
                         "inputs": {
                             "name": "tempArray",
                             "value": "@skip(variables('testArray'),variables('chunkSize'))"
                         },
                         "runAfter": {
                             "For_each": [
                                 "Succeeded"
                             ]
                         },
                         "type": "SetVariable"
                     },
                     "Set_Working_Array": {
                         "inputs": {
                             "name": "workingArray",
                             "value": "@take(variables('testArray'), variables('chunkSize'))"
                         },
                         "runAfter": {},
                         "type": "SetVariable"
                     }
                 },
                 "expression": "@equals(length(variables('testArray')), 0)",
                 "limit": {
                     "count": 60,
                     "timeout": "PT1H"
                 },
                 "runAfter": {
                     "Initialize_chunkSize": [
                         "Succeeded"
                     ]
                 },
                 "type": "Until"
             }
         },
         "contentVersion": "1.0.0.0",
         "outputs": {},
         "parameters": {},
         "triggers": {
             "manual": {
                 "inputs": {
                     "schema": {}
                 },
                 "kind": "Http",
                 "type": "Request"
             }
         }
     },
     "parameters": {}
 }









· 2 · Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for that answer!

As you said, splitting it in the function would be much easier. Splitting it is not the problem for me, I could work that out. But how would I return like 5 jsonArrays with each 100k items, with a single POST to my function? The request returns only one body, and logic apps can't split up the return body at specific points, can it?

0 Votes 0 · ·

YThere is a "splitOn" property that you can setup in many triggers like the http trigger that acts as a forEach on the initial input. It does share the same 100k limit so it doesn't solve the initial problem. The other option would be to use a nested forEach to unwrap the arrays.

The splitOn setting isn't currently displayed by default in the http trigger settings- I did need to go into code view to add it. You can read more about the splitOn property in the docs: https://docs.microsoft.com/en-us/azure/logic-apps/logic-apps-workflow-actions-triggers#split-on-debatch

0 Votes 0 · ·