API (JSON) to Parquet via DataFactory

Ryan Abbey 1,171 Reputation points
2021-10-21T21:59:59.103+00:00

I'm trying to investigate options that will allow us to take the response from an API call (ideally in JSON but possibly XML) through the Copy Activity in to a parquet output.. the biggest issue I have is that the JSON is hierarchical so I need it to be able to flatten the JSON

Initially, I've been playing with the JSON directly to see if I can get what I want out of the Copy Activity with intent to pass in a Mapping configuration to meet the file expectations (I've uploaded the Copy activity pipe and sample json, not sure if anything else is required for play)

On initial configuration, the below is the mapping that it gives me... of particular note is the hierarchy for "vehicles" (level 1) and (although not displayed because I can't make the screen small enough) "fleets" (level 2 - i.e. attribute of vehicle)
142579-image.png

The output when run is giving me a single row but my data has 2 vehicles with 1 of those vehicles having 2 fleets.. I have set the Collection Reference to "Fleets" as I want this lower layer (and have tried "[0]", "[*]", "") without it making a difference to output (only ever first row), what should I be setting here to say "all rows"? (If I do the collection reference to "Vehicles" I get two rows (with first Fleet object selected in each) but it must be possible to delve to lower hierarchies if its giving the selection option?? Or is this for multiple level 1 hierarchies only? Is it possible to get to level 2?

142583-inputjson.txt142559-pipe.txt

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,624 questions
{count} vote

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2021-10-23T05:58:58.337+00:00

    Hi @Ryan Abbey ,

    Thank you for posting query on Microsoft Q&A Platform.

    Yes, Its limitation in Copy activity. Copy activity will not able to flatten if you have nested arrays. A workaround for this will be using Flatten transformation in data flows.

    Similar example with nested arrays discussed here. Please check it. I tried flatten transformation on your sample json. Its working fine. I was able to flatten.
    143121-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    4 people found this answer helpful.

0 additional answers

Sort by: Most helpful