question

PaliosJason-9536 avatar image
0 Votes"
PaliosJason-9536 asked ·

Malformed records are detected in schema inference.

Hello I am trying to ingest data from an api, I want to try using data flow to transform
and store the data in a sql database.

I successfully copied the json response from the api in a storage account (data lake gen 2).

However I cannot get any preview of the data when mapping it as a source in data flow.


The error is when I want to preview the data as a source in data flow.
at Source 'RawJsonResponse': org.apache.spark.SparkException: Job aborted due to stage failure ...
org.apache.spark.SparkException: Malformed records are detected in schema inference. Parse Mode: FAILFAST.

the entire trace does not seem to have any other useful information.


I have simplified the content to test if it is indeed the json that is malformed, but I can't see anything wrong.

contents of the simplified json

[
{"":"389cbb431e39693f8b4e116048461617","id":"9b9d5b64f1b26a76372210a4ae560c6e"}
,{"
":"389cbb431e39693f8b4e116048461617","id":"4d99ba3182996825ed9bc227534bfe9d"}
]

In source settings I have schema drift enabled.

In source options I have Array of documents selected as Document form. ( I have tried the other forms with no success)

What am I doing wrong?

azure-data-factory
· 6
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

When I took your sample JSON, set the blank key to "name" and used "array of documents" in source options, it worked.


71990-image.png


1 Vote 1 ·
image.png (14.5 KiB)

Are there perhaps some hidden prerequisites in the storage account before I can load json files?

0 Votes 0 ·

This is odd... The key is not blank, it is an underscore _ i see it is not visible in the post, but even if change the value of the key I get the same error.

72027-image.png


0 Votes 0 ·
image.png (60.8 KiB)

This will probably need a ticket for Azure support so that engineering can look at it. Especially since it is not giving you the results when you enter a key like it is when I try it.

1 Vote 1 ·

@PaliosJason-9536 , as per @MarkKromer-2402 's suggestion, I am willing to provide you with a one-time free support ticket, if you do not have a support plan. Please let me know if you need this and I will send detail privately.

1 Vote 1 ·

That will not be necessary, I have opened an Azure Support ticket as my organization does have a support plan.
Thanks for the help

1 Vote 1 ·
JoshA-7845 avatar image
0 Votes"
JoshA-7845 answered ·

I can reproduce this on the simplest JSON file. Is there any answer?
Seems Data Factory cannot parse JSON that well

·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Kiran-MSFT avatar image
0 Votes"
Kiran-MSFT answered ·

Make sure you choose the right documentForm (singleDocument or arrayOfDocuments in most cases). Most cases of parsing error are when the configurations are misplaced.

·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.