Malformed records are detected in schema inference.

Question

Hello I am trying to ingest data from an api, I want to try using data flow to transform
and store the data in a sql database.

I successfully copied the json response from the api in a storage account (data lake gen 2).

However I cannot get any preview of the data when mapping it as a source in data flow.

The error is when I want to preview the data as a source in data flow.
at Source 'RawJsonResponse': org.apache.spark.SparkException: Job aborted due to stage failure ...
org.apache.spark.SparkException: Malformed records are detected in schema inference. Parse Mode: FAILFAST.

the entire trace does not seem to have any other useful information.

I have simplified the content to test if it is indeed the json that is malformed, but I can't see anything wrong.

contents of the simplified json

[
{"":"389cbb431e39693f8b4e116048461617","id":"9b9d5b64f1b26a76372210a4ae560c6e"}
,{"":"389cbb431e39693f8b4e116048461617","id":"4d99ba3182996825ed9bc227534bfe9d"}
]

In source settings I have schema drift enabled.

In source options I have Array of documents selected as Document form. ( I have tried the other forms with no success)

What am I doing wrong?

Answer

I can reproduce this on the simplest JSON file. Is there any answer?
Seems Data Factory cannot parse JSON that well

Answer

Make sure you choose the right documentForm (singleDocument or arrayOfDocuments in most cases). Most cases of parsing error are when the configurations are misplaced.

Malformed records are detected in schema inference.

2 answers