question

MrtenLindblad-7214 avatar image
0 Votes"
MrtenLindblad-7214 asked EbramTawfik-8746 edited

Azure Synapse: Error handling external file

We run Azure Synapse Serverless on top of Time Series Insight data that are stored as parquet.
It works well, except that the Time Series Insight service appends to parquet files for up to 10 minutes at a time.
During those windows we get an error analyzing the data:

 Msg 15813, Level 16, State 1, Line 1 
 Error handling external file: 'Invalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.'. File/External table name: 'dbo.foo'.

With openrowset:

 Msg 15813, Level 16, State 1, Line 1
 Error handling external file: 'Invalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.'. File/External table name: 'https://foo.dfs.core.windows.net/env-foo/V=1/PT=Time/Y=2021/M=04/foo.parquet'.

Is there a way to ignore those files gracefully in Synapse?
Thanks.

azure-synapse-analytics
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @MrtenLindblad-7214 and welcome to Microsoft Q&A.

Am I correct in my understanding that the Time Series Insights writes to Parquet, and Synapse reads the Parquet and writes to SQL? Or is it the other way around?

0 Votes 0 ·

@MrtenLindblad-7214 are you still facing the issue? If so please provide more details, otherwise let us know what your resolution was.

0 Votes 0 ·

1 Answer

EbramTawfik-8746 avatar image
0 Votes"
EbramTawfik-8746 answered EbramTawfik-8746 edited

@MartinJaffer-MSFT I am getting the same error however the file is not corrupted


 Msg 15813, Level 16, State 1, Line 1
 Error handling external file: 'Invalid metadata in parquet file. Number of rows in metadata does not match actual number of rows in parquet file.'. File/External table name:




I was able to read the file fine using "Parquet Dotnet https://github.com/aloneguid/parquet-dotnet/tree/master" here is the code:

 private void btn_read_Click(object sender, RoutedEventArgs e)
         {
             txt_data.Text = "";
             using (Stream fileStream = File.OpenRead(txt_fileName.Text))
             {
                 // open parquet file reader
                 using (var parquetReader = new ParquetReader(fileStream))
                 {
                     // get file schema (available straight after opening parquet reader)
                     // however, get only data fields as only they contain data values
                     DataField[] dataFields = parquetReader.Schema.GetDataFields();
    
                     // enumerate through row groups in this file
                     for (int i = 0; i < parquetReader.RowGroupCount; i++)
                     {
                         // create row group reader
                         using (ParquetRowGroupReader groupReader = parquetReader.OpenRowGroupReader(i))
                         {
                             // read all columns inside each row group (you have an option to read only
                             // required columns if you need to.
                             DataColumn[] columns = dataFields.Select(groupReader.ReadColumn).ToArray();
                                
                             for (int j = 0; j < columns.Length; j++)
                             {
                                 txt_data.Text += dataFields[j].Name + ": \n";
                                 // .Data member contains a typed array of column data you can cast to the type of the column
                                 Array data = columns[j].Data;
                                 foreach (var item in data)
                                 {
                                     txt_data.Text += item.ToString() + "\n";
                                 }
                                 txt_data.Text += "\n\n\n\n";
                             }
    
    
    
                         }
                     }
                 }
             }
         }
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.