Hi,
I have an delta lake table in ADLS, if I create view with delta format, I only see the last version which has 4 records; however if I create the external table (even with delta format), I see all the records from all the versions.
Hi,
I have an delta lake table in ADLS, if I create view with delta format, I only see the last version which has 4 records; however if I create the external table (even with delta format), I see all the records from all the versions.
Could you share a little bit more about how you are creating the tables? If I created the CETAS table with the delta format I got the expected result- only the most recent data. If I created it from with parquet format then I did get all versions of the data.
This is my test script:
CREATE EXTERNAL TABLE dbo.deltaCetasTest
WITH (
LOCATION = 'deltaCetasTest/',
DATA_SOURCE = <source>,
FILE_FORMAT = snappy
)
AS
SELECT *
FROM OPENROWSET(BULK 'abfss://<container>@<account>.dfs.core.windows.net/<path>',
FORMAT = 'delta') AS f;
SELECT * FROM dbo.deltaCetasTest;
Thanks for checking on this, I have sent you email about my subscription id to you by email.
can you please let me know what is the proper way to create external table for delta lake? Thanks!
II'm not seeing anything wrong with your code, and using it I'm still having trouble replicating your issue. I've sent you a private message to gather some information so I can set you up with a one-time free support request. I appreciate your patience while working through this problem.
I use the wizard from azure as below,
IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = 'SynapseParquetFormat')
CREATE EXTERNAL FILE FORMAT [SynapseParquetFormat]
WITH ( FORMAT_TYPE = DELTA)
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'lake_vyadls_dfs_core_windows_net')
CREATE EXTERNAL DATA SOURCE [lake_vyadls_dfs_core_windows_net]
WITH (
LOCATION = 'https://vyadls.dfs.core.windows.net/lake',
)
Go
CREATE EXTERNAL TABLE lob.customer (
[id] smallint,
[name] varchar(8000),
[etltime] datetime2(7),
[city] varchar(8000)
)
WITH (
LOCATION = 'customer',
DATA_SOURCE = [lake_vyadls_dfs_core_windows_net],
FILE_FORMAT = [SynapseParquetFormat]
)
GO
7 people are following this question.