Insert/Load values into Single newly added column in already loaded table/history data

Question

Hello, I have table called sales having fields Invoice Number,OrderCategory,Item.Its a incremental load table. Everyday day-1 data from source gets added to this table.

Currently I had to add InvoiceDate column to table.

Now I need to load this column data for all the previous record as well.
There is no single column with unique value, can say unique combination would be Invoice Number,OrderCategory,Item

Please suggest best possible way to load invoice date column.

Answer

Good day,

The simplest option, is first to add the new column and next you have few options to load the data according to many parameters based on the source and target of the data.

Please suggest best possible way to load invoice date column.

1) Load from where?!? What is the source of the data which you want to load?

The answer is very different if the source of the is another existing table or if the data needed to be loaded from CSV file or from Azure Blob, or from excel or from...

Each source of data has different options for ETL actions which can be used for the loading of the data, and we have no information

2) What is the database structure?!? How many rows do you have? What is the type of the new column

The recommended procedure is totally different if you have clustered index , or if you have constraints, or if you have to do some ETL on the source data, or you have 100 or 1 non-clustered index, or if you have in the table 10 rows or 10 trillion rows!

The DDL and the data directly impact the "best" practice for the question, and we have no information

Please try to provide more information regarding the source and target of the data (Do not edit the original question's message which once it had responds - instead add the information as comment to this message so we will be able to discuss your case)

In general here are some common option you can use (but we cannot know which is best based on the information we have). Note that in all cases this is after you added the new column and you have to load the data to the empty column

1) Using direct UPDATE from SELECT queries (fit if the source data is in another table for example)

UPDATE T1
SET T1.C1= T2.C1
FROM Source_Table1 AS T2 -- query can be much more complex here including JOINs from few source tables
WHERE T1.C2= 'Ronen'

Note: One of the most common solution is to load the data to another Heaps table and next use simple query to update the original table.

2) Using OPENROWSET to get the source data and using simple UPDATE queries as above

3) Using bacpac file

4) Using SQL Server Integration Services (SSIS) or Azure-SSIS

5) Using bcb

6) Using Azure data sync

and so on....

Please provide more information so we will be able to chose a way that fits you scenario

Highly advance topic and probably OFF_Topic for your needs! SQL Internals: In some cases SQL Server can add a column in place of a columns which was deleted before. There are multiple parameters that impact this scenario.

Note: I presented a full lecture on the topic of "SQL Server tables physical structure" few times. You might be able to search Google for one of these event which were recorded. There are versions in Hebrew and in English. In these lectures I demonstrate these cases and explain when a new column can come in the place of an old deleted column. I also have some posts on the topic in my blog.

Why this is important?!? Since behind the scenes if the server need to add column and he cannot use a space of a column which was deleted before then in practical the server will probably need to create all the columns from scratch (all the rows actually) and move all the data from the old location on the disk to the new location - this can be huge performance issue!

Understanding the internals of the Data File and checking the physical structure of the table can give us the information of best solution and expected impact.

Unfortunately this information you can get ONLY using un-documented entities like system_internals_partitions and sys.system_internals_partition_columns and more...

Insert/Load values into Single newly added column in already loaded table/history data

1 answer