question

JohnStraumann avatar image
0 Votes"
JohnStraumann asked JohnStraumann commented

SQL Stored Procedure?

Hello all:

I am working on a project and need to create 100000000 records and they have a small data footprint, just 4 columns, 3 uniqueidentifiers, 1 int. 2 of the identifiers need to be read from other tables, and then used in the other table. So for example

TableA has a column taid uniqueidentifier (about 20000 records)
TableB has a column tbid uniqueidentifer (about 5000 records)

I need a row in TableC for every row in TableA and TableB, that will look like this:
taid tbid tcid int

Is it possible to a stored procedure to retrieve the ids from TableA and TableB, store in arrays or something, and then do loops to create TableC?

Thanks!

sql-server-generalsql-server-transact-sqlsql-server-analysis-services
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered JohnStraumann commented

If we overlook that this may be more than mouthful, you can do something like:

INSERT table(taid, tbid, tcid, int)
   SELECT a.taid, b.tbid, newid(), row_number() over(ORDER BY (SELECT NULL))
   FROM  tableA a
  CROSS JOIN tableB b

Your description is somewhat vague, so I had to extrapolate a little bit.

· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for your response and I apologize for being vague, this is for a customer project and they are very secretive, so I am trying to give as much info as possible while keeping info confidential! In your example, "table" is the target table, and the values in brackets are the target table columns? How would I generate a uniqueidentifier for each new record created in table C? I found this example:

declare @guid uniqueidentifier;
set @guid = NEWID();
select @guid as MyGUID;

but how would I put that in a loop as each new record gets a new GUID?

I tried this:
INSERT dbo.TableC(CId, AId, BId,Status)
SELECT NEWID(), a.AId, b.BId, 0 over(ORDER BY (SELECT NULL))
FROM dbo.TableA a
CROSS JOIN dbo.TableB b

but I got an error, "The multi-part identifier "a.AId could not be bound" and "incorrect syntax near over".

Thanks!

0 Votes 0 ·

How would I generate a uniqueidentifier for each new record created in table C? I found this example:

You use the newid() function, which was my intention to show, but there was a typo or two, which I have corrected.

"The multi-part identifier "a.AId could not be bound"

That is because you have a comma where there should be a dot.

"incorrect syntax near over".

That is because substituted row_number in my example with 0, which makes little sense. row_number() will number each row, the the OVER clause specifies how it is to be numbered. ORDER BY (SELECT NULL) explicitly says that you don't care in which order the rows are numbered. I put that in, since I had no information about what values you wanted here.

0 Votes 0 ·

I should add that the challenge here is not to form the query - that is the easy part. The normal procedure in SQL is to work with all data at once in a single statement. However, with the volumes you have indicated, this may not end well here, but you will need to break it up in batches.

I did not want to get into those details, since you don't seem to master basic SQL very well, and batching is more advanced. Dan gave a simple approach to the problem.

In any case, at this stage you should focus on getting a working query that inserts all data in a single statement with a small data set.

1 Vote 1 ·
Show more comments
DanGuzman avatar image
0 Votes"
DanGuzman answered

Adding on to Erland's answer, if inserting 100M at once is too much for your environment to handle, you could run multiple inserts to batch the operation. For example, the WHERE clauses below would select/insert approximately 25M rows each, assuming the tcid uniqueidentifer distribution is random.

 WHERE 
     a.tcid < '00000000-0000-0000-0000-400000000000';
    
 WHERE 
     a.tcid >= '00000000-0000-0000-0000-400000000000'
     AND a.tcid < '00000000-0000-0000-0000-800000000000';
    
 WHERE 
     a.tcid >= '00000000-0000-0000-0000-800000000000'
     AND a.tcid < '00000000-0000-0000-0000-c00000000000';
    
 WHERE 
     a.tcid >= '00000000-0000-0000-0000-c00000000000';
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.