question

ahmedsalah-1628 avatar image
0 Votes"
ahmedsalah-1628 asked ·

How to write query generate unique Id for concate both columns search parts and company id ?

I work on sql server 2012 . I face issue I can't generate unique Id from merge or concave both columns

Search Parts and Company ID and generated Id must be on column Generated ID

to be easier and fast on search

so Please How to generate unique Id from both column Search Parts and Company ID
and if both column repeated value both get same Id
so please How to do that

AS Example

SearchParts CompanyId GeneratedId
A5ghf7598f7GGHYUTYA 3456 901
when concate both columns Search Parts and Company Id generate Id
and take same Id in case of repeated
but main idea generate number unique for both column search parts and company id

every search parts text and CompanyId must be unique and take unique Id on generated Id

 create table #partswithcompany
  (
  SearchParts  nvarchar(200),
  CompanyId  int,
  GeneratedId INT
  )
  insert into #partswithcompany (SearchParts,CompanyId,GeneratedId)
  values
  ('A5ghf7598fdmlcpghjk',1234,NULL),
  ('AKLJGSA7598fdmlcpghjk',5870,NULL),
  ('A5ghfJKKJGHHGghjk',9081818,NULL),
  ('KHJLFFS8fdmlcpghjk',123345,NULL),
  ('A5ghf7598f7GGHYUTYA',3456,NULL),
  ('A5ghfJKKJGHHGghjk',9081818,NULL),
   ('A5ghf7598f7GGHYUTYA',3456,NULL),
   ('A5ghf7598f7GGHYUTYA',3456,NULL)
    
 Expected Result
    
     SearchParts    CompanyId    GeneratedId
     A5ghf7598fdmlcpghjk    1234    5
     AKLJGSA7598fdmlcpghjk    5870    9
     A5ghfJKKJGHHGghjk    9081818    8
     KHJLFFS8fdmlcpghjk    123345    6
     A5ghf7598f7GGHYUTYA    3456    7
     A5ghfJKKJGHHGghjk    9081818    8
     A5ghf7598f7GGHYUTYA    3456    7
     A5ghf7598f7GGHYUTYA    3456    7


sql-server-generalsql-server-transact-sql
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MelissaMa-msft avatar image
0 Votes"
MelissaMa-msft answered ·

Hi @ahmedsalah-1628

Based on Viorel-1's answer, you could create a trigger or procedure to update the GeneratedId if newly created data is inserted tomorrow.

Please refer below example and check whether it is helpful:

 --insert some new data
   insert into #partswithcompany (SearchParts,CompanyId,GeneratedId)
   values
   ('A5ghf7598fdmlcpghjk',1234,NULL),
   ('AKLJGSA75fukfgjpghjk',5870,NULL),
   ('A5ghfJKKJdfhgetrhsGghjk',818,NULL)
    
 --update GeneratedId with same search parts and company id repeated
 update a
 set a.GeneratedId=b.GeneratedId
 from #partswithcompany A 
 INNER JOIN #partswithcompany B ON A.SearchParts=B.SearchParts AND A.CompanyId = B.CompanyId
 WHERE a.GeneratedId IS NULL and b.GeneratedId is not null
    
 --update GeneratedId with different search parts and company id 
 ; with M as
  (
      select a.*, dense_rank() over (order by a.SearchParts, a.CompanyId) as n
      from #partswithcompany A 
     INNER JOIN #partswithcompany B ON A.SearchParts=B.SearchParts AND A.CompanyId = B.CompanyId
     WHERE a.GeneratedId IS NULL and b.GeneratedId is  null
  )
   update M
  set GeneratedId = (select max(GeneratedId)+n  from #partswithcompany where GeneratedId is not null)

 select * from #partswithcompany

Output:

 SearchParts    CompanyId    GeneratedId
 A5ghf7598fdmlcpghjk    1234    2
 AKLJGSA7598fdmlcpghjk    5870    4
 A5ghfJKKJGHHGghjk    9081818    3
 KHJLFFS8fdmlcpghjk    123345    5
 A5ghf7598f7GGHYUTYA    3456    1
 A5ghfJKKJGHHGghjk    9081818    3
 A5ghf7598f7GGHYUTYA    3456    1
 A5ghf7598f7GGHYUTYA    3456    1
 A5ghf7598fdmlcpghjk    1234    2
 AKLJGSA75fukfgjpghjk    5870    7
 A5ghfJKKJdfhgetrhsGghjk    818    6

Best regards
Melissa


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

· Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Viorel-1 avatar image
0 Votes"
Viorel-1 answered ·

See if the next statement satisfies:

 ;
 with N as
 (
     select *, dense_rank() over (order by SearchParts, CompanyId) as n
     from #partswithcompany
 )
 update N
 set GeneratedId = n

If you need just a query, then use and adjust the select part of above statement.

· 1 · Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I need generated Id from both column search parts and company id
calculation
meaning
when pass both text below
A5ghf7598f7GGHYUTYA 3456
generate id

when pass these two text again generate same number
also i need to do some thing not me to generate id run query
are there are any thing automtive generate id

suppose today i put on database
A5ghf7598f7GGHYUTYA 3456
tomorrow
A5ghf7598f7GGHYUTYA 3456
then both must get same id
so How to do that

0 Votes 0 ·
TomPhillips-1744 avatar image
0 Votes"
TomPhillips-1744 answered ·

The simplest thing is to just use an IDENTITY field.

 create table #partswithcompany
   (
   SearchParts  nvarchar(200),
   CompanyId  int,
   GeneratedId INT IDENTITY(1,1)
   )
   insert into #partswithcompany (SearchParts,CompanyId)
   values
   ('A5ghf7598fdmlcpghjk',1234),
   ('AKLJGSA7598fdmlcpghjk',5870),
   ('A5ghfJKKJGHHGghjk',9081818),
   ('KHJLFFS8fdmlcpghjk',123345),
   ('A5ghf7598f7GGHYUTYA',3456),
   ('A5ghfJKKJGHHGghjk',9081818),
    ('A5ghf7598f7GGHYUTYA',3456),
    ('A5ghf7598f7GGHYUTYA',3456)
    
 SELECT *
 FROM #partswithcompany
· 1 · Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

so if i have

both values two time as
('A5ghf7598f7GGHYUTYA',3456)
('A5ghf7598f7GGHYUTYA',3456)
so How to get same generated id two times
if i have same search parts and company id repeated

0 Votes 0 ·
ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered ·

suppose today i put on database
A5ghf7598f7GGHYUTYA 3456
tomorrow
A5ghf7598f7GGHYUTYA 3456
then both must get same id
so How to do that

Viorel's solution is good for first round.

When you get new data, you first lookup if you have the existing pairs of SearchParts and CompanyID to get the ID. For entirely new rows, you use Viorel's query, adding the current max Id.

· Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Viorel-1 avatar image
1 Vote"
Viorel-1 answered ·

If you need this as presumptive method of increasing the performance of queries drastically (using a single index on GeneratedId), then maybe you can also find why the normal queries and indices (on SearchParts and CompanyId) do not give good results.

To make an non-unique integer from two columns and to implement a hash index you can also consider CHECKSUM or BINARY_CHECKSUM (https://docs.microsoft.com/en-us/sql/t-sql/functions/checksum-transact-sql):

 drop table if exists #partswithcompany
    
 create table #partswithcompany
 (
     SearchParts nvarchar(200),
     CompanyId  int,
     GeneratedId as checksum(SearchParts, CompanyId)
 )
    
 -- TODO: create indices
    
 insert into #partswithcompany (SearchParts, CompanyId) values
 ('A5ghf7598fdmlcpghjk',1234),
 ('AKLJGSA7598fdmlcpghjk',5870),
 ('A5ghfJKKJGHHGghjk',9081818),
 ('KHJLFFS8fdmlcpghjk',123345),
 ('A5ghf7598f7GGHYUTYA',3456),
 ('A5ghfJKKJGHHGghjk',9081818),
 ('A5ghf7598f7GGHYUTYA',3456),
 ('A5ghf7598f7GGHYUTYA',3456)
    
 select * from #partswithcompany
    
 -- sample search by Parts and Company
    
 declare @parts as nvarchar(200) = 'A5ghf7598f7GGHYUTYA'
 declare @company as int = 3456
    
 declare @generated_id int = checksum(@parts, @company)
    
 select *
 from #partswithcompany
 where GeneratedId = @generated_id
 and SearchParts = @parts
 and CompanyId = @company




· 5 · Share
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

but it sometimes generated number with minus
i need all number generated positive

0 Votes 0 ·
Viorel-1 avatar image Viorel-1 ahmedsalah-1628 ·

Try iif(checksum(...)=-2147483648, 0, abs(checksum(0))).

Or maybe checksum(...) & 0x7FFFFFFF.

But this probably reduces the benefits of hash indices; it is not clear why negative numbers are scary.

Or use bigint: cast(checksum(…) as bigint) + 2147483648.



0 Votes 0 ·

only last comment
using checksum is best for performance
or
using hashbyte

0 Votes 0 ·
Show more comments

Beware the checksum is likely to give duplicates, since 1) it is a fairly simple-minded XOR algorithm. 2) it is a 32-bit number.

I once did the calculation for the probability that 100000 random 32-bit number would include at least one duplicate. It is, as I recall, about 0.3. The likelihood that one million 32-bit random numbers would be unique is around 10E-50.

In another thread, Ahmed indicates that he has 40 million rows, so checksum seems like very bad idea for a unique number. checksum can be useful if you have a lot of columns, and you want a first sieve, and you are prepared to handle the duplicates.

The likelihood for collisions with hashbytes is a lot less and you can decide to ignore the risk. Then again, I don't see the point with adding a 64-byte hash for only two columns.

0 Votes 0 ·