question

vsslasd-3371 avatar image
0 Votes"
vsslasd-3371 asked vsslasd-3371 commented

TSQL Bell Curve Part 2

Hello, I asked a question a few weeks ago on retrieving data for TSQL Bell Curve, and very much appreciate the answer.

Here is the original question:

https://docs.microsoft.com/en-us/answers/questions/268142/prep-data-formula-for-a-bell-curve.html

Here was the response:


 declare @D table (ID varchar(7), ProjDate Date, ProjAmt numeric(19,5))
            
  insert Into @D(ID, ProjDate, ProjAmt) values
  ('C2147',  '2021-01-01', 6.60869565217391),
  ('C2147',  '2021-01-04', 6.60869565217391),
  ('C2147',  '2021-01-11', 6.60869565217391),
  ('C2147',  '2021-01-18', 6.60869565217391),
  ('C2147',  '2021-01-25', 6.60869565217391),
  ('C2147',  '2021-02-01', 6.60869565217391),
  ('C2147',  '2021-02-08', 6.60869565217391),
  ('C2147',  '2021-02-15', 6.60869565217391),
  ('C2147',  '2021-02-22', 6.60869565217391),
  ('C2147',  '2021-03-01', 6.60869565217391),
  ('C2147',  '2021-03-08', 6.60869565217391),
  ('C2147',  '2021-03-15', 6.60869565217391),
  ('C2147',  '2021-03-22', 6.60869565217391),
  ('C2147',  '2021-03-29', 6.60869565217391),
  ('C2147',  '2021-04-05', 6.60869565217391),
  ('C2147',  '2021-04-12', 6.60869565217391),
  ('C2147',  '2021-04-19', 6.60869565217391),
  ('C2147',  '2021-04-26', 6.60869565217391),
  ('C2147',  '2021-05-03', 6.60869565217391),
  ('C2147',  '2021-05-10', 6.60869565217391),
  ('C2147',  '2021-05-17', 6.60869565217391),
  ('C2147',  '2021-05-24', 6.60869565217391),
  ('C2147',  '2021-05-31', 6.60869565217391)
        
  ---    
        
  declare @mindate as date = (select min(ProjDate) from @d)
        
  ;
  with E1 as
  (
      select *
      from @D
      cross apply (values (datediff(day, @mindate, ProjDate))) t1(x)
  ),
  E2 as
  (
      select cast(min(x) as float) as a, cast(max(x) as float) as c
      from E1
  ),
  E3 as
  (
      select distinct E2.*, cast(percentile_disc( 0.75 ) within group (order by x) over () as float) as b
      from E1, E2
  ),
  E4 as
  (
      select *,
          power(x - a, alpha - 1 ) * power(c - x, beta - 1) / 
              (dbo.BetaApprox(alpha, beta) * power(c - a, alpha + beta - 1)) as val
      from E1, E3
      cross apply (values (1 + 4 * (b - a) / (c - a), 1 + 4 * (c - b) / (c - a))) t1(alpha, beta)
  )
  select ID, ProjDate, val * 100 / max(val) over (partition by null) as ProjAmt, replicate('*', val * 1000)
  from E4
  order by ProjDate
    
 --Helper Function
 create or alter function dbo.BetaApprox( @x float, @y float ) 
  returns float
  as begin
      return sqrt(2 * pi()) * power(@x, @x - 0.5) * power(@y, @y - 0.5) / 
          power(@x + @y, @x + @y - 0.5)
  end

My question is: What would be the appropriate parameter to make the density of the Bell Curve Sharper or more Broad?

How can I account for changing a variable to change the density of the bell curve, so that it is more steep or more gradual ?

Thank you so very much

sql-server-transact-sql
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @vsslasd-3371

Could you please validate all the latest answers so far and provide any update?

Best regards
Melissa

0 Votes 0 ·
Viorel-1 avatar image
0 Votes"
Viorel-1 answered vsslasd-3371 commented

It is not clear why you used the intermediate query as a base for this question. You already have another query which treats the sums and partitions. Here is the adjustment of that query, which includes the @gamma parameter to control the shape:

 declare @D table (ID varchar(7), ProjDate Date, ProjAmt numeric(19,5))
            
 insert Into @D(ID, ProjDate, ProjAmt) values
 ('C2147',  '2021-01-01', 6.60869565217391),
 ('C2147',  '2021-01-04', 6.60869565217391),
 ('C2147',  '2021-01-11', 6.60869565217391),
 ('C2147',  '2021-01-18', 6.60869565217391),
 ('C2147',  '2021-01-25', 6.60869565217391),
 ('C2147',  '2021-02-01', 6.60869565217391),
 ('C2147',  '2021-02-08', 6.60869565217391),
 ('C2147',  '2021-02-15', 6.60869565217391),
 ('C2147',  '2021-02-22', 6.60869565217391),
 ('C2147',  '2021-03-01', 6.60869565217391),
 ('C2147',  '2021-03-08', 6.60869565217391),
 ('C2147',  '2021-03-15', 6.60869565217391),
 ('C2147',  '2021-03-22', 6.60869565217391),
 ('C2147',  '2021-03-29', 6.60869565217391),
 ('C2147',  '2021-04-05', 6.60869565217391),
 ('C2147',  '2021-04-12', 6.60869565217391),
 ('C2147',  '2021-04-19', 6.60869565217391),
 ('C2147',  '2021-04-26', 6.60869565217391),
 ('C2147',  '2021-05-03', 6.60869565217391),
 ('C2147',  '2021-05-10', 6.60869565217391),
 ('C2147',  '2021-05-17', 6.60869565217391),
 ('C2147',  '2021-05-24', 6.60869565217391),
 ('C2147',  '2021-05-31', 6.60869565217391),
        
 ('Z0000',  '2020-01-01', 9.8),
 ('Z0000',  '2020-01-04', 9.8),
 ('Z0000',  '2020-01-11', 9.8),
 ('Z0000',  '2020-01-18', 9.8),
 ('Z0000',  '2020-01-25', 9.8),
 ('Z0000',  '2020-02-01', 9.8),
 ('Z0000',  '2020-02-08', 9.8),
 ('Z0000',  '2020-02-15', 9.8),
 ('Z0000',  '2020-02-22', 9.8),
 ('Z0000',  '2020-03-01', 9.8),
 ('Z0000',  '2020-03-08', 9.8),
 ('Z0000',  '2020-03-15', 9.8),
 ('Z0000',  '2020-03-22', 9.8),
 ('Z0000',  '2020-03-29', 9.8),
 ('Z0000',  '2020-04-05', 9.8),
 ('Z0000',  '2020-04-12', 9.8),
 ('Z0000',  '2020-04-19', 9.8),
 ('Z0000',  '2020-04-26', 9.8),
 ('Z0000',  '2020-05-03', 9.8),
 ('Z0000',  '2020-05-10', 9.8),
 ('Z0000',  '2020-05-17', 9.8),
 ('Z0000',  '2020-05-24', 9.8),
 ('Z0000',  '2020-05-31', 9.8)
        
 ---    
    
 declare @gamma as float = 4
        
 ;
 with X as
 (
     select d.*, datediff(day, 0, ProjDate) x
     from @D d 
 ),
 SA as
 (
     select ID, sum(ProjAmt) as sumamt
     from @D
     group by ID
 ),
 AC as
 (
     select ID, cast(min(x) as float) as a, cast(max(x) as float) as c
     from X
     group by ID
 ),
 B as
 (
     select distinct ID, cast(percentile_cont( 0.75 ) within group (order by x) over (partition by ID) as float) as b
     from X
 ),
 R as
 (
     select X.ID, X.ProjDate, X.ProjAmt,
         power(x - a, alpha - 1 ) * power(c - x, beta - 1) / 
                 (dbo.BetaApprox(alpha, beta) * power(c - a, alpha + beta - 1)) as val
     from X
     inner join AC on AC.ID = X.ID
     inner join B on B.ID = X.ID
     cross apply (values (1 + @gamma * (b - a) / (c - a), 1 + @gamma * (c - b) / (c - a))) t(alpha, beta)
 ),
 SV as
 (
     select ID, sum(val) as sumval
     from R
     group by ID
 ),
 F as
 (
     select R.ID, R.ProjDate, R.ProjAmt, val * sumamt / sumval as NewProjAmt, sumamt
     from R
     inner join SV on SV.ID = R.ID
     inner join SA on SA.ID = R.ID
 )
 select F.ID, F.ProjDate, F.ProjAmt as OldProjAmt, NewProjAmt, replicate( '*', NewProjAmt), sumamt as OldSum, sum(NewProjAmt) over (partition by ID) as NewSum
 from F
 order by ID, ProjDate


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

thank you !!!

0 Votes 0 ·
MelissaMa-msft avatar image
0 Votes"
MelissaMa-msft answered

Hi @vsslasd-3371

Based on Viorel's anwer, please refer below which the sum of the Project Amount in the code does add up correctly to the amount of 6.60869565217391 * 23.

 declare @gamma as float = 4        
  ;
  with E1 as
  (
      select *, datediff(day, 0, ProjDate) as x
      from @D
  ),
  E2 as
  (
      select cast(min(x) as float) as a, cast(max(x) as float) as c
      from E1
  ),
  E3 as
  (
      select distinct E2.*, cast(percentile_disc( 0.75 ) within group (order by x) over () as float) as b
      from E1, E2
  ),
  E4 as
  (
      select *,
          power(x - a, alpha - 1 ) * power(c - x, beta - 1) / 
              (dbo.BetaApprox(alpha, beta) * power(c - a, alpha + beta - 1)) as val
      from E1, E3
      cross apply (values (1 + @gamma * (b - a) / (c - a), 1 + @gamma * (c - b) / (c - a))) t(alpha, beta)
  )
  select ID, ProjDate, ProjAmt*32*val/sum(val) over (partition by null) ProjAmt, replicate('*', val * 1000)
  from E4

Output:
71925-bell.png

Best regards
Melissa


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.



bell.png (23.5 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Viorel-1 avatar image
0 Votes"
Viorel-1 answered vsslasd-3371 commented

According to various articles, try an apparatus that uses @gamma – a value that controls the shape:

 declare @D table (ID varchar(7), ProjDate Date, ProjAmt numeric(19,5))
                
 insert Into @D(ID, ProjDate, ProjAmt) values
 ('C2147',  '2021-01-01', 6.60869565217391),
 ('C2147',  '2021-01-04', 6.60869565217391),
 ('C2147',  '2021-01-11', 6.60869565217391),
 ('C2147',  '2021-01-18', 6.60869565217391),
 ('C2147',  '2021-01-25', 6.60869565217391),
 ('C2147',  '2021-02-01', 6.60869565217391),
 ('C2147',  '2021-02-08', 6.60869565217391),
 ('C2147',  '2021-02-15', 6.60869565217391),
 ('C2147',  '2021-02-22', 6.60869565217391),
 ('C2147',  '2021-03-01', 6.60869565217391),
 ('C2147',  '2021-03-08', 6.60869565217391),
 ('C2147',  '2021-03-15', 6.60869565217391),
 ('C2147',  '2021-03-22', 6.60869565217391),
 ('C2147',  '2021-03-29', 6.60869565217391),
 ('C2147',  '2021-04-05', 6.60869565217391),
 ('C2147',  '2021-04-12', 6.60869565217391),
 ('C2147',  '2021-04-19', 6.60869565217391),
 ('C2147',  '2021-04-26', 6.60869565217391),
 ('C2147',  '2021-05-03', 6.60869565217391),
 ('C2147',  '2021-05-10', 6.60869565217391),
 ('C2147',  '2021-05-17', 6.60869565217391),
 ('C2147',  '2021-05-24', 6.60869565217391),
 ('C2147',  '2021-05-31', 6.60869565217391)
          
 ---    
    
 declare @gamma as float = 4
          
 ;
 with E1 as
 (
     select *, datediff(day, 0, ProjDate) as x
     from @D
 ),
 E2 as
 (
     select cast(min(x) as float) as a, cast(max(x) as float) as c
     from E1
 ),
 E3 as
 (
     select distinct E2.*, cast(percentile_disc( 0.75 ) within group (order by x) over () as float) as b
     from E1, E2
 ),
 E4 as
 (
     select *,
         power(x - a, alpha - 1 ) * power(c - x, beta - 1) / 
             (dbo.BetaApprox(alpha, beta) * power(c - a, alpha + beta - 1)) as val
     from E1, E3
     cross apply (values (1 + @gamma * (b - a) / (c - a), 1 + @gamma * (c - b) / (c - a))) t(alpha, beta)
 )
 select ID, ProjDate, val * 100 / max(val) over (partition by null) as ProjAmt, replicate('*', val * 1000)
 from E4
 order by ProjDate

When @gamma is 4, you should see results that are similar to previous fixed-shape approach. Change it to obtain different results. (For example, try @gamma = 1, 2, 7, etc.).

The method can be also applied to extended script that deals with partitions.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you for your input -
This is not quite right, since the sum of the Project Amount in the code does not add up correctly to the amount of 6.60869565217391 * 23.

0 Votes 0 ·
MelissaMa-msft avatar image
0 Votes"
MelissaMa-msft answered

Hi @vsslasd-3371

Welcome to Microsoft Q&A!

Please provide your expected output of density of the Bell Curve Sharper or more Broad.

Only according to your statement and function provided, I did some tests with changing the number inside percentile_disc() and the calculation inside function dbo.BetaApprox.

I found that both of them could affect the density of the Bell Curve.

One: Changing the number inside percentile_disc() .
For example ,percentile_disc(0.5).

Two: Changing the calculation inside function dbo.BetaApprox.
For example: return sqrt(0.5 pi()) power(@x, @x - 0.5) * power(@y, @y - 0.5) /
power(@x + @y, @x + @y - 0.5)

You could change above numbers one by one and choose the best one.

Best regards
Melissa


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.