SQL Server Deadlock on subresource PERMISSIONS when sp_executesql with #temptables

JayaT 1 Reputation point
2021-12-03T21:24:31.107+00:00

Hi,
We are seeing a high volume of deadlocks since we introduced Vault process that gives permissions on schema.
Here the process1 runs the following GRANT statements
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE ON SCHEMA :: sharedproxy TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]
Process 1 holds a SCH_M lock on the resource 'SECURITY CACHE' under the transaction name 'SEC Cache Coherency'

process 2 runs a stored proc mar_dev.dbo.spDD_MarriageView_Search which has dynamic sql that uses sp_executesql to insert into a #temp table.
This has never been a problem before. Now from the deadlock graph we can see that sp_executesql holds a SCH_M lock on METADATA: database_id = 39 PERMISSIONS(class = 0, major_id = 0) under the transactionname 'read permissions'
This is interfering with SECURITY CACHE from process 1.

There is no documentation around xml nodes transaction name any where.
Could someone please throw some light on why sp_executesql is accessing PERMISSIONS for a simple insert into #temp table.
Any ideas to resolve this deadlocks?

<deadlock>  
  <victim-list>  
    <victimProcess id="process2c98d755468" />  
  </victim-list>  
  <process-list>  
    <process id="process2c98d755468" taskpriority="0" logused="0" waitresource="METADATA: database_id = 39 SECURITY_CACHE($hash = 0x27:0x0), lockPartitionId = 0" waittime="4309" ownerId="17175321444" transactionname="SEC Cache Coherency" lasttranstarted="2021-12-02T08:45:23.043" XDES="0x2cea0dd1a40" lockMode="Sch-M" schedulerid="1" kpid="10016" status="suspended" spid="101" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2021-12-02T08:45:23.040" lastbatchcompleted="2021-12-02T08:45:23.040" lastattention="1900-01-01T00:00:00.040" clientapp="mar-vault-appdb" hostname="myhost-65" hostpid="0" loginname="sp-vault-hash-d" isolationlevel="read committed (2)" xactid="17175320772" currentdb="39" currentdbname="mar_dev" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">  
      <executionStack>  
        <frame procname="adhoc" line="1" sqlhandle="0x01002700b2bb6835407e560ed102000000000000000000000000000000000000000000000000000000000000">  
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE  ON SCHEMA :: sharedproxy   TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]    </frame>  
      </executionStack>  
      <inputbuf>  
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE  ON SCHEMA :: sharedproxy   TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]   </inputbuf>  
    </process>  
    <process id="process2c98d755088" taskpriority="0" logused="0" waitresource="METADATA: database_id = 39 PERMISSIONS(class = 0, major_id = 0), lockPartitionId = 0" waittime="4309" ownerId="17175321432" transactionname="read permissions" lasttranstarted="2021-12-02T08:45:23.043" XDES="0x2ca8a173a40" lockMode="Sch-S" schedulerid="1" kpid="832" status="suspended" spid="450" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2021-12-02T08:45:22.987" lastbatchcompleted="2021-12-02T08:45:22.987" lastattention="1900-01-01T00:00:00.987" clientapp="DataSvc" hostname="ase-rate-59" hostpid="5404" loginname="dm\sp-rate-d" isolationlevel="read committed (2)" xactid="17175321385" currentdb="39" currentdbname="mar_dev" lockTimeout="4294967295" clientoption1="536870944" clientoption2="128056">  
      <executionStack>  
        <frame procname="adhoc" line="2" stmtstart="12" stmtend="562" sqlhandle="0x020000004f75a605535205b81f671cce718aff7a9a3bd5fb0000000000000000000000000000000000000000">  
unknown    </frame>  
        <frame procname="mssqlsystemresource.sys.sp_executesql" line="1" stmtstart="-1" sqlhandle="0x0400ff7f427f99d9010000000000000000000000000000000000000000000000000000000000000000000000">  
sp_executesql    </frame>  
        <frame procname="mar_dev.dbo.spDD_MarriageView_Search" line="59" stmtstart="3458" stmtend="3502" sqlhandle="0x030027004f2a4e475121f000bca4000001000000000000000000000000000000000000000000000000000000">  
EXEC sp_executesql @sq    </frame>  
      </executionStack>  
      <inputbuf>  
Proc [Database Id = 39 Object Id = 1196304975]   </inputbuf>  
    </process>  
  </process-list>  
  <resource-list>  
    <metadatalock subresource="SECURITY_CACHE" classid="$hash = 0x27:0x0" dbid="39" lockPartition="0" id="lock2d0eaa62f80" mode="Sch-M">  
      <owner-list>  
        <owner id="process2c98d755088" mode="Sch-M" />  
      </owner-list>  
      <waiter-list>  
        <waiter id="process2c98d755468" mode="Sch-M" requestType="wait" />  
      </waiter-list>  
    </metadatalock>  
    <metadatalock subresource="PERMISSIONS" classid="class = 0, major_id = 0" dbid="39" lockPartition="0" id="lock2d050969700" mode="Sch-M">  
      <owner-list>  
        <owner id="process2c98d755468" mode="Sch-M" />  
      </owner-list>  
      <waiter-list>  
        <waiter id="process2c98d755088" mode="Sch-S" requestType="wait" />  
      </waiter-list>  
    </metadatalock>  
  </resource-list>  
</deadlock>  

154789-blog-post.xml

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,813 questions
{count} votes

10 answers

Sort by: Most helpful
  1. Erland Sommarskog 101.8K Reputation points MVP
    2021-12-03T22:35:00+00:00

    Not really your plain-vanilla standard deadlock.

    To start somewhere, what does "SELECT @@version" return?

    Since the deadlock trace does not include the statement for the dynamic SQL, I don't know what it is doing. You say that it is inserting into a temp table, but supposedly it is reading the data from permanent tables, and in such case there needs to be a permission check on these tables.

    I don't really have much suggestions on how to fix this, but this Vault software must be a crazy with running all these GRANT statements. I'm not familiar with it, so it is difficult to give advice.

    I have a recollection of have seen questions about deadlocks involving permissions before, but I don't recall the outcome.

    0 comments No comments

  2. Erland Sommarskog 101.8K Reputation points MVP
    2021-12-04T19:20:33.423+00:00

    I will have to admit that I don't know the internals of these operations.

    However, so much is clear, for every query there has to be a permission check, with one exception: the query is inside a stored procedure, and the table accessed have the same owner as the stored procedure. And that exception does not apply here, because the dynamic SQL is a stored procedure of its own without any owner, so ownership chaining does not apply.

    Obviously, since permissions are stored in the database, SQL Server needs to read them just like any other data. I don't know the details about the security cache, but it makes sense that permissions are cached in some structure that is faster to read than the system tables. But in this case this leads to deadlock.

    Looking at the procedure, I think it is possible to rewrite it to not use dynamic SQL, but instead use a static query with OPTION (RECOMPILE). I am not sure that it is worth the effort, though. I would be more inclined to gain control over that process that spits out all those GRANT.

    0 comments No comments

  3. JayaT 1 Reputation point
    2021-12-04T22:09:30.307+00:00

    @Erland Sommarskog

    That makes sense.
    I am trying to see if there is a way to check the locks that are held on the 'PERMISSIONS' subresource. The data from sys.dm_tran_locks is very transient in nature and not able to capture the data that I need. Extended events 'lock_acquried' gives information about the resource 'METADATA' but not subresource.
    I want to prove the theory that subresource 'PERMISSIONS' is incompatible with 'SECURITY CACHE' and hence the deadlock.
    Unfortunately, there is not a whole lot of documentation on the specifics around 'METADATA' locks.

    The vault app is designed to create credentials on the fly. The user & login is dropped as the session ends and this is a secure way of running app instances.
    But as we have implemented this in large scale, we are seeing issues with the magnitude of grants being generated.

    Clearly, we can't create indexes around system tables to help alleviate this. We didn't foresee that 'SECURITY CACHE' would be a source of contention with GRANT statements.
    Any logical way, you could think of, to implement this, perhaps some optimizations that could be made on the sql server to facilitate this ?

    0 comments No comments

  4. JayaT 1 Reputation point
    2021-12-04T18:57:49.753+00:00

    @Erland Sommarskog

    Thank you for your response.

    We are running SQL Server 2016 SP2(CU17) on a windows server 2016 box. Patch level 13.0.5888.11

    The stored proc is joining #temp table with permanent tables to get some data. Although the deadlock graph does not show exact query that is used to check for permissions.
    Does every query that runs against a table check for PERMISSIONS subresource? If yes, Is there a way to find out the query that is being run when checking the PERMISSION sub resource?

    Here is the stored proc code.

    CREATE PROCEDURE dbo.spDD_marriageView_Search (  
     @tIds  dbo.udt_id READONLY,   
     @sIds  dbo.udt_id READONLY,  
        @schIds                        dbo.udt_id READONLY,  
        @paymentDateFrom                    DATETIME = NULL,  
        @fullPaymentOnly                    BIT = NULL,  
        @asOfDate DATETIME = NULL  
    )  
    AS  
      
      
    BEGIN  
     DECLARE @sql        NVARCHAR(MAX),  
             @paramlist  NVARCHAR(4000)  
      
        CREATE TABLE #sR  
        (  
         Id INT NOT NULL PRIMARY KEY,  
         TId INT NOT NULL,  
         SchId INT NOT NULL,  
        )  
          
     SET @sql = '  
        INSERT INTO #sR  
     SELECT   
         fl.fl_ongoing_calc_id AS Id,  
         tq.tq_tr_transaction_id AS TId,  
         fl.fl_se_schedule_id AS SchId,  
     FROM dbo.sched_detail fl  
         INNER JOIN dbo.schedule se ON fl.fl_se_schedule_id = se.se_schedule_id  
         INNER JOIN dbo.tran_q tq ON se.se_tq_tran_quote_id = tq.tq_tran_quote_id'  
      
     IF EXISTS(SELECT 1 FROM @tIds)  
     BEGIN  
     CREATE TABLE #tIds (ID INT NOT NULL, UNIQUE(ID))  
     INSERT INTO #tIds SELECT id FROM @tIds  
      
     SET @sql = @sql + ' INNER JOIN #tIds tids ON tids.id = tq.tq_tr_transaction_id'  
     END  
      
     IF EXISTS(SELECT 1 FROM @schIds)  
     BEGIN  
     CREATE TABLE #schIds (ID INT NOT NULL, UNIQUE(ID))  
     INSERT INTO #schIds SELECT id FROM @schIds  
      
         SET @sql = @sql + ' INNER JOIN #schIds sdids ON sdids.id = fl.fl_ongoing_calc_id'  
     END  
      
         
        SET @sql = @sql + ' WHERE 1=1'  
      
        IF @paymentDateFrom IS NOT NULL  
        BEGIN   
            SET @sql = @sql + ' AND fl.fl_payment_dt >= @paymentDateFrom'  
        END  
          
        IF @asOfDate IS NOT NULL  
        BEGIN   
            SET @sql = @sql + ' AND (fl.fl_from_dt <= @asOfDate OR fl.fl_float_from_dt <= @asOfDate) AND (fl.fl_to_dt >= @asOfDate OR fl.fl_float_to_dt >= @asOfDate)'  
        END  
      
     SELECT @paramlist = '@paymentDateFrom DATETIME,  
      @asOfDate DATETIME'  
      
     EXEC sp_executesql  @sql,   
                            @paramlist,  
                         @paymentDateFrom,  
                         @asOfDate  
      
         
            SELECT * FROM #sR  
         
      
        IF EXISTS(SELECT 1 FROM @tIds)  
     BEGIN  
     DROP TABLE #tIds  
     END  
      
     IF EXISTS(SELECT 1 FROM @sIds)  
     BEGIN  
     DROP TABLE #sIds  
     END  
      
        IF EXISTS(SELECT 1 FROM @schIds)  
        BEGIN  
     DROP TABLE #schIds  
     END  
      
        DROP TABLE #sR  
    END  
    
    0 comments No comments

  5. Erland Sommarskog 101.8K Reputation points MVP
    2021-12-04T22:39:14.257+00:00

    I think the deadlock graph makes it quite clear that the locks are incompatible.

    If you want to look at it more closely, you can use my beta_lockinfo.

    As you say, the locks are short-lived, but you can do this. In one window run

    CREATE USER nisse WITHOUT LOGIN
    CREATE USER katrin WITHOUT LOGIN
    GRANT SELECT ON dbo.sometable TO katrin
    go
    BEGIN TRANSACTION
    GRANT SELECT, UPDATE, DELETE, INSERT ON SCHEMA::dbo TO nisse
    

    In a second window, run:

    EXECUTE AS USER = 'katrin'
    go
    SELECT * FROM dbo.sometbale
    go
    REVERT
    

    This will block. Then in a third window run beta_lockinfo to look at the locks. !! in the blklvl column means that this is a lead blocker. 1 means that this the directly blocked process.

    This was the easy part. The next to study is the locks on the SECURITY CACHE, that can be more difficult. One option is to set up the deadlock, which will be unresolved for five seconds. beta_lockinfo will display DD in the blklvl column in this case.

    As for the actual problem, I'm thinking: can't you add this temporary user to a role with the required permissions instead? That role would include the user being blocked. (Or is the user running the SP also one of these temporary users?)

    If that is not feasible, I would deal with this deadlock problem on the Vault side. One option is to use SET DEADLOCK_PRIORTY LOW, so that this process always becomes the deadlock victim. Then again, that seems to happen already. Also, you will need to wait five seconds until the deadlock is resolved.

    A better approach is that the Vault process issues SET LOCK_TIMEOUT 0, so if it can't get a lock, it gets error 1222. You trap that error and wait for, say, 100 ms and try again. By backing out directly, you don't disturb the other process.

    You could of course also put the retry logic in the procedure that runs the dynamic SQL, but I suspect that this is not the only one that runs into deadlock, so I'm thinking there are fewer places to change on the Vault side.

    0 comments No comments