• greenplum源码解析 全局死锁检测ReadMe


    来源

    本文内容来自src/backend/utils/gdd/ReadMe
    内容涉及较多基础相关知识,可见知识回顾:
    greenplum 常规锁管理器-1
    postgres 死锁检测机制-1

    Global Deadlock Detection 全局死锁检测

    背景来源于本地死锁检测,在分布式数据库下引发的全局性死锁检测,以下是一个典型的本地
    死锁SQL

    Let’s begin with a simple case using a similar syntax of isolation2 test framework:

    A: begin;
    A: update foo set val=val where id=1;
    
    B: begin;
    B: update foo set val=val where id=2;
    B&: update foo set val=val where id=1;
    
    A&: update foo set val=val where id=2;
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    In this case there are two concurrent transactions, A and B. At first A holds
    a Xid lock on the tuple with `id=1` and B holds a Xid lock with `id=2` without
    blocking, but then B gets blocked by A when trying to update the tuple with
    `id=1` and A gets blocked by B when trying to update the tuple with `id=2`, so
    a deadlock is formed. And in this case the tuple `id=1` and the tuple `id=2` is
    on the same segment, so the deadlock just happens locally.
    
    该案例涉及两个并发事务A和B,起初A持有元组 id = 1的xid lock,B持有元组id = 2 的xid lock,此阶段不
    阻塞;后B尝试更新 id = 1的元组时被事务A阻塞,A尝试更新 id = 2的元组时被B阻塞,即形成死锁。由于更新
    操作只涉及一个segment,因此称为本地死锁。
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    Local Deadlock Detection

    On a single segment this deadlock can be automatically detected. The detection 
    is based on a cycle detection algorithm:
    
    - each transaction is a vertex;
    - each waiting relation is a directed edge;
    - if there is a directed cycle in the graph then there is deadlock.
    
    In above case there are two vertices, A and B, and there are two directed
    edges, `A==>B` and `B==>A`, obviously these two edges form a directed cycle,
    so the deadlock can be detected.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    在这里插入图片描述

    Global Deadlock

    However on a distributed cluster (e.g. Greenplum DB) rows with different ids
    are distributed to different segments, suppose rows with `id=1` are on seg 0
    and rows with `id=2` are on seg 1, then each segment only see a part of the
    graph <This forms a global / distributed deadlock.>.
    
    在分布式集群如greenplum DB,具有不同行id的元组数据将会分布至不同的segment上,如 id = 1 的元组
    在seg0 上,id = 2的元组在 seg1上,如发生上述现象即为死锁。
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    在这里插入图片描述

    Global Deadlock Prevention

    To prevent global deadlock, in Greenplum DB an exclusive table lock is held
    for `UPDATE` and `DELETE` commands, so concurrent updates are actually
    disabled.
    
    为了预防死锁,在gp中设定在执行 update 和 delete 操作时获取排他锁,这样便不允许并发更新
    
    This policy does prevent the global deadlock, but the cost is bad performance
    on concurrent updates.
    
    该方法的确能预防死锁,但是会付出大代价且数据库的并发性能严重降低
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    Global Deadlock Detection

    If we can gather all these edges to one segment then we can form the same
    graph as the [local one](#local-deadlock-detection) and detect the deadlock.
    
    如果能收集一个segment上的所有边,便可得到该segment上的等待图,进而检测死锁
    
    However if we make the detection based on these directly gathered edges then
    some fake deadlocks can be detected. A transaction can be waiting for at most
    one other transaction on any single host, but it can be waiting for multiple
    other transactions globally in a distributed system. Some waiting relations
    are automatically changed if the holder object changed, but some are not. An
    example of unchanged waiting relations is waiting for a Xid lock(it can be only
    released after the holding transaction is over). An example of changed waiting
    relations is waiting for a tuple lock(the tuple lock can be released before the
    holding transaction is over). For concrete examples, please refer the test cases.
    
    但是,如果我们对这些直接收集的边进行检测,可能会检测到一些假死锁。 一个事务最多可以在任何单个主机
    上等待另一个事务,但它可以在分布式系统中全局等待多个其他事务。 如果持有者对象发生变化,一些等待关
    系会自动改变,但有些则不会。 等待关系不变的一个例子是等待一个Xid锁(它只能在持有事务结束后释放)。
    更改等待关系的一个示例是等待元组锁(可以在持有事务结束之前释放元组锁)。 具体例子请参考测试用例。
    
    A proper global deadlock detector must be designed with these distributed
    system specific characteristics in consideration. The basic requirements are
    that if there is a deadlock then find it out and break it, and make sure don't
    make false alarms.
    
    设计适当的全局死锁检测器必须考虑这些分布式系统特定特性的情况。 基本要求是,如果有死锁,就找出并打破
    它,并确保不要误报。
    
    Our global deadlock detection algorithm is based on the local version, and has
    extra logic to distinguish the false deadlocks from the real ones.
    
    死锁检测算法基于本地死锁检测,并结合额外的逻辑区真死锁和假思索
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    Abstraction and Definition

    - **vertex**: each transaction is a vertex;
    - **edge**: if transaction A is waiting for transaction B then there is a
      directed edge from vertex A to vertex B;
    - **out edge**: an edge from A to B is an out edge of A;
    - **in edge**: an edge from A to B is an in edge of B;
    - **out degree**: the number of a vertex's out edges is its out degree;
    - **in degree**: the number of a vertex's in edges is its in degree;
    - **solid edge**: if an edge from A to B can not be released until B
      COMMIT/ABORT, then this edge is a solid edge;
    - **dotted edge**: if an edge from A to B can be released when or before B's
      current query is done, then this edge is a dotted edge;
    - **local graph**: each segment has a local graph formed by all the local
      edges on this segment;
    - **global graph**: edges on all the segment form a global graph;
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    相关参考:greenplum源码解析 全局死锁检测GDD-1

    判断规则

    ### Rules and Policies
    
    The detection is based on one basic assumption:
    
    - if some vertices (transactions) formed a cycle (deadlock), then they and
      their edges (waiting relations) will no longer change their status;
    
    And we know some facts:
    
    - an edge can be either a dotted edge or a solid edge, no other possibilities;
    // 约束边必定是虚边或者是软边,无其他情况
    According to these, we have below deductions:
    
    - if the status of a vertex or an edge is possible to change, then this
      vertex/edge is not part of a cycle (yet);
    - if a vertex has no out edge on any segment then its status is possible to
      change;
      - --> `Rule 1: delete a vertex if its global out degree is 0`;
    - if a vertex has no in edge on any segment then it's not part of a cycle;
      - --> `Rule 2: delete a vertex if its global in degree is 0`;
    - if a vertex has no out edge on segment X but has an out edge on segment Y:
      - its solid in edges on X is not possible to change;
      - its dotted in edges on X are possible to change;
        - --> `Rule 3: delete a vertex's dotted in edges on a segment if its local
          out degree is 0`;
        
     规则1:如果在全局等待图中某节点出度为0,则可移除
     规则2:如果在全局等待图中某节点入度为0,则可移除
     规则3:规则12不满足下,若节点处于虚边,且其出度为0,则可以移除
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29

    死锁检测算法
    总结:
    1)首先消除全局等待图中无出度的节点<分布式事务>
    2) 然后消除全局等待图中无入度的节点;
    3)紧接着消除虚边且无出度的节点(遍历所有的segment 本地等待图);
    4)上述步骤完成至无节点或边消除退出循环;
    5)如果存在等待环且该事务是有效的分布式事务,则选择其中一个/若干个事务进行回滚以打破死锁;
    算法如下

    ### Algorithm
    
    With all these 3 rules we can have an algorithm as this:
    
    ```python
    # reduce edges and vertices
    while true:
        for vertex in vertices_with_zero_global_out_degree():
            # delete vertex and its solid/dotted in edges on all segments
            delete_vertex(vertex)
    
        for vertex in vertices_with_zero_global_in_degree():
            # delete vertex and its solid/dotted out edges on all segments
            delete_vertex(vertex)
    
        for segment in all_segments():
            for vertex in vertices_with_zero_local_out_degree(segment):
                # delete vertex's dotted in edges on segment,
                # related vertices' global in/out degrees are also updated
                delete_dotted_in_edges_on_segment(segment, vertex)
    
        if nothing_deleted_in_current_loop():
            # no more vertex or edge to delete
            break
    
    # detect for deadlock cycle
    if count_vertices_not_deleted() > 0:
        # detected at least one cycle
        # the cycles are only reliable if all the contained vertices
        # are valid transactions on master
        if all_vertices_are_valid():
            # choose a vertex and cancel it
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    ###案例说明

    ### Case Analysis
    
    Let's run the new algorithm on an actual case to understand how does it work.
    
    The testing cluster has 3 segments: seg -1, seg 0 and seg 1. Master is on seg
    -1.  A table `t1` is created as below:
    
    ```sql
    create table t1 (id int, val int);
    insert into t1 values (1,1), (2,2), (3,3), (4,4);
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    Its data distribution is as below:

    select gp_segment_id, * from t1 order by id;
     gp_segment_id | id | val
    ---------------+----+-----
                 0 |  1 |   1
                 1 |  2 |   2
                 0 |  3 |   3
                 1 |  4 |   4
    (4 rows)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    The sessions are like below:

    C: begin;
    C: update t1 set val=30 where id=2;  
    
    A: begin;
    A: update t1 set val=10 where val=3;
    
    B: begin;
    B: update t1 set val=20 where id=4;
    B&: update t1 set val=20 where val=3 or id=2;
    -- B ==> A: xid lock on seg 0
    -- B ==> C: xid lock on seg 1
    
    A&: update t1 set val=10 where val=2;
    -- A ~~> B: tuple lock on seg 1
    
    D: begin;
    D&: update t1 set val=40 where id=4;
    -- D ==> B: xid lock on seg 1
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    根据上述SQL不难画出本地等待图进而构建全局等待图,如下:
    With the collected edges we form below original graphs on each segment:

    +-----------------------+    +-------------------------------------+
    |         seg 0         |    |                seg 1                |
    |                       |    | +-----+                             |
    | +-----+       +-----+ |    | |  A  | ~~~~> +-----+       +-----+ |
    | |     |       |     | |    | +-----+       |     |       |     | |
    | |  B  | ====> |  A  | |    |               |  B  | ====> |  C  | |
    | |     |       |     | |    | +-----+       |     |       |     | |
    | +-----+       +-----+ |    | |  D  | ====> +-----+       +-----+ |
    |                       |    | +-----+                             |
    +-----------------------+    +-------------------------------------+
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    Edges are reduced in this order:

    • B1 ==> C1: as C’s global out degree is 0 (Rule 1);
    • D1 ==> B1: as D’s global in degree is 0 (Rule 2);
    • A1 ~~> B1: as B1’s local out degree is 0 and it’s a dotted edge (Rule 3);
    • B0 ==> A0: as A’ global out degree is 0 (Rule 1);

    步骤1:移除seg1节点上的事务C,其全局等待图出度为 0;   规则1
    步骤2:移除seg1节点上的事务D,其全局等待图入度为 0;   规则2
    步骤3:移除seg1节点上的事务B,虚边且入度为0,       规则3
    步骤4:移除seg0节点上的事务A,此时全局等待图中出度为0.  规则1

    As all edges are removed, there is no deadlock cycle.

  • 相关阅读:
    idea Springboot 教师标识管理系统开发mysql数据库web结构java编程计算机网页源码maven项目
    记 linux 系统编译好的exp提权提示无gcc
    The DAO众筹事件与以太坊分叉
    电动汽车有序无序充放电的优化调度附matlab代码
    现代C++、STL、QTL的使用
    【MM32F5270开发板试用】基于MindSDK测试MM32F5270开发板IIC
    第7章:存储系统
    Python | GUI | Tkinter - 2. 组件使用
    记录本地Nginx发布vue项目
    SpringBoot整合JWT、实现登录和拦截
  • 原文地址:https://blog.csdn.net/qq_52668274/article/details/126686980