1 2 MSMProp2, a simplified but functionally equivalent version of MSMProp1 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4 5 Julian Seward, OpenWorks Ltd, 19 August 2008 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7 8 Note that this file does NOT describe the state machine used in the 9 svn://svn.valgrind.org/branches/YARD version of Helgrind. That state 10 machine is different again from any previously described machine. 11 12 See the file README_YARD.txt for more details on YARD. 13 14 ---------------------- 15 16 In early 2008 Konstantin Serebryany proposed "MSMProp1", a memory 17 state machine for data race detection. It is described at 18 http://code.google.com/p/data-race-test/wiki/MSMProp1 19 20 Implementation experiences show MSMProp1 is useful, but difficult to 21 implement efficiently. In particular keeping the memory usage under 22 control is complex and difficult. 23 24 This note points out a key simplification of MSMProp1, which makes it 25 easier to implement without changing the functionality. 26 27 28 The idea 29 ~~~~~~~~ 30 31 The core of the idea pertains to the "Condition" entry for MSMProp1 32 state machine rules E5 and E6(r). These are, respectively: 33 34 HB(SS, currS) and its negation 35 ! HB(SS, currS). 36 37 Here, SS is a set of segments, and currS is a single segment. Each 38 segment contains a vector timestamp. The expression "HB(SS, currS)" 39 is intended to denote 40 41 for each segment S in SS . happens_before(S,currS) 42 43 where happens_before(S,T) means that S's vector timestamp is ordered 44 before-or-equal to T's vector timestamp. 45 46 In words, the expression 47 48 for each segment S in SS . happens_before(S,currS) 49 50 is equivalent to saying that currS has a timestamp which is 51 greater-than-equal to the timestamps of all the segments in SS. 52 53 The key observation is that this is equivalent to 54 55 happens_before( JOIN(SS), currS ) 56 57 where JOIN is the lattice-theoretic "max" or "least upper bound" 58 operation on vector clocks. Given the definition of HB, 59 happens_before and (binary) JOIN, this is easy to prove. 60 61 62 The consequences 63 ~~~~~~~~~~~~~~~~ 64 65 With that observation in place, it is a short step to observe that 66 storing segment sets in MSMProp1 is unnecessary. Instead of 67 storing a segment set in each shadow value, just store and 68 update a single vector timestamp. The following two equivalences 69 hold: 70 71 MSMProp1 MSMProp2 72 73 adding a segment S join-ing S's vector timestamp 74 to the segment-set to the current vector timestamp 75 76 HB(SS,currS) happens_before( 77 currS's timestamp, 78 current vector timestamp ) 79 80 Once it is no longer necessary to represent segment sets, it then 81 also becomes unnecessary to represent segments. This constitutes 82 a significant simplication to the implementation. 83 84 85 The resulting state machine, MSMProp2 86 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 87 88 MSMProp2 is isomorphic to MSMProp1, with the following changes: 89 90 States are New, Read(VTS,LS), Write(VTS,LS) 91 92 where LS is a lockset (as before) and VTS is a vector timestamp. 93 94 For a thread T with current lockset 'currLS' and current VTS 'currVTS' 95 making a memory access, the new rules are 96 97 Name Old-State Op Guard New-State Race-If 98 99 E1 New rd True Read(currVTS,currLS) False 100 101 E2 New wr True Write(currVTS,currLS) False 102 103 E3 Read(oldVTS,oldLS) rd True Read(newVTS,newLS) False 104 105 E4 Read(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 106 && !hb(oldVTS,currVTS) 107 108 E5 Write(oldVTS,oldLS) rd hb(oldVTS, Read(currVTS,currLS) False 109 currVTS) 110 111 E6r Write(oldVTS,oldLS) rd !hb(oldVTS, Write(newVTS,newLS) #newLS == 0 112 currVTS) && !hb(oldVTS,currVTS) 113 114 E6w Write(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 115 && !hb(oldVTS,currVTS) 116 117 where newVTS = join2(oldVTS,currVTS) 118 119 newLS = if hb(oldVTS,currVTS) 120 then currLS 121 else intersect(oldLS,currLS) 122 123 hb(vts1, vts2) = vts1 happens before or is equal to vts2 124 125 126 Interpretation of the states 127 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 128 129 I always found the state names in MSMProp1 confusing. Both MSMProp1 130 and MSMProp2 are easier to understand if the states Read and Write are 131 renamed, like this: 132 133 old name new name 134 135 Read WriteConstraint 136 Write AllConstraint 137 138 The effect of a state Read(VTS,LS) is to constrain all later-observed 139 writes so that either (1) the writing thread holds at least one lock 140 in common with LS, or (2) those writes must happen-after VTS. If 141 neither of those two conditions hold, a race is reported. 142 143 Hence a Read state places a constraint on writes. 144 145 The effect of a state Write(VTS,LS) is similar, but it applies to all 146 later-observed accesses: either (1) the accessing thread holds at 147 least one lock in common with LS, or (2) those accesses must 148 happen-after VTS. If neither of those two conditions hold, a race is 149 reported. 150 151 Hence a Write state places a constraint on all accesses. 152 153 If we ignore the LS component of these states, the intuitive 154 interpretation of the VTS component is that it states the earliest 155 vector-time that the next write / access may safely happen. 156 157