Home | History | Annotate | Download | only in runtime
      1 // Copyright 2015 The Go Authors. All rights reserved.
      2 // Use of this source code is governed by a BSD-style
      3 // license that can be found in the LICENSE file.
      4 
      5 // Garbage collector: stack barriers
      6 //
      7 // Stack barriers enable the garbage collector to determine how much
      8 // of a gorountine stack has changed between when a stack is scanned
      9 // during the concurrent scan phase and when it is re-scanned during
     10 // the stop-the-world mark termination phase. Mark termination only
     11 // needs to re-scan the changed part, so for deep stacks this can
     12 // significantly reduce GC pause time compared to the alternative of
     13 // re-scanning whole stacks. The deeper the stacks, the more stack
     14 // barriers help.
     15 //
     16 // When stacks are scanned during the concurrent scan phase, the stack
     17 // scan installs stack barriers by selecting stack frames and
     18 // overwriting the saved return PCs (or link registers) of these
     19 // frames with the PC of a "stack barrier trampoline". Later, when a
     20 // selected frame returns, it "returns" to this trampoline instead of
     21 // returning to its actual caller. The trampoline records that the
     22 // stack has unwound past this frame and jumps to the original return
     23 // PC recorded when the stack barrier was installed. Mark termination
     24 // re-scans only as far as the first frame that hasn't hit a stack
     25 // barrier and then removes and un-hit stack barriers.
     26 //
     27 // This scheme is very lightweight. No special code is required in the
     28 // mutator to record stack unwinding and the trampoline is only a few
     29 // assembly instructions.
     30 //
     31 // Book-keeping
     32 // ------------
     33 //
     34 // The primary cost of stack barriers is book-keeping: the runtime has
     35 // to record the locations of all stack barriers and the original
     36 // return PCs in order to return to the correct caller when a stack
     37 // barrier is hit and so it can remove un-hit stack barriers. In order
     38 // to minimize this cost, the Go runtime places stack barriers in
     39 // exponentially-spaced frames, starting 1K past the current frame.
     40 // The book-keeping structure hence grows logarithmically with the
     41 // size of the stack and mark termination re-scans at most twice as
     42 // much stack as necessary.
     43 //
     44 // The runtime reserves space for this book-keeping structure at the
     45 // top of the stack allocation itself (just above the outermost
     46 // frame). This is necessary because the regular memory allocator can
     47 // itself grow the stack, and hence can't be used when allocating
     48 // stack-related structures.
     49 //
     50 // For debugging, the runtime also supports installing stack barriers
     51 // at every frame. However, this requires significantly more
     52 // book-keeping space.
     53 //
     54 // Correctness
     55 // -----------
     56 //
     57 // The runtime and the compiler cooperate to ensure that all objects
     58 // reachable from the stack as of mark termination are marked.
     59 // Anything unchanged since the concurrent scan phase will be marked
     60 // because it is marked by the concurrent scan. After the concurrent
     61 // scan, there are three possible classes of stack modifications that
     62 // must be tracked:
     63 //
     64 // 1) Mutator writes below the lowest un-hit stack barrier. This
     65 // includes all writes performed by an executing function to its own
     66 // stack frame. This part of the stack will be re-scanned by mark
     67 // termination, which will mark any objects made reachable from
     68 // modifications to this part of the stack.
     69 //
     70 // 2) Mutator writes above the lowest un-hit stack barrier. It's
     71 // possible for a mutator to modify the stack above the lowest un-hit
     72 // stack barrier if a higher frame has passed down a pointer to a
     73 // stack variable in its frame. This is called an "up-pointer". The
     74 // compiler ensures that writes through up-pointers have an
     75 // accompanying write barrier (it simply doesn't distinguish between
     76 // writes through up-pointers and writes through heap pointers). This
     77 // write barrier marks any object made reachable from modifications to
     78 // this part of the stack.
     79 //
     80 // 3) Runtime writes to the stack. Various runtime operations such as
     81 // sends to unbuffered channels can write to arbitrary parts of the
     82 // stack, including above the lowest un-hit stack barrier. We solve
     83 // this in two ways. In many cases, the runtime can perform an
     84 // explicit write barrier operation like in case 2. However, in the
     85 // case of bulk memory move (typedmemmove), the runtime doesn't
     86 // necessary have ready access to a pointer bitmap for the memory
     87 // being copied, so it simply unwinds any stack barriers below the
     88 // destination.
     89 //
     90 // Gotchas
     91 // -------
     92 //
     93 // Anything that inspects or manipulates the stack potentially needs
     94 // to understand stack barriers. The most obvious case is that
     95 // gentraceback needs to use the original return PC when it encounters
     96 // the stack barrier trampoline. Anything that unwinds the stack such
     97 // as panic/recover must unwind stack barriers in tandem with
     98 // unwinding the stack.
     99 //
    100 // Stack barriers require that any goroutine whose stack has been
    101 // scanned must execute write barriers. Go solves this by simply
    102 // enabling write barriers globally during the concurrent scan phase.
    103 // However, traditionally, write barriers are not enabled during this
    104 // phase.
    105 //
    106 // Synchronization
    107 // ---------------
    108 //
    109 // For the most part, accessing and modifying stack barriers is
    110 // synchronized around GC safe points. Installing stack barriers
    111 // forces the G to a safe point, while all other operations that
    112 // modify stack barriers run on the G and prevent it from reaching a
    113 // safe point.
    114 //
    115 // Subtlety arises when a G may be tracebacked when *not* at a safe
    116 // point. This happens during sigprof. For this, each G has a "stack
    117 // barrier lock" (see gcLockStackBarriers, gcUnlockStackBarriers).
    118 // Operations that manipulate stack barriers acquire this lock, while
    119 // sigprof tries to acquire it and simply skips the traceback if it
    120 // can't acquire it. There is one exception for performance and
    121 // complexity reasons: hitting a stack barrier manipulates the stack
    122 // barrier list without acquiring the stack barrier lock. For this,
    123 // gentraceback performs a special fix up if the traceback starts in
    124 // the stack barrier function.
    125 
    126 package runtime
    127 
    128 import (
    129 	"runtime/internal/atomic"
    130 	"runtime/internal/sys"
    131 	"unsafe"
    132 )
    133 
    134 const debugStackBarrier = false
    135 
    136 // firstStackBarrierOffset is the approximate byte offset at
    137 // which to place the first stack barrier from the current SP.
    138 // This is a lower bound on how much stack will have to be
    139 // re-scanned during mark termination. Subsequent barriers are
    140 // placed at firstStackBarrierOffset * 2^n offsets.
    141 //
    142 // For debugging, this can be set to 0, which will install a
    143 // stack barrier at every frame. If you do this, you may also
    144 // have to raise _StackMin, since the stack barrier
    145 // bookkeeping will use a large amount of each stack.
    146 var firstStackBarrierOffset = 1024
    147 
    148 // gcMaxStackBarriers returns the maximum number of stack barriers
    149 // that can be installed in a stack of stackSize bytes.
    150 func gcMaxStackBarriers(stackSize int) (n int) {
    151 	if debug.gcstackbarrieroff > 0 {
    152 		return 0
    153 	}
    154 
    155 	if firstStackBarrierOffset == 0 {
    156 		// Special debugging case for inserting stack barriers
    157 		// at every frame. Steal half of the stack for the
    158 		// []stkbar. Technically, if the stack were to consist
    159 		// solely of return PCs we would need two thirds of
    160 		// the stack, but stealing that much breaks things and
    161 		// this doesn't happen in practice.
    162 		return stackSize / 2 / int(unsafe.Sizeof(stkbar{}))
    163 	}
    164 
    165 	offset := firstStackBarrierOffset
    166 	for offset < stackSize {
    167 		n++
    168 		offset *= 2
    169 	}
    170 	return n + 1
    171 }
    172 
    173 // gcInstallStackBarrier installs a stack barrier over the return PC of frame.
    174 //go:nowritebarrier
    175 func gcInstallStackBarrier(gp *g, frame *stkframe) bool {
    176 	if frame.lr == 0 {
    177 		if debugStackBarrier {
    178 			print("not installing stack barrier with no LR, goid=", gp.goid, "\n")
    179 		}
    180 		return false
    181 	}
    182 
    183 	if frame.fn.entry == cgocallback_gofuncPC {
    184 		// cgocallback_gofunc doesn't return to its LR;
    185 		// instead, its return path puts LR in g.sched.pc and
    186 		// switches back to the system stack on which
    187 		// cgocallback_gofunc was originally called. We can't
    188 		// have a stack barrier in g.sched.pc, so don't
    189 		// install one in this frame.
    190 		if debugStackBarrier {
    191 			print("not installing stack barrier over LR of cgocallback_gofunc, goid=", gp.goid, "\n")
    192 		}
    193 		return false
    194 	}
    195 
    196 	// Save the return PC and overwrite it with stackBarrier.
    197 	var lrUintptr uintptr
    198 	if usesLR {
    199 		lrUintptr = frame.sp
    200 	} else {
    201 		lrUintptr = frame.fp - sys.RegSize
    202 	}
    203 	lrPtr := (*sys.Uintreg)(unsafe.Pointer(lrUintptr))
    204 	if debugStackBarrier {
    205 		print("install stack barrier at ", hex(lrUintptr), " over ", hex(*lrPtr), ", goid=", gp.goid, "\n")
    206 		if uintptr(*lrPtr) != frame.lr {
    207 			print("frame.lr=", hex(frame.lr))
    208 			throw("frame.lr differs from stack LR")
    209 		}
    210 	}
    211 
    212 	gp.stkbar = gp.stkbar[:len(gp.stkbar)+1]
    213 	stkbar := &gp.stkbar[len(gp.stkbar)-1]
    214 	stkbar.savedLRPtr = lrUintptr
    215 	stkbar.savedLRVal = uintptr(*lrPtr)
    216 	*lrPtr = sys.Uintreg(stackBarrierPC)
    217 	return true
    218 }
    219 
    220 // gcRemoveStackBarriers removes all stack barriers installed in gp's stack.
    221 //
    222 // gp's stack barriers must be locked.
    223 //
    224 //go:nowritebarrier
    225 func gcRemoveStackBarriers(gp *g) {
    226 	if debugStackBarrier && gp.stkbarPos != 0 {
    227 		print("hit ", gp.stkbarPos, " stack barriers, goid=", gp.goid, "\n")
    228 	}
    229 
    230 	// Remove stack barriers that we didn't hit.
    231 	for _, stkbar := range gp.stkbar[gp.stkbarPos:] {
    232 		gcRemoveStackBarrier(gp, stkbar)
    233 	}
    234 
    235 	// Clear recorded stack barriers so copystack doesn't try to
    236 	// adjust them.
    237 	gp.stkbarPos = 0
    238 	gp.stkbar = gp.stkbar[:0]
    239 }
    240 
    241 // gcRemoveStackBarrier removes a single stack barrier. It is the
    242 // inverse operation of gcInstallStackBarrier.
    243 //
    244 // This is nosplit to ensure gp's stack does not move.
    245 //
    246 //go:nowritebarrier
    247 //go:nosplit
    248 func gcRemoveStackBarrier(gp *g, stkbar stkbar) {
    249 	if debugStackBarrier {
    250 		print("remove stack barrier at ", hex(stkbar.savedLRPtr), " with ", hex(stkbar.savedLRVal), ", goid=", gp.goid, "\n")
    251 	}
    252 	lrPtr := (*sys.Uintreg)(unsafe.Pointer(stkbar.savedLRPtr))
    253 	if val := *lrPtr; val != sys.Uintreg(stackBarrierPC) {
    254 		printlock()
    255 		print("at *", hex(stkbar.savedLRPtr), " expected stack barrier PC ", hex(stackBarrierPC), ", found ", hex(val), ", goid=", gp.goid, "\n")
    256 		print("gp.stkbar=")
    257 		gcPrintStkbars(gp, -1)
    258 		print(", gp.stack=[", hex(gp.stack.lo), ",", hex(gp.stack.hi), ")\n")
    259 		throw("stack barrier lost")
    260 	}
    261 	*lrPtr = sys.Uintreg(stkbar.savedLRVal)
    262 }
    263 
    264 // gcTryRemoveAllStackBarriers tries to remove stack barriers from all
    265 // Gs in gps. It is best-effort and efficient. If it can't remove
    266 // barriers from a G immediately, it will simply skip it.
    267 func gcTryRemoveAllStackBarriers(gps []*g) {
    268 	for _, gp := range gps {
    269 	retry:
    270 		for {
    271 			switch s := readgstatus(gp); s {
    272 			default:
    273 				break retry
    274 
    275 			case _Grunnable, _Gsyscall, _Gwaiting:
    276 				if !castogscanstatus(gp, s, s|_Gscan) {
    277 					continue
    278 				}
    279 				gcLockStackBarriers(gp)
    280 				gcRemoveStackBarriers(gp)
    281 				gcUnlockStackBarriers(gp)
    282 				restartg(gp)
    283 				break retry
    284 			}
    285 		}
    286 	}
    287 }
    288 
    289 // gcPrintStkbars prints the stack barriers of gp for debugging. It
    290 // places a "@@@" marker at gp.stkbarPos. If marker >= 0, it will also
    291 // place a "==>" marker before the marker'th entry.
    292 func gcPrintStkbars(gp *g, marker int) {
    293 	print("[")
    294 	for i, s := range gp.stkbar {
    295 		if i > 0 {
    296 			print(" ")
    297 		}
    298 		if i == int(gp.stkbarPos) {
    299 			print("@@@ ")
    300 		}
    301 		if i == marker {
    302 			print("==> ")
    303 		}
    304 		print("*", hex(s.savedLRPtr), "=", hex(s.savedLRVal))
    305 	}
    306 	if int(gp.stkbarPos) == len(gp.stkbar) {
    307 		print(" @@@")
    308 	}
    309 	if marker == len(gp.stkbar) {
    310 		print(" ==>")
    311 	}
    312 	print("]")
    313 }
    314 
    315 // gcUnwindBarriers marks all stack barriers up the frame containing
    316 // sp as hit and removes them. This is used during stack unwinding for
    317 // panic/recover and by heapBitsBulkBarrier to force stack re-scanning
    318 // when its destination is on the stack.
    319 //
    320 // This is nosplit to ensure gp's stack does not move.
    321 //
    322 //go:nosplit
    323 func gcUnwindBarriers(gp *g, sp uintptr) {
    324 	gcLockStackBarriers(gp)
    325 	// On LR machines, if there is a stack barrier on the return
    326 	// from the frame containing sp, this will mark it as hit even
    327 	// though it isn't, but it's okay to be conservative.
    328 	before := gp.stkbarPos
    329 	for int(gp.stkbarPos) < len(gp.stkbar) && gp.stkbar[gp.stkbarPos].savedLRPtr < sp {
    330 		gcRemoveStackBarrier(gp, gp.stkbar[gp.stkbarPos])
    331 		gp.stkbarPos++
    332 	}
    333 	gcUnlockStackBarriers(gp)
    334 	if debugStackBarrier && gp.stkbarPos != before {
    335 		print("skip barriers below ", hex(sp), " in goid=", gp.goid, ": ")
    336 		// We skipped barriers between the "==>" marker
    337 		// (before) and the "@@@" marker (gp.stkbarPos).
    338 		gcPrintStkbars(gp, int(before))
    339 		print("\n")
    340 	}
    341 }
    342 
    343 // nextBarrierPC returns the original return PC of the next stack barrier.
    344 // Used by getcallerpc, so it must be nosplit.
    345 //go:nosplit
    346 func nextBarrierPC() uintptr {
    347 	gp := getg()
    348 	return gp.stkbar[gp.stkbarPos].savedLRVal
    349 }
    350 
    351 // setNextBarrierPC sets the return PC of the next stack barrier.
    352 // Used by setcallerpc, so it must be nosplit.
    353 //go:nosplit
    354 func setNextBarrierPC(pc uintptr) {
    355 	gp := getg()
    356 	gcLockStackBarriers(gp)
    357 	gp.stkbar[gp.stkbarPos].savedLRVal = pc
    358 	gcUnlockStackBarriers(gp)
    359 }
    360 
    361 // gcLockStackBarriers synchronizes with tracebacks of gp's stack
    362 // during sigprof for installation or removal of stack barriers. It
    363 // blocks until any current sigprof is done tracebacking gp's stack
    364 // and then disallows profiling tracebacks of gp's stack.
    365 //
    366 // This is necessary because a sigprof during barrier installation or
    367 // removal could observe inconsistencies between the stkbar array and
    368 // the stack itself and crash.
    369 //
    370 //go:nosplit
    371 func gcLockStackBarriers(gp *g) {
    372 	// Disable preemption so scanstack cannot run while the caller
    373 	// is manipulating the stack barriers.
    374 	acquirem()
    375 	for !atomic.Cas(&gp.stackLock, 0, 1) {
    376 		osyield()
    377 	}
    378 }
    379 
    380 //go:nosplit
    381 func gcTryLockStackBarriers(gp *g) bool {
    382 	mp := acquirem()
    383 	result := atomic.Cas(&gp.stackLock, 0, 1)
    384 	if !result {
    385 		releasem(mp)
    386 	}
    387 	return result
    388 }
    389 
    390 func gcUnlockStackBarriers(gp *g) {
    391 	atomic.Store(&gp.stackLock, 0)
    392 	releasem(getg().m)
    393 }
    394