HACKING.md

Documentation: runtime

     1This is a living document and at times it will be out of date. It is
     2intended to articulate how programming in the Go runtime differs from
     3writing normal Go. It focuses on pervasive concepts rather than
     4details of particular interfaces.
     5
     6Scheduler structures
     7====================
     8
     9The scheduler manages three types of resources that pervade the
    10runtime: Gs, Ms, and Ps. It's important to understand these even if
    11you're not working on the scheduler.
    12
    13Gs, Ms, Ps
    14----------
    15
    16A "G" is simply a goroutine. It's represented by type `g`. When a
    17goroutine exits, its `g` object is returned to a pool of free `g`s and
    18can later be reused for some other goroutine.
    19
    20An "M" is an OS thread that can be executing user Go code, runtime
    21code, a system call, or be idle. It's represented by type `m`. There
    22can be any number of Ms at a time since any number of threads may be
    23blocked in system calls.
    24
    25Finally, a "P" represents the resources required to execute user Go
    26code, such as scheduler and memory allocator state. It's represented
    27by type `p`. There are exactly `GOMAXPROCS` Ps. A P can be thought of
    28like a CPU in the OS scheduler and the contents of the `p` type like
    29per-CPU state. This is a good place to put state that needs to be
    30sharded for efficiency, but doesn't need to be per-thread or
    31per-goroutine.
    32
    33The scheduler's job is to match up a G (the code to execute), an M
    34(where to execute it), and a P (the rights and resources to execute
    35it). When an M stops executing user Go code, for example by entering a
    36system call, it returns its P to the idle P pool. In order to resume
    37executing user Go code, for example on return from a system call, it
    38must acquire a P from the idle pool.
    39
    40All `g`, `m`, and `p` objects are heap allocated, but are never freed,
    41so their memory remains type stable. As a result, the runtime can
    42avoid write barriers in the depths of the scheduler.
    43
    44`getg()` and `getg().m.curg`
    45----------------------------
    46
    47To get the current user `g`, use `getg().m.curg`.
    48
    49`getg()` alone returns the current `g`, but when executing on the
    50system or signal stacks, this will return the current M's "g0" or
    51"gsignal", respectively. This is usually not what you want.
    52
    53To determine if you're running on the user stack or the system stack,
    54use `getg() == getg().m.curg`.
    55
    56Stacks
    57======
    58
    59Every non-dead G has a *user stack* associated with it, which is what
    60user Go code executes on. User stacks start small (e.g., 2K) and grow
    61or shrink dynamically.
    62
    63Every M has a *system stack* associated with it (also known as the M's
    64"g0" stack because it's implemented as a stub G) and, on Unix
    65platforms, a *signal stack* (also known as the M's "gsignal" stack).
    66System and signal stacks cannot grow, but are large enough to execute
    67runtime and cgo code (8K in a pure Go binary; system-allocated in a
    68cgo binary).
    69
    70Runtime code often temporarily switches to the system stack using
    71`systemstack`, `mcall`, or `asmcgocall` to perform tasks that must not
    72be preempted, that must not grow the user stack, or that switch user
    73goroutines. Code running on the system stack is implicitly
    74non-preemptible and the garbage collector does not scan system stacks.
    75While running on the system stack, the current user stack is not used
    76for execution.
    77
    78nosplit functions
    79-----------------
    80
    81Most functions start with a prologue that inspects the stack pointer
    82and the current G's stack bound and calls `morestack` if the stack
    83needs to grow.
    84
    85Functions can be marked `//go:nosplit` (or `NOSPLIT` in assembly) to
    86indicate that they should not get this prologue. This has several
    87uses:
    88
    89- Functions that must run on the user stack, but must not call into
    90  stack growth, for example because this would cause a deadlock, or
    91  because they have untyped words on the stack.
    92
    93- Functions that must not be preempted on entry.
    94
    95- Functions that may run without a valid G. For example, functions
    96  that run in early runtime start-up, or that may be entered from C
    97  code such as cgo callbacks or the signal handler.
    98
    99Splittable functions ensure there's some amount of space on the stack
   100for nosplit functions to run in and the linker checks that any static
   101chain of nosplit function calls cannot exceed this bound.
   102
   103Any function with a `//go:nosplit` annotation should explain why it is
   104nosplit in its documentation comment.
   105
   106Error handling and reporting
   107============================
   108
   109Errors that can reasonably be recovered from in user code should use
   110`panic` like usual. However, there are some situations where `panic`
   111will cause an immediate fatal error, such as when called on the system
   112stack or when called during `mallocgc`.
   113
   114Most errors in the runtime are not recoverable. For these, use
   115`throw`, which dumps the traceback and immediately terminates the
   116process. In general, `throw` should be passed a string constant to
   117avoid allocating in perilous situations. By convention, additional
   118details are printed before `throw` using `print` or `println` and the
   119messages are prefixed with "runtime:".
   120
   121For unrecoverable errors where user code is expected to be at fault for the
   122failure (such as racing map writes), use `fatal`.
   123
   124For runtime error debugging, it may be useful to run with `GOTRACEBACK=system`
   125or `GOTRACEBACK=crash`. The output of `panic` and `fatal` is as described by
   126`GOTRACEBACK`. The output of `throw` always includes runtime frames, metadata
   127and all goroutines regardless of `GOTRACEBACK` (i.e., equivalent to
   128`GOTRACEBACK=system`). Whether `throw` crashes or not is still controlled by
   129`GOTRACEBACK`.
   130
   131Synchronization
   132===============
   133
   134The runtime has multiple synchronization mechanisms. They differ in
   135semantics and, in particular, in whether they interact with the
   136goroutine scheduler or the OS scheduler.
   137
   138The simplest is `mutex`, which is manipulated using `lock` and
   139`unlock`. This should be used to protect shared structures for short
   140periods. Blocking on a `mutex` directly blocks the M, without
   141interacting with the Go scheduler. This means it is safe to use from
   142the lowest levels of the runtime, but also prevents any associated G
   143and P from being rescheduled. `rwmutex` is similar.
   144
   145For one-shot notifications, use `note`, which provides `notesleep` and
   146`notewakeup`. Unlike traditional UNIX `sleep`/`wakeup`, `note`s are
   147race-free, so `notesleep` returns immediately if the `notewakeup` has
   148already happened. A `note` can be reset after use with `noteclear`,
   149which must not race with a sleep or wakeup. Like `mutex`, blocking on
   150a `note` blocks the M. However, there are different ways to sleep on a
   151`note`:`notesleep` also prevents rescheduling of any associated G and
   152P, while `notetsleepg` acts like a blocking system call that allows
   153the P to be reused to run another G. This is still less efficient than
   154blocking the G directly since it consumes an M.
   155
   156To interact directly with the goroutine scheduler, use `gopark` and
   157`goready`. `gopark` parks the current goroutine—putting it in the
   158"waiting" state and removing it from the scheduler's run queue—and
   159schedules another goroutine on the current M/P. `goready` puts a
   160parked goroutine back in the "runnable" state and adds it to the run
   161queue.
   162
   163In summary,
   164
   165<table>
   166<tr><th></th><th colspan="3">Blocks</th></tr>
   167<tr><th>Interface</th><th>G</th><th>M</th><th>P</th></tr>
   168<tr><td>(rw)mutex</td><td>Y</td><td>Y</td><td>Y</td></tr>
   169<tr><td>note</td><td>Y</td><td>Y</td><td>Y/N</td></tr>
   170<tr><td>park</td><td>Y</td><td>N</td><td>N</td></tr>
   171</table>
   172
   173Atomics
   174=======
   175
   176The runtime uses its own atomics package at `internal/runtime/atomic`.
   177This corresponds to `sync/atomic`, but functions have different names
   178for historical reasons and there are a few additional functions needed
   179by the runtime.
   180
   181In general, we think hard about the uses of atomics in the runtime and
   182try to avoid unnecessary atomic operations. If access to a variable is
   183sometimes protected by another synchronization mechanism, the
   184already-protected accesses generally don't need to be atomic. There
   185are several reasons for this:
   186
   1871. Using non-atomic or atomic access where appropriate makes the code
   188   more self-documenting. Atomic access to a variable implies there's
   189   somewhere else that may concurrently access the variable.
   190
   1912. Non-atomic access allows for automatic race detection. The runtime
   192   doesn't currently have a race detector, but it may in the future.
   193   Atomic access defeats the race detector, while non-atomic access
   194   allows the race detector to check your assumptions.
   195
   1963. Non-atomic access may improve performance.
   197
   198Of course, any non-atomic access to a shared variable should be
   199documented to explain how that access is protected.
   200
   201Some common patterns that mix atomic and non-atomic access are:
   202
   203* Read-mostly variables where updates are protected by a lock. Within
   204  the locked region, reads do not need to be atomic, but the write
   205  does. Outside the locked region, reads need to be atomic.
   206
   207* Reads that only happen during STW, where no writes can happen during
   208  STW, do not need to be atomic.
   209
   210That said, the advice from the Go memory model stands: "Don't be
   211[too] clever." The performance of the runtime matters, but its
   212robustness matters more.
   213
   214Unmanaged memory
   215================
   216
   217In general, the runtime tries to use regular heap allocation. However,
   218in some cases the runtime must allocate objects outside of the garbage
   219collected heap, in *unmanaged memory*. This is necessary if the
   220objects are part of the memory manager itself or if they must be
   221allocated in situations where the caller may not have a P.
   222
   223There are three mechanisms for allocating unmanaged memory:
   224
   225* sysAlloc obtains memory directly from the OS. This comes in whole
   226  multiples of the system page size, but it can be freed with sysFree.
   227
   228* persistentalloc combines multiple smaller allocations into a single
   229  sysAlloc to avoid fragmentation. However, there is no way to free
   230  persistentalloced objects (hence the name).
   231
   232* fixalloc is a SLAB-style allocator that allocates objects of a fixed
   233  size. fixalloced objects can be freed, but this memory can only be
   234  reused by the same fixalloc pool, so it can only be reused for
   235  objects of the same type.
   236
   237In general, types that are allocated using any of these should be
   238marked as not in heap by embedding `internal/runtime/sys.NotInHeap`.
   239
   240Objects that are allocated in unmanaged memory **must not** contain
   241heap pointers unless the following rules are also obeyed:
   242
   2431. Any pointers from unmanaged memory to the heap must be garbage
   244   collection roots. More specifically, any pointer must either be
   245   accessible through a global variable or be added as an explicit
   246   garbage collection root in `runtime.markroot`.
   247
   2482. If the memory is reused, the heap pointers must be zero-initialized
   249   before they become visible as GC roots. Otherwise, the GC may
   250   observe stale heap pointers. See "Zero-initialization versus
   251   zeroing".
   252
   253Zero-initialization versus zeroing
   254==================================
   255
   256There are two types of zeroing in the runtime, depending on whether
   257the memory is already initialized to a type-safe state.
   258
   259If memory is not in a type-safe state, meaning it potentially contains
   260"garbage" because it was just allocated and it is being initialized
   261for first use, then it must be *zero-initialized* using
   262`memclrNoHeapPointers` or non-pointer writes. This does not perform
   263write barriers.
   264
   265If memory is already in a type-safe state and is simply being set to
   266the zero value, this must be done using regular writes, `typedmemclr`,
   267or `memclrHasPointers`. This performs write barriers.
   268
   269Linkname conventions
   270====================
   271
   272```
   273//go:linkname localname [importpath.name]
   274```
   275
   276`//go:linkname` specifies the symbol name (`importpath.name`) used to a
   277reference a local identifier (`localname`). The target symbol name is an
   278arbitrary ELF/macho/etc symbol name, but by convention we typically use a
   279package-prefixed symbol name to keep things organized.
   280
   281The full generality of `//go:linkname` is very flexible, so as a convention to
   282simplify things, we define three standard forms of `//go:linkname` directives.
   283
   284When possible, always prefer to use the linkname "handshake" described below.
   285
   286"Push linkname"
   287---------------
   288
   289A "push" linkname gives a local _definition_ a final symbol name in a different
   290package. This effectively "pushes" the symbol to the other package.
   291
   292```
   293//go:linkname foo otherpkg.foo
   294func foo() {
   295    // impl
   296}
   297```
   298
   299The other package needs a _declaration_ to use the symbol from Go, or it can
   300directly reference the symbol in assembly. Typically this is an "export
   301linkname" declaration (below).
   302
   303"Pull linkname"
   304---------------
   305
   306A "pull" linkname gives references to a local _declaration_ a final symbol name
   307in a different package. This effectively "pulls" the symbol from the other
   308package.
   309
   310```
   311//go:linkname foo otherpkg.foo
   312func foo()
   313```
   314
   315The other package simply needs to define the symbol, but typically this is a
   316"export linkname" definition (below).
   317
   318"Export linkname"
   319-----------------
   320
   321The second argument to `//go:linkname` is the target symbol name. If it is
   322omitted, the toolchain uses the default symbol name. In other words, this is a
   323linkname to itself. This seems to be a no-op, but it is used to mean that this
   324symbol is "exported" for use with another linkname.
   325
   326```
   327//go:linkname foo
   328func foo() {
   329    // impl
   330}
   331```
   332
   333When applied to a definition, an export linkname indicates that another package
   334has a pull linkname targeting this symbol. This has a few effects:
   335
   336- The compiler avoids generates ABI wrappers for ABI0 and/or ABIInternal, so a
   337  symbol defined in Go can be referenced from assembly in another package, or
   338  vice versa.
   339- The linker will allow pull linknames to this symbol even with
   340  `-checklinkname=true` (see "Handshake" section below).
   341
   342```
   343//go:linkname foo
   344func foo()
   345```
   346
   347When applied to a declaration, an export linkname indicates that another package
   348has a push linkname targeting this symbol. Other than documentation, the only
   349effect this has on the toolchain is that the compiler will not require a `.s`
   350file in the package (normally the compiler requires a `.s` file when there are
   351function declarations without a body).
   352
   353Handshake
   354---------
   355
   356We always prefer to use push linknames rather than pull linknames. With a push
   357linkname, the package with the definition is aware it is publishing an API to
   358another package. On the other hand, with a pull linkname, the definition
   359package may be completely unaware of the dependency and may unintentionally
   360break users.
   361
   362The preferred form for a linkname is to use a push linkname in the defining
   363package, and a target linkname in the receiving package. The latter is not
   364strictly required, but serves as documentation. By convention, the receiving
   365package names the symbol containing the source package to further aid
   366documentation.
   367
   368```
   369package runtime
   370
   371//go:linkname foo otherpkg.runtime_foo
   372func foo() {
   373    // impl
   374}
   375```
   376
   377```
   378package otherpkg
   379
   380//go:linkname runtime_foo
   381func runtime_foo()
   382```
   383
   384As of Go 1.23, the linker forbids pull linknames of symbols in the standard
   385library unless they participate in a handshake. Since many third-party packages
   386already have pull linknames to standard library functions, for backwards
   387compatibility, standard library symbols that are the target of external pull
   388linknames must use a target linkname to signal to the linker that pull
   389linknames are acceptable.
   390
   391```
   392package runtime
   393
   394//go:linkname fastrand
   395func fastrand() {
   396    // impl
   397}
   398```
   399
   400Note that linker enforcement can be disabled with the `-checklinkname=false`
   401flag.
   402
   403Variables
   404---------
   405
   406All of the examples above use `//go:linkname` on functions. It is also possible
   407to use it on global variables as well, though this is much less common.
   408
   409Variables don't have a clear distinction between definition and declaration. As
   410a rule, only one side should have a non-zero initial value. That side is the
   411"definition" and the other is the "declaration".
   412
   413Both sides should have the same type, including size. Though if one side is
   414larger than another, the linker allocates space for the larger size.
   415
   416Runtime-only compiler directives
   417================================
   418
   419In addition to the "//go:" directives documented in "go doc compile",
   420the compiler supports additional directives only in the runtime.
   421
   422go:systemstack
   423--------------
   424
   425`go:systemstack` indicates that a function must run on the system
   426stack. This is checked dynamically by a special function prologue.
   427
   428go:nowritebarrier
   429-----------------
   430
   431`go:nowritebarrier` directs the compiler to emit an error if the
   432following function contains any write barriers. (It *does not*
   433suppress the generation of write barriers; it is simply an assertion.)
   434
   435Usually you want `go:nowritebarrierrec`. `go:nowritebarrier` is
   436primarily useful in situations where it's "nice" not to have write
   437barriers, but not required for correctness.
   438
   439go:nowritebarrierrec and go:yeswritebarrierrec
   440----------------------------------------------
   441
   442`go:nowritebarrierrec` directs the compiler to emit an error if the
   443following function or any function it calls recursively, up to a
   444`go:yeswritebarrierrec`, contains a write barrier.
   445
   446Logically, the compiler floods the call graph starting from each
   447`go:nowritebarrierrec` function and produces an error if it encounters
   448a function containing a write barrier. This flood stops at
   449`go:yeswritebarrierrec` functions.
   450
   451`go:nowritebarrierrec` is used in the implementation of the write
   452barrier to prevent infinite loops.
   453
   454Both directives are used in the scheduler. The write barrier requires
   455an active P (`getg().m.p != nil`) and scheduler code often runs
   456without an active P. In this case, `go:nowritebarrierrec` is used on
   457functions that release the P or may run without a P and
   458`go:yeswritebarrierrec` is used when code re-acquires an active P.
   459Since these are function-level annotations, code that releases or
   460acquires a P may need to be split across two functions.
   461
   462go:uintptrkeepalive
   463-------------------
   464
   465The //go:uintptrkeepalive directive must be followed by a function declaration.
   466
   467It specifies that the function's uintptr arguments may be pointer values that
   468have been converted to uintptr and must be kept alive for the duration of the
   469call, even though from the types alone it would appear that the object is no
   470longer needed during the call.
   471
   472This directive is similar to //go:uintptrescapes, but it does not force
   473arguments to escape. Since stack growth does not understand these arguments,
   474this directive must be used with //go:nosplit (in the marked function and all
   475transitive calls) to prevent stack growth.
   476
   477The conversion from pointer to uintptr must appear in the argument list of any
   478call to this function. This directive is used for some low-level system call
   479implementations.
   480
   481Execution tracer
   482================
   483
   484The execution tracer is a way for users to see what their goroutines are doing,
   485but they're also useful for runtime hacking.
   486
   487Using execution traces to debug runtime problems
   488------------------------------------------------
   489
   490Execution traces contain a wealth of information about what the runtime is
   491doing. They contain all goroutine scheduling actions, data about time spent in
   492the scheduler (P running without a G), data about time spent in the garbage
   493collector, and more. Use `go tool trace` or [gotraceui](https://gotraceui.dev)
   494to inspect traces.
   495
   496Traces are especially useful for debugging latency issues, and especially if you
   497can catch the problem in the act. Consider using the flight recorder to help
   498with this.
   499
   500Turn on CPU profiling when you take a trace. This will put the CPU profiling
   501samples as timestamped events into the trace, allowing you to see execution with
   502greater detail. If you see CPU profiling sample events appear at a rate that does
   503not match the sample rate, consider that the OS or platform might be taking away
   504CPU time from the process, and that you might not be debugging a Go issue.
   505
   506If you're really stuck on a problem, adding new instrumentation with the tracer
   507might help, especially if it's helpful to see events in relation to other
   508scheduling events. See the next section on modifying the execution tracer.
   509However, consider using `debuglog` for additional instrumentation first, as that
   510is far easier to get started with.
   511
   512Notes on modifying the execution tracer
   513---------------------------------------
   514
   515The execution tracer lives in the files whose names start with "trace."
   516The parser for the execution trace format lives in the `internal/trace` package.
   517
   518If you plan on adding new trace events, consider starting with a [trace
   519experiment](../internal/trace/tracev2/EXPERIMENTS.md).
   520
   521If you plan to add new trace instrumentation to the runtime, read the comment
   522at the top of [trace.go](./trace.go), especially the invariants.
   523
   524debuglog
   525========
   526
   527`debuglog` is a powerful runtime-only debugging tool. Think of it as an
   528ultra-low-overhead `println` that works just about anywhere in the runtime.
   529These properties are invaluable when debugging subtle problems in tricky parts
   530of the codebase. `println` can often perturb code enough to stop data races from
   531happening, while `debuglog` perturbs execution far less.
   532
   533`debuglog` accumulates log messages in a ring buffer on each M, and dumps out
   534the contents, ordering it by timestamp, on certain kinds of crashes. Some messages
   535might be lost if the ring buffer gets full, in which case consider increasing the
   536size, or just work with a partial log.
   537
   5381. Add `debuglog` instrumentation to the runtime. Don't forget to call `end`!
   539   Example: `dlog().s("hello world").u32(5).end()`
   5402. By default, `debuglog` only dumps its contents in certain kinds of crashes.
   541   Consider adding more calls to `printDebugLog` if you're not getting any output.
   5423. Build the program you wish to debug with the `debuglog` build tag.
   543
   544`debuglog` is lower level than execution traces, and much easier to set up.
View as plain text