doc/asm: explain coordination with garbage collector

rsc · rsc · commit 202bf8d94dfe · 2014-10-28T15:51:06.000-04:00
Also a few other minor changes. Fixes golang#8712. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/164150043
diff --git a/doc/asm.html b/doc/asm.html
@@ -117,6 +117,9 @@ <h3 id="symbols">Symbols</h3>
 <p>
 The <code>SB</code> pseudo-register can be thought of as the origin of memory, so the symbol <code>foo(SB)</code>
 is the name <code>foo</code> as an address in memory.
+This form is used to name global functions and data.
+Adding <code>&lt;&gt;</code> to the name, as in <code>foo&lt;&gt;(SB)</code>, makes the name
+visible only in the current source file, like a top-level <code>static</code> declaration in a C file.
 </p>
 
 <p>
@@ -128,8 +131,11 @@ <h3 id="symbols">Symbols</h3>
 When referring to a function argument this way, it is conventional to place the name
 at the beginning, as in <code>first_arg+0(FP)</code> and <code>second_arg+8(FP)</code>.
 Some of the assemblers enforce this convention, rejecting plain <code>0(FP)</code> and <code>8(FP)</code>.
-For assembly functions with Go prototypes, <code>go vet</code> will check that the argument names
+For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the argument names
 and offsets match.
+On 32-bit systems, the low and high 32 bits of a 64-bit value are distinguished by adding
+a <code>_lo</code> or <code>_hi</code> suffix to the name, as in <code>arg_lo+0(FP)</code> or <code>arg_hi+4(FP)</code>.
+If a Go prototype does not name its result, the expected assembly name is <code>ret</code>.
 </p>
 
 <p>
@@ -206,6 +212,8 @@ <h3 id="directives">Directives</h3>
 and is called with 8 bytes of argument, which live on the caller's frame.
 If <code>NOSPLIT</code> is not specified for the <code>TEXT</code>,
 the argument size must be provided.
+For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the
+argument size is correct.
 </p>
 
 <p>
@@ -216,19 +224,20 @@ <h3 id="directives">Directives</h3>
 </p>
 
 <p>
-For <code>DATA</code> directives, the symbol is followed by a slash and the number
-of bytes the memory associated with the symbol occupies.
-The arguments are optional flags and the data itself.
-For instance,
-</p>
+Global data symbols are defined by a sequence of initializing
+<code>DATA</code> directives followed by a <code>GLOBL</code> directive.
+Each <code>DATA</code> directive initializes a section of the
+corresponding memory.
+The memory not explicitly initialized is zeroed.
+The general form of the <code>DATA</code> directive is
 
 <pre>
-DATA  runtime·isplan9(SB)/4, $1
+DATA	symbol+offset(SB)/width, value
 </pre>
 
 <p>
-declares the local symbol <code>runtime·isplan9</code> of size 4 and value 1.
-Again the symbol has the middle dot and is offset from <code>SB</code>.
+which initializes the symbol memory at the given offset and width with the given value.
+The <code>DATA</code> directives for a given symbol must be written with increasing offsets.
 </p>
 
 <p>
@@ -237,15 +246,26 @@ <h3 id="directives">Directives</h3>
 which will have initial value all zeros unless a <code>DATA</code> directive
 has initialized it.
 The <code>GLOBL</code> directive must follow any corresponding <code>DATA</code> directives.
-This example
+</p>
+
+<p>
+For example,
 </p>
 
 <pre>
-GLOBL runtime·tlsoffset(SB),$4
+DATA divtab&lt;&gt;+0x00(SB)/4, $0xf4f8fcff
+DATA divtab&lt;&gt;+0x04(SB)/4, $0xe6eaedf0
+...
+DATA divtab&lt;&gt;+0x3c(SB)/4, $0x81828384
+GLOBL divtab&lt;&gt;(SB), RODATA, $64
+
+GLOBL runtime·tlsoffset(SB), NOPTR, $4
 </pre>
 
 <p>
-declares <code>runtime·tlsoffset</code> to have size 4.
+declares and initializes <code>divtab&lt;&gt;</code>, a read-only 64-byte table of 4-byte integer values,
+and declares <code>runtime·tlsoffset</code>, a 4-byte, implicitly zeroed variable that
+contains no pointers.
 </p>
 
 <p>
@@ -299,6 +319,80 @@ <h3 id="directives">Directives</h3>
 </li>
 </ul>
 
+<h3 id="runtime">Runtime Coordination</h3>
+
+<p>
+For garbage collection to run correctly, the runtime must know the
+location of pointers in all global data and in most stack frames.
+The Go compiler emits this information when compiling Go source files,
+but assembly programs must define it explicitly.
+</p>
+
+<p>
+A data symbol marked with the <code>NOPTR</code> flag (see above)
+is treated as containing no pointers to runtime-allocated data.
+A data symbol with the <code>RODATA</code> flag
+is allocated in read-only memory and is therefore treated
+as implicitly marked <code>NOPTR</code>.
+A data symbol with a total size smaller than a pointer
+is also treated as implicitly marked <code>NOPTR</code>.
+It is not possible to define a symbol containing pointers in an assembly source file;
+such a symbol must be defined in a Go source file instead.
+Assembly source can still refer to the symbol by name
+even without <code>DATA</code> and <code>GLOBL</code> directives.
+A good general rule of thumb is to define all non-<code>RODATA</code>
+symbols in Go instead of in assembly.
+</p>
+
+<p>
+Each function also needs annotations giving the location of
+live pointers in its arguments, results, and local stack frame.
+For an assembly function with no pointer results and
+either no local stack frame or no function calls,
+the only requirement is to define a Go prototype for the function
+in a Go source file in the same package.
+For more complex situations, explicit annotation is needed.
+These annotations use pseudo-instructions defined in the standard
+<code>#include</code> file <code>funcdata.h</code>.
+</p>
+
+<p>
+If a function has no arguments and no results,
+the pointer information can be omitted.
+This is indicated by an argument size annotation of <code>$<i>n</i>-0</code>
+on the <code>TEXT</code> instruction.
+Otherwise, pointer information must be provided by
+a Go prototype for the function in a Go source file,
+even for assembly functions not called directly from Go.
+(The prototype will also let <code>go</code> <code>vet</code> check the argument references.)
+At the start of the function, the arguments are assumed
+to be initialized but the results are assumed uninitialized.
+If the results will hold live pointers during a call instruction,
+the function should start by zeroing the results and then 
+executing the pseudo-instruction <code>GO_RESULTS_INITIALIZED</code>.
+This instruction records that the results are now initialized
+and should be scanned during stack movement and garbage collection.
+It is typically easier to arrange that assembly functions do not
+return pointers or do not contain call instructions;
+no assembly functions in the standard library use
+<code>GO_RESULTS_INITIALIZED</code>.
+</p>
+
+<p>
+If a function has no local stack frame,
+the pointer information can be omitted.
+This is indicated by a local frame size annotation of <code>$0-<i>n</i></code>
+on the <code>TEXT</code> instruction.
+The pointer information can also be omitted if the
+function contains no call instructions.
+Otherwise, the local stack frame must not contain pointers,
+and the assembly must confirm this fact by executing the 
+pseudo-instruction <code>NO_LOCAL_POINTERS</code>.
+Because stack resizing is implemented by moving the stack,
+the stack pointer may change during any function call:
+even pointers to stack data must not be kept in local variables.
+</p>
+
 <h2 id="architectures">Architecture-specific details</h2>
 
 <p>
@@ -434,13 +528,10 @@ <h3 id="unsupported_opcodes">Unsupported opcodes</h3>
 // so actually
 // void atomicload64(uint64 *res, uint64 volatile *addr);
 TEXT runtime·atomicload64(SB), NOSPLIT, $0-8
-	MOVL	4(SP), BX
-	MOVL	8(SP), AX
-	// MOVQ (%EAX), %MM0
-	BYTE $0x0f; BYTE $0x6f; BYTE $0x00
-	// MOVQ %MM0, 0(%EBX)
-	BYTE $0x0f; BYTE $0x7f; BYTE $0x03
-	// EMMS
-	BYTE $0x0F; BYTE $0x77
+	MOVL	ptr+0(FP), AX
+	LEAL	ret_lo+4(FP), BX
+	BYTE $0x0f; BYTE $0x6f; BYTE $0x00	// MOVQ (%EAX), %MM0
+	BYTE $0x0f; BYTE $0x7f; BYTE $0x03	// MOVQ %MM0, 0(%EBX)
+	BYTE $0x0F; BYTE $0x77			// EMMS
 	RET
 </pre>
diff --git a/src/runtime/funcdata.h b/src/runtime/funcdata.h
@@ -28,6 +28,9 @@
 // defines the pointer map for the function's arguments.
 // GO_ARGS should be the first instruction in a function that uses it.
 // It can be omitted if there are no arguments at all.
+// GO_ARGS is inserted implicitly by the linker for any function
+// that also has a Go prototype and therefore is usually not necessary
+// to write explicitly.
 #define GO_ARGS	FUNCDATA $FUNCDATA_ArgsPointerMaps, go_args_stackmap(SB)
 
 // GO_RESULTS_INITIALIZED indicates that the assembly function