125
133
See "C? Go? Cgo!" for an introduction to using cgo:
126
134
http://golang.org/doc/articles/c_go_cgo.html
128
package documentation
139
Implementation details.
141
Cgo provides a way for Go programs to call C code linked into the same
142
address space. This comment explains the operation of cgo.
144
Cgo reads a set of Go source files and looks for statements saying
145
import "C". If the import has a doc comment, that comment is
146
taken as literal C code to be used as a preamble to any C code
147
generated by cgo. A typical preamble #includes necessary definitions:
149
// #include <stdio.h>
152
For more details about the usage of cgo, see the documentation
153
comment at the top of this file.
157
Cgo scans the Go source files that import "C" for uses of that
158
package, such as C.puts. It collects all such identifiers. The next
159
step is to determine each kind of name. In C.xxx the xxx might refer
160
to a type, a function, a constant, or a global variable. Cgo must
163
The obvious thing for cgo to do is to process the preamble, expanding
164
#includes and processing the corresponding C code. That would require
165
a full C parser and type checker that was also aware of any extensions
166
known to the system compiler (for example, all the GNU C extensions) as
167
well as the system-specific header locations and system-specific
168
pre-#defined macros. This is certainly possible to do, but it is an
169
enormous amount of work.
171
Cgo takes a different approach. It determines the meaning of C
172
identifiers not by parsing C code but by feeding carefully constructed
173
programs into the system C compiler and interpreting the generated
174
error messages, debug information, and object files. In practice,
175
parsing these is significantly less work and more robust than parsing
178
Cgo first invokes gcc -E -dM on the preamble, in order to find out
179
about simple #defines for constants and the like. These are recorded
182
Next, cgo needs to identify the kinds for each identifier. For the
183
identifiers C.foo and C.bar, cgo generates this C program:
186
void __cgo__f__(void) {
189
enum { _cgo_enum_0 = foo };
191
enum { _cgo_enum_1 = bar };
194
This program will not compile, but cgo can look at the error messages
195
to infer the kind of each identifier. The line number given in the
196
error tells cgo which identifier is involved.
198
An error like "unexpected type name" or "useless type name in empty
199
declaration" or "declaration does not declare anything" tells cgo that
200
the identifier is a type.
202
An error like "statement with no effect" or "expression result unused"
203
tells cgo that the identifier is not a type, but not whether it is a
204
constant, function, or global variable.
206
An error like "not an integer constant" tells cgo that the identifier
207
is not a constant. If it is also not a type, it must be a function or
208
global variable. For now, those can be treated the same.
210
Next, cgo must learn the details of each type, variable, function, or
211
constant. It can do this by reading object files. If cgo has decided
212
that t1 is a type, v2 and v3 are variables or functions, and c4, c5,
213
and c6 are constants, it generates:
216
typeof(t1) *__cgo__1;
217
typeof(v2) *__cgo__2;
218
typeof(v3) *__cgo__3;
219
typeof(c4) *__cgo__4;
220
enum { __cgo_enum__4 = c4 };
221
typeof(c5) *__cgo__5;
222
enum { __cgo_enum__5 = c5 };
223
typeof(c6) *__cgo__6;
224
enum { __cgo_enum__6 = c6 };
226
long long __cgo_debug_data[] = {
236
and again invokes the system C compiler, to produce an object file
237
containing debug information. Cgo parses the DWARF debug information
238
for __cgo__N to learn the type of each identifier. (The types also
239
distinguish functions from global variables.) If using a standard gcc,
240
cgo can parse the DWARF debug information for the __cgo_enum__N to
241
learn the identifier's value. The LLVM-based gcc on OS X emits
242
incomplete DWARF information for enums; in that case cgo reads the
243
constant values from the __cgo_debug_data from the object file's data
246
At this point cgo knows the meaning of each C.xxx well enough to start
247
the translation process.
251
[The rest of this comment refers to 6g and 6c, the Go and C compilers
252
that are part of the amd64 port of the gc Go toolchain. Everything here
253
applies to another architecture's compilers as well.]
255
Given the input Go files x.go and y.go, cgo generates these source
260
_cgo_gotypes.go # for 6g
261
_cgo_defun.c # for 6c
264
_cgo_export.c # for gcc
265
_cgo_main.c # for gcc
267
The file x.cgo1.go is a copy of x.go with the import "C" removed and
268
references to C.xxx replaced with names like _Cfunc_xxx or _Ctype_xxx.
269
The definitions of those identifiers, written as Go functions, types,
270
or variables, are provided in _cgo_gotypes.go.
272
Here is a _cgo_gotypes.go containing definitions for C.flush (provided
273
in the preamble) and C.puts (from stdio):
275
type _Ctype_char int8
276
type _Ctype_int int32
277
type _Ctype_void [0]byte
279
func _Cfunc_CString(string) *_Ctype_char
280
func _Cfunc_flush() _Ctype_void
281
func _Cfunc_puts(*_Ctype_char) _Ctype_int
283
For functions, cgo only writes an external declaration in the Go
284
output. The implementation is in a combination of C for 6c (meaning
285
any gc-toolchain compiler) and C for gcc.
287
The 6c file contains the definitions of the functions. They all have
288
similar bodies that invoke runtime·cgocall to make a switch from the
289
Go runtime world to the system C (GCC-based) world.
291
For example, here is the definition of _Cfunc_puts:
293
void _cgo_be59f0f25121_Cfunc_puts(void*);
296
·_Cfunc_puts(struct{uint8 x[1];}p)
298
runtime·cgocall(_cgo_be59f0f25121_Cfunc_puts, &p);
301
The hexadecimal number is a hash of cgo's input, chosen to be
302
deterministic yet unlikely to collide with other uses. The actual
303
function _cgo_be59f0f25121_Cfunc_puts is implemented in a C source
304
file compiled by gcc, the file x.cgo2.c:
307
_cgo_be59f0f25121_Cfunc_puts(void *v)
313
} __attribute__((__packed__)) *a = v;
314
a->r = puts((void*)a->p0);
317
It extracts the arguments from the pointer to _Cfunc_puts's argument
318
frame, invokes the system C function (in this case, puts), stores the
319
result in the frame, and returns.
323
Once the _cgo_export.c and *.cgo2.c files have been compiled with gcc,
324
they need to be linked into the final binary, along with the libraries
325
they might depend on (in the case of puts, stdio). 6l has been
326
extended to understand basic ELF files, but it does not understand ELF
327
in the full complexity that modern C libraries embrace, so it cannot
328
in general generate direct references to the system libraries.
330
Instead, the build process generates an object file using dynamic
331
linkage to the desired libraries. The main function is provided by
334
int main() { return 0; }
335
void crosscall2(void(*fn)(void*, int), void *a, int c) { }
336
void _cgo_allocate(void *a, int c) { }
337
void _cgo_panic(void *a, int c) { }
339
The extra functions here are stubs to satisfy the references in the C
340
code generated for gcc. The build process links this stub, along with
341
_cgo_export.c and *.cgo2.c, into a dynamic executable and then lets
342
cgo examine the executable. Cgo records the list of shared library
343
references and resolved names and writes them into a new file
344
_cgo_import.c, which looks like:
346
#pragma cgo_dynamic_linker "/lib64/ld-linux-x86-64.so.2"
347
#pragma cgo_import_dynamic puts puts#GLIBC_2.2.5 "libc.so.6"
348
#pragma cgo_import_dynamic __libc_start_main __libc_start_main#GLIBC_2.2.5 "libc.so.6"
349
#pragma cgo_import_dynamic stdout stdout#GLIBC_2.2.5 "libc.so.6"
350
#pragma cgo_import_dynamic fflush fflush#GLIBC_2.2.5 "libc.so.6"
351
#pragma cgo_import_dynamic _ _ "libpthread.so.0"
352
#pragma cgo_import_dynamic _ _ "libc.so.6"
354
In the end, the compiled Go package, which will eventually be
355
presented to 6l as part of a larger program, contains:
357
_go_.6 # 6g-compiled object for _cgo_gotypes.go *.cgo1.go
358
_cgo_defun.6 # 6c-compiled object for _cgo_defun.c
359
_all.o # gcc-compiled object for _cgo_export.c, *.cgo2.c
360
_cgo_import.6 # 6c-compiled object for _cgo_import.c
362
The final program will be a dynamic executable, so that 6l can avoid
363
needing to process arbitrary .o files. It only needs to process the .o
364
files generated from C files that cgo writes, and those are much more
365
limited in the ELF or other features that they use.
367
In essence, the _cgo_import.6 file includes the extra linking
368
directives that 6l is not sophisticated enough to derive from _all.o
369
on its own. Similarly, the _all.o uses dynamic references to real
370
system object code because 6l is not sophisticated enough to process
373
The main benefits of this system are that 6l remains relatively simple
374
(it does not need to implement a complete ELF and Mach-O linker) and
375
that gcc is not needed after the package is compiled. For example,
376
package net uses cgo for access to name resolution functions provided
377
by libc. Although gcc is needed to compile package net, gcc is not
378
needed to link programs that import package net.
382
When using cgo, Go must not assume that it owns all details of the
383
process. In particular it needs to coordinate with C in the use of
384
threads and thread-local storage. The runtime package, in its own
385
(6c-compiled) C code, declares a few uninitialized (default bss)
389
void (*libcgo_thread_start)(void*);
392
Any package using cgo imports "runtime/cgo", which provides
393
initializations for these variables. It sets iscgo to 1, initcgo to a
394
gcc-compiled function that can be called early during program startup,
395
and libcgo_thread_start to a gcc-compiled function that can be used to
396
create a new thread, in place of the runtime's usual direct system
399
[NOTE: From here down is planned but not yet implemented.]
401
Internal and External Linking
403
The text above describes "internal" linking, in which 6l parses and
404
links host object files (ELF, Mach-O, PE, and so on) into the final
405
executable itself. Keeping 6l simple means we cannot possibly
406
implement the full semantics of the host linker, so the kinds of
407
objects that can be linked directly into the binary is limited (other
408
code can only be used as a dynamic library). On the other hand, when
409
using internal linking, 6l can generate Go binaries by itself.
411
In order to allow linking arbitrary object files without requiring
412
dynamic libraries, cgo will soon support an "external" linking mode
413
too. In external linking mode, 6l does not process any host object
414
files. Instead, it collects all the Go code and writes a single go.o
415
object file containing it. Then it invokes the host linker (usually
416
gcc) to combine the go.o object file and any supporting non-Go code
417
into a final executable. External linking avoids the dynamic library
418
requirement but introduces a requirement that the host linker be
419
present to create such a binary.
421
Most builds both compile source code and invoke the linker to create a
422
binary. When cgo is involved, the compile step already requires gcc, so
423
it is not problematic for the link step to require gcc too.
425
An important exception is builds using a pre-compiled copy of the
426
standard library. In particular, package net uses cgo on most systems,
427
and we want to preserve the ability to compile pure Go code that
428
imports net without requiring gcc to be present at link time. (In this
429
case, the dynamic library requirement is less significant, because the
430
only library involved is libc.so, which can usually be assumed
433
This conflict between functionality and the gcc requirement means we
434
must support both internal and external linking, depending on the
435
circumstances: if net is the only cgo-using package, then internal
436
linking is probably fine, but if other packages are involved, so that there
437
are dependencies on libraries beyond libc, external linking is likely
438
to work better. The compilation of a package records the relevant
439
information to support both linking modes, leaving the decision
440
to be made when linking the final binary.
444
In either linking mode, package-specific directives must be passed
445
through to 6l. These are communicated by writing #pragma directives
446
in a C source file compiled by 6c. The directives are copied into the .6 object file
447
and then processed by the linker.
451
#pragma cgo_import_dynamic <local> [<remote> ["<library>"]]
453
In internal linking mode, allow an unresolved reference to
454
<local>, assuming it will be resolved by a dynamic library
455
symbol. The optional <remote> specifies the symbol's name and
456
possibly version in the dynamic library, and the optional "<library>"
457
names the specific library where the symbol should be found.
459
In the <remote>, # or @ can be used to introduce a symbol version.
462
#pragma cgo_import_dynamic puts
463
#pragma cgo_import_dynamic puts puts#GLIBC_2.2.5
464
#pragma cgo_import_dynamic puts puts#GLIBC_2.2.5 "libc.so.6"
466
A side effect of the cgo_import_dynamic directive with a
467
library is to make the final binary depend on that dynamic
468
library. To get the dependency without importing any specific
469
symbols, use _ for local and remote.
472
#pragma cgo_import_dynamic _ _ "libc.so.6"
474
For compatibility with current versions of SWIG,
475
#pragma dynimport is an alias for #pragma cgo_import_dynamic.
477
#pragma cgo_dynamic_linker "<path>"
479
In internal linking mode, use "<path>" as the dynamic linker
480
in the final binary. This directive is only needed from one
481
package when constructing a binary; by convention it is
482
supplied by runtime/cgo.
485
#pragma cgo_dynamic_linker "/lib/ld-linux.so.2"
487
#pragma cgo_export_dynamic <local> <remote>
489
In internal linking mode, put the Go symbol
490
named <local> into the program's exported symbol table as
491
<remote>, so that C code can refer to it by that name. This
492
mechanism makes it possible for C code to call back into Go or
495
For compatibility with current versions of SWIG,
496
#pragma dynexport is an alias for #pragma cgo_export_dynamic.
498
#pragma cgo_import_static <local>
500
In external linking mode, allow unresolved references to
501
<local> in the go.o object file prepared for the host linker,
502
under the assumption that <local> will be supplied by the
503
other object files that will be linked with go.o.
506
#pragma cgo_import_static puts_wrapper
508
#pragma cgo_export_static <local> <remote>
510
In external linking mode, put the Go symbol
511
named <local> into the program's exported symbol table as
512
<remote>, so that C code can refer to it by that name. This
513
mechanism makes it possible for C code to call back into Go or
516
#pragma cgo_ldflag "<arg>"
518
In external linking mode, invoke the host linker (usually gcc)
519
with "<arg>" as a command-line argument following the .o files.
520
Note that the arguments are for "gcc", not "ld".
523
#pragma cgo_ldflag "-lpthread"
524
#pragma cgo_ldflag "-L/usr/local/sqlite3/lib"
526
A package compiled with cgo will include directives for both
527
internal and external linking; the linker will select the appropriate
528
subset for the chosen linking mode.
532
As a simple example, consider a package that uses cgo to call C.sin.
533
The following code will be generated by cgo:
537
type _Ctype_double float64
538
func _Cfunc_sin(_Ctype_double) _Ctype_double
542
#pragma cgo_import_dynamic sin sin#GLIBC_2.2.5 "libm.so.6"
544
#pragma cgo_import_static _cgo_gcc_Cfunc_sin
545
#pragma cgo_ldflag "-lm"
547
void _cgo_gcc_Cfunc_sin(void*);
550
·_Cfunc_sin(struct{uint8 x[16];}p)
552
runtime·cgocall(_cgo_gcc_Cfunc_sin, &p);
555
// compiled by gcc, into foo.cgo2.o
558
_cgo_gcc_Cfunc_sin(void *v)
563
} __attribute__((__packed__)) *a = v;
567
What happens at link time depends on whether the final binary is linked
568
using the internal or external mode. If other packages are compiled in
569
"external only" mode, then the final link will be an external one.
570
Otherwise the link will be an internal one.
572
The directives in the 6c-compiled file are used according to the kind
575
In internal mode, 6l itself processes all the host object files, in
576
particular foo.cgo2.o. To do so, it uses the cgo_import_dynamic and
577
cgo_dynamic_linker directives to learn that the otherwise undefined
578
reference to sin in foo.cgo2.o should be rewritten to refer to the
579
symbol sin with version GLIBC_2.2.5 from the dynamic library
580
"libm.so.6", and the binary should request "/lib/ld-linux.so.2" as its
581
runtime dynamic linker.
583
In external mode, 6l does not process any host object files, in
584
particular foo.cgo2.o. It links together the 6g- and 6c-generated
585
object files, along with any other Go code, into a go.o file. While
586
doing that, 6l will discover that there is no definition for
587
_cgo_gcc_Cfunc_sin, referred to by the 6c-compiled source file. This
588
is okay, because 6l also processes the cgo_import_static directive and
589
knows that _cgo_gcc_Cfunc_sin is expected to be supplied by a host
590
object file, so 6l does not treat the missing symbol as an error when
591
creating go.o. Indeed, the definition for _cgo_gcc_Cfunc_sin will be
592
provided to the host linker by foo2.cgo.o, which in turn will need the
593
symbol 'sin'. 6l also processes the cgo_ldflag directives, so that it
594
knows that the eventual host link command must include the -lm
595
argument, so that the host linker will be able to find 'sin' in the
598
6l Command Line Interface
600
The go command and any other Go-aware build systems invoke 6l
601
to link a collection of packages into a single binary. By default, 6l will
602
present the same interface it does today:
606
produces a file named 6.out, even if 6l does so by invoking the host
607
linker in external linking mode.
609
By default, 6l will decide the linking mode as follows: if the only
610
packages using cgo are those on a whitelist of standard library
611
packages (net, os/user, runtime/cgo), 6l will use internal linking
612
mode. Otherwise, there are non-standard cgo packages involved, and 6l
613
will use external linking mode. The first rule means that a build of
614
the godoc binary, which uses net but no other cgo, can run without
615
needing gcc available. The second rule means that a build of a
616
cgo-wrapped library like sqlite3 can generate a standalone executable
617
instead of needing to refer to a dynamic library. The specific choice
618
can be overridden using a command line flag: 6l -cgolink=internal or
619
6l -cgolink=external.
621
In an external link, 6l will create a temporary directory, write any
622
host object files found in package archives to that directory (renamed
623
to avoid conflicts), write the go.o file to that directory, and invoke
624
the host linker. The default value for the host linker is $CC, split
625
into fields, or else "gcc". The specific host linker command line can
626
be overridden using a command line flag: 6l -hostld='gcc -ggdb'
628
These defaults mean that Go-aware build systems can ignore the linking
629
changes and keep running plain '6l' and get reasonable results, but
630
they can also control the linking details if desired.