Go Tools: The Compiler — Part 1 Assembly Language and Go

https://medium.com/martinomburajr/go-tools-the-compiler-part-1-assembly-language-and-go-ffc42cbf579d

Martin Ombura Jr.

The Go compiler is at the heart of Go’s build process, taking code and generating executables from that code. Go’s compiler is available for those interested to tinker with it using the **go tool compiler**command.

Assembly and the Assembler

Assembly is programming language meant to be understood by humans, and is often characterized as a low-level programming language. In some languages, the compiler generates assembly language. In most cases Assembly is the penultimate step in the hierarchy of programming abstractions before you know how to speak to the machine.

Assembly cannot be directly executed as machine code or by the host machine, and it needs an assembler (which is just another program) to be able to convert it into machine code. Assembly differs from machine code in that, assembly does not contain binaries, it cannot be directly executed by machine code and assembly is meant to be “human readable”. Typically as you ascend the hierarchy of programming abstractions. e.g. from high-level programming languages, to low-level ones, more emphasis is placed on interfacing with the actual characteristics of the architecture you’re running on. This arguably makes assembler highly relevant when pursuing performance or accessing low-level hardware functionality such as in embedded systems.

Low Level Golang Primitives

Assembler gives access to the runtime which allows for functionality such as context switching and better access to the stack, that allows for efficient communication of data in primitives such as channels. One interesting use case is the math/big package in the Go standard library.

How Assembly is used in Go

Compiler Architecture

In a talk given by Rob Pike, he walked through some of the compilers architectural changes from early versions of the Go language, to modern day implementation. The figure shown below (Figure 1), showcases a variety steps compilers can take to transform code to linked programs.

The top row, is the canonical way programs that use assembly compile code. Code is compiled into assembly, and that assembly code is then linked. gcc is an example of a compiler that does this. The red dotted line describes binary representation of pseudo instructions that are generated at that point. The subsequent rows represent how [**Plan9**](https://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs) architectures go about creating executable binaries from code.

As of Go 1.3, in a bid to rid the Go standard library (STL) of C code, the Go language designers opted for a trade off that would sacrifice compile speeds, for faster builds.

In the newer architecture, the bottom two rows (Go’s compilation process), the compiler encompasses what the traditional compiler would be doing (shown in the first row), as well as the role of the assembler. This is important, as now the compilation phase not only handles both the high-level code (golang) and the generation of assembly, it generates real instructions in the form of an intermediate representation known as **obj**. **obj** generates real instructions for the linker and is also agnostic to the type of assembler.

Where does Golang fit in all this?

Assembly is included in a small set of go packages, some of these are **runtime**, **syscall**,**math**, **crypto** and **reflect**

1. math package

The **math** and **math/big** package are one of the few packages in the standard library that actually has assembly code that your program calls into in order to achieve better performance with regards to the calculation of highly computational constructs such as big numbers i.e. numbers greater than int64 i.e 9,223,372,036,854,775,807 or less than -9,223,372,036,854,775,807.

In the words of Rob Pike, sometimes using assembler allows you to come up with something better than what the compiler could come up with on its own. math/big also has assembly functions for arithmetic operations on vectors that are more efficient to compute in assembly than in Go or C. Some trigonometrical polynomial coefficients and certain constants are referenced and computed in assembly such as pi, the computation of **arccos, arcsin, arctan** as well as other hyperbolic trignometric functions e.g. **sinh, cosh, tanh** etc.

2. crypto package

As hardware evolves and becomes more capable of computing cryptographic keys very quickly, the software needs to catch up to this. The designers of Go thought it would be better to include Assembly code for some of the **crypto**functionality (as it can be hardware dependent, especially when trying to achieve fast cryptography), and not have to include it as actual Go code in the STL. Here are some examples listed below:

  1. The **crypto/aes** package contains assembly code for various CPU architectures to optimize the encryption of certain blocks.
  2. The **crypto/elliptic** package contains assembly code for the computation of fast prime field elliptic curves that contain 256-bit primes.
  3. The **crypto/md5** package has assembly for different architectures for the computation of various md5 hashes.
  4. The **crypto/sha256 & crypto/sha512**package uses assembly for the optimization of its SHA256 hashes.

3. reflect package

The reflect package is Go’s go to for functionality that involves the use of reflection. Reflection is best defined by this answer from StackOverflow

The ability to inspect the code in the system and see object types is not reflection, but rather Type Introspection. Reflection is then the ability to make modifications at runtime by making use of introspection. The distinction is necessary here as some languages support introspection, but do not support reflection. One such example is C++

Reflection can have the tendency to be slow, or computationally expensive. Some low-level implementations done in assembly can speed it up.

4. runtime/cgo package

The cgo tool that enables the creation of C code from Go. Some of the cross calling that happens between both languages can be optimized by the use of assembly. One technique is to efficiently reuse registers between callee and caller programs saving computation and memory usage. Some assembly optimizations also can be done for the gcc compiler. In other cases Go’s take on assembly is simply used to standardize the calling between code on all different CPU platforms.

5. runtime/atomic and sync/atomic package

Some low-level synchronization primitives are handled by Assembler. This is shown both the **runtime/atomic** and **sync/atomic** packages.

6. syscall package

It would make sense that calls to the kernel are optimized through the use of Assembly. In Linux based GOOS, the actual **Syscall** command itself is performed in Assembly. Even retrieving core OS information such as time of day, is done in assembly. Perhaps there is value in optimizing that function as a lot of tools require access to GOOS time regularly and often.

Generating Assembly in Go

With all this talk of Assembly in Go, let’s actually generate some Assembly from Go code, using the go tools. So we’ll write an extremely simple Hello World application. I’m keeping it simple because the output tends to be very verbose.

  1. // main.go
  2. package main func main()
  3. {
  4. print("Hello world")
  5. }
  6. // Generates obj file as main.o
  7. // go tool compile main.go

Generates an obj file (that we speak about earlier). It is simply binary, but you can inspect it if you are curious.

  1. // Generates assembly, and sends it to a new main.asm
  2. go tool compile -S main.go > main.asm

See link for output

This will send the output generated by the **-S** flag into a main.asm file so you can inspect it. Note how verbose the output is, as well as all the underlying instructions.

To see the full contents of the generated files, check out this GitHub repo.

Conclusion

Assembly is important, it has been around for a long time, however as the need for performance and higher access to low-level hardware arises, it’s value is inexolerable.

If you have never seen Assembly language, here’s a code snippet from one of my 2nd Year Computer Science assignments. This code is written for MIPS architecture and checks to see if a given text is a palindrome (i.e is the same when read forward and in reverse). Note that Assembly varies based on the CPU architecture

  1. #OMBMAR001
  2. .data
  3. isPalinMess: .asciiz "It is a palindrome \n"
  4. noPalinMess: .asciiz "It is not a palindrome \n"
  5. startMess: .asciiz "Enter a number: \n"
  6. temp1: .word 0 # entered number
  7. decima: .word 10 #decima
  8. .text
  9. main:
  10. #START MESSAGE
  11. la $a0,startMess
  12. li $v0, 4
  13. syscall
  14. #ENTER NUMBER
  15. li $v0, 5
  16. syscall
  17. sw $v0, temp1 #SWAP NUMBER TO MEMORY
  18. ################################ SETUP ########################
  19. move $v1, $v0 #move v0 to v1
  20. move $t2, $zero # set t2 = 0
  21. lw $t3, decima # set t3 = 10
  22. move $t0, $zero # set t0 = 0
  23. move $t4, $v1 # copy of int1
  24. jal reverse
  25. j end
  26. #######################REVERSE: REVERSES PALINDROME#######################################
  27. reverse: #reverse the number
  28. bgtz $t4, reverseFunction # while 0 < int1 (in $t4) do reverse step
  29. j compare
  30. ################# REVERSEFUNCTION: REVERSES THE INTEGER #########################
  31. reverseFunction:
  32. div $t4, $t3 # no/10
  33. mfhi $t6 # temp variable for mod rest
  34. mult $t2, $t3 # rev * 10
  35. mflo $t2 # temp variable for mult result
  36. add $t2, $t2, $t6 # rev + mod
  37. div $t4, $t3 # divide int1 copy by 10
  38. mflo $t4 # temp value for t4
  39. j reverse
  40. ########### COMPARE: COMPARES THE TWO INTEGERS ########################
  41. compare:
  42. beq $t2, $v1, pal
  43. j nopal
  44. ############### PAL : IF PALINDROME PRINT MESSAGE #####################
  45. pal:
  46. la $a0, isPalinMess # load is palindrome message
  47. li $v0, 4
  48. syscall
  49. j end
  50. ############### IF NOT PALINDROME: PRINT MESSAGE ###############################
  51. nopal:
  52. la $a0, noPalinMess # load no palindrome message
  53. li $v0, 4
  54. syscall
  55. j end
  56. end:
  57. li $v0,10
  58. syscall #Exit

Additional Links

ft_authoradmin  ft_create_time2020-01-24 02:47
 ft_update_time2020-01-24 02:51