String formatting in Go
Uday HiwaraleMay 24

String formatting in Go

String formatting or String interpolation is an important concept in any language. Printf would probably be the general implementation of how a variable of any type is formatted in a string. We will see many variants of this in this tutorial.

Go to the profile of Uday Hiwarale

Uday Hiwarale


Printing something to the standard output in Go is very easy. You have to just import fmt package (which stands for format but Go developers like to call it fumpt) and call functions like Print to write something to standard output like terminal or command prompt.

We have a couple of different functions like Print, Printf and Println which can achieve this. Let’s focus on Print and Printf first.

fmt.Print(args…) (n int, e error)

The **Print** is a variadic function, means it can accept multiple arguments. It concatenates these arguments with space in between if both of the adjacent argument is not a string and writes to the standard output without a trailing newline.

The Print function returns the number of bytes written and any error occurred while writing but we generally ignore these return values.

fmt.Println(args…) (n int, e error)

The **Println** is also a variadic function, means it can also accept multiple arguments. It always concatenates these arguments with space in between and writes to the standard output with a trailing newline.

fmt.Sprint(args…) string & fmt.Sprintln(args…) string

Sprint and Print functions work in a similar fashion. Sprint does everything what Print does but does not write the resulting string to the standard output, instead, it returns it.

As by now, you must have guessed that Sprintln works in the same way Println works, just it returns the resulting string. So let’s check it.

In most of the cases, you will be using **Println** because it’s simple and you don’t need to do complex concatenation as it will handle that by itself. But what if we need to do string interpolation?

Before we move forward, let’s understand scientific notation first. If a floating point number is too long to express like **0.00000014325** then we use scientific notation format that is written in the form of **_m_ × 10^n** (m times ten raised to the power of n). Here m (called mantissa) can be an integer or a floating point number in the shortest possible format and n is a positive or negative integer (called exponent). Read more about scientific notation here on Wikipedia. Here are a few examples of scientific notation. Sometimes 10 is replace with e or E for simplicity (e is just arbitrary and has nothing to do with _e_ in the exponential function).

  1. 125000 => 125 * 10^37.5 => 7.5 * 10^09.815670 => 9.81567 * 10^00.0081 => 8.1 * e^-3 // same as : 8.1 * 10^-3-56.11 => -5.611 * E^1 // same as : -5.611 * 10^10.00000014325 => 1.4325 * 10^-7

It is also possible to write a floating point number in a binary format, brilliantly explained here by Abhishalini.

What is a string interpolation?

A string literal can have placeholders. These placeholders are also called as verbs. These placeholders can be replaced by actual variables using string formatting functions provided by fmt package.

Some of these verbs or placeholder in Go might appear in %x format where a formatting verb must start with % and x is be formatting operator. Let’s see a quick demo using Printf formatting function provided by fmt package.

The fmt.Printf excepts first arguments to be a string literal with some placeholders. Later arguments will be sequentially substituted in the placeholder. In the above example, %v is the placeholder we used to replace the value as it appears (we will discuss this later). The number of arguments except the initial string should be equal to the number placeholders we used. Printf does not end with a trailing newline (and there is no _Prinfln_ function) and has the same return values of Print function.

?? fmt.Sprintf is a function similar to fmt.Printf but instead of writing to standard output, it returns the formatted string.

Formatting verbs (placeholders)

In the above topics, we have learned about formatting functions and there is nothing more to it. Now, we have to focus our attention on formatting verbs because there are too many to remember but they can save a lot of time in debugging.

➭ %v (default format)

%v formatting verb represents the value in the default format provided by Go. Let’s see a couple of examples to examine what are the default formats.

%v verb is not generic in nature but Go uses other formatting verbs based on the input value. In later verbs description, we will see what generic verb was used by %v in the above example.

There is another variant of %v which is **%#v** which is used to format the value as it appears in the Go code. It is also called as syntax representation.

Here, # is called as a flag. These come between **%** sign and the actual verb letter. There are other flags in Go which will see in later topics.

If you need to display field names in a struct or pointer to a struct, you can use %+v verb (**_+_** flag), an example of it is shown below.

➭ %T (Type format)

%T formatting verb represents the data-type of a value in Go standard syntax. Examples of all data types are listed below.

➭ %t (boolean format)

%t is used to format boolean value into a string word true or false.

? This is the generic formatting verb used by %v for a boolean value.

➭ %d (base10 format)

When you have integers in hexadecimal or octal formats, you can use %d to convert them base10 format. You can also use it with int8, int16, int32 or their unsigned variants. Which means you can also see the rune value of a character as a rune is int32 an integer (explained more in the next section).

You can optionally use %+d to print the sign of a numeric value (if not already provided). If you want to leave a space character in the place of the numeric sign, then use % d. If the sign is not provided in a numeric value, then only space will be added to the output.

You can also use width to add extra padding to the output as explained in decimal format section.

? %d is the generic formatting verb used by %v for formatting integers, hexadecimal & octal numbers, and runes (characters).

➭ %b (base2/binary number format)

If you want to format an integer into a binary number, then use %b verb.

➭ %x (hexadecimal format) and %o (octal format)

Any integer can be represented in base16 (hexadecimal) or base8 (octal) format. %x is used for hexadecimal conversion while %o is used for octal conversion. %X is a variant of %x and you can use it when you want letters in the hexadecimal representation are uppercase.

%#x or %#X adds leading 0x or 0X to the formatted output to denote hexadecimal number while %#o adds leading 0 to denote octal number.

%x and %X is also used to convert a string to the hexadecimal presentation. Since string in Go is a slice of byte, the hexadecimal representation of a string will be a hexadecimal representation of each byte in that slice. Since the maximum value of a byte is hexadecimal FF (255 in decimal), we should expect two hexadecimal characters per byte.

Let’s use character J, K, and lowercase beta (β) from this Unicode list. Their hexadecimal bytes values are 4a, 4b, and ceb2 respectively.

➭ %f (decimal format) and %e (scientific notation format)

%f is useful when you want to convert a floating point number (**_float32_** or **_float64_**) to a decimal notation with custom width and precision. Width is the space taken by the output decimal number (with left-padding when the width is more than minimum characters needed to represent the output number) while precision is the number of numbers to be shown after the decimal point. %f can be used in the below variant to customize width and precision

  1. %**w**.**p**f => **w** is width and **p** is precision

Both w and p are optional while dot . is necessary to separate them when both are used. When w is not provided, Go will use default width to perfectly show the decimal number. When p is not provided, Go will use the default precision to show the number after the decimal point.

When the precision is not provided, Go sets default precision to 6. You can also use %F in the place of %f as both of them do the same thing. You can use + sign to always display the sign of the number as you can see in point [8]. You can use # to always print decimal point which can be seen in point [7].

When a floating point number is too large to display, we can use scientific notation as discussed earlier. You can use %e or %E to format a floating point number to its scientific notation. This formatting verb also works like %f, hence you can provide custom width and precision (and other flags).

➭ %g (%f ↔️ %e fallback)

%g or %G is used when you want to format a floating point number to its scientific notation when that floating point number is large but keep the same when it’s small. Let’s see some examples.

As you can see from the above results, %g works a little different from %f formatting verb when it comes to precision. Precision in %g is the number of significant digits to be displayed in the output. For example, in point [2], significant digits were 3 hence Go displayed scientific notation while in point [3] significant digits were 6 and scientific notation is not possible.

? %v uses %g generic formatting verb to format floating point numbers.

➭ %c (character format) and %U (Unicode format)

As we have seen in string data type tutorial, a string is a slice of bytes but a character is represented by rune data type and rune is an **int32** data type that contains one or more bytes which collectively represent a character in UTF-8 encoding AKA Unicode Code Point (please read the tutorial, it’s very easy to understand this). When you have a rune and you want to see the character it represents, you can use %c formatting verb.

First, we need to create a rune in order to use %c. Let’s use the Greek Capital Letter Theta (Θ) to create a rune as it can’t be presented in ASCII character set. As you can see from here, the decimal value of this character is 920 (code point) & hexadecimal value is 0398 while the encoding sequence (code units) in binary are 11001110 10011000 & in hexadecimal are ce 98. Since Go source code is encoded in UTF-8, we can store this character in a string (using double quotes) and rune (using single quotes).

In the above example, using %d formatting verb, we extrapolated the decimal value of rune r which contains the Theta character. Since s is a string and it’s a slice of bytes, using for loop, we can obtain the code units of Theta character. Then, using %c formatting verb, we formatted the rune r (which contains the code point in int32 / a decimal value) to the character it represents in UTF-8 encoding format (list here).

%U is used to format a Unicode character to its Unicode notation format. Here is a good answer to why we use **U+** for Unicode notation.

If you want to display the original character in the output after the Unicode notation, you can use %#U.

➭ %s (string format) and %q (escaped string format)

%s formatting verb converts a string or slice of bytes to a string format. This formatted string is not escaped. Hence as you can see from point [2] and [3], our slices of bytes contain double quote characters (decimal values here) and the formatted output is the same as input characters.

%q works in the same way as %s but it adds double quotes to the input string or slice of bytes and escapes any double quotes or characters with special meaning (like a forward slash) in the string as we can see in point [4], [5] and [6]. Escaping a string is very useful as explained here on StackOverflow.

If you need ASCII only output, then you can use %+q which converts any non-ASCII characters to its Unicode escape sequence (in ASCII characters) as explained below. \u0398 represents the Theta characters in Unicode.

? _%v_ uses _%s_ generic formatting verb to format string and slices of bytes.

➭ %p (pointer address format)

%p formatting verb is used to format the memory address of a pointer to lowercase base16 notation. Slice is also a valid input for %p and it formats the pointer to the 0th element of the slice to **base16** notation. By default, %p formats a channel to **base16** notation as a channel in itself is a pointer.

? _%v_ uses _%p_ generic formatting verb to format a pointer or channel. %#p removes the leading 0x in **base16** notation of memory address.

Other important notes

  1. If a verb does not expect a flag, it will be ignored.
  2. If you want to use % literal in the output and you do not want Go compiler to interpret it as a verb then you can escape it with another % . Hence fmt.Printf("%% is %d", 100) will output % is 100.
  3. If you have a custom type and you want to print your custom value using %v formatting verb (instead of Go’s default interpretation of the type) then your type must implement String method. Here is an example.
  4. As we know, if width in %f is greater than the output numeric value, then the output value is padded from the left with spaces. If you want to add padding on the right, use  sign as explained here.
  5. There are other important functions provided by fmt package, which you can read on the official fmt package documentation.
ft_authoradmin  ft_create_time2019-06-02 19:31
 ft_update_time2019-06-02 19:31