Skip to content

Latest commit

 

History

History

strings

String

A string is a ubiquitous data structure, typically a built-in data type in programming languages. However, beneath the surface, strings are essentially slices of characters that enable textual data storage and manipulation.

Implementation

In Go, strings are a data type. Behind the scenes strings are an immutable slice of bytes. Since Go is a UTF-8 compliant language, each character in Go can take up to 4 bytes of storage.

The strings package provides several useful convenience functions. Examples include:

When a iterating every character in a string in Go using the range keyword, every element becomes a rune which is an alias for the type int32. If the code being written works with many single-character strings, it is better to define variables and function parameters as rune rather than convert them many times. The following code shows how to iterate through a string.

package main

import "fmt"

/*
 main outputs the rune (int32) value of each character:

 Char #0 "a" has value 97
 Char #1 "A" has value 65
 Char #2 "和" has value 21644
 Char #5 "平" has value 24179
 Char #8 "😊" has value 128522
*/
func main() {
	for i, r := range "aA𓅚😊" {
		fmt.Printf("Char #%d %q has value %d\n", i, string(r), r)
	}
}

A very common tool to use for manipulating strings in Go is the fmt.Sprintf function. This is specially useful when converting many values into a string.

package main

import "fmt"

func main() {
	number := 1
	value := 1.1
	name := "foo"

	output := fmt.Sprintf("%d %f %s", number, value, name)
	fmt.Println(output) // 1 1.100000 foo
}

Regular Expressions

Unlike many other programming languages, in Go regular expressions are guaranteed to have O(n) time complexity where n is the length of the input, making them a viable and practical option for pattern matching in a string.

Here is an example of how you can find a pattern using regular expressions in Go. Given a string return the string if it contains a fish word. A fish word is a word that starts with fi optionally followed by other character(s) and ends with sh. Examples include {fish, finish}.

package main

import (
	"fmt"
	"regexp"
)

var fishPattern = regexp.MustCompile(`(?i).*fi\w*sh\b`)

// main outputs [selfish][shellfish][fish][finish][Finnish]
func main() {
	inputs := []string{"shift", "selfish", "shellfish", "fish dish", "finish", "Finnish"}

	for _, input := range inputs {
		matches := fishPattern.FindAllString(input, -1)
		if len(matches) > 0 {
			fmt.Print(matches)
		}
	}
}

Complexity

Since strings are slices of bytes, the time complexity of string operations should be similar to arrays. Reading a character at a given index is O(1), but since strings are immutable modifying them involves creating a new string making it a O(n) operation. Go standard library includes strings.Builder for more efficient string building.

The space complexity to store a string depends on the type of characters. This following example shows how we can index a string and print the hexadecimal value of every byte in it.

package main

import "fmt"

// main Outputs 41 f0 93 85 9a f0 9f 98 8a.
func main() {
	input := "A𓅚😊"
	for i := 0; i < len(input); i++ {
		fmt.Printf("%x ", input[i])
	}
}

The output of the above code indicates that 9 bytes are used to store the 3 input characters. 1 byte for the first character and 4 bytes for each of the remaining two.

Application

Strings store words, characters, sentences, etc.

Rehearsal