Home When using range to iterate over a string, the type differs between the value returned by range and accessing the rune directly using the index. Why?

Questions

When using range to iterate over a string, the type differs between the value returned by range and accessing the rune directly using the index. Why?

byMR

June 13, 2023

When I iterate over a string to access the runes, I have two options:

s := "AB"
range i, v := range s {
    // access the copy of the rune via value
    fmt.Printf("%T", v) // prints int32
    // access the rune via its index directly trough s
    fmt.Printf("%T", s[i]) // prints uint8
}

I do understand that int32 is for the rune type while uint8 for the byte type. Meaning that one time I get a rune and the other time I get a byte. But why?

For context: In this case it’s not a problem because the ASCII chars have enough space inside uint8 but when there is an Emoji for example the space is not enough and therefore the value is wrong.

>Solution :

Because they’re different things that do different things. Range on a string iterates over runes, indexed access on a string accesses bytes, because strings are stored as UTF-8 and don’t have constant-time access to a given rune in the middle. To be clear: s[1] is not the second rune, codepoint, or character of s; it’s the second byte.

If you want to iterate over bytes you can use range([]byte(s)). If you want random access to runes you can use []rune(s) (better to convert once and index into it multiple times, otherwise you might end up on Accidentally Quadratic), or else figure out how to do what you want to do in terms of the functions in the strings package.