How to combine two code points to get one?

I know that unicode code point for Á is U+00C1. I read on internet and many forums and articles that I can also make an Á by combining characters ´ (unicode: U+00B4) and A (unicode: U+0041).

My question is simple. How to do it? I tried something like this. I decided to try it in golang, but it’s perfectly fine if someone knows how to do it in python (or some other programming language). It doesn’t matter to me.

Okay, so I tried next.

A in binary is: 01000001

´ in binary is: 10110100

It together takes 15 bits, so I need UTF-8 3 bytes format (1110xxxx 10xxxxxx 10xxxxxx)

By filling the bits from A and ´ (first A) in the places of x, the following is obtained: 11100100 10000110 10110100.

Then I converted the resulting three bytes back into hexadecimal values: E4 86 B4.

However, when I tried to write it in code, I got a completely different character. In other words, my solution is not working as I expected.

package main

import (
    "fmt"
)

func main() {
    r := "\xE4\x86\xB4"

    fmt.Println(r) // It wrote 䆴 instead of Á
}

>Solution :

It looks like the ´ (U+00B4) character you provided is not actually a combining character as Unicode defines it.

>>> "A\u00b4"
'A´'

If we use ◌́ (U+0301) instead, then we can just place it in sequence with a character like A and get the expected output:

>>> "A\u0301"
'Á'

Unless I’m misunderstanding what you mean, it doesn’t look like any binary manipulation or trickery is necessary here.

Leave a Reply