encodeURI (or encodeURIComponent) return different encodings in different parts of my code

(disclaimer: this is the first time I post a question on SO, so I apologize in advance if I did anything wrong)

I have an URI pointing to an image with this structure:

(stuff…)/acryagl_violencia física/(more stuff…).jpg

I tried to encode it but I get two different results in two different script files and I don’t see the reason why.

// Script one:
`stuff.../${ encodeURIComponent(element.article_id_thumbnail) }/...stuff`
// I get 'acryagl_violencia%20fi%CC%81sica', which does NOT work

// Script two (and Chrome console):
`stuff.../${ encodeURIComponent(element.id) }/...stuff`
// I get 'acryagl_violencia%20f%C3%ADsica', which DOES work

// Notice the difference is on the 'í' from 'física'

According to https://www.url-encode-decode.com/, both strings should decode to the same, which is weird to me. I am totally lost on this one.

In case it helps, this is a React + Vite project, although I don’t see how this could be related with the bundler. I am also testing everything on Chrome.

I fixed it by manually encoding the í character, but there should be a better fix.
Has anyone faced this problem before?

>Solution :

The code works well, it’s the source data that seem to be inconsistent:

You might need to use normalize somewhere in your content handling pipeline to get them consistent.

console.log(
 decodeURI('%C3%AD').normalize('NFKD')
 ===
 decodeURI('i%CC%81')
); // true
// both are two (same) codepoints;
// first was decomposed from single codepoint

console.log(
 decodeURI('%C3%AD')
 ===
 decodeURI('i%CC%81').normalize('NFKC')
); // true
// both are same single codepoint;
// second was composed into it from two codepoints

Leave a Reply