(disclaimer: this is the first time I post a question on SO, so I apologize in advance if I did anything wrong)
I have an URI pointing to an image with this structure:
(stuff…)/acryagl_violencia física/(more stuff…).jpg
I tried to encode it but I get two different results in two different script files and I don’t see the reason why.
// Script one:
`stuff.../${ encodeURIComponent(element.article_id_thumbnail) }/...stuff`
// I get 'acryagl_violencia%20fi%CC%81sica', which does NOT work
// Script two (and Chrome console):
`stuff.../${ encodeURIComponent(element.id) }/...stuff`
// I get 'acryagl_violencia%20f%C3%ADsica', which DOES work
// Notice the difference is on the 'í' from 'física'
According to https://www.url-encode-decode.com/, both strings should decode to the same, which is weird to me. I am totally lost on this one.
In case it helps, this is a React + Vite project, although I don’t see how this could be related with the bundler. I am also testing everything on Chrome.
I fixed it by manually encoding the í
character, but there should be a better fix.
Has anyone faced this problem before?
>Solution :
The code works well, it’s the source data that seem to be inconsistent:
- your first string contains letter "i" followed by combining acute accent (U+0301)
- your second string contains latin small letter i with acute (U+00ED)
You might need to use normalize
somewhere in your content handling pipeline to get them consistent.
console.log(
decodeURI('%C3%AD').normalize('NFKD')
===
decodeURI('i%CC%81')
); // true
// both are two (same) codepoints;
// first was decomposed from single codepoint
console.log(
decodeURI('%C3%AD')
===
decodeURI('i%CC%81').normalize('NFKC')
); // true
// both are same single codepoint;
// second was composed into it from two codepoints