Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

\uD83D\uDCCC keep showing up in code I've inherited. What does this unicode sequence do?

I’ve been reading about code injection using unicode sequences and have been using a tool from Dotnetsafer to locate sequences in a codebad I’ve inherited. This sequence \uD83D\uDCCC keeps coming up:

An example:

appears as: [588]                             __builder5.AddMarkupContent(51, "??");
actual    : [588]                             __builder5.AddMarkupContent(51, "\uD83D\uDCCC");

What is this sequence? Why would the code be injecting it into HTML?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

EDIT 1: I’ve looked up the sequence and the only thing remotely useful that I’ve found is https://unicode.scarfboy.com/?s=D83D+DCCC

>Solution :

Those are the UTF-16 code units that encode the Unicode character U+1F4CC (the pushpin emoji 📌).

How could you have found out?

  1. Look up U+D83D and U+DCCC and find out that they are not actual Unicode characters, but high and low surrogates respectively, meaning they are used in UTF-16
  2. Google for "D83D DCCC" and find this page which explicitly lists those as the UTF-16 encoding of the pushpin emoji.

Actually, come to think of it, you could just skip step #1 😉

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading