Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to take specific set of characters out of overall string and save to array or list?

I have a string with Unicodes inside of it, and I am trying to extract each unicode from the overall string and save it to a list/array..

This is the overall string:

"test 🔷 test 💙 test 🔹"

I want the following list:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

1. 🔷 2. 💙 3. 🔹

Right now I am trying the following:

string[] emojiSeparators = new string[] { "&#", ";" };
string[] resultEmojis;

resultEmojis = noHtmlEmoji.Split(
  emojiSeparators, StringSplitOptions.RemoveEmptyEntries);

But I am getting the words "test" added to the list like below:

enter image description here

I only want the unicodes saved to my list, so that I can iterate over them and do things.

>Solution :

I suggest matching with a help of regular expression:

using System.Linq;
using System.Text.RegularExpressions;

...

string[] resultEmojis = Regex
  .Matches(noHtmlEmoji, @"&#[1-9][0-9]{5}(?=;)")
  .Cast<Match>()
  .Select(match => match.Value)
  .ToArray();

Pattern &#[1-9][0-9]{5}(?=;) explained:

&#       - &# characters
[1-9]    - digit in 1..9 range
[0-9]{5} - 5 digits in 0..9 range
(?=;)    - ; character which is not included into the match

Fiddle

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading