C# convert byte array to strings split by specific bytes

Advertisements

I am sorry, if this is much of a dumb question. But I can’t really figure this out, and I bet it has to be much simpler than I think.

I have a byte[] array which contains several Unicode Strings, each char takes clearly 2 bytes, and each string is delimited by two 00 00 bytes, until double 00 00 marks the end of it all.

When I try to use UnicodeEncoding.Unicode.GetString(myBuffer) I do get the first string, but when the delimiter byte is found it start to get garbage all around.

Right now I am parsing byte by byte and then concatenating things but I am sure there has to be a better way into this.

I was wondering if I should try to find the "position" of the delimiter bytes and then limit the GetString method to that lent? But if so, how do you find 2 the position of 2 specific bytes in a byte array?

the example byte array looks like this:

Hex View
 
00000000  73 00 74 00 72 00 31 00  00 00 73 00 74 00 72 00  s.t.r.1...s.t.r.
00000010  32 00 00 00 73 00 74 00  72 00 33 00 00 00 00 00  2...s.t.r.3.....

>Solution :

So your buffer is valid little endian UTF-16 data. Those "double 00 bytes" is just the NUL character, or \0.

Encoding.Unicode.GetString(myBuffer) will actually correctly decode the whole buffer, but it’ll have embedded NUL characters in it delimiting each sub string. Which is fine, because \0 is just like any character. This isn’t C.

If you split by \0 after decoding, you can get all the substrings, removing empty entries to get rid of those final NULs:

var decoded = Encoding.Unicode.GetString(myBuffer);
foreach(var str in decoded.Split('\0', StringSplitOptions.RemoveEmptyEntries))
    Console.WriteLine(str);

Alternatively, you can search for the first NUL if you want:

var index = decoded.IndexOf('\0');
var firstStr = decoded.Substring(0, index);

And so on.

Leave a ReplyCancel reply