Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Format a `Debug` value into a utf-16 string

Suppose I have value of a type that implements Debug and I want to encode the result of formatting the value in utf-16. One way to do this would be to use format! and then convert the str to utf-16:

use std::fmt::Debug;

#[derive(Debug)]
pub struct User {
    name: String,
    some_ids: [u8; 16],
    // more fields, etc. Quite a few of them, actually
}

pub fn display_to_u16(user: &User) -> Vec<u16> {
    let asutf8 = format!("{user:#?}");
    asutf8.encode_utf16().collect()
}

But it seems wasteful to not directly write the result as a utf-16 string – or more precisely a vector of utf-16 codepoints. Is there any way to directly format a value as a utf-16 string?


Note: The real requirements is to work with an abstract dyn Debug type as part of an impl of tracing_subscriber::field::Visit::record_debug. The User type should serve as an example only. It is not feasible to "simply" implement a different serialization scheme. Working with the Debug trait is an integral part of the question.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The format!() macro always returns a String. However, the write!() macro is the generic version that accepts anything that implements std::fmt::Write (or std::io::Write). So if there’s a type that implements one of those and encodes to UTF16 on-the-fly, then that’d be your best bet.

I took a quick look and didn’t find a 3rd-party crate that provides this, but it’s easy enough to implement yourself:

use std::fmt::{Error, Write};

struct Utf16Writer(Vec<u16>);

impl Write for Utf16Writer {
    fn write_str(&mut self, s: &str) -> Result<(), Error> {
        Ok(self.0.extend(s.encode_utf16()))
    }
    
    fn write_char(&mut self, c: char) -> Result<(), Error> {
        Ok(self.0.extend(c.encode_utf16(&mut [0; 2]).iter()))
    }
}

And then you can use it like so:

use std::fmt::{Debug, Write};

pub fn display_to_u16(user: &User) -> Vec<u16> {
    let mut writer = Utf16Writer(Vec::new());
    write!(writer, "{user:#?}").unwrap();
    writer.0
}

You can assess the performance yourself, but I’m skeptical that this will make it faster (though it may use less memory overall). Intermediate calls may need to encode the relevant data into UTF8 (if they aren’t already) in order to pass it as a str so you’re not really saving any encoding steps.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading