Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Relation of endianness to assembly conversion of size in C

Please note that the below is adapted from Problem 3.4 of Bryant and O’Hallaron’s text (CSAPP3e). I have stripped away everything but my essential question.

Context: we are looking at a x86-64/Linux/gcc combo wherein ints are 4 bytes and chars are considered signed (and, of course, 1 byte). We are interested in writing the assembly corresponding to conversion of an int to a char which, at a high level, we know arises from performing truncation.

They present the following solution:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

movl (%rdi), %eax            // Read 4 bytes
movb %al, (%rsi)             // Store low-order byte

My question is whether we can change the movl to a movb since, after all, we are only using a byte in the end. My concern with this suspicion is that there might be some endian-dependence with the read, and we might somehow be getting the high bits if our processor/OS is in little-endian mode. Is this suspicion correct, or would my change work no matter what?

I would try this out but 1) I am on a Mac with Apple silicon and 2) even if my suspicion worked, I couldn’t be sure if this sort of thing was implementation-dependent.

>Solution :

You’re right to be concerned about endianness for this kind of operation, but in this case, your alternative approach would fail on big-endian machines, not on little-endian ones.

x86 is little endian, which means the low-order eight bits of a 32-bit integer are stored in the first (lowest address) byte of that integer, so

movb (%rdi), %al             // Read low-order byte
movb %al, (%rsi)             // Store low-order byte

will do the truncation you want to do on x86. But on a big-endian machine the equivalent operation would read the highest eight bits of the 32-bit integer. (I would give an example but I don’t remember the assembly language for any big-endian architecture well enough to rattle one off, anymore.)

The virtue of doing it the way CS:APP does it is that the same construct will work correctly on both big- and little-endian architectures. Of course, if you’re programming in assembly language you have to rewrite the code anyway, but it’s one fewer thing to worry about while you’re doing that.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading