Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

C code compiles into incorrect instructions

#include<stdio.h>
#include<stdlib.h>

int main(int argc, char** argv) {
  char *str5=malloc(10);
  *str5="xxxxx\0";
  printf("%s\n",str5);
  return 0;
}

compiles into the following assembly (using "gcc source.c"):

=> 0x555555555179 <main+4>:     sub    rsp,0x20
   0x55555555517d <main+8>:     mov    DWORD PTR [rbp-0x14],edi
   0x555555555180 <main+11>:    mov    QWORD PTR [rbp-0x20],rsi
   0x555555555184 <main+15>:    mov    edi,0xa
   0x555555555189 <main+20>:    call   0x555555555040 <malloc@plt>
   0x55555555518e <main+25>:    mov    QWORD PTR [rbp-0x8],rax
   0x555555555192 <main+29>:    lea    rax,[rip+0xe6b]        # 0x555555556004
   0x555555555199 <main+36>:    mov    edx,eax
   0x55555555519b <main+38>:    mov    rax,QWORD PTR [rbp-0x8]
   0x55555555519f <main+42>:    mov    BYTE PTR [rax],dl
   0x5555555551a1 <main+44>:    mov    rax,QWORD PTR [rbp-0x8]
   0x5555555551a5 <main+48>:    mov    rdi,rax
   0x5555555551a8 <main+51>:    call   0x555555555030 <puts@plt>
   0x5555555551ad <main+56>:    mov    eax,0x0
   0x5555555551b2 <main+61>:    leave
   0x5555555551b3 <main+62>:    ret
   0x5555555551b4 <_fini>:      sub    rsp,0x8
   0x5555555551b8 <_fini+4>:    add    rsp,0x8
   0x5555555551bc <_fini+8>:    ret

What actually happens:
So, on main+42, it loads $dl (the lowest byte of the ADDRESS of my string constant "xxxxx\0" into the address $rax points to, which is my variable str5. This ends up being just a garbage character, and the rest of the string constant is never copied. The compiler spits out the following warning which is probably pertinent, but I don’t understand how:
source.c:6:14: warning: assignment to ‘char’ from ‘char *’ makes integer from pointer without a cast [-Wint-conversion]

What I’m expecting to happen:
The string constant "xxxxx\0" is loaded into the address pointed to by str5. Then, the string pointed to by str5 is printed to the screen.
Why does gcc compile this code this way?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

in C *str5="xxxxx\0"; does not copy the string.

This operation is actually:

  • taking the address of the string literal "xxxxx\0"
  • converts this address to char (which is an integer)
  • assigns the first character of the allocated by malloc memory with this integer value

When you print it, the first character is that integer in the character form. The rest of the characters are not initialized and this code invokes undefined behaviour.

If you want to copy the string literal you need to strcpy(str, "xxxxx"); (the string literal will have null terminating character and you do not need to put it there yourself)
So the compiler is right and it is generating the code for the program you wrote.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading