Why can't I regex a converted bytes to string?


My issue is that when I regex a string variable it works, but when I convert a byte object to string and then regex that string it returns an empty list, here is my code.


#include <iostream>
using namespace std;

string a = "ABC";

int main() {
  cout << "Hello World!";
  return 0;


import subprocess as subs
import re

file = "simple.cpp"

full_ast = subs.run(["clang -Xclang -ast-dump %s" % file], shell=True, stdout=subs.PIPE)

s = ("test | |-UsingDirectiveDecl 0x16de688 <line:58:3, col:24> col:24 Namespace 0x16de588 '__debug' test\n"
 "test |-UsingDirectiveDecl 0x1e840b8 <simple.cpp:2:1, col:17> col:17 Namespace 0x1378e98 'std' test")

pattern = r"UsingDirectiveDecl\s0x[a-f0-9]{7}\s+<simple\.cpp:[0-9]+:[0-9]+,\s[a-zA-Z]+:[0-9]+>\s[a-zA-Z]+:[0-9]+\sNamespace\s0x[a-f0-9]{7}\s'[^']*'"

s_full_ast = str(full_ast.stdout)
namespace_s = re.findall(pattern, s) # Switch between s and s_full_ast

I want to know why it’s not working and how I can fix it. Any help is much appreciated.

>Solution :

You aren’t creating the string you think you are:

>>> str(b'foo')
"b'foo'"  # not 'foo'

You want to decode the bytes value instead.

>>> b'foo'.decode()

subprocess can do this for you, if you supply a text keyword argument.

>>> subprocess.run("echo foo", shell=True, stdout=subprocess.PIPE).stdout
>>> subprocess.run("echo foo", shell=True, stdout=subprocess.PIPE, text=True).stdout

(You may also need to supply the encoding argument to specify what encoding should be used to convert the bytes written by your command to a str.)

Leave a ReplyCancel reply