Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Keep only letters and spaces from log file with python

I have a log file that looks like this:

kernel: apparmor = "STATUS" operation = "profile_load" profile = "unconfined" name = "nvidia_modprobe" comm = "apparmor_parser"
kernel: audit: apparmor = "STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe comm="apparmor_parser"
kernel: audit: apparmor = "STATUS" operation="profile_load" profile = "unconfined" 
kernel: audit: apparmor = "STATUS" operation= "profile_load"

I read as a multiline string and I want to keep only letters and spaces and look like this

kernel apparmor  STATUS operation  profile_load profile  unconfined name  nvidia_modprobe comm  apparmor_parser
kernel audit apparmor  STATUS operation profile_load profile  unconfined name nvidia_modprobe comm apparmor_parser
........

I can do it with

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

unwanted_chars = ":.,/"
log.replace(unwanted_chars, "")

but I don’t want to have to add all possible characters.
I was thinking something with isalpha and isspace or some regex.

>Solution :

you could use the re.sub method to use a regular expression for your substitution. This way you can define a negated range expression. I.E replace anything thats not the ranges i have defined. In the code below, it will replace anything thats not an upper case or lower case letter, or a digit from 0 to 9 or a space.

import re


data = '''kernel: apparmor = "STATUS" operation = "profile_load" profile = "unconfined" name = "nvidia_modprobe" comm = "apparmor_parser"
kernel: audit: apparmor = "STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe comm="apparmor_parser"
kernel: audit: apparmor = "STATUS" operation="profile_load" profile = "unconfined" 
kernel: audit: apparmor = "STATUS" operation= "profile_load"'''

print(re.sub(r"[^A-Za-z0-9\s]", "", data))

OUTPUT

kernel apparmor  STATUS operation  profileload profile  unconfined name  nvidiamodprobe comm  apparmorparser
kernel audit apparmor  STATUS operationprofileload profileunconfined namenvidiamodprobe commapparmorparser
kernel audit apparmor  STATUS operationprofileload profile  unconfined 
kernel audit apparmor  STATUS operation profileload
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading