I am trying to convert a text into a dataframe using Python.
sample_text: 'This is \nsample text\n\nName|age\n--|--\n1.abc|45\n2.xyz|34'
Final Desired output:
Steps I am following to achieve above output is listed below :
- Break the text into multiple rows and assign it to a variable: I have tried using
print()to process this textformatted_text = print('This is \nsample text\n\nName|age\n--|--\n1.abc|45\n2.xyz|34')but it cant be assigned asprint()returnsNoneTypeso I get error here.
Desired output after this step:
This is
sample text
Name|age
--|--
1.abc|45
2.xyz|34
- Use above
line break textstored invariableto be read as csv with separator|to create dataframe: I have been thinking of processing this aspd.read_csv(formatted_text,sep='|', skipinitialspace=True)
Desired_output after this step:
I tried earlier explaining this problem in SO post but I guess I wasn’t able to explain it well and it got closed. Hope I am able to explain my issue this time. It could be a silly task but I am stuck at this from long now and would appreciated any help.
>Solution :
A possible solution:
text = 'This is \nsample text\n\nName|age\n--|--\n1.abc|45\n2.xyz|34'
pd.read_csv(StringIO(text), lineterminator='\n', engine='c', header=None)
To split the columns:
(pd.read_csv(StringIO(text), lineterminator='\n', engine='c', header=None)[0]
.str.split(r'|', expand=True))
Output:
0
0 This is
1 sample text
2 Name|age
3 --|--
4 1.abc|45
5 2.xyz|34
