Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split into columns using pandas

I’ve a data file with 4 visible columns that I’m trying to split using pandas. I’m getting ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 8

This is my data

0.001155672            259,439      branch-instructions                                         
 0.001155672          1,266,239      instructions              #    1.10  insn per cycle         
 0.001155672             24,148      cache-references                                            
 0.001155672             11,586      cache-misses              #   47.979 % of all cache refs    
 0.001155672          1,150,999      cpu-cycles                                                  
 0.001155672              8,888      branch-misses             #    3.43% of all branches        
 0.002370509            381,074      branch-instructions                                         
 0.002370509          1,908,560      instructions              #    1.12  insn per cycle         
 0.002370509             29,034      cache-references                                            
 0.002370509             15,362      cache-misses              #   52.910 % of all cache refs    

I’ve tried using data = pd.read_table('repg.txt',sep='\s+', header=None, thousands=',') and delim_whitespace=True

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Just add comment='#' as a parameter of pd.read_table:

data = pd.read_table('repg.txt',sep='\s+', header=None, thousands=',', comment='#')
print(data)

# Output
          0        1                    2
0  0.001156   259439  branch-instructions
1  0.001156  1266239         instructions
2  0.001156    24148     cache-references
3  0.001156    11586         cache-misses
4  0.001156  1150999           cpu-cycles
5  0.001156     8888        branch-misses
6  0.002371   381074  branch-instructions
7  0.002371  1908560         instructions
8  0.002371    29034     cache-references
9  0.002371    15362         cache-misses
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading