Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Read txt file with pandas into dataframe

I want to read the txt file from here with Dota 2 mmrs for different players. It has the form as below:

      1) "103757918"
      2) "1"
      3) "107361667"
      4) "1"
      5) "108464725"
      6) "1"
      7) "110818765"
      8) "1"
      9) "111436016"
     10) "1"
     11) "113518306"
     12) "1"
     13) "118896321"
     14) "1"
     15) "119780733"
     16) "1"
     17) "120360801"
     18) "1"
     19) "120870684"
     20) "1"
     21) "122616345"
     22) "1"
     23) "124393917"
     24) "1"
     25) "124487030"

With the account_id (e.g 103757918) followed by the mmr of the player (e.g 1). How can I read this in a pandas dataframe with two columns = account_id, mmr?

I don’t need the index numbers.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Read the data how you normally would, slice the data every other row and concat. After you can rename the columns to whatever you want.

Slicing the data this way assumes that the first value is always the account_id followed by the mmr. Notice how row 25 is missing the mmr in your sample data and is therefore null.

df = pd.read_csv(data.txt, sep='\s+', header=None)
pd.concat([df[1][::2].reset_index(drop=True),
           df[1][1::2].reset_index(drop=True)], axis=1)

Here is a working example based on your sample data

s = '''1) "103757918"
2) "1"
3) "107361667"
4) "1"
5) "108464725"
6) "1"
7) "110818765"
8) "1"
9) "111436016"
10) "1"
11) "113518306"
12) "1"
13) "118896321"
14) "1"
15) "119780733"
16) "1"
17) "120360801"
18) "1"
19) "120870684"
20) "1"
21) "122616345"
22) "1"
23) "124393917"
24) "1"
25) "124487030"'''


from io import StringIO

df = pd.read_csv(StringIO(s),sep='\s+', header=None)
data = pd.concat([df[1][::2].reset_index(drop=True),
              df[1][1::2].reset_index(drop=True)], axis=1)

data.columns = ['account_id', 'mmr']

    account_id  mmr
0    103757918  1.0
1    107361667  1.0
2    108464725  1.0
3    110818765  1.0
4    111436016  1.0
5    113518306  1.0
6    118896321  1.0
7    119780733  1.0
8    120360801  1.0
9    120870684  1.0
10   122616345  1.0
11   124393917  1.0
12   124487030  NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading