Home Split strings in a Series, convert to array and average the values

Questions

Split strings in a Series, convert to array and average the values

July 30, 2024

I have a Pandas Series that has these unique values:

array(['17', '19', '21', '20', '22', '23', '12', '13', '15', '24', '25',
       '18', '16', '14', '26', '11', '10', '12/16', '27', '10/14',
       '16/22', '16/21', '13/17', '14/19', '11/15', '10/15', '15/21',
       '13/19', '13/18', '32', '28', '12/15', '29', '42', '30', '31',
       '34', '46', '11/14', '18/25', '19/26', '17/24', '19/24', '17/23',
       '13/16', '11/16', '15/20', '36', '17/25', '19/25', '17/22',
       '18/26', '39', '41', '35', '50', '9/13', '33', '10/13', '9/12',
       '93/37', '14/20', '10/16', '14/18', '16/23', '37', '9/11', '37/94',
       '20/54', '22/31', '22/30', '23/33', '44', '40', '50/95', '38',
       '16/24', '15/23', '15/22', '18/23', '16/20', '37/98', '19/27',
       '38/88', '23/31', '14/22', '45', '39/117', '28/76', '33/82',
       '15/19', '23/30', '47', '46/115', '14/21', '17/18', '25/50',
       '12/18', '12/17', '21/28', '20/27', '26/58', '22/67', '22/47',
       '25/51', '35/83', '39/86', '31/72', '24/56', '30/80', '32/85',
       '42/106', '40/99', '30/51', '21/43', '52', '56', '25/53', '34/83',
       '30/71', '27/64', '35/111', '26/62', '32/84', '39/95', '18/24',
       '22/29', '42/97', '48', '55', '58', '39/99', '49', '43', '40/103',
       '22/46', '54/133', '25/54', '36/83', '29/72', '28/67', '35/109',
       '25/62', '14/17', '42/110', '52/119', '20/60', '46/105', '25/56',
       '27/65', '25/74', '21/49', '29/71', '26/59', '27/62'], dtype=object)

The ones that have the ‘/’, I want to split these into arrays and then average their values. One simpler but a flawed approach is to simply extract the first value:
master_data["Cmb MPG"].str.split('/').str[0].astype('int8')

However, what I truly require is the two values being averaged.
I have tried several commands and this one:

np.array(master_data["Cmb MPG"].str.split('/')).astype('int8').mean()

Should ideally do the job, but I get a ValueError followed by a TypeError:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'list'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[88], line 1
----> 1 np.array(master_data["Cmb MPG"].str.split('/')).astype('int8')

ValueError: setting an array element with a sequence.

The slice() method returns a Series but it won’t proceed either with the splitting of strings.

What is required is:

'18/25' ---> [18, 25] ---> 22 (rounded)

>Solution :

I would use extractall and groupby.mean:

s = pd.Series(['10', '12/16', '27', '10/14', '16/22', '16/21', '13/17'])

out = (s.str.extractall(r'(\d+)')[0].astype(int).groupby(level=0).mean()
        .round().astype(int)
       )

You could also go with split and mean, but this generates a more expensive intermediate and will not scale as well if you have many items (1/2/3/4/5):

out = (s.str.split('/', expand=True).astype(float).mean(axis=1)
        .round().astype(int)
      )

Output:

0    10
1    14
2    27
3    12
4    19
5    18
6    15
dtype: int64

pandas

byMR

Published July 30, 2024

Add a comment

How can I be better in html css

byMR

July 30, 2024

Questions

Writing json to sqlite3 but with a condition to remove a person from sqlite3 if person is not in json file

byMR

July 30, 2024

Questions

Trying to change max-width + display: grid in media query

byMR

July 30, 2024

Questions

GUI Not Displaying All Components (C++)

byMR

July 30, 2024

Questions

Convert Array of single dynamic-key objects into a single object representing all dynamic key entries

byMR

July 31, 2024

Questions

Promise resolved runs before main process finished

byMR

July 31, 2024

Split strings in a Series, convert to array and average the values

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How can I be better in html css

Writing json to sqlite3 but with a condition to remove a person from sqlite3 if person is not in json file

Trying to change max-width + display: grid in media query

GUI Not Displaying All Components (C++)

Convert Array of single dynamic-key objects into a single object representing all dynamic key entries

Promise resolved runs before main process finished

Keep Up to Date with the Most Important News

Split strings in a Series, convert to array and average the values

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How can I be better in html css

Writing json to sqlite3 but with a condition to remove a person from sqlite3 if person is not in json file

Trying to change max-width + display: grid in media query

GUI Not Displaying All Components (C++)

Convert Array of single dynamic-key objects into a single object representing all dynamic key entries

Promise resolved runs before main process finished

Discover more from Dev solutions