How to make a new column with conditions on another column?

December 7, 2021

I would like to create a cat_month column in my expeditions dataframe. This column would contain the mountain category (small, medium or large) and I would like to assign a category according to the height contained in the highpoint_metres column (with quartiles: small = height lower than the first quartile) but I can’t manage to do it.

Data:

import pandas as pd
expeditions = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/expeditions.csv")

What I’ve tried :

peaks[cat_monts] = 
for peak_id in expeditions : 
 if "highpoint_metres" < 6226.5 : #1er quartile 
  return "petite montagne"
elif 6226.5<"highpoint_metres" <7031.25:
  return "moyenne montagne"
else : 
 return "grande montagne"

>Solution :

Use np.select which accepts a list of conditions, list of their corresponding values, and a default ("else") value.

The conditions are evaluated in order, so you can use this:

conditions = {
    'moyenne montagne': expeditions['highpoint_metres'] < 7031.25,
    'petite montagne': expeditions['highpoint_metres'] < 6226.5,
}
expeditions['cat_month'] = np.select(conditions.values(), conditions.keys(), default='grande montagne')

Output:

      expedition_id  ...  highpoint_metres  ...         cat_month
0         ANN260101  ...            7937.0  ...   grande montagne
1         ANN269301  ...            7937.0  ...   grande montagne
2         ANN273101  ...            7937.0  ...   grande montagne
3         ANN278301  ...            7000.0  ...  moyenne montagne
4         ANN279301  ...            7160.0  ...   grande montagne
...             ...  ...               ...  ...               ...
10359     PUMO19101  ...            7138.0  ...   grande montagne
10360     PUMO19102  ...            7138.0  ...   grande montagne
10361     PUTH19101  ...            6350.0  ...  moyenne montagne
10362     RATC19101  ...            6600.0  ...  moyenne montagne
10363     SANK19101  ...            6452.0  ...  moyenne montagne