I want to introduce a new col in df based on other col values.
If c1-c3 cols have only 1 unique value then that unique value will go into c4 col.
If c1-c3 cols have two different values then "both" will go into c4 col.
NaN should not be considered as a valid value. Only c2 and c3 have a few NaNs.
Minimal example:
df = pd.DataFrame({
"c1": ["left", "right", "right", "left", "left","right"],
"c2": ["left", "right", "right", "right", "NaN","right"],
"c3": ["NaN", "NaN", "left", "NaN", "left","right"]})
Required df:
answerdf = pd.DataFrame({
"c1": ["left", "right", "right", "left", "left","right"],
"c2": ["left", "right", "right", "right", "NaN","right"],
"c3": ["NaN", "NaN", "left", "NaN", "left","right"],
"c4":["left", "right", "both", "both", "left","right"] })
>Solution :
import pandas as pd
import numpy as np
df = pd.DataFrame({
"c1": ["left", "right", "right", "left", "left", "right"],
"c2": ["left", "right", "right", "right", np.nan, "right"],
"c3": [np.nan, np.nan, "left", np.nan, "left", "right"]
})
def worker(row):
if "left" in row.values and "right" in row.values:
return "both"
if "left" in row.values:
return "left"
if "right" in row.values:
return "right"
return np.nan
df["c4"] = df[["c1", "c2", "c3"]].apply(worker, axis=1)
This returns nan if neither left nor right is given and might be easier to understand
Output
c1 c2 c3 c4
0 left left NaN left
1 right right NaN right
2 right right left both
3 left right NaN both
4 left NaN left left
5 right right right right