Trying to loop through the tuple that is currently a column in my data frame. For the first ID I want to select the first item in the group tuple then for the second ID select the second variable in the tuple. For the remaining ID’s in the same group I would like to cycle back through the tuple.
If the group changes I would like to repeat the process with the new group. I’m also fine with splitting it into a new data frame and then union the results back in later.
df = pd.DataFrame({'ID':[1,2,3,4,5,6],
'Group':["('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Bird','Dog')",
"('Bird','Dog')",
]
})
ID | Group |
---|---|
1 | (‘Cat’, ‘Dog’) |
2 | (‘Cat’, ‘Dog’) |
3 | (‘Cat’, ‘Dog’) |
4 | (‘Cat’, ‘Dog’) |
5 | (‘Bird’, ‘Dog’) |
6 | (‘Bird’, ‘Dog’) |
ID | Group |
---|---|
1 | Cat |
2 | Dog |
3 | Cat |
4 | Dog |
5 | Bird |
6 | Dog |
>Solution :
Assuming a column of tuples:
df['Group'] = (df.groupby(df['ID'].sub(1).mod(2))['Group']
.transform(lambda s: s.str[s.name])
)
If you have strings:
from ast import literal_eval
df['Group'] = (df['Group'].apply(literal_eval)
.groupby(df['ID'].sub(1).mod(2))
.transform(lambda s: s.str[s.name])
)
Output:
ID Group
0 1 Cat
1 2 Dog
2 3 Cat
3 4 Dog
4 5 Bird
5 6 Dog