Here’s an example of DF:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 VNF
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [1, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1]
2 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [-1, 0]
3 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, -1]
4 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [1, 0]
5 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [0, 1]
6 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [-1, 0]
How to delete those rows where df[‘VNF’] = [-1, 0] or [0, -1] and df[‘EC1’], df[‘EC2’] and df[‘CDC’] has a value of 0 in the same index position as the -1 in df[‘VNF’])?
The expected result would be:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 VNF
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [1, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1]
2 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [1, 0]
3 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [0, 1]
4 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1] [-1, 0]
Here’s the constructor for the DataFrame:
data = {'EC1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [1, 0]],
'EC2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'CDC': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'L1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L3': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L4': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
'L5': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'L6': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
'VNF': [[1, 0], [0, 1], [-1, 0], [0, -1], [1, 0], [0, 1], [-1, 0]]}
>Solution :
You can explode every column of df
, then identify the elements satisfying the first (sum of "VNF" values must be -1) and second condition and filter out the elements that satisfy both conditions to create temp
. Then since each cell must have two elements, you can count whether each index contains 2 elements by transforming count
, then filter the rows with two indices and groupby
the index and aggregate to list:
exploded = df.explode(df.columns.tolist())
first_cond = exploded.groupby(level=0)['VNF'].transform('sum').eq(-1)
second_cond = exploded['VNF'].eq(-1) & exploded['EC1'].eq(0) & exploded['EC2'].eq(0) & exploded['CDC'].eq(0)
temp = exploded[~(first_cond & second_cond)]
out = temp[temp.groupby(level=0)['VNF'].transform('count').gt(1)].groupby(level=0).agg(list).reset_index(drop=True)
Output:
EC1 EC2 CDC L1 L2 L3 L4 L5 L6 \
0 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0]
1 [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0] [0, 0]
2 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
3 [0, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
4 [1, 0] [0, 0] [0, 1] [0, 0] [0, 0] [0, 0] [0, 0] [0, 1] [0, 1]
VNF
0 [1, 0]
1 [0, 1]
2 [1, 0]
3 [0, 1]
4 [-1, 0]