I have the following data frame:
df = pd.DataFrame([([40.33, 40.34, 40.22],[-71.11, -71.21, -71.14],[12, 45, 10]), ([41.23, 41.40, 41.22],[-72.01, -72.01, -72.01],[11, 23, 15]), ([43.33, 43.34],[-70.11, -70.21],[12, 40]), ([41.23, 41.40], [-72.01, -72.01, -72.01], [11, 23, 15])], columns=['long', 'lat', 'accuracy'])
long lat accuracy
[40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10]
[41.23, 41.40, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15]
[43.33, 43.34] [-70.11, -70.21] [12, 40]
[41.23, 41.40] [-72.01, -72.01, -72.01] [11, 23, 15]
...
Each column contains a list of floats. I want to check if in each row in all three columns, the sizes of these lists are the same. What is the best way to do this, return another column named sanity with TRUE if all lists have the same size, FALSE if at least one list has a different size compared to the rest?
The expected output is:
long lat accuracy sanity
[40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10] TRUE
[41.23, 41.40, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15] TRUE
[43.33, 43.34] [-70.11, -70.21] [12, 40] TRUE
[41.23, 41.40] [-72.01, -72.01, -72.01] [11, 23, 15] FALSE
>Solution :
You can approach this with applymap and nunique :
df["sanity"] = df.applymap(len).nunique(axis=1).eq(1)
# Output :
print(df)
long lat accuracy sanity
0 [40.33, 40.34, 40.22] [-71.11, -71.21, -71.14] [12, 45, 10] True
1 [41.23, 41.4, 41.22] [-72.01, -72.01, -72.01] [11, 23, 15] True
2 [43.33, 43.34] [-70.11, -70.21] [12, 40] True
3 [41.23, 41.4] [-72.01, -72.01, -72.01] [11, 23, 15] False