IndexingError: Unalignable boolean Series provided as indexer, how to fix?

i am currently working on amazon products dataset and want to fill NaNs in column named "amazon_category_and_sub_category". I want to do it with modes of categories for each manufacturer:

modes = X_train.groupby(by="manufacturer["amazon_category_and_sub_category"].apply(lambda x : np.nan if pd.Series.mode(x).size == 0 else pd.Series.mode(x)[0])

I calculate these modes based on X_train values, but now I want to do the same thing for X_test. As i understand i should use modes from X_train values. Before i do that, i need to check if there is a new manufacturer in test sample :

nans_test = X_test["amazon_category_and_sub_category"].isna()

nans_test = X_test.loc[nans_test, "manufacturer"].isin(modes.index)

After that, when I try to set values for nans_test mask :

X_test.loc[nans_test, "amazon_category_and_sub_category"] = modes[X_test.loc[nans_test, ["manufacturer"]]].to_numpy()

I get an error:

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Can you explain pls why is this happening and how to fix it?

UPD: I want to first fill NaNs with modes, where it is possible and later define a value for NaN’s which left

I checked indeces for both X_test and nans_test but they look the same way.
Tried to google an error but it feels that each situation has it’s own special bug in code

>Solution :

I think you need chain both conditions tested by & for bitwise AND and for mapping use

m1 = X_test["amazon_category_and_sub_category"].isna()
m2 = X_test["manufacturer"].isin(modes.index)

nans_test = m1 & m2

X_test.loc[nans_test, "amazon_category"] = X_test.loc[nans_test, "manufacturer"].map(modes)

Leave a Reply