I am using pandas to check wether two dataframes are contained within each others. the method .isin() is only helpful (e.g., returns True) only when labels match (ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.isin.html) but I want to check further that this to include cases where the labels don’t match.
Example: df1:
+----+----+----+----+----+
| 3 | 4 | 5 | 6 | 7 |
+----+----+----+----+----+
| 11 | 13 | 10 | 15 | 12 |
+----+----+----+----+----+
| 8 | 2 | 9 | 0 | 1 |
+----+----+----+----+----+
| 14 | 23 | 31 | 21 | 19 |
+----+----+----+----+----+
df2:
+----+----+
| 13 | 10 |
+----+----+
| 2 | 9 |
+----+----+
I want the output to be True since df2 is inside df1
Any ideas how to do that using Pandas?
>Solution :
You can use numpy‘s sliding_window_view:
from numpy.lib.stride_tricks import sliding_window_view as swv
(swv(df1, df2.shape)==df2.to_numpy()).all((-2, -1)).any()
Output: True
Intermediate:
(swv(df1, df2.shape)==df2.to_numpy()).all((-2, -1))
array([[False, False, False, False],
[False, True, False, False], # df2 is found in position 1,1
[False, False, False, False]])