Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to multiply all columns with each other

I have a pandas dataframe and I want to add to it new features, like this:

Say I have features X_1,X_2,X_3 and X_4, then I want to add X_1 * X_2, X_1 * X_3, X_1 * X_4, and similarly X_2 * X_3, X_2 * X_4 and X_3 * X_4. I want to add them, not replace the original features.

How do I do that?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

for c1, c2 in combinations(df.columns, r=2):
    df[f"{c1} * {c2}"] = df[c1] * df[c2]

you can take every r = 2 combination of the columns, multiply them and assign.

Example run:

In [66]: df
Out[66]:
   x1  y1  x2  y2
0  20   5  22  10
1  25   8  27   2

In [67]: from itertools import combinations

In [68]: for c1, c2 in combinations(df.columns, r=2):
    ...:     df[f"{c1} * {c2}"] = df[c1] * df[c2]
    ...:

In [69]: df
Out[69]:
   x1  y1  x2  y2  x1 * y1  x1 * x2  x1 * y2  y1 * x2  y1 * y2  x2 * y2
0  20   5  22  10      100      440      200      110       50      220
1  25   8  27   2      200      675       50      216       16       54

Another way via sklearn.preprocessing.PolynomialFeatures:

In [74]: df
Out[74]:
   x1  y1  x2  y2
0  20   5  22  10
1  25   8  27   2

In [75]: from sklearn.preprocessing import PolynomialFeatures

In [76]: poly = PolynomialFeatures(degree=2,
                                   interaction_only=True, 
                                   include_bias=False)

In [77]: poly.fit_transform(df)
Out[77]:
array([[ 20.,   5.,  22.,  10., 100., 440., 200., 110.,  50., 220.],
       [ 25.,   8.,  27.,   2., 200., 675.,  50., 216.,  16.,  54.]])

In [78]: new_columns = df.columns.tolist() + [*map(" * ".join,
                                                   combinations(df.columns, r=2))]

In [79]: df = pd.DataFrame(poly.fit_transform(df), columns=new_columns)

In [80]: df
Out[80]:
     x1   y1    x2    y2  x1 * y1  x1 * x2  x1 * y2  y1 * x2  y1 * y2  x2 * y2
0  20.0  5.0  22.0  10.0    100.0    440.0    200.0    110.0     50.0    220.0
1  25.0  8.0  27.0   2.0    200.0    675.0     50.0    216.0     16.0     54.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading