Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Change Value of a Dataframe Column Based on a Filter with specific parameters

I’m looking at this but I have no idea how to formulate it:
Change Value of a Dataframe Column Based on a Filter

I need to change the values in medianIncome with values of 0.4999 or lower to 0.4999 or if 15.0001 or higher to 15.0001.

Here’s sample data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    id  longitude_x latitude    ocean_proximity longitude_y state   medianHouseValue    housingMedianAge    totalBedrooms   totalRooms  households  population  medianIncome
0   1   -122.23 37.88   NEAR BAY    -122.23 CA  452.603 45.0    131.0   884.0   130.0   323.0   83252.0
1   396 -122.34 37.88   NEAR BAY    -122.23 CA  350.004 41.0    930.0   3063.0  926.0   2560.0  17375.0
2   398 -122.29 37.88   NEAR BAY    -122.23 CA  216.703 54.0    263.0   1211.0  230.0   525.0   38672.0
3   401 -122.28 37.88   NEAR BAY    -122.23 CA  261.303 55.0    333.0   1845.0  335.0   772.0   42614.0
4   424 -122.26 37.88   NEAR BAY    -122.23 CA  391.803 53.0    418.0   2553.0  404.0   898.0   62425.0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
929044  9476    -123.38 39.37   INLAND  -121.24 CA  124.601 20.0    813.0   3947.0  732.0   1902.0  26424.0
929045  9494    -123.75 39.37   INLAND  -121.24 CA  151.403 20.0    299.0   1377.0  282.0   830.0   32500.0
929046  10065   -121.03 39.37   INLAND  -121.24 CA  85.000  15.0    327.0   1338.0  310.0   1174.0  26341.0
929047  10074   -120.10 39.37   INLAND  -121.24 CA  117.301 34.0    411.0   2328.0  373.0   1016.0  45208.0
929048  21558   -121.24 39.37   INLAND  -121.24 CA  89.401  18.0    616.0   2787.0  532.0   1387.0  23886.0

It shows:

np.where(df[‘x’] > 0 & df[‘y’] < 10, 1, 0)

So I’m at:

np.where(housing['medianIncome'] > 15.0001

And I’m stuck as to the rest. Only using pandas and numpy, not able to use lambda.

I’m expecting an outcome that won’t give an error. As of yet, I don’t have an outcome.

>Solution :

Use Series.clip:

housing = pd.DataFrame({'medianIncome':[20,5,0.07]})

housing['medianIncome'] = housing['medianIncome'].clip(upper=15.0001, lower=0.4999)

print (housing)
   medianIncome
0       15.0001
1        5.0000
2        0.4999

Alternative with numpy.select if need set another values by conditions:

housing['medianIncome'] = np.select([housing['medianIncome'].lt(0.4999),
                                     housing['medianIncome'].gt(15.0001)],
                                     [0,1], 
                                     default=housing['medianIncome'])

print (housing)
   medianIncome
0           1.0
1           5.0
2           0.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading