Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: Adding a column to a data frame based on a column in another data frame causes several duplicate rows

I want to take the state name from one data frame, and put it in another based on the county fips code. Below is a snippet of the two

> head(arr)
  state_abb fips_state_county_code
1        CO                   8001
2        CO                   8001
3        CO                   8001
4        CO                   8001
5        CO                   8001
6        CO                   8001

> head(countyname)
    county       countyname
245   8001     Adams County
246   8003   Alamosa County
247   8005  Arapahoe County
248   8007 Archuleta County
249   8009      Baca County
250   8011      Bent County

When I use left_join, I get a ton of duplicates

xcountyname= left_join(arr, countyname, by =c("county" = "fips_state_county_code")) 

> head(xcountyname)
  county   countyname state_abb
1   8001 Adams County        CO
2   8001 Adams County        CO
3   8001 Adams County        CO
4   8001 Adams County        CO
5   8001 Adams County        CO
6   8001 Adams County        CO

It should only be adding the state column, but it has now gone from 535 rows to over 71k rows. Is there a better way to do this so that the state only fills in the existing rows of the xcountyname data frame?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Edit: arr is a subset of another dataset, that’s why there are duplicates. But it still stands that I only want the rows with the state name to transfer with the existing fips codes in xcountyname

>Solution :

You only want one state abb for each fips, so use distinct()

left_join(
  distinct(arr %>% select(state_abb, county=fips_state_county_code)),
  countyname
)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading