Home Add a row in data.frame by counting row numbers of another csv with names stored in the data.frame using dplyr

Questions

Add a row in data.frame by counting row numbers of another csv with names stored in the data.frame using dplyr

April 4, 2023

I have a data frame of plant Latin names, and another folder GBIF_data that stores the downloaded gbif data in csv named by the Latin names in the data frame, I want to mutate a new column to store how much data has been downloaded from GBIF for each plant Latin name, here is the code:

read.csv("data.csv") %>%
  mutate(OCCURRENCES = nrow(read.delim(CSVPATH))) #csv files downloaded from GBIF use tab as delimiter so here read.delim should be used

The data frame looks like this (Here I show only the CSVPATH column which is mutated by concatenating the path before the plant Latin name and replacing the spaces in Latin name with the underscore, other columns that are not relative to the topic have been omitted):

   CSVPATH                                                                            
 ../GBIF_data/Lycopodium_cernuum.csv          
 ../GBIF_data/Lycopodium_japonicum.csv        
 ../GBIF_data/Lycopodiastrum_casuarinoides.csv
 ../GBIF_data/Selaginella_uncinata.csv        
 ../GBIF_data/Selaginella_doederleinii.csv    
 ../GBIF_data/Equisetum_ramosissimum.csv      
 ../GBIF_data/Ophioglossum_reticulatum.csv    
 ../GBIF_data/Osmunda_vachellii.csv           
 ../GBIF_data/Lygodium_japonicum.csv          
 ../GBIF_data/Lygodium_microphyllum.csv

And the name of the csv data stored in GBIF_data folder just replaced the space in the Latin name with the underscore _. When I ran the code, it reported the error:

Error in `mutate()`:
! Problem while computing `OCCURRENCES = nrow(read.delim(CSVPATH))`.
Caused by error in `h()`:
! error in evaluating the argument 'x' in selecting a method for function 'nrow': invalid 'description' argument

I wonder why dplyr::mutate does not work in this situation? It successfully mutated the Latin names to CSVPATH by string operations but when reading and counting the row numbers of another csv file it fails.

Thanks in advance!

>Solution :

We may need rowwise as read.delim is not vectorized i.e. it reads only a single file at a time

library(dplyr)
read.csv("data.csv") %>%
  rowwise %>%
  mutate(OCCURRENCES = nrow(read.delim(CSVPATH))) %>%
  ungroup

Or another option is map

library(purrr)
read.csv('data.csv') %>%
   mutate(OCCURRENCES = map_int(CSVPATH, ~ read.delim(.x) %>% nrow()))

mutate

byMR

Published April 04, 2023

Add a comment

map array of parts to string via regex in deno (typescript)

byMR

April 4, 2023

Questions

Generic way to check validity of file name in Python?

byMR

April 4, 2023

Questions

Python nested dictionary – remove "" and data with extra spaces but keep None values

byMR

April 4, 2023

Questions

Java – How to extract a substring from start until the end of a particular word in a string?

byMR

April 4, 2023

Questions

Python convert text file to pandas dataframe with multiline text

byMR

April 4, 2023

Questions

flask and Jinja2 control structure doesn't work with render_template.format but works when passing the variable directly

byMR

April 4, 2023

Add a row in data.frame by counting row numbers of another csv with names stored in the data.frame using dplyr

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

map array of parts to string via regex in deno (typescript)

Generic way to check validity of file name in Python?

Python nested dictionary – remove "" and data with extra spaces but keep None values

Java – How to extract a substring from start until the end of a particular word in a string?

Python convert text file to pandas dataframe with multiline text

flask and Jinja2 control structure doesn't work with render_template.format but works when passing the variable directly

Keep Up to Date with the Most Important News

Add a row in data.frame by counting row numbers of another csv with names stored in the data.frame using dplyr

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

map array of parts to string via regex in deno (typescript)

Generic way to check validity of file name in Python?

Python nested dictionary – remove "" and data with extra spaces but keep None values

Java – How to extract a substring from start until the end of a particular word in a string?

Python convert text file to pandas dataframe with multiline text

flask and Jinja2 control structure doesn't work with render_template.format but works when passing the variable directly

Discover more from Dev solutions