Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: how to shorten a value to just one number inside the value

Chromosome_name Start Position
CHR_HSCHR7_2_CTG6 142857940
CHR_HSCHR19LRC_PGF2_CTG3_1 54316049

I have just started to use R.
I have a data frame of chromosome names but I just want to replace the long names with the number of the chromosome.
i.e CHR_HSCHR19LRC_PGF2_CTG3_1 would be "19"
I need to replace the long name with the number just after the characters "HRCHR"
How would I do this?

I tried the method of manually entry the replacement value:
gsub(".*HSCHR19", "19", dataframe)

But this takes far too long for a list of >100 values. I would like to find a way to do this automatically.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use

sub('^.*CHR(\\d+).*$', '\\1', Chromosome_name)
#> [1] "7"  "19"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading