Suppose that I have a dataset traffic with a column Traffic_count that displays the traffic count for each traffic counting station:
| Traffic_counting_station_ID | Traffic_count |
|---|---|
| 1 | 24.592 |
| 2 | 65.500 |
| 3 | 4.976 |
The problem is that Traffic_count is interpreted as a float type while the values should represent integer numbers. As an example, when I generate a new column Traffic_count_TimesTen which is formulated as traffic$Traffic_count*10, the resulting table is:
| Traffic_counting_station_ID | Traffic_count | Traffic_count_TimesTen |
|---|---|---|
| 1 | 24.592 | 245.92 |
| 2 | 65.500 | 655.00 |
| 3 | 4.976 | 49.76 |
When I apply traffic$Traffic_count <- as.integer(traffic$Traffic_count) the values for Traffic_count will just be 25, 66 and 5 respectively.
Applying traffic$Traffic_count <- as.numeric(gsub(".","",traffic$Traffic_count)) to remove the point that defines the type float, results in NA.
How can I convert the values of Traffic_count to integer numbers so that the values in Traffic_count are regarded as 24592, 65500, 4976 and the values in Traffic_count_TimesTen are regarded as 245920, 655000 and 49760?
>Solution :
your gsub solution is almost correct.Try the following:
traffic$Traffic_count <- as.numeric(gsub("\\.","",traffic$Traffic_count))
Explanation
. is a special sign in regex, meaning "any" character, therefore it will replace every letter with "". If you exlicitly want to replace a dot, you need to put a \ in front of it. But as R tries to interpret a single "" within a string we need to tell R itself not to preprocess the slash with another slash. therefore \\