Clean special character, numeric and character

Multi tool use
Clean special character, numeric and character
I have a variable like below in my dataframe
df$emp_length(10+ years, <1 year, 8 years)
I need to clean this variable for better analysis. Example, I want to compare this variable with other categorical or numerical variable. What is the best way to seperate this variable in to multiple columns.
I am thinking to separate this variable based on space something like below,
df$emp_length = c(10+, <1, 8)
df$years = c(years, years, years)
Also I would like to know if the number with special characters like + and < will be considered as numeric in R or I have to separate special character and numbers?
I want to have emp_length variable as numeric and years variable as character.
Please help!
10+
<1
1 Answer
1
One can use tidyr::extract
to first separate emp_length
in 2 columns. Then replace any symbol (anything other than 0-9
) to ""
in column with number and then convert it to numeric.
tidyr::extract
emp_length
0-9
""
Option#1: Keep the symbol with number
library(tidyverse)
df <- df %>% extract(emp_length, c("emp_length", "years"),
regex="([[:digit:]+<]+)\s+(\w+)")
df
# emp_length years
# 1 10+ years
# 2 <1 year
# 3 8 years
Option#2: Just number but column is numeric
library(tidyverse)
df <- df %>%
extract(emp_length, c("emp_length", "years"), regex="([[:digit:]+<]+)\s+(\w+)") %>%
mutate(emp_length = as.numeric(gsub("[^0-9]","\1",emp_length)))
df
# emp_length years
# 1 10 years
# 2 1 year
# 3 8 years
Data:
df <- data.frame(emp_length = c("10+ years", "<1 year", "8 years"),
stringsAsFactors = FALSE)
I dont want to create seperate column for special characters. The problem is that that data <1 year (starts with < symbol) has been changed to NA 1 instead of 1 and year. How to fix this?
– Krishna
Jun 30 at 13:04
Output: num [1:39717] 10 NA 10 10 1 3 8 9 4 NA ...
– Krishna
Jun 30 at 13:14
I just noticed the space between less than < symbol and 1. The actual data is < 1 year. I think that is the problem. How to fix this?
– Krishna
Jun 30 at 13:40
Perfect! thank you
– Krishna
Jul 1 at 7:18
I did the same. thank you
– Krishna
2 days ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
What's value are you looking for
10+
and<1
?– MKR
Jun 30 at 10:01