R - Create multiple new columns using conditional statements


R - Create multiple new columns using conditional statements



R - Create multiple new columns using conditional statements



I am wondering if there is a way to create multiple columns based on a condition(s).



For example below, I have a dataframe with data and I want to create two columns based on the based on ccy. One column is a gbp conversion rate of the ccy and the other is a cad conversion.



If I pipe the mutates I can get it to work but there is repetition (and in my real problem, I have a complex list of ifelse so repeating the code for every column would create a lot of repetition).


df <- structure(list(product = c('option', 'forward', 'forward', 'option'),
ccy = c('usd', 'usd', 'eur', 'usd'),
amount = c(1000, 2000, 1000, 5000)),
.Names = c('product', 'ccy', 'amount'),
row.names = c(NA, 4L),
class = "data.frame")
df
product ccy amount
1 option usd 1000
2 forward usd 2000
3 forward eur 1000
4 option usd 5000

df %>% mutate(gbp_amount =
ifelse(ccy == 'usd', round(amount / 1.8, 2),
ifelse(ccy == 'eur', round(amount / 1.3, 2),
'not_converted'))) %>%
mutate(cad_amount =
ifelse(ccy == 'usd', round(amount / 0.85, 2),
ifelse(ccy == 'eur', round(amount / .7, 2),
'not_converted')))

product ccy amount gbp_amount cad_amount
1 option usd 1000 555.56 1176.47
2 forward usd 2000 1111.11 2352.94
3 forward eur 1000 769.23 1428.57
4 option usd 5000 2777.78 5882.35



Is there a way to create multiple columns based on a single if condition?



E.g., something like this pseudo code...


df %>% ifelse(df$ccy == 'usd',
(mutate(gbp_amount = round(amount / 1.8, 2)),
mutate(cad_amount = round(amount / 0.85, 2))),
ifelse(df$ccy == 'eur',
(mutate(gbp_amount = round(amount / 1.3, 2)),
mutate(cad_amount = round(amount / 0.7, 2))),
'not_converted'))




3 Answers
3



You can use SQL-alike joins if you have many "equals" conditions.



I am using the data.table syntax but you could do this dplyr too:


data.table


dplyr


library(data.table)

setDT(df)

# add a row which cannot be found ("joined") to demonstrate missing rates
df <- rbind(df, data.table(product = "option", ccy = "aud", amount = 3000))
df

lookup <- data.table(ccy = c("usd", "eur"),
gbp_rate = c( 1.8, 1.3),
cad_rate = c( 0.85, 0.7))
lookup
# ccy gbp_rate cad_rate
# 1: usd 1.8 0.85
# 2: eur 1.3 0.70

df[lookup, `:=`(gbp_amount = round(amount / gbp_rate, 2),
cad_amount = round(amount / cad_rate, 2)),
on = "ccy"]
df
# product ccy amount gbp_amount cad_amount
# 1: option usd 1000 555.56 1176.47
# 2: forward usd 2000 1111.11 2352.94
# 3: forward eur 1000 769.23 1428.57
# 4: option usd 5000 2777.78 5882.35
# 5: option aud 3000 NA NA



You have to sort the result as you like and mark the lookup errors (missing conversion rates) with another value than NA if you want (but not with the string "not_converted" like in your question since this would mix up the data type of the column - double vs character).


NA


"not_converted"



Consider building a rates data set and merge with your original, avoiding nested ifelse:


ifelse


rates_df <- data.frame(ccy = c('usd', 'eur'),
type = c('gbp', 'gbp', 'cad', 'cad'),
rate = c(1.8, 1.3, 0.85, 0.7),
stringsAsFactors = FALSE)
rates_df

df %>%
inner_join(rates_df, by="ccy") %>%
mutate(gbp_amount = ifelse(type=="gbp", round(amount / rate, 2), 0),
cad_amount = ifelse(type=="cad", round(amount / rate, 2), 0)) %>%
select(product, ccy, matches("amount")) %>%
group_by(product, ccy, amount) %>%
summarise_all(sum)

# # A tibble: 4 x 5
# # Groups: product, ccy [?]
# product ccy amount gbp_amount cad_amount
# <chr> <chr> <dbl> <dbl> <dbl>
# 1 forward eur 1000 769.23 1428.57
# 2 forward usd 2000 1111.11 2352.94
# 3 option usd 1000 555.56 1176.47
# 4 option usd 5000 2777.78 5882.35



You'll have to use a for-loop if you want to do multiple actions. @R Yoda's solution is probably better though. Like he said, I'd use NA rather than a character string so you aren't mixing data types in a vector, otherwise it'll default to character.


for-loop


for (i in 1:nrow(df)) {
if(df$ccy[i] == "usd") {
df$gbp_amount[i] <- round(df$amount[i] / 1.8, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.85, 2);
} else {
NA
}
if(df$ccy[i] == "eur") {
df$gbp_amount[i] <- round(df$amount[i] / 1.3, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.7, 2);
} else {
NA
}
}





You just converted a vectorized process to rowwise iteration!
– Parfait
Jun 29 at 21:27





Heresy, I know! User asked for a single if statement though, careful what you wish for.
– Anonymous coward
Jun 29 at 21:51


if






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

.

How to generate build files in next js

Delphi Android file open failure with API 26