R - How to perform arithmetic operations on some, but not all variables in rows, omitting NA

Question

R - How to perform arithmetic operations on some, but not all variables in rows, omitting NA

Suppose I have data.frame:

a <- c(1,2,3,4,5)
b <- c(0,1,NA,3,4)
c <- c(9,10,11,NA,13)

df <- data.frame(a,b,c)

I managed to write a user-defined function that I can use to summarize certain variables line by line, ignoring NA (in this case, I summarize all the variables, but imagine a large data.frame file where I only need to add a few variables):

sum.df.na.rm <- function(x) {
    rowSums(df[,x], na.rm = TRUE)
}

df$d <- sum.df.na.rm(c("a","b","c"))

> df
  a  b  c  d
  1  0  9 10
  2  1 10 13
  3 NA 11 14
  4  3 NA  7
  5  4 13 22

Now suppose I want to subtract b from a and add c, but still ignoring NA. I can do:

df$bneg <- df$b * (-1)
df$e <- sum.df.na.rm(c("a","bneg","c"))

> df
  a  b  c  d bneg  e
  1  0  9 10    0 10
  2  1 10 13   -1 11
  3 NA 11 14   NA 14
  4  3 NA  7   -3  1
  5  4 13 22   -4 14

But in order to multiply b by (-1) so that it is subtracted in the function sum.df.na.rm, it seems to me very inefficient.

How would you do this without using the bneg intermediate variable?

+3

r dataframe

iraserd Feb 20 '14 at 10:32

source share

1 answer

Thomas · Accepted Answer · 2014-02-20T10:42:34+0000

:

> `%+%` <- function(e1, e2) {e1[is.na(e1)] <- 0; e2[is.na(e2)] <- 0; return(e1 + e2)}
> `%-%` <- function(e1, e2) {e1[is.na(e1)] <- 0; e2[is.na(e2)] <- 0; return(e1 - e2)}
> within(df, e <- a %-% b %+% c)
  a  b  c  e
1 1  0  9 10
2 2  1 10 11
3 3 NA 11 14
4 4  3 NA  1
5 5  4 13 14

R - How to perform arithmetic operations on some, but not all variables in rows, omitting NA

More articles: