Smooth / denormalize the result of the cumulative function R

I am new to R, and I am trying to use aggregateto execute some time series built on a data frame, for every object and for every metric in my dataset. This works great, but I believe the result is not in a format that is very easy to use. I would like to be able to convert the results back to the same format as the original frame.

Using an aperture dataset as an example:

# Split into two data frames, one for metrics, the other for grouping
iris_species = subset(iris, select=Species)
iris_metrics = subset(iris, select=-Species)
# Compute diff for each metric with respect to its species
iris_diff = aggregate(iris_metrics, iris_species, diff)

I just use diffit to illustrate that I have a function that forms time series, so I get a time series, possibly of different lengths, as a result, and definitely more than one aggregate value (for example, an average value).

I would like to convert the result, which seems to be a matrix that has a list, a valuable cell for the original “flat” data frame.

I'm most curious about how to manage this with results from aggregate, but I would be fine with the decisions that everyone makes in plyror reshape.

+5
source share
4 answers

As you know, it aggregateworks one column at a time. One value is expected, and odd things happen if you return length vectors other than 1.

You can divide this by byto get the data (with fewer rows than in iris), and put them back:

b <- by(iris_metrics, iris_species, FUN=function(x) diff(as.matrix(x)))
do.call(rbind, lapply(names(b), function(x) data.frame(Species=x, b[[x]])))

diff(as.matrix) , ( ). , , Species iris.

+2

, , - data.table:

require(data.table)
dt <- data.table(iris, key="Species")
dt.out <- dt[, lapply(.SD, diff), by=Species]

plyr, . Species diff .

require(plyr)
ddply(iris, .(Species), function(x) do.call(cbind, lapply(x[,1:4], diff)))
+2

- , , ave . diff , NA ( ).

iris_diff = lapply(iris_metrics, 
        function(xx) ave(xx, iris_species, FUN=function(x) c(NA, diff(x) ) )  )
str(iris_diff)
#--------------
List of 4
 $ Sepal.Length: num [1:150] NA -0.2 -0.2 -0.1 0.4 ...
 $ Sepal.Width : num [1:150] NA -0.5 0.2 -0.1 0.5 0.3 -0.5 0 -0.5 0.2 ...
 $ Petal.Length: num [1:150] NA 0 -0.1 0.2 -0.1 ...
 $ Petal.Width : num [1:150] NA 0 0 0 0 0.2 -0.1 -0.1 0 -0.1 ...

, data.frame . :

iris_diff <- data.frame( Species= iris_species, iris_diff)
str(iris_diff)
#------
'data.frame':   150 obs. of  5 variables:
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Sepal.Length: num  NA -0.2 -0.2 -0.1 0.4 ...
 $ Sepal.Width : num  NA -0.5 0.2 -0.1 0.5 0.3 -0.5 0 -0.5 0.2 ...
 $ Petal.Length: num  NA 0 -0.1 0.2 -0.1 ...
 $ Petal.Width : num  NA 0 0 0 0 0.2 -0.1 -0.1 0 -0.1 ...
+1

: aggregate matrix "Sepal.Length", "Sepal.Width" ..

> str(iris_diff)
'data.frame':   3 obs. of  5 variables:
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 2 3
 $ Sepal.Length: num [1:3, 1:49] -0.2 -0.6 -0.5 -0.2 0.5 ...
 $ Sepal.Width : num [1:3, 1:49] -0.5 0 -0.6 0.2 -0.1 0.3 -0.1 -0.8 -0.1 0.5 ...
 $ Petal.Length: num [1:3, 1:49] 0 -0.2 -0.9 -0.1 0.4 ...
 $ Petal.Width : num [1:3, 1:49] 0 0.1 -0.6 0 0 0.2 0 -0.2 -0.3 0 ...

, data.frame 197 .

"iris_diff" data.frame 197 . (, @James, SO):

do.call(data.frame, iris_diff)

, str :

> str(do.call(data.frame, iris_diff))
'data.frame':   3 obs. of  197 variables:
 $ Species        : Factor w/ 3 levels "setosa","versicolor",..: 1 2 3
 $ Sepal.Length.1 : num  -0.2 -0.6 -0.5
 $ Sepal.Length.2 : num  -0.2 0.5 1.3
 $ Sepal.Length.3 : num  -0.1 -1.4 -0.8
 $ Sepal.Length.4 : num  0.4 1 0.2
 $ Sepal.Length.5 : num  0.4 -0.8 1.1
 $ Sepal.Length.6 : num  -0.8 0.6 -2.7
 $ Sepal.Length.7 : num  0.4 -1.4 2.4
 $ Sepal.Length.8 : num  -0.6 1.7 -0.6
 $ Sepal.Length.9 : num  0.5 -1.4 0.5
 $ Sepal.Length.10: num  0.5 -0.2 -0.7
+1

All Articles