The way I solved this was by computing the covariance sample matrix iteratively. Thus, you only need a subset of the data for any point in time. Reading only a subset of the data can be done using readLineswhere you open a file connection and read iteratively. The algorithm looks something like this (this is a two-stage algorithm):
Calculate average values for a column (assuming these are variables)
- Open file connection (
con = open(...)) - Read 1000 lines (
readLines(con, n = 1000)) - Calculate the sum of squares per column
- (
sos_column = sos_column + new_sos) - 2-4 .
- 1, .
:
- (
con = open(...)) - 1000 (
readLines(con, n = 1000)) - -
crossprod - 2-4 .
- 1, .
, princomp covmat = your_covmat princomp, .
, , , , . , (, 1000 ), (nvar * nvar double).