Sqldf and maintainability of the R code base

Question

Sqldf and maintainability of the R code base

If you create the main code base of the entire organization in R, is it acceptable to use the sqldf package as a standard approach for data sorting tasks? Or is it best to rely on operations with R syntax, where possible? Building on sqldf, one injects a substantial amount of the other SQL syntax into its R code base.

I ask this question with particular regard to maintainability and style. I searched for existing R-style manuals and found nothing about this.

EDIT. To clarify the workflow I'm worried about, consider a data processing script in which you can use sqldf as follows:

library(sqldf)
gclust_group<-sqldf("SELECT clust,SUM(trips) AS trips2
                FROM gclust
                GROUP BY clust")

gclust_group2<-sqldf("SELECT g.*, h.Longitude,h.Latitude,h.withinss, s.trips2
                 FROM highestd g
                 LEFT JOIN centers h
                 ON g.clust=h.clust
                 LEFT JOIN gclust_group s
                 ON g.clust=s.clust")

script . ( , Hadoop PIG, PIG script). SQL, .

+3

r sqldf data-munging

Ben Rollert 08 . '14 4:39

1

Spacedman · Accepted Answer · 2014-02-08T12:26:25+0000

. , . . .

, sqldf dplyr, R-, Rcpp .

- sqldf dplyr, , , . , , 100 , dplyr? , .

sqldf dplyr ( RCS, ?) , .

, , R- , .

Sqldf and maintainability of the R code base

More articles: