I have several files to upload and you want to combine them into one data frame. I am trying to use textConnection but it is very slow. This is what my data looks like when I load it into R:
"1995200008,10,1995,5190.61,73300"
"1995200010,1,1995,6776.44,42652"
"1995200011,11,1995,2315.83,4169"
"1995200014,6,1995,9846.79,2113"
"1995200017,8,1995,3978.93,2449"
"1995200018,6,1995,3582.69,2449"
"1995200022,7,1995,10409.18,2859"
I cannot use read.csv because it uses a library to retrieve data from Hadoop. Double quotes are in the data.
Here is the code I'm using:
tmp <- hdfs.read.text.file(filename)
tmp1 <- read.table(textConnection(tmp), sep = ",")
Does anyone know a method that will work faster?