How to check data.table key works correctly and why is it not?

Not sure if the error or my mistake - the key data.tabledoes not work for the table, I read from a file with UTF encoding ( link ).

names <- data.table(name = unique(read.table(file = "boys_ru.txt", header = FALSE, sep = "\n", quote = "", stringsAsFactors = F)$V1), sex = 1)
setkey(names, name)

data.tabledoesn't seem to recognize the key properly. names[""]returns nothing while names[name == ""]works fine

> names[name == ""]
     name sex
1:    1

If I create a table myself, everything works fine too

dt1 <- data.table(name = rep("", 5), sex = rep(1, 5))
setkey(dt1, name)

I do not know what to do, because it does not allow me to join this table with another table of 10M rows in the name field. Interestingly, it merge.data.frameworks as expected with a table names(but too slow). sessionInfo-

R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C            LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C         LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 
+3
source share
1 answer

, read.table(..., encoding = "UTF-8"). , data.table . @Arun RFORD .

+1

All Articles