Another approach: from start to finish:
Make reproducible data:
dat <- read.table(header = TRUE, text = "SNP Geno Allele
marker1 G1 AA
marker2 G1 TT
marker3 G1 TT
marker1 G2 CC
marker2 G2 AA
marker3 G2 TT
marker1 G3 GG
marker2 G3 AA
marker3 G3 TT")
UPDATED Extract the allele column, split it into separate characters, then make these characters into two columns of the data frame:
Explicitly
dat1 <- data.frame(t(matrix(
unlist(strsplit(as.vector(dat$Allele), split = "")),
ncol = length(dat$Allele), nrow = 2)))
OR after @joran's suggestion
dat1 <- data.frame(do.call(rbind, strsplit(as.vector(dat$Allele), split = "")))
THEN
Add column names to new columns:
names(dat1) <- c("Allele1", "Allele2")
Attach two new columns to the columns from the original data table, as @ user1317221 suggests:
dat3 <- cbind(dat$SNP, dat$Geno, dat1)
dat$SNP dat$Geno Allele1 Allele2
1 marker1 G1 A A
2 marker2 G1 T T
3 marker3 G1 T T
4 marker1 G2 C C
5 marker2 G2 A A
6 marker3 G2 T T
7 marker1 G3 G G
8 marker2 G3 A A
9 marker3 G3 T T
source
share