Let's say I analyzed the website using the expression below
library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)
if I run the code below
xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
I will get below -
[1] "Description" "What new"
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"
Now I'm only interested in the "Customers Also Installed" part. But when I run the code below,
xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
he spits out all the applications included in “Other applications from King.com All Apps”, “Customers Also Viewed” and “Customers Also Installed”.
So I tried
xpathSApply(url.df_1, "//div[h3='Customers Also Installed']", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
but it didn’t work. So I tried
xpathSApply(url.df_1, "//div[contains(.,'Customers Also Installed')]",xmlValue)
but it doesn’t work either. (The output should be something like below -)
[,1]
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2\n Terry Paton\n "
[2,] "/app/android/com.terrypaton.unity.pogz2/"
Any guidance would be greatly appreciated!