I am trying to make a fuzzy match in R, where I have several data fields that will match.
For instance:
try_to_match <- c('seoul korea', 'bisbane', 'korea', 'australia brisbane')
locations <- data.frame(name=c('seoul', 'brisbane'),
country=c('south korea', 'australia'))
I want to map user-entered locations to try_to_matchwith the framework locations.
Now there are similar questions about fuzzy agreement with R on SO, and most - agrep. However, I cannot find any coverage, a fuzzy match, when there are a few words to match.
For example, if I only match locations$name, I get a match for "bisbane" on "brisbane" as I expect. In addition, I do not get matches for various searches in the country, because it locations$namedoes not have a country in it.
sapply(try_to_match, agrep, locations$name, value=T)
# $`seoul korea`
# character(0)
# $bisbane
# [1] "brisbane"
# $korea
# character(0)
# $`australia brisbane`
# character(0)
So, I assume that I should include a match with the country:
sapply(try_to_match, agrep, paste(locations$name, locations$country), value=T)
# $`seoul korea`
# character(0)
# $bisbane
# [1] "brisbane australia"
# $korea
# [1] "seoul south korea"
# $`australia brisbane`
# character(0)
, "seoul korea" "seoul south korea" - . , "brisbane australia" , "-" ( ). ( , "" "seoul south korea", , ).
, : , , ?
- ?
(, - geonames, , . R ).