Fuzzy match with multiple words

I am trying to make a fuzzy match in R, where I have several data fields that will match.

For instance:

try_to_match <- c('seoul korea', 'bisbane', 'korea', 'australia brisbane')
locations <- data.frame(name=c('seoul', 'brisbane'),
                        country=c('south korea', 'australia'))

I want to map user-entered locations to try_to_matchwith the framework locations.

Now there are similar questions about fuzzy agreement with R on SO, and most - agrep. However, I cannot find any coverage, a fuzzy match, when there are a few words to match.

For example, if I only match locations$name, I get a match for "bisbane" on "brisbane" as I expect. In addition, I do not get matches for various searches in the country, because it locations$namedoes not have a country in it.

sapply(try_to_match, agrep, locations$name, value=T)
# $`seoul korea`
# character(0)    
# $bisbane
# [1] "brisbane"    
# $korea
# character(0)
# $`australia brisbane`
# character(0)

So, I assume that I should include a match with the country:

sapply(try_to_match, agrep, paste(locations$name, locations$country), value=T)
# $`seoul korea`
# character(0)    
# $bisbane
# [1] "brisbane australia"    
# $korea
# [1] "seoul south korea"    
# $`australia brisbane`
# character(0)

, "seoul korea" "seoul south korea" - . , "brisbane australia" , "-" ( ). ( , "" "seoul south korea", , ).

, : , , ?

- ?

(, - geonames, , . R ).

+5

All Articles