How to use Ruby regex to capture non-English words?

I am trying to check the "words" with Ruby 1.8.7.

My regular expression to catch the word now:

/[a-zA-Z]\'*\-*/

It will only catch English words; Is there a way to catch non-English UTF-8 characters?

+3
source share
1 answer

Even the Regex 1.8.x engine supports UTF-8, you just need to use the correct expression, and this is a little more than just using /\w/:

s = "résumé and some other words"
puts s[/[a-z]+/u]
puts s[/\w+/u]

and you will receive:

r
résumé
+4
source

All Articles