Php find emoji [update existing code]

I am trying to detect emoji in my php code and prevent users from entering it.

The code I have is:

if(preg_match('/\xEE[\x80-\xBF][\x80-\xBF]|\xEF[\x81-\x83][\x80-\xBF]/', $value) > 0)
{
    //warning...
}

But it does not work for all emoji. Any ideas?

+5
source share
4 answers
if(preg_match('/\xEE[\x80-\xBF][\x80-\xBF]|\xEF[\x81-\x83][\x80-\xBF]/', $value) 

You really want to combine character-level Unicode rather than trying to track UTF-8 byte sequences. Use a modifier uto process your character-based UTF-8 string.

Emoji are encoded in the block U + 1F300-U + 1F5FF. However:

  • many characters from sets of emotions of Japanese carriers are actually mapped onto existing Unicode characters, such as card suits, zodiac signs, and some arrows. Do you think these emoji characters now?

  • , Unicode, . . iOS 4 Softbank. . .

:

function unichr($i) {
    return iconv('UCS-4LE', 'UTF-8', pack('V', $i));
}

if (preg_match('/['.
    unichr(0x1F300).'-'.unichr(0x1F5FF).
    unichr(0xE000).'-'.unichr(0xF8FF).
']/u'), $value) {
    ...
}
+10

:

emoji, Unicode 6.0, 722 , 114 pre-6.0 Unicode standard, 608 - , Unicode 6.0. [4] emoji - ( ), Unicode EmojiSources.txt, .

. 722 , 722 .

, , . , emoji.

, :

\x{1F30F}

1F30F - .

, , .

+2

- , Miscellaneous_Symbols_And_Pictographs. Perl

 /\p{Assigned}/ && \p{block=Miscellaneous_Symbols_And_Pictographs}/

/\P{Cn}/ && /\p{Miscellaneous_Symbols_And_Pictographs}/

/(?=\p{Assigned})\p{Miscellaneous_Symbols_And_Pictographs}/

, PCRE, PHP, Unicode. , . , Unicode script . .

.

- Unicode :

/(?=\P{Cn})[\x{1F300}-\x{1F5FF}]/

Looks like a service nightmare full of magic numbers.

+1
source

What I came up with today. This is probably not a very good solution to this problem, but at least it works;)

if(iconv('Windows-1250', 'UTF-8', iconv('UTF-8', 'Windows-1250', $value)) != $value)
-2
source

All Articles