Least used unicode delimiter

I am trying to mark my text with a separator in certain places that will be used later for parsing. I want to use the delimiter character, which is used the least often. I'm looking at the character "\ 2" or U + 0002. Is it safe enough to use? What other suggestions are there? The text is unicode and will have both English and non-English characters.

You want to use a character that can still be "exploded" () by "PHP".

Edit:

I also want to be able to display this piece of text on the screen (in the browser), and the separator will be "invisible" to the user. I can definitely use str_replace () to get rid of visible separators, but if there are good invisible separators, then such processing is not required.

+1
source share
1 answer

If it is only for internal representation (i.e. not for exchange and storage), you can use non-character code such as U + FFFF. Java uses this as a signal that a CharacterIterator executes, for example .

+4
source

All Articles