Printing the number of characters in a UTF-8 string

For instance:

local a = "Lua"
local u = ""
print(a:len(), u:len())

output:

3   6

How can I print the number of characters in a utf-8 string?

+3
source share
4 answers

If you need to use Unicode / UTF-8 in Lua, you need to use external libraries because Lua only works with 8-bit strings. One such slnunicode library . Sample code on how to calculate the length of a string:

local unicode = require "unicode"
local utf8 = unicode.utf8

local a = "Lua"
local u = ""
print(utf8.len(a), utf8.len(u)) --> 3    3
+6
source

In Lua 5.3, you can use utf8.lento get the length of a UTF-8 string:

local a = "Lua"
local u = ""
print(utf8.len(a), utf8.len(u))

Output: 3 3

+3
source

.

Lua Unicode. , , . , . Lua Unicode-, Lua, , .

+2

Another option is to wrap your own USF-8 string functions and use the os functions for heavy lifting. It depends on which OS you use - I did it on OSX, and it works. The windows will look alike. Of course, this opens up another possibility for worms, if you just want to run the script from the command line - it depends on your application.

0
source

All Articles