Check if the file contains a multibyte character

I have subtitle files in UTF-8. Sometimes there are several sporadic multibyte characters in these files that cause problems in some applications.

How to check linux (and the ability to find them) if a specific file contains any multibyte character.

+3
source share
2 answers

You can use the file command

chalet16$ echo test > a.txt
chalet16$ echo testก >  b.txt #One of Thai characters
chalet16$ file *.txt
a.txt: ASCII text
b.txt: UTF-8 Unicode text
+2
source

You can use the fileor command chardet.

+2
source

All Articles