Read a UNIX encoded file with C #

I have a C # program that we use to replace some values ​​with others that will be used after the parameters. Like "NAME1", replaced by & 1, "NAME2" with & 2, etc.

The problem is that the data for the change is in a text file encoded in UNIX, and special characters such as í, which even in memory are read as a square (Invalid char). Official specifications that are beyond control, the file cannot be changed and has no choice but to read it like this.

I tried reading with most of the 130 encodings. C # offers me:

EncodingInfo[] info = System.Text.Encoding.GetEncodings();
string text;
for (int a = 0; a < info.Length; ++a)
{
      text = File.ReadAllText(fn, info[a].GetEncoding());
      File.WriteAllText(fn + a, text, info[a].GetEncoding());
}

fn is the path to the file to read. They checked all the files made (for example, 130), none of them wrote correctly about this, and I could not find anything on the Internet.

DECISION:

It seems, finally, this code made the work correctly get the text, also had to fix the same encoder for the recording part:

System.Text.Encoding encoding = System.Text.Encoding.GetEncodings()[41].GetEncoding();

String text = File.ReadAllText(fn, encoding); // get file text 

// DO ALL THE STUFF I HAD TO

File.WriteAllText(fn, text, encoding) System.Text.Encoding.GetEncodings()[115].GetEncoding();   //Latin 9 (ISO) 

/* ALL THIS ENCODINGS WORKED APARENTLY FOR ME WITH ALL WEIRD CHARS I WAS ABLE TO WRITE :P
    System.Text.Encoding.GetEncodings()[108].GetEncoding(); //Baltic (ISO)
    System.Text.Encoding.GetEncodings()[107].GetEncoding(); //Latin 3 (ISO)
    System.Text.Encoding.GetEncodings()[106].GetEncoding(); //Central European (ISO)
    System.Text.Encoding.GetEncodings()[105].GetEncoding(); //Western European (ISO)
    System.Text.Encoding.GetEncodings()[49].GetEncoding();      //Vietnamese (Windows)
    System.Text.Encoding.GetEncodings()[45].GetEncoding();      //Turkish (Windows)
    System.Text.Encoding.GetEncodings()[41].GetEncoding();      //Central European (Windows)   <-- Used this one 
    */

Many thanks for your help.

Noman (1)

+5
source share
1 answer

You should get the correct encoding format. try

use the -i file. This will produce MIME type information for the file, which will also contain a character set encoding. I found the man-page :

Or try enca

It can guess and even convert between encodings. Just take a look at the manual page.

, .

: Unix script ()

+2

All Articles