Regular expression needed to convert lst file to csv

I have a file (rating.lst) downloaded from IMDB interfaces. Content is in the following format: -

Distribution   Votes      Rating  Title
0000001222     297339     8.4     Reservoir Dogs (1992)
0000001223     64504      8.4     The Third Man (1949)
0000000115     48173      8.4     Jodaeiye Nader az Simin (2011)
0000001232     324564     8.4     The Prestige (2006)
0000001222     301527     8.4     The Green Mile (1999)

My goal is to convert this file to a CSV file (comma separated) with the following desired result (example for 1 line):

Distribution   Votes      Rating  Title
0000001222,    301527,    8.4,    The Green Mile (1999)

I use a text panel and supports regular expression search and replace. I am not sure what type of regular expression is needed to achieve the above desired results. Can someone please help me with this. Thanks in advance.

+3
source share
4 answers

. , - " -, ".

^(.+?)\s+(.+?)\s+(.+?)\s+(.+?)$ \1,\2,\3,"\4", ( Notepad ++)

Distribution,Votes,Rating,"Title"
0000001222,297339,8.4,"Reservoir Dogs (1992)"
0000001223,64504,8.4,"The Third Man (1949)"
0000000115,48173,8.4,"Jodaeiye Nader az Simin (2011)"
0000001232,324564,8.4,"The Prestige (2006)"
0000001222,301527,8.4,"The Green Mile (1999)"

- .+?, , . , "" , , , , Avatar, the Last Airbender .

, Excel, .

, Excel.

0
  • F8, ""
  • ,
  • Find : put: ^([[:digit:]]{10})[[:space:]]+([[:digit:]]+)[[:space:]]+([[:digit:]]- {1,2}\.[[:digit:]])[[:space:]]+(.*)$
  • Replace with: put \1,\2,\3,"\4"
  • " "

enter image description here

. 1 .lst - , .

: , ,

: , , CSV. -.

0

" "", :

Find: ^\([0-9]+\)[ \t]+\([0-9]+\)[ \t]+\([^ \t]+\)[ \t]+\(.*\)\ 1, \ 2, \ 3, "\ 4"

0
source

MY BAD This is a C # program. I will leave this for an alternative solution.

The ignorepattern space is a template comment.

This will create data that can be placed in the CSV file. Note. CSV files do not have extra white color in them according to your example ....

string data =@"Distribution   Votes      Rating  Title
0000001222     297339     8.4     Reservoir Dogs (1992)
0000001223     64504      8.4     The Third Man (1949)
0000000115     48173      8.4     Jodaeiye Nader az Simin (2011)
0000001232     324564     8.4     The Prestige (2006)
0000001222     301527     8.4     The Green Mile (1999)
";

string pattern = @"
^                     # Always start at the Beginning of line
(                     # Grouping
   (?<Value>[^\s]+)     # Place all text into Value named capture
   (?:\s+)              # Match but don't capture 1 to many spaces
){3}                  # 3 groups of data
(?<Value>[^\n\r]+)    # Append final to value named capture group of the match
";

var result = Regex.Matches(data, pattern, RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace)
                  .OfType<Match>()
                  .Select (mt => string.Join(",", mt.Groups["Value"].Captures
                                                                    .OfType<Capture>()
                                                                    .Select (c => c.Value))
                                                                    );

Console.WriteLine (result);

/* output
Distribution,Votes,Rating,Title
0000001222,297339,8.4,Reservoir Dogs (1992)
0000001223,64504,8.4,The Third Man (1949)
0000000115,48173,8.4,Jodaeiye Nader az Simin (2011)
0000001232,324564,8.4,The Prestige (2006)
0000001222,301527,8.4,The Green Mile (1999)
*/
0
source

All Articles