How to make UNIX diff ignore duplicate lines at different positions?

I have two CSV files about 134 mb.

All I want to do is get the "diff" from two files, except that the position of the line does not matter.

In other words, let's say I have:

abc,123
def,456

and

def,456
ghi,789

I do not want to be told about this, 456. He is in a different position in the second file, but I want him to be considered not different.

Just doing diff file1 file2> outputfile does not work. Which command should I use for this? I know this is trivial in PHP, but I'm running out of memory. I would rather use UNIX command line tools. Diff may not even be the right tool for this.

+3
source share
2 answers

, sort , , - :

sort file1 > sorted_1
sort file2 > sorted_2

diff sorted_1 sorted_2
+2

, diff, . , , :

1
2
3

3
1
2

. , ( http://code.google.com/p/csvfix/ - ).

, , , .

0

All Articles