List differences in two files with awk

Say if I have two files -

File1:

1|abc
2|cde
3|pkr

File2:

1|abc
2|cde
4|lkg

How can I list the true difference in both files using awk? If the second file is a subset of the first file, I can do the following -

awk -F"|" 'NR==FNR{a[$1]=$2;next} !($1 in a)' file{1,2}

But that would give me

4|lkg

I would like to get a conclusion as follows, since this is the true difference. The difference should be considered as:

3|pkr
4|lkg

Difference Criteria:

  • Field 1 is present in file1, but not in file2.
  • Field 1 is present in file2, but not in file1.
  • Field 1 is present in both files, but has different meanings.

A bit of background:

File 1 and file 2 represent the export of tables from different databases. It has two fields separated by a pipe separator. Field 1 is always unique. Field 2 may be the same.

awk , . ( 2 ), , . , .

0
3

awk:

$ cat f1
a|1
b|2
c|1
$ cat f2
b|2
c|1
d|0
$ awk '{ h[$0] = ! h[$0] } END { for (k in h) if (h[k]) print k }' f1 f2
a|1
d|0
$
+4

, comm :

$ comm -3 <(sort file1) <(sort file2)

say a|1 file1 file2 , a|1 , file2 file1. , a|1 , , -u sort

$ comm -3 <(sort -u file1) <(sort -u file2)
+3
diff file1 file2 | perl -lne 'if(/^[<>]/){s/^..//g;print}'

below is the test:

> cat file1
a|1
b|2
c|1
> cat file2
b|2
c|1
d|0
> diff file1 file2 | perl -lne 'if(/^[<>]/){s/^..//g;print}'
a|1
d|0
> 
+1
source

All Articles