merge two files based on one column Awk -
i trying merge 2 tab delimited files files - of unequal lengths. need merge files based on column number 1 , values 3rd column of each file new file. if of files missing id ( uncommon value) should blank value in new file -
file1: id1 2199 082 id2 0909 20909 id3 8002 8030 id4 28080 80828 file2: id1 988 00808 id2 808 80808 id4 8080 2525 id6 838 3800 merged file : id1 082 00808 id2 20909 80808 id3 8030 id4 80828 2525 id6 3800
i went through many forums , posts , far have this
awk -f\t 'nr==fnr{a[$1]=$1; b[$1]=$1; next} {$2=a[$1]; $3=b[$1]}1'
but not yield right result, can please suggest. lot!
$ awk -f'\t' 'nr==fnr{a[$1]=$3; next} {a[$1]; b[$1]=$3} end{for (id in a) print id,a[id],b[id]}' ofs='\t' file1 file2 | sort id1 082 00808 id2 20909 80808 id3 8030 id4 80828 2525 id6 3800
how works
this script uses 2 variables. every line in file1, associative array a
has key corresponding id , value of third field. every id in file2, a
has key (but not value). file2, array b
has key every id corresponding value third column.
-f'\t'
this sets field separator on input tab. note
\t
must quoted protect shell.nr==fnr{a[$1]=$3; next}
this sets associative array
a
first file.a[$1]; b[$1]=$3
this sets associative array second file. makes sure array
a
has key every id in file2.end{for (id in a) print id,a[id],b[id]}
this prints out results.
ofs='\t'
this sets output field separator tab.
sort
the awk construct
for key in array
not guaranteed return keys in particular order. usesort
sort output ascending order in id.
Comments
Post a Comment