Joining entries based off of column using awk/join

up vote
3
down vote

favorite

I have two files which are pipe-delimited and may have column 1+column2 matches in both, or one file may have the entry while the other does not. Assume my match-key I am going off of equals $1"-"$2 using a pipe '|' as the FS.

file1

1111|AAA|foo|50
1111|BBB|foo|30
2222|BBB|foo|10

file2

1111|AAA|bar|10
1111|CCC|bar|20
3333|AAA|bar|40

The desired output would be the following for the first entry (I have this working)

1111|AAA|50|10

For the second entry file1 (If there is no matching column1+column2 in both files, replace the entry which is missing for foo as 0. And the other way around)

1111|BBB|30|0

And for an entry key (column1+column2) in file2, but not in file1 (This is entry 3 of file 2 expected output)

3333|AAA|0|40

So, desired output overall format is listing ALL unique keys which are represented by column1+column2 in BOTH files. With the 3rd column entries being those values from file1 column 4 (or 0 if value doesn't exist in file1) and the 4th column in output as those values in column 4 of file 2 (or 0 if value doesn't exist in file2).

I have done a lot of research and tried many things but I have values not outputting if the column1+column2 pair exists in file2 but not file1 by using the following:

join -t"|" -e0 -a1 -a2 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F"|" 'print $1"-"$2"' | sort -k1,1) <(<file2 awk -F"|" 'print $1"-"$2"' | sort -k1,1)

The above case gives me expected output if there is a column1+column2 match in file1 but not file2, and appends a 0 for the match not existing... How can I get this to work for ALL scenarios?

Above command will do some process substitution by adding a key in column 1 in both files which is column1+column2, and then join based off of that new key. -e0 will add a 0 if this key exists in file1 but not file2. How can I get it to cover the case of: New key (column1-column2) exists in file 2 but NOT file 1?

asked Mar 6 at 22:30

EDubman

1286

It would be a lot easier to do this with sqlite.
â€“Â Red Cricket
Mar 7 at 6:33

add a commentÂ |Â

up vote
3
down vote

favorite

file1

1111|AAA|foo|50
1111|BBB|foo|30
2222|BBB|foo|10

file2

1111|AAA|bar|10
1111|CCC|bar|20
3333|AAA|bar|40

The desired output would be the following for the first entry (I have this working)

1111|AAA|50|10

For the second entry file1 (If there is no matching column1+column2 in both files, replace the entry which is missing for foo as 0. And the other way around)

1111|BBB|30|0

And for an entry key (column1+column2) in file2, but not in file1 (This is entry 3 of file 2 expected output)

3333|AAA|0|40

I have done a lot of research and tried many things but I have values not outputting if the column1+column2 pair exists in file2 but not file1 by using the following:

join -t"|" -e0 -a1 -a2 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F"|" 'print $1"-"$2"' | sort -k1,1) <(<file2 awk -F"|" 'print $1"-"$2"' | sort -k1,1)

The above case gives me expected output if there is a column1+column2 match in file1 but not file2, and appends a 0 for the match not existing... How can I get this to work for ALL scenarios?

asked Mar 6 at 22:30

EDubman

1286

It would be a lot easier to do this with sqlite.
â€“Â Red Cricket
Mar 7 at 6:33

add a commentÂ |Â

up vote
3
down vote

favorite

file1

1111|AAA|foo|50
1111|BBB|foo|30
2222|BBB|foo|10

file2

1111|AAA|bar|10
1111|CCC|bar|20
3333|AAA|bar|40

The desired output would be the following for the first entry (I have this working)

1111|AAA|50|10

For the second entry file1 (If there is no matching column1+column2 in both files, replace the entry which is missing for foo as 0. And the other way around)

1111|BBB|30|0

And for an entry key (column1+column2) in file2, but not in file1 (This is entry 3 of file 2 expected output)

3333|AAA|0|40

I have done a lot of research and tried many things but I have values not outputting if the column1+column2 pair exists in file2 but not file1 by using the following:

join -t"|" -e0 -a1 -a2 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F"|" 'print $1"-"$2"' | sort -k1,1) <(<file2 awk -F"|" 'print $1"-"$2"' | sort -k1,1)

The above case gives me expected output if there is a column1+column2 match in file1 but not file2, and appends a 0 for the match not existing... How can I get this to work for ALL scenarios?

asked Mar 6 at 22:30

EDubman

1286

file1

1111|AAA|foo|50
1111|BBB|foo|30
2222|BBB|foo|10

file2

1111|AAA|bar|10
1111|CCC|bar|20
3333|AAA|bar|40

The desired output would be the following for the first entry (I have this working)

1111|AAA|50|10

For the second entry file1 (If there is no matching column1+column2 in both files, replace the entry which is missing for foo as 0. And the other way around)

1111|BBB|30|0

And for an entry key (column1+column2) in file2, but not in file1 (This is entry 3 of file 2 expected output)

3333|AAA|0|40

I have done a lot of research and tried many things but I have values not outputting if the column1+column2 pair exists in file2 but not file1 by using the following:

join -t"|" -e0 -a1 -a2 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F"|" 'print $1"-"$2"' | sort -k1,1) <(<file2 awk -F"|" 'print $1"-"$2"' | sort -k1,1)

The above case gives me expected output if there is a column1+column2 match in file1 but not file2, and appends a 0 for the match not existing... How can I get this to work for ALL scenarios?

asked Mar 6 at 22:30

EDubman

1286

asked Mar 6 at 22:30

EDubman

1286

asked Mar 6 at 22:30

EDubman

1286

asked Mar 6 at 22:30

EDubman

1286

It would be a lot easier to do this with sqlite.
â€“Â Red Cricket
Mar 7 at 6:33

add a commentÂ |Â

It would be a lot easier to do this with sqlite.
â€“Â Red Cricket
Mar 7 at 6:33

It would be a lot easier to do this with sqlite.
â€“Â Red Cricket
Mar 7 at 6:33

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
3
down vote

accepted

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

You can do the same with a single awk invocation, storing $4 in two arrays indexed by e.g. $1|$2 and then in the END block iterating over each array indices, comparing them and printing accordingly:

awk -F'|' 'NR==FNR"$2]=$4;nextx[$1"
ENDfor (j in x)if (!(j in z))print j, "0", x[j];
for (i in z)if (i in x)print i, z[i], x[i] else print i, z[i], "0"
' OFS="|" file1 file2

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

add a commentÂ |Â

up vote
1
down vote

The following replaces the first | in both files with @ (use a character that does not occur elsewhere in the file), performs the join and then changes the @ back to the original |. This way, we create a new |-delimited join field consisting of columns 1 and 2 from the original files.

join -t'|' -e0 -a1 -a2 -o0,1.3,2.3 
 <( sed 's/|/@/' file1 | sort ) 
 <( sed 's/|/@/' file2 | sort ) |
tr '@' '|'

In the output field specification (-o), a zero represents the join field and column 3 in either file is actually column 4 from the original data.

For the given input files, this generates

1111|AAA|50|10
1111|BBB|30|0
1111|CCC|0|20
2222|BBB|10|0
3333|AAA|0|40

answered Mar 7 at 6:40

Kusalananda

103k13202318

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

add a commentÂ |Â

up vote
1
down vote

Another awk approach:

awk -F'|' 'NR==FNRf1[$1FS$2]=$NF;next f2[$1FS$2]=$NF 
 ENDfor (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x];
 for (y in f2) print y, 0, f2[y]
' file[12] OFS='|'

Explanation:

NR==FNRf1[$1FS$2]=$NF;next, this will run only for file1 and with the key combination of $1FS$2 will store last column value $NF in array called f1 (FS will substitute with | as awk's Field Seperator).

f2[$1FS$2]=$NF, same as above but this will run only for file2

for (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x], loop within array f1 and print key (x), its value in file1 f1[x] and if there same file1 key in file2, then print it as well, else print 0 (Used ternary condition f2[x]?f2[x]:0), after that we are also deleting record of same key from file2 with delete f2[x].

for (y in f2) print y, 0, f2[y], now array f2 has records which exist in file2 only, so we are printing their key (y), 0 because doesn't not exist in file1 and their value in file2 f2[y].

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f428633%2fjoining-entries-based-off-of-column-using-awk-join%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
3
down vote

accepted

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

awk -F'|' 'NR==FNR"$2]=$4;nextx[$1"
ENDfor (j in x)if (!(j in z))print j, "0", x[j];
for (i in z)if (i in x)print i, z[i], x[i] else print i, z[i], "0"
' OFS="|" file1 file2

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

add a commentÂ |Â

up vote
3
down vote

accepted

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

awk -F'|' 'NR==FNR"$2]=$4;nextx[$1"
ENDfor (j in x)if (!(j in z))print j, "0", x[j];
for (i in z)if (i in x)print i, z[i], x[i] else print i, z[i], "0"
' OFS="|" file1 file2

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

add a commentÂ |Â

up vote
3
down vote

accepted

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

awk -F'|' 'NR==FNR"$2]=$4;nextx[$1"
ENDfor (j in x)if (!(j in z))print j, "0", x[j];
for (i in z)if (i in x)print i, z[i], x[i] else print i, z[i], "0"
' OFS="|" file1 file2

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

With your approach you have to use join twice (or change your approach to do it with a single join invocation) :

print the common lines and the unpairable lines from file1 with join -t'|' -e0 -a1 -o 1.2,1.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

print the unpairable lines from file2 with join -t'|' -e0 -v2 -o 2.2,2.3,1.5,2.5 <(<file1 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1) <(<file2 awk -F'|' 'print $1"-"$2"' | sort -t'|' -k1,1)

awk -F'|' 'NR==FNR"$2]=$4;nextx[$1"
ENDfor (j in x)if (!(j in z))print j, "0", x[j];
for (i in z)if (i in x)print i, z[i], x[i] else print i, z[i], "0"
' OFS="|" file1 file2

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

edited Mar 7 at 11:22

answered Mar 6 at 23:24

don_crissti

46.4k15123153

answered Mar 6 at 23:24

don_crissti

46.4k15123153

answered Mar 6 at 23:24

don_crissti

46.4k15123153

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

add a commentÂ |Â

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

This looks great! I had attempted using a single awk expression but just could not get it down correctly. I will be verifying this answer tomorrow and update accordingly. Much appreciated
â€“Â EDubman
Mar 7 at 5:48

Both options work great! Thank you
â€“Â EDubman
Mar 7 at 21:49

add a commentÂ |Â

up vote
1
down vote

join -t'|' -e0 -a1 -a2 -o0,1.3,2.3 
 <( sed 's/|/@/' file1 | sort ) 
 <( sed 's/|/@/' file2 | sort ) |
tr '@' '|'

In the output field specification (-o), a zero represents the join field and column 3 in either file is actually column 4 from the original data.

For the given input files, this generates

1111|AAA|50|10
1111|BBB|30|0
1111|CCC|0|20
2222|BBB|10|0
3333|AAA|0|40

answered Mar 7 at 6:40

Kusalananda

103k13202318

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

add a commentÂ |Â

up vote
1
down vote

join -t'|' -e0 -a1 -a2 -o0,1.3,2.3 
 <( sed 's/|/@/' file1 | sort ) 
 <( sed 's/|/@/' file2 | sort ) |
tr '@' '|'

In the output field specification (-o), a zero represents the join field and column 3 in either file is actually column 4 from the original data.

For the given input files, this generates

1111|AAA|50|10
1111|BBB|30|0
1111|CCC|0|20
2222|BBB|10|0
3333|AAA|0|40

answered Mar 7 at 6:40

Kusalananda

103k13202318

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

add a commentÂ |Â

up vote
1
down vote

join -t'|' -e0 -a1 -a2 -o0,1.3,2.3 
 <( sed 's/|/@/' file1 | sort ) 
 <( sed 's/|/@/' file2 | sort ) |
tr '@' '|'

In the output field specification (-o), a zero represents the join field and column 3 in either file is actually column 4 from the original data.

For the given input files, this generates

1111|AAA|50|10
1111|BBB|30|0
1111|CCC|0|20
2222|BBB|10|0
3333|AAA|0|40

answered Mar 7 at 6:40

Kusalananda

103k13202318

join -t'|' -e0 -a1 -a2 -o0,1.3,2.3 
 <( sed 's/|/@/' file1 | sort ) 
 <( sed 's/|/@/' file2 | sort ) |
tr '@' '|'

In the output field specification (-o), a zero represents the join field and column 3 in either file is actually column 4 from the original data.

For the given input files, this generates

1111|AAA|50|10
1111|BBB|30|0
1111|CCC|0|20
2222|BBB|10|0
3333|AAA|0|40

answered Mar 7 at 6:40

Kusalananda

103k13202318

answered Mar 7 at 6:40

Kusalananda

103k13202318

answered Mar 7 at 6:40

Kusalananda

103k13202318

answered Mar 7 at 6:40

Kusalananda

103k13202318

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

add a commentÂ |Â

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

Yup; with unknown input I usually resort to low ascii chars which are almost guaranteed to never occur in a text file but that is on gnu setups...
â€“Â don_crissti
Mar 7 at 11:22

add a commentÂ |Â

up vote
1
down vote

Another awk approach:

awk -F'|' 'NR==FNRf1[$1FS$2]=$NF;next f2[$1FS$2]=$NF 
 ENDfor (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x];
 for (y in f2) print y, 0, f2[y]
' file[12] OFS='|'

Explanation:

NR==FNRf1[$1FS$2]=$NF;next, this will run only for file1 and with the key combination of $1FS$2 will store last column value $NF in array called f1 (FS will substitute with | as awk's Field Seperator).

f2[$1FS$2]=$NF, same as above but this will run only for file2

for (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x], loop within array f1 and print key (x), its value in file1 f1[x] and if there same file1 key in file2, then print it as well, else print 0 (Used ternary condition f2[x]?f2[x]:0), after that we are also deleting record of same key from file2 with delete f2[x].

for (y in f2) print y, 0, f2[y], now array f2 has records which exist in file2 only, so we are printing their key (y), 0 because doesn't not exist in file1 and their value in file2 f2[y].

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

add a commentÂ |Â

up vote
1
down vote

Another awk approach:

awk -F'|' 'NR==FNRf1[$1FS$2]=$NF;next f2[$1FS$2]=$NF 
 ENDfor (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x];
 for (y in f2) print y, 0, f2[y]
' file[12] OFS='|'

Explanation:

NR==FNRf1[$1FS$2]=$NF;next, this will run only for file1 and with the key combination of $1FS$2 will store last column value $NF in array called f1 (FS will substitute with | as awk's Field Seperator).

f2[$1FS$2]=$NF, same as above but this will run only for file2

for (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x], loop within array f1 and print key (x), its value in file1 f1[x] and if there same file1 key in file2, then print it as well, else print 0 (Used ternary condition f2[x]?f2[x]:0), after that we are also deleting record of same key from file2 with delete f2[x].

for (y in f2) print y, 0, f2[y], now array f2 has records which exist in file2 only, so we are printing their key (y), 0 because doesn't not exist in file1 and their value in file2 f2[y].

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

add a commentÂ |Â

up vote
1
down vote

Another awk approach:

awk -F'|' 'NR==FNRf1[$1FS$2]=$NF;next f2[$1FS$2]=$NF 
 ENDfor (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x];
 for (y in f2) print y, 0, f2[y]
' file[12] OFS='|'

Explanation:

NR==FNRf1[$1FS$2]=$NF;next, this will run only for file1 and with the key combination of $1FS$2 will store last column value $NF in array called f1 (FS will substitute with | as awk's Field Seperator).

f2[$1FS$2]=$NF, same as above but this will run only for file2

for (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x], loop within array f1 and print key (x), its value in file1 f1[x] and if there same file1 key in file2, then print it as well, else print 0 (Used ternary condition f2[x]?f2[x]:0), after that we are also deleting record of same key from file2 with delete f2[x].

for (y in f2) print y, 0, f2[y], now array f2 has records which exist in file2 only, so we are printing their key (y), 0 because doesn't not exist in file1 and their value in file2 f2[y].

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

Another awk approach:

awk -F'|' 'NR==FNRf1[$1FS$2]=$NF;next f2[$1FS$2]=$NF 
 ENDfor (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x];
 for (y in f2) print y, 0, f2[y]
' file[12] OFS='|'

Explanation:

NR==FNRf1[$1FS$2]=$NF;next, this will run only for file1 and with the key combination of $1FS$2 will store last column value $NF in array called f1 (FS will substitute with | as awk's Field Seperator).

f2[$1FS$2]=$NF, same as above but this will run only for file2

for (x in f1)print x,f1[x],f2[x]?f2[x]:0; delete f2[x], loop within array f1 and print key (x), its value in file1 f1[x] and if there same file1 key in file2, then print it as well, else print 0 (Used ternary condition f2[x]?f2[x]:0), after that we are also deleting record of same key from file2 with delete f2[x].

for (y in f2) print y, 0, f2[y], now array f2 has records which exist in file2 only, so we are printing their key (y), 0 because doesn't not exist in file1 and their value in file2 f2[y].

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

edited Mar 7 at 7:22

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

answered Mar 7 at 6:18

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

14.9k82462

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu