Comparing csv files to lookup column 1 then check values in column 2

up vote
0
down vote

favorite

OK, I will try and explain what I need to do as best as possible.
Basically I have two CSV files, as per the examples below:

File 1:

Column 1, Column 2
abc , 123
def , 234
adf , 567

File 2

Column 1, Column 2
abc , 123
def , 234
adf , 578

I need to write either a shell script or simple command that will do the following:

Sort both files by column 1

Row by row, do the following:
- Using column 1 in file 1, search for this value in columnÃ‚Â 1 in file 2.
  1. if found, compare the value in column 2 in file 1 against the value in column 2 of file 2
  2. if it matches, write column 1, column 2 and "Validated" in column 3 to a separate file
  3. if it does not match, write column 1, column 2 and "Failed" to a separate file

This results in two output files: the first with everything that was found in columnÃ‚Â 1 and column 2 matches, and a second file containing either column 1 lookups that failed or where column 1 was found, where column 2 did not match, so, essentially, using column 1 as the key to check column 2.

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

Your question would be clearer if you (1)Ã¢Â€Â¯showed exactly what you want the output files to look like for your example input, and (2)Ã¢Â€Â¯described (and illustrated) what you want to happen if a value in columnÃ¢Â€Â¯1 of fileÃ¢Â€Â¯1 is not present in fileÃ¢Â€Â¯2.Ã¢Â€Â‚Also, have you made any attempt to solve this yourself?Ã¢Â€ÂƒDone any research?Ã¢Â€ÂƒWhat have you found?Ã¢Â€ÂƒWhat have you tried?
â€“Â G-Man
Mar 21 '17 at 18:10

Hi, apologies yes I have been looking into this extensively. I first tried the below but this did not give the desired result so had to play around with the columns etc awk -F, 'NR==FNRa[$1,$3]++;next (a[$1,$2])' file1.txt file2.txt I have been also looking at whether I should be doing this in SQL but that means importing on eof the files into SQL and running innerjoins which with 150 million lines can take a while. I am still investigating and will obviously share anything I find that works on this post.
â€“Â Veyron
Mar 22 '17 at 14:24

add a commentÂ |Â

up vote
0
down vote

favorite

OK, I will try and explain what I need to do as best as possible.
Basically I have two CSV files, as per the examples below:

File 1:

Column 1, Column 2
abc , 123
def , 234
adf , 567

File 2

Column 1, Column 2
abc , 123
def , 234
adf , 578

I need to write either a shell script or simple command that will do the following:

Sort both files by column 1

Row by row, do the following:
- Using column 1 in file 1, search for this value in columnÃ‚Â 1 in file 2.
  1. if found, compare the value in column 2 in file 1 against the value in column 2 of file 2
  2. if it matches, write column 1, column 2 and "Validated" in column 3 to a separate file
  3. if it does not match, write column 1, column 2 and "Failed" to a separate file

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

Your question would be clearer if you (1)Ã¢Â€Â¯showed exactly what you want the output files to look like for your example input, and (2)Ã¢Â€Â¯described (and illustrated) what you want to happen if a value in columnÃ¢Â€Â¯1 of fileÃ¢Â€Â¯1 is not present in fileÃ¢Â€Â¯2.Ã¢Â€Â‚Also, have you made any attempt to solve this yourself?Ã¢Â€ÂƒDone any research?Ã¢Â€ÂƒWhat have you found?Ã¢Â€ÂƒWhat have you tried?
â€“Â G-Man
Mar 21 '17 at 18:10

Hi, apologies yes I have been looking into this extensively. I first tried the below but this did not give the desired result so had to play around with the columns etc awk -F, 'NR==FNRa[$1,$3]++;next (a[$1,$2])' file1.txt file2.txt I have been also looking at whether I should be doing this in SQL but that means importing on eof the files into SQL and running innerjoins which with 150 million lines can take a while. I am still investigating and will obviously share anything I find that works on this post.
â€“Â Veyron
Mar 22 '17 at 14:24

add a commentÂ |Â

up vote
0
down vote

favorite

OK, I will try and explain what I need to do as best as possible.
Basically I have two CSV files, as per the examples below:

File 1:

Column 1, Column 2
abc , 123
def , 234
adf , 567

File 2

Column 1, Column 2
abc , 123
def , 234
adf , 578

I need to write either a shell script or simple command that will do the following:

Sort both files by column 1

Row by row, do the following:
- Using column 1 in file 1, search for this value in columnÃ‚Â 1 in file 2.
  1. if found, compare the value in column 2 in file 1 against the value in column 2 of file 2
  2. if it matches, write column 1, column 2 and "Validated" in column 3 to a separate file
  3. if it does not match, write column 1, column 2 and "Failed" to a separate file

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

OK, I will try and explain what I need to do as best as possible.
Basically I have two CSV files, as per the examples below:

File 1:

Column 1, Column 2
abc , 123
def , 234
adf , 567

File 2

Column 1, Column 2
abc , 123
def , 234
adf , 578

I need to write either a shell script or simple command that will do the following:

Sort both files by column 1

Row by row, do the following:
- Using column 1 in file 1, search for this value in columnÃ‚Â 1 in file 2.
  1. if found, compare the value in column 2 in file 1 against the value in column 2 of file 2
  2. if it matches, write column 1, column 2 and "Validated" in column 3 to a separate file
  3. if it does not match, write column 1, column 2 and "Failed" to a separate file

shell-script text-processing columns file-comparison

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

edited Mar 21 '17 at 18:08

G-Man

11.8k92658

asked Mar 21 '17 at 17:08

Veyron

asked Mar 21 '17 at 17:08

Veyron

asked Mar 21 '17 at 17:08

Veyron

Your question would be clearer if you (1)Ã¢Â€Â¯showed exactly what you want the output files to look like for your example input, and (2)Ã¢Â€Â¯described (and illustrated) what you want to happen if a value in columnÃ¢Â€Â¯1 of fileÃ¢Â€Â¯1 is not present in fileÃ¢Â€Â¯2.Ã¢Â€Â‚Also, have you made any attempt to solve this yourself?Ã¢Â€ÂƒDone any research?Ã¢Â€ÂƒWhat have you found?Ã¢Â€ÂƒWhat have you tried?
â€“Â G-Man
Mar 21 '17 at 18:10

Hi, apologies yes I have been looking into this extensively. I first tried the below but this did not give the desired result so had to play around with the columns etc awk -F, 'NR==FNRa[$1,$3]++;next (a[$1,$2])' file1.txt file2.txt I have been also looking at whether I should be doing this in SQL but that means importing on eof the files into SQL and running innerjoins which with 150 million lines can take a while. I am still investigating and will obviously share anything I find that works on this post.
â€“Â Veyron
Mar 22 '17 at 14:24

add a commentÂ |Â

Your question would be clearer if you (1)Ã¢Â€Â¯showed exactly what you want the output files to look like for your example input, and (2)Ã¢Â€Â¯described (and illustrated) what you want to happen if a value in columnÃ¢Â€Â¯1 of fileÃ¢Â€Â¯1 is not present in fileÃ¢Â€Â¯2.Ã¢Â€Â‚Also, have you made any attempt to solve this yourself?Ã¢Â€ÂƒDone any research?Ã¢Â€ÂƒWhat have you found?Ã¢Â€ÂƒWhat have you tried?
â€“Â G-Man
Mar 21 '17 at 18:10

Hi, apologies yes I have been looking into this extensively. I first tried the below but this did not give the desired result so had to play around with the columns etc awk -F, 'NR==FNRa[$1,$3]++;next (a[$1,$2])' file1.txt file2.txt I have been also looking at whether I should be doing this in SQL but that means importing on eof the files into SQL and running innerjoins which with 150 million lines can take a while. I am still investigating and will obviously share anything I find that works on this post.
â€“Â Veyron
Mar 22 '17 at 14:24

Your question would be clearer if you (1)Ã¢Â€Â¯showed exactly what you want the output files to look like for your example input, and (2)Ã¢Â€Â¯described (and illustrated) what you want to happen if a value in columnÃ¢Â€Â¯1 of fileÃ¢Â€Â¯1 is not present in fileÃ¢Â€Â¯2.Ã¢Â€Â‚Also, have you made any attempt to solve this yourself?Ã¢Â€ÂƒDone any research?Ã¢Â€ÂƒWhat have you found?Ã¢Â€ÂƒWhat have you tried?
â€“Â G-Man
Mar 21 '17 at 18:10

Hi, apologies yes I have been looking into this extensively. I first tried the below but this did not give the desired result so had to play around with the columns etc awk -F, 'NR==FNRa[$1,$3]++;next (a[$1,$2])' file1.txt file2.txt I have been also looking at whether I should be doing this in SQL but that means importing on eof the files into SQL and running innerjoins which with 150 million lines can take a while. I am still investigating and will obviously share anything I find that works on this post.
â€“Â Veyron
Mar 22 '17 at 14:24

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
0
down vote

Given the following input files:

$ cat in1 in2
Column 1, Column 2
abc , 123
def , 234
adf , 567
Column 1, Column 2
abc , 123
def , 234
adf , 578

First, we sort them; we can then stitch them together into one file:

$ sort in1 > in1.sorted; sort in2 > in2.sorted; paste in1,2.sorted
Column 1, Column 2 Column 1, Column 2
abc , 123 abc , 123
adf , 567 adf , 578
def , 234 def , 234

awk will help us here, but the commas get in our way; we can get rid of them with sed first:

$ paste in1,2.sorted | sed s/,//g
Column 1 Column 2 Column 1 Column 2
abc 123 abc 123
adf 567 adf 578
def 234 def 234

And then we can dump that through a quick awk:

$ paste in1,2.sorted | sed s/,//g | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
Column Failed
abc Validated
adf Failed
def Validated

This can also be done with raw awk, with the advantage of being able to strip out the header rows and not rely on the same data being in the same order, thus removing the need for sorting:

$ awk 'FNR != 1 && NR == FNR data[$1]=$3 FNR != 1 && NR != FNR if(data[$1]==$3) print $1, "Validated" else print $1, "Failed" ' in1,2
abc Validated
adf Failed
def Validated

This relies on a few magic awk built-in variables and tricks related to them:

NR - the total number of records processed

FNR - the total number of records in the current file processed

FNR != 1 - skips the first row of each file (does not treat the headers as data)

NR != FNR - runs only after the first file has been completely read and we have started reading subsequent files. This allows us to prepopulate the data array for testing once we start chewing on the second file.

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

add a commentÂ |Â

up vote
0
down vote

I think I have this sorted out now with the following, in case any else reads this and needs this. Thanks again.

FNR == NR
for (i = 2; i <= NF; i++) a[i,$1] = $i
b[$1];
next;

($1 in b) # check if row in file2 existed in file1
for (i = 2; i <= NF; i++)
if (a[i,$1] == $i)
printf("%s->col%d: %s vs %s: Validn", $1, i-1, a[i,$1], $i);
else
printf("%s->col%d: %s vs %s: Failuren", $1, i-1, a[i,$1], $i);

delete b[$1]; # delete entries which are processed

END
for (left in b) # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equaln", left, i-1, a[i,left])

answered Mar 23 '17 at 11:31

Veyron

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f352877%2fcomparing-csv-files-to-lookup-column-1-then-check-values-in-column-2%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

Given the following input files:

$ cat in1 in2
Column 1, Column 2
abc , 123
def , 234
adf , 567
Column 1, Column 2
abc , 123
def , 234
adf , 578

First, we sort them; we can then stitch them together into one file:

$ sort in1 > in1.sorted; sort in2 > in2.sorted; paste in1,2.sorted
Column 1, Column 2 Column 1, Column 2
abc , 123 abc , 123
adf , 567 adf , 578
def , 234 def , 234

awk will help us here, but the commas get in our way; we can get rid of them with sed first:

$ paste in1,2.sorted | sed s/,//g
Column 1 Column 2 Column 1 Column 2
abc 123 abc 123
adf 567 adf 578
def 234 def 234

And then we can dump that through a quick awk:

$ paste in1,2.sorted | sed s/,//g | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
Column Failed
abc Validated
adf Failed
def Validated

This can also be done with raw awk, with the advantage of being able to strip out the header rows and not rely on the same data being in the same order, thus removing the need for sorting:

$ awk 'FNR != 1 && NR == FNR data[$1]=$3 FNR != 1 && NR != FNR if(data[$1]==$3) print $1, "Validated" else print $1, "Failed" ' in1,2
abc Validated
adf Failed
def Validated

This relies on a few magic awk built-in variables and tricks related to them:

NR - the total number of records processed

FNR - the total number of records in the current file processed

FNR != 1 - skips the first row of each file (does not treat the headers as data)

NR != FNR - runs only after the first file has been completely read and we have started reading subsequent files. This allows us to prepopulate the data array for testing once we start chewing on the second file.

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

add a commentÂ |Â

up vote
0
down vote

Given the following input files:

$ cat in1 in2
Column 1, Column 2
abc , 123
def , 234
adf , 567
Column 1, Column 2
abc , 123
def , 234
adf , 578

First, we sort them; we can then stitch them together into one file:

$ sort in1 > in1.sorted; sort in2 > in2.sorted; paste in1,2.sorted
Column 1, Column 2 Column 1, Column 2
abc , 123 abc , 123
adf , 567 adf , 578
def , 234 def , 234

awk will help us here, but the commas get in our way; we can get rid of them with sed first:

$ paste in1,2.sorted | sed s/,//g
Column 1 Column 2 Column 1 Column 2
abc 123 abc 123
adf 567 adf 578
def 234 def 234

And then we can dump that through a quick awk:

$ paste in1,2.sorted | sed s/,//g | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
Column Failed
abc Validated
adf Failed
def Validated

This can also be done with raw awk, with the advantage of being able to strip out the header rows and not rely on the same data being in the same order, thus removing the need for sorting:

$ awk 'FNR != 1 && NR == FNR data[$1]=$3 FNR != 1 && NR != FNR if(data[$1]==$3) print $1, "Validated" else print $1, "Failed" ' in1,2
abc Validated
adf Failed
def Validated

This relies on a few magic awk built-in variables and tricks related to them:

NR - the total number of records processed

FNR - the total number of records in the current file processed

FNR != 1 - skips the first row of each file (does not treat the headers as data)

NR != FNR - runs only after the first file has been completely read and we have started reading subsequent files. This allows us to prepopulate the data array for testing once we start chewing on the second file.

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

add a commentÂ |Â

up vote
0
down vote

Given the following input files:

$ cat in1 in2
Column 1, Column 2
abc , 123
def , 234
adf , 567
Column 1, Column 2
abc , 123
def , 234
adf , 578

First, we sort them; we can then stitch them together into one file:

$ sort in1 > in1.sorted; sort in2 > in2.sorted; paste in1,2.sorted
Column 1, Column 2 Column 1, Column 2
abc , 123 abc , 123
adf , 567 adf , 578
def , 234 def , 234

awk will help us here, but the commas get in our way; we can get rid of them with sed first:

$ paste in1,2.sorted | sed s/,//g
Column 1 Column 2 Column 1 Column 2
abc 123 abc 123
adf 567 adf 578
def 234 def 234

And then we can dump that through a quick awk:

$ paste in1,2.sorted | sed s/,//g | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
Column Failed
abc Validated
adf Failed
def Validated

This can also be done with raw awk, with the advantage of being able to strip out the header rows and not rely on the same data being in the same order, thus removing the need for sorting:

$ awk 'FNR != 1 && NR == FNR data[$1]=$3 FNR != 1 && NR != FNR if(data[$1]==$3) print $1, "Validated" else print $1, "Failed" ' in1,2
abc Validated
adf Failed
def Validated

This relies on a few magic awk built-in variables and tricks related to them:

NR - the total number of records processed

FNR - the total number of records in the current file processed

FNR != 1 - skips the first row of each file (does not treat the headers as data)

NR != FNR - runs only after the first file has been completely read and we have started reading subsequent files. This allows us to prepopulate the data array for testing once we start chewing on the second file.

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

Given the following input files:

$ cat in1 in2
Column 1, Column 2
abc , 123
def , 234
adf , 567
Column 1, Column 2
abc , 123
def , 234
adf , 578

First, we sort them; we can then stitch them together into one file:

$ sort in1 > in1.sorted; sort in2 > in2.sorted; paste in1,2.sorted
Column 1, Column 2 Column 1, Column 2
abc , 123 abc , 123
adf , 567 adf , 578
def , 234 def , 234

awk will help us here, but the commas get in our way; we can get rid of them with sed first:

$ paste in1,2.sorted | sed s/,//g
Column 1 Column 2 Column 1 Column 2
abc 123 abc 123
adf 567 adf 578
def 234 def 234

And then we can dump that through a quick awk:

$ paste in1,2.sorted | sed s/,//g | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
Column Failed
abc Validated
adf Failed
def Validated

This can also be done with raw awk, with the advantage of being able to strip out the header rows and not rely on the same data being in the same order, thus removing the need for sorting:

$ awk 'FNR != 1 && NR == FNR data[$1]=$3 FNR != 1 && NR != FNR if(data[$1]==$3) print $1, "Validated" else print $1, "Failed" ' in1,2
abc Validated
adf Failed
def Validated

This relies on a few magic awk built-in variables and tricks related to them:

NR - the total number of records processed

FNR - the total number of records in the current file processed

FNR != 1 - skips the first row of each file (does not treat the headers as data)

NR != FNR - runs only after the first file has been completely read and we have started reading subsequent files. This allows us to prepopulate the data array for testing once we start chewing on the second file.

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

edited Mar 21 '17 at 22:50

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

answered Mar 21 '17 at 17:25

DopeGhoti

41k55080

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

add a commentÂ |Â

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

Here is a modified version as a one-liner: paste <(sort in1) <(sort in2) | tr -d , | awk '$2 == $4 print $1,"Validated"; $2 != $4 print $1,"Failed"'
â€“Â hschou
Mar 21 '17 at 22:23

Thank you, I am working on the final lists to compare and will give this a try and will let you know the end results and if I need to tweak anything. Will update you on progress.
â€“Â Veyron
Mar 22 '17 at 14:28

OK, so did a test with this. The first file is the complete source list of 15 million lines, the second file is a subset of what has been check, ~22,000 lines, end result is that only about 50 lines were validated. I am looking into another method and will post here shortly.
â€“Â Veyron
Mar 22 '17 at 15:48

add a commentÂ |Â

up vote
0
down vote

I think I have this sorted out now with the following, in case any else reads this and needs this. Thanks again.

END
for (left in b) # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equaln", left, i-1, a[i,left])

answered Mar 23 '17 at 11:31

Veyron

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

add a commentÂ |Â

up vote
0
down vote

I think I have this sorted out now with the following, in case any else reads this and needs this. Thanks again.

END
for (left in b) # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equaln", left, i-1, a[i,left])

answered Mar 23 '17 at 11:31

Veyron

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

add a commentÂ |Â

up vote
0
down vote

I think I have this sorted out now with the following, in case any else reads this and needs this. Thanks again.

END
for (left in b) # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equaln", left, i-1, a[i,left])

answered Mar 23 '17 at 11:31

Veyron

I think I have this sorted out now with the following, in case any else reads this and needs this. Thanks again.

END
for (left in b) # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equaln", left, i-1, a[i,left])

answered Mar 23 '17 at 11:31

Veyron

answered Mar 23 '17 at 11:31

Veyron

answered Mar 23 '17 at 11:31

Veyron

answered Mar 23 '17 at 11:31

Veyron

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

add a commentÂ |Â

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

Further to my email, although I thought I had an answer, I have tested this with a couple of CSR files but it does not work. I have spent hours going through this but cannot seem to work out where the awk statement is wrong. Any help would be appreciated and thanks in advance.
â€“Â Veyron
Mar 27 '17 at 4:28

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu