Compare 2 fields from different files and only return matches regardless of order

up vote
-1
down vote

favorite

I have 2 files
File1.txt:

Column1 | Column2
username2 | timestamp
username1 | timestamp
username4 | timestamp

File2.txt:

Column1 | Column2
username1 | timestamp
username3 | timestamp
username2 | timestamp

I want to output where they have matching Column1 values into a new file just showing Column1 content. Those values in Column1 are not always in the same order between file1.txt and file2.txt and some entries will be missing from either file.

Output-File3.txt

Column1

username1

username2

edited Jun 7 at 15:20

asked Jun 7 at 14:20

Don Draper

post the expected result
â€“Â RomanPerekhrest
Jun 7 at 14:27

add a commentÂ |Â

up vote
-1
down vote

favorite

I have 2 files
File1.txt:

Column1 | Column2
username2 | timestamp
username1 | timestamp
username4 | timestamp

File2.txt:

Column1 | Column2
username1 | timestamp
username3 | timestamp
username2 | timestamp

Output-File3.txt

Column1

username1

username2

edited Jun 7 at 15:20

asked Jun 7 at 14:20

Don Draper

post the expected result
â€“Â RomanPerekhrest
Jun 7 at 14:27

add a commentÂ |Â

up vote
-1
down vote

favorite

I have 2 files
File1.txt:

Column1 | Column2
username2 | timestamp
username1 | timestamp
username4 | timestamp

File2.txt:

Column1 | Column2
username1 | timestamp
username3 | timestamp
username2 | timestamp

Output-File3.txt

Column1

username1

username2

edited Jun 7 at 15:20

asked Jun 7 at 14:20

Don Draper

I have 2 files
File1.txt:

Column1 | Column2
username2 | timestamp
username1 | timestamp
username4 | timestamp

File2.txt:

Column1 | Column2
username1 | timestamp
username3 | timestamp
username2 | timestamp

Output-File3.txt

Column1

username1

username2

edited Jun 7 at 15:20

asked Jun 7 at 14:20

Don Draper

edited Jun 7 at 15:20

asked Jun 7 at 14:20

Don Draper

asked Jun 7 at 14:20

Don Draper

asked Jun 7 at 14:20

Don Draper

post the expected result
â€“Â RomanPerekhrest
Jun 7 at 14:27

add a commentÂ |Â

post the expected result
â€“Â RomanPerekhrest
Jun 7 at 14:27

post the expected result
â€“Â RomanPerekhrest
Jun 7 at 14:27

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
2
down vote

accepted

Using awk

awk -F ' *| *' 'NR==FNRa[$1];next($1 in a)' file1 file2

The array a is filled with the content of the first file1 column. Only lines matching an entry array will be printed when next file is parsed.

answered Jun 7 at 14:28

oliv

90427

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

add a commentÂ |Â

up vote
0
down vote

Extract column 1 from both files, sort it, then find the duplicated lines:

cut -d" " -f1 File1.txt File2.txt | sort | uniq -d

answered Jun 7 at 15:23

glenn jackman

45.6k265100

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

add a commentÂ |Â

up vote
-2
down vote

Just sort both files before comparing them

sort f1 > f1s
sort f2 > f2s
diff f1s f2s

answered Jun 7 at 14:24

schily

8,61821435

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448440%2fcompare-2-fields-from-different-files-and-only-return-matches-regardless-of-orde%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
2
down vote

accepted

Using awk

awk -F ' *| *' 'NR==FNRa[$1];next($1 in a)' file1 file2

The array a is filled with the content of the first file1 column. Only lines matching an entry array will be printed when next file is parsed.

answered Jun 7 at 14:28

oliv

90427

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

add a commentÂ |Â

up vote
2
down vote

accepted

Using awk

awk -F ' *| *' 'NR==FNRa[$1];next($1 in a)' file1 file2

The array a is filled with the content of the first file1 column. Only lines matching an entry array will be printed when next file is parsed.

answered Jun 7 at 14:28

oliv

90427

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

add a commentÂ |Â

up vote
2
down vote

accepted

Using awk

awk -F ' *| *' 'NR==FNRa[$1];next($1 in a)' file1 file2

The array a is filled with the content of the first file1 column. Only lines matching an entry array will be printed when next file is parsed.

answered Jun 7 at 14:28

oliv

90427

Using awk

awk -F ' *| *' 'NR==FNRa[$1];next($1 in a)' file1 file2

The array a is filled with the content of the first file1 column. Only lines matching an entry array will be printed when next file is parsed.

answered Jun 7 at 14:28

oliv

90427

answered Jun 7 at 14:28

oliv

90427

answered Jun 7 at 14:28

oliv

90427

answered Jun 7 at 14:28

oliv

90427

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

add a commentÂ |Â

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

Thanks Oliv, although that gave me everything; matches and differences.
â€“Â Don Draper
Jun 7 at 15:02

One minor drawback: if userX appears in file1, and it appears twice in file2, you'll see both instances in the output. You might want: $1 in a print $1; delete a[$1] to only print the first one.
â€“Â glenn jackman
Jun 7 at 16:05

In my case, there are no duplicates.
â€“Â Don Draper
Jun 7 at 17:46

@Don, if this is the best answer, accept it. Read What should I do when someone answers my question?
â€“Â glenn jackman
Jun 7 at 18:04

add a commentÂ |Â

up vote
0
down vote

Extract column 1 from both files, sort it, then find the duplicated lines:

cut -d" " -f1 File1.txt File2.txt | sort | uniq -d

answered Jun 7 at 15:23

glenn jackman

45.6k265100

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

add a commentÂ |Â

up vote
0
down vote

Extract column 1 from both files, sort it, then find the duplicated lines:

cut -d" " -f1 File1.txt File2.txt | sort | uniq -d

answered Jun 7 at 15:23

glenn jackman

45.6k265100

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

add a commentÂ |Â

up vote
0
down vote

Extract column 1 from both files, sort it, then find the duplicated lines:

cut -d" " -f1 File1.txt File2.txt | sort | uniq -d

answered Jun 7 at 15:23

glenn jackman

45.6k265100

Extract column 1 from both files, sort it, then find the duplicated lines:

cut -d" " -f1 File1.txt File2.txt | sort | uniq -d

answered Jun 7 at 15:23

glenn jackman

45.6k265100

answered Jun 7 at 15:23

glenn jackman

45.6k265100

answered Jun 7 at 15:23

glenn jackman

45.6k265100

answered Jun 7 at 15:23

glenn jackman

45.6k265100

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

add a commentÂ |Â

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

Thats almost it I reckon. Although there are single spaces infront of the first column text and those are being missed. How would you incorporate that?
â€“Â Don Draper
Jun 7 at 15:38

Then cut is the wrong tool. Stick with awk.
â€“Â glenn jackman
Jun 7 at 16:04

Could I use cut to remove the first space?
â€“Â Don Draper
Jun 7 at 17:47

No, you'd use sed 's/^[[:blank:]]+//' to remove leading whitespace. But at this point, stringing several commands together, you'd be better off with a single awk program.
â€“Â glenn jackman
Jun 7 at 18:02

add a commentÂ |Â

up vote
-2
down vote

Just sort both files before comparing them

sort f1 > f1s
sort f2 > f2s
diff f1s f2s

answered Jun 7 at 14:24

schily

8,61821435

add a commentÂ |Â

up vote
-2
down vote

Just sort both files before comparing them

sort f1 > f1s
sort f2 > f2s
diff f1s f2s

answered Jun 7 at 14:24

schily

8,61821435

add a commentÂ |Â

up vote
-2
down vote

Just sort both files before comparing them

sort f1 > f1s
sort f2 > f2s
diff f1s f2s

answered Jun 7 at 14:24

schily

8,61821435

Just sort both files before comparing them

sort f1 > f1s
sort f2 > f2s
diff f1s f2s

answered Jun 7 at 14:24

schily

8,61821435

answered Jun 7 at 14:24

schily

8,61821435

answered Jun 7 at 14:24

schily

8,61821435

answered Jun 7 at 14:24

schily

8,61821435

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu