find the difference between two files and print it in the proper format

I have two text file "account.txt" and "customer.txt"

**account.txt**

876251251
716126181
888281211
666615211
787878787
111212134

**customer.txt**

876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288

I need to compare "account.txt" with "customer.txt".

All account numbers from account.txt should be present in customer.txt file and if customer.txt is missing any "account number" then we should print out all those account numbers which are missing in customer.txt.

And also all those extra customer numbers in customer.txt that are not present in account.txt file, I want to print that as well.

Output should be:

Missing Account Number:
888281211
787878787
111212134

Extra Customer Number:
792342108
792332668
760332429
791952441
676702288

Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.

comm -23 account.txt customer.txt

Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

add a comment |

I have two text file "account.txt" and "customer.txt"

**account.txt**

876251251
716126181
888281211
666615211
787878787
111212134

**customer.txt**

876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288

I need to compare "account.txt" with "customer.txt".

All account numbers from account.txt should be present in customer.txt file and if customer.txt is missing any "account number" then we should print out all those account numbers which are missing in customer.txt.

And also all those extra customer numbers in customer.txt that are not present in account.txt file, I want to print that as well.

Output should be:

Missing Account Number:
888281211
787878787
111212134

Extra Customer Number:
792342108
792332668
760332429
791952441
676702288

Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.

comm -23 account.txt customer.txt

Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

add a comment |

I have two text file "account.txt" and "customer.txt"

**account.txt**

876251251
716126181
888281211
666615211
787878787
111212134

**customer.txt**

876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288

I need to compare "account.txt" with "customer.txt".

All account numbers from account.txt should be present in customer.txt file and if customer.txt is missing any "account number" then we should print out all those account numbers which are missing in customer.txt.

And also all those extra customer numbers in customer.txt that are not present in account.txt file, I want to print that as well.

Output should be:

Missing Account Number:
888281211
787878787
111212134

Extra Customer Number:
792342108
792332668
760332429
791952441
676702288

Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.

comm -23 account.txt customer.txt

Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

I have two text file "account.txt" and "customer.txt"

**account.txt**

876251251
716126181
888281211
666615211
787878787
111212134

**customer.txt**

876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288

I need to compare "account.txt" with "customer.txt".

All account numbers from account.txt should be present in customer.txt file and if customer.txt is missing any "account number" then we should print out all those account numbers which are missing in customer.txt.

And also all those extra customer numbers in customer.txt that are not present in account.txt file, I want to print that as well.

Output should be:

Missing Account Number:
888281211
787878787
111212134

Extra Customer Number:
792342108
792332668
760332429
791952441
676702288

Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.

comm -23 account.txt customer.txt

Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.

linux text-processing

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

edited Dec 14 at 19:42

asked Dec 14 at 19:22

user5447339

160118

asked Dec 14 at 19:22

user5447339

160118

asked Dec 14 at 19:22

user5447339

160118

add a comment |

3 Answers
3

active

oldest

votes

Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:

grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted

... then use comm as you indicated:

 echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted; 

 echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;

Given sample inputs of:

account.txt

garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra

customer.txt

garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk

The resultant output is:

Missing Account Number:
111212134
787878787
888281211

Extra Customer Number:
676702288
760332429
791952441
792332668
792342108

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

|
show 1 more comment

Yes, it is possible, maybe easiest with diff.

$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288

$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134

$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288

The following shellscript diff-script is more polished.

#!/bin/bash

# assuming 9-digit account and customer numbers

sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt

diff account.srt customer.srt > diff.txt

echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt

echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt

echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"

Demo example,

$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'

$ cat result.txt 
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

|
show 1 more comment

For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.

$ cat account
876251251

716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134

$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429

791952441
676702288

$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer

answered Dec 14 at 21:08

George Vasiliou

5,59531028

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488044%2ffind-the-difference-between-two-files-and-print-it-in-the-proper-format%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted

... then use comm as you indicated:

 echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted; 

 echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;

Given sample inputs of:

account.txt

garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra

customer.txt

garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk

The resultant output is:

Missing Account Number:
111212134
787878787
888281211

Extra Customer Number:
676702288
760332429
791952441
792332668
792342108

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

|
show 1 more comment

grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted

... then use comm as you indicated:

 echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted; 

 echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;

Given sample inputs of:

account.txt

garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra

customer.txt

garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk

The resultant output is:

Missing Account Number:
111212134
787878787
888281211

Extra Customer Number:
676702288
760332429
791952441
792332668
792342108

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

|
show 1 more comment

grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted

... then use comm as you indicated:

 echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted; 

 echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;

Given sample inputs of:

account.txt

garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra

customer.txt

garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk

The resultant output is:

Missing Account Number:
111212134
787878787
888281211

Extra Customer Number:
676702288
760332429
791952441
792332668
792342108

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted

... then use comm as you indicated:

 echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted; 

 echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;

Given sample inputs of:

account.txt

garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra

customer.txt

garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk

The resultant output is:

Missing Account Number:
111212134
787878787
888281211

Extra Customer Number:
676702288
760332429
791952441
792332668
792342108

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

answered Dec 14 at 20:21

Jeff Schaller

38.5k1053125

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

|
show 1 more comment

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29

It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34

I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37

Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37

if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07

|
show 1 more comment

Yes, it is possible, maybe easiest with diff.

$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288

$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134

$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288

The following shellscript diff-script is more polished.

#!/bin/bash

# assuming 9-digit account and customer numbers

sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt

diff account.srt customer.srt > diff.txt

echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt

echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt

echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"

Demo example,

$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'

$ cat result.txt 
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

|
show 1 more comment

Yes, it is possible, maybe easiest with diff.

$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288

$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134

$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288

The following shellscript diff-script is more polished.

#!/bin/bash

# assuming 9-digit account and customer numbers

sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt

diff account.srt customer.srt > diff.txt

echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt

echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt

echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"

Demo example,

$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'

$ cat result.txt 
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

|
show 1 more comment

Yes, it is possible, maybe easiest with diff.

$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288

$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134

$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288

The following shellscript diff-script is more polished.

#!/bin/bash

# assuming 9-digit account and customer numbers

sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt

diff account.srt customer.srt > diff.txt

echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt

echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt

echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"

Demo example,

$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'

$ cat result.txt 
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

Yes, it is possible, maybe easiest with diff.

$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288

$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134

$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288

The following shellscript diff-script is more polished.

#!/bin/bash

# assuming 9-digit account and customer numbers

sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt

diff account.srt customer.srt > diff.txt

echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt

echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt

echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"

Demo example,

$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'

$ cat result.txt 
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

edited Dec 14 at 20:44

answered Dec 14 at 19:42

sudodus

96116

answered Dec 14 at 19:42

sudodus

96116

answered Dec 14 at 19:42

sudodus

96116

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

|
show 1 more comment

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44

@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44

sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47

Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49

Sometimes I have some strings like #REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51

|
show 1 more comment

For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.

$ cat account
876251251

716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134

$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429

791952441
676702288

$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer

answered Dec 14 at 21:08

George Vasiliou

5,59531028

add a comment |

For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.

$ cat account
876251251

716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134

$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429

791952441
676702288

$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer

answered Dec 14 at 21:08

George Vasiliou

5,59531028

add a comment |

For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.

$ cat account
876251251

716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134

$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429

791952441
676702288

$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer

answered Dec 14 at 21:08

George Vasiliou

5,59531028

For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.

$ cat account
876251251

716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134

$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429

791952441
676702288

$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer

answered Dec 14 at 21:08

George Vasiliou

5,59531028

answered Dec 14 at 21:08

George Vasiliou

5,59531028

answered Dec 14 at 21:08

George Vasiliou

5,59531028

answered Dec 14 at 21:08

George Vasiliou

5,59531028

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu

find the difference between two files and print it in the proper format

3 Answers
3

account.txt

customer.txt

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

account.txt

customer.txt

account.txt

customer.txt

account.txt

customer.txt

account.txt

customer.txt

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

find the difference between two files and print it in the proper format

3 Answers 3

account.txt

customer.txt

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

account.txt

customer.txt

account.txt

customer.txt

account.txt

customer.txt

account.txt

customer.txt

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

3 Answers
3

3 Answers
3

3 Answers
3