find the difference between two files and print it in the proper format

Clash Royale CLAN TAG#URR8PPP
I have two text file "account.txt" and "customer.txt"
**account.txt**
876251251
716126181
888281211
666615211
787878787
111212134
**customer.txt**
876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288
I need to compare "account.txt" with "customer.txt".
- All account numbers from
account.txtshould be present incustomer.txtfile and ifcustomer.txtis missing any "account number" then we should print out all those account numbers which are missing incustomer.txt. - And also all those extra customer numbers in
customer.txtthat are not present inaccount.txtfile, I want to print that as well.
Output should be:
Missing Account Number:
888281211
787878787
111212134
Extra Customer Number:
792342108
792332668
760332429
791952441
676702288
Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.
comm -23 account.txt customer.txt
Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.
linux text-processing
add a comment |
I have two text file "account.txt" and "customer.txt"
**account.txt**
876251251
716126181
888281211
666615211
787878787
111212134
**customer.txt**
876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288
I need to compare "account.txt" with "customer.txt".
- All account numbers from
account.txtshould be present incustomer.txtfile and ifcustomer.txtis missing any "account number" then we should print out all those account numbers which are missing incustomer.txt. - And also all those extra customer numbers in
customer.txtthat are not present inaccount.txtfile, I want to print that as well.
Output should be:
Missing Account Number:
888281211
787878787
111212134
Extra Customer Number:
792342108
792332668
760332429
791952441
676702288
Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.
comm -23 account.txt customer.txt
Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.
linux text-processing
add a comment |
I have two text file "account.txt" and "customer.txt"
**account.txt**
876251251
716126181
888281211
666615211
787878787
111212134
**customer.txt**
876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288
I need to compare "account.txt" with "customer.txt".
- All account numbers from
account.txtshould be present incustomer.txtfile and ifcustomer.txtis missing any "account number" then we should print out all those account numbers which are missing incustomer.txt. - And also all those extra customer numbers in
customer.txtthat are not present inaccount.txtfile, I want to print that as well.
Output should be:
Missing Account Number:
888281211
787878787
111212134
Extra Customer Number:
792342108
792332668
760332429
791952441
676702288
Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.
comm -23 account.txt customer.txt
Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.
linux text-processing
I have two text file "account.txt" and "customer.txt"
**account.txt**
876251251
716126181
888281211
666615211
787878787
111212134
**customer.txt**
876251251
716126181
792342108
792332668
666615211
760332429
791952441
676702288
I need to compare "account.txt" with "customer.txt".
- All account numbers from
account.txtshould be present incustomer.txtfile and ifcustomer.txtis missing any "account number" then we should print out all those account numbers which are missing incustomer.txt. - And also all those extra customer numbers in
customer.txtthat are not present inaccount.txtfile, I want to print that as well.
Output should be:
Missing Account Number:
888281211
787878787
111212134
Extra Customer Number:
792342108
792332668
760332429
791952441
676702288
Is this possible to do in linux? I started off like this but it only does the first case I want, not the second one? Also I need to print the output in the above format.
comm -23 account.txt customer.txt
Note: It could be possible that there are some strings or empty lines in those files so we need to discard that in the comparison if we see any. We only need to compare valid numbers.
linux text-processing
linux text-processing
edited Dec 14 at 19:42
asked Dec 14 at 19:22
user5447339
160118
160118
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:
grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted
... then use comm as you indicated:
echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted;
echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;
Given sample inputs of:
account.txt
garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra
customer.txt
garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk
The resultant output is:
Missing Account Number:
111212134
787878787
888281211
Extra Customer Number:
676702288
760332429
791952441
792332668
792342108
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Alsogrep -Ex '[[:digit:]]9'command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07
|
show 1 more comment
Yes, it is possible, maybe easiest with diff.
$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288
$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134
$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288
The following shellscript diff-script is more polished.
#!/bin/bash
# assuming 9-digit account and customer numbers
sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt
diff account.srt customer.srt > diff.txt
echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt
echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt
echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"
Demo example,
$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'
$ cat result.txt
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like#REF!or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51
|
show 1 more comment
For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.
$ cat account
876251251
716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134
$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429
791952441
676702288
$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488044%2ffind-the-difference-between-two-files-and-print-it-in-the-proper-format%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:
grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted
... then use comm as you indicated:
echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted;
echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;
Given sample inputs of:
account.txt
garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra
customer.txt
garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk
The resultant output is:
Missing Account Number:
111212134
787878787
888281211
Extra Customer Number:
676702288
760332429
791952441
792332668
792342108
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Alsogrep -Ex '[[:digit:]]9'command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07
|
show 1 more comment
Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:
grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted
... then use comm as you indicated:
echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted;
echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;
Given sample inputs of:
account.txt
garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra
customer.txt
garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk
The resultant output is:
Missing Account Number:
111212134
787878787
888281211
Extra Customer Number:
676702288
760332429
791952441
792332668
792342108
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Alsogrep -Ex '[[:digit:]]9'command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07
|
show 1 more comment
Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:
grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted
... then use comm as you indicated:
echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted;
echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;
Given sample inputs of:
account.txt
garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra
customer.txt
garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk
The resultant output is:
Missing Account Number:
111212134
787878787
888281211
Extra Customer Number:
676702288
760332429
791952441
792332668
792342108
Another simple option would be to use comm; it just needs sorted input, so give it clean input by filtering for "valid account numbers" (the whole line consists only of 9 digits), then pipe to sort before redirecting to a new file:
grep -Ex '[[:digit:]]9' account.txt | sort > account.txt.sorted
grep -Ex '[[:digit:]]9' customer.txt | sort > customer.txt.sorted
... then use comm as you indicated:
echo 'Missing Account Number:'; comm -23 account.txt.sorted customer.txt.sorted;
echo 'Extra Customer Number:'; comm -13 account.txt.sorted customer.txt.sorted;
Given sample inputs of:
account.txt
garbage
876251251
716126181
888281211
666615211
666615211extra
787878787
111212134
extra
customer.txt
garbage
876251251
876251251extra
716126181
792342108
792332668
666615211
760332429
791952441
676702288
junk
The resultant output is:
Missing Account Number:
111212134
787878787
888281211
Extra Customer Number:
676702288
760332429
791952441
792332668
792342108
answered Dec 14 at 20:21
Jeff Schaller
38.5k1053125
38.5k1053125
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Alsogrep -Ex '[[:digit:]]9'command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07
|
show 1 more comment
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Alsogrep -Ex '[[:digit:]]9'command is only returning one number in the output but in my .txt file I have lot of numbers...
– user5447339
Dec 15 at 2:07
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
I tried the first two grep command and in the two new files generated I only see one item? Any idea what could be the reason? Also can we remove the limitation that numbers can only be 9 digits? It can be more or less as well
– user5447339
Dec 15 at 0:29
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
It looks like first grep command is only returning one number as the output and that's why sorted file has one number in the file. not sure why though grep command gives one number as the output..
– user5447339
Dec 15 at 0:34
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
I assumed that account numbers were as you showed -- 9 digits, and by themselves on a line.
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
Some sample input would help; how do you distinguish an account number from not-an-account-number?
– Jeff Schaller
Dec 15 at 0:37
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also
grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...– user5447339
Dec 15 at 2:07
if it is a number then they are valid account or customer number. It can be more than 9 or less than 9 digits as well. Also
grep -Ex '[[:digit:]]9' command is only returning one number in the output but in my .txt file I have lot of numbers...– user5447339
Dec 15 at 2:07
|
show 1 more comment
Yes, it is possible, maybe easiest with diff.
$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288
$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134
$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288
The following shellscript diff-script is more polished.
#!/bin/bash
# assuming 9-digit account and customer numbers
sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt
diff account.srt customer.srt > diff.txt
echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt
echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt
echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"
Demo example,
$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'
$ cat result.txt
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like#REF!or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51
|
show 1 more comment
Yes, it is possible, maybe easiest with diff.
$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288
$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134
$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288
The following shellscript diff-script is more polished.
#!/bin/bash
# assuming 9-digit account and customer numbers
sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt
diff account.srt customer.srt > diff.txt
echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt
echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt
echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"
Demo example,
$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'
$ cat result.txt
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like#REF!or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51
|
show 1 more comment
Yes, it is possible, maybe easiest with diff.
$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288
$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134
$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288
The following shellscript diff-script is more polished.
#!/bin/bash
# assuming 9-digit account and customer numbers
sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt
diff account.srt customer.srt > diff.txt
echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt
echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt
echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"
Demo example,
$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'
$ cat result.txt
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108
Yes, it is possible, maybe easiest with diff.
$ diff account.txt customer.txt
1c1
< **account.txt**
---
> **customer.txt**
5c5,6
< 888281211
---
> 792342108
> 792332668
7,8c8,10
< 787878787
< 111212134
---
> 760332429
> 791952441
> 676702288
$ diff account.txt customer.txt|grep '^<'
< **account.txt**
< 888281211
< 787878787
< 111212134
$ diff account.txt customer.txt|grep '^>'
> **customer.txt**
> 792342108
> 792332668
> 760332429
> 791952441
> 676702288
The following shellscript diff-script is more polished.
#!/bin/bash
# assuming 9-digit account and customer numbers
sort account.txt | uniq > account.srt
sort customer.txt | uniq > customer.srt
diff account.srt customer.srt > diff.txt
echo 'only in account.srt:' > result.txt
< diff.txt grep -E '^< [0-9]9$' | sed s'/^< //' >> result.txt
echo 'only in customer.srt:' >> result.txt
< diff.txt grep -E '^> [0-9]9$' | sed s'/^> //' >> result.txt
echo "The result is in the file 'result.txt'"
echo "You can read it with 'less result.txt'"
Demo example,
$ ./diff-script
The result is in the file 'result.txt'
You can read it with 'less result.txt'
$ cat result.txt
only in account.srt:
111212134
787878787
888281211
only in customer.srt:
676702288
760332429
791952441
792332668
792342108
edited Dec 14 at 20:44
answered Dec 14 at 19:42
sudodus
96116
96116
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like#REF!or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51
|
show 1 more comment
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like#REF!or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..
– user5447339
Dec 14 at 19:51
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
is there any way to do everything in one command and print in the above format?
– user5447339
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
@user5447339, I'll fine-tune my answer, please wait a while :-)
– sudodus
Dec 14 at 19:44
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
sure.. also how can I compare only valid numbers.. Ignore all the empty lines in the file and some strings as well which are there.. Also if you can add some description on how the above commands are working then it will help me a lot as well in understanding..
– user5447339
Dec 14 at 19:47
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Please specify valid numbers! Are all nine-digit-numbers valid?
– sudodus
Dec 14 at 19:49
Sometimes I have some strings like
#REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..– user5447339
Dec 14 at 19:51
Sometimes I have some strings like
#REF! or some other strings in a new line so it should discard those from the comparisons.. Also sometimes it has empty lines in the file followed by some numbers so it should discard those as well..– user5447339
Dec 14 at 19:51
|
show 1 more comment
For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.
$ cat account
876251251
716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134
$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429
791952441
676702288
$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer
add a comment |
For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.
$ cat account
876251251
716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134
$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429
791952441
676702288
$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer
add a comment |
For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.
$ cat account
876251251
716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134
$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429
791952441
676702288
$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer
For this job i would choose awk. Bellow code runs only for valid data of 9 numbers in the row. Empty lines, lines with numbers more or less than 9, and lines with letters are ignored.
$ cat account
876251251
716126181
888281211
asdferfggggg
666615211
787878787
123456789123
111212134
$ cat customer
876251251
716126181
eeeeeeeee
792342108
792332668
666615211
760332429
791952441
676702288
$ awk '/^[0-9]9$/a[$0]++;b[$0]="found only in " FILENAMEENDfor (i in a) if (a[i]==1) print i,b[i]' account customer |sort -k2
111212134 found only in account
787878787 found only in account
888281211 found only in account
676702288 found only in customer
760332429 found only in customer
791952441 found only in customer
792332668 found only in customer
792342108 found only in customer
answered Dec 14 at 21:08
George Vasiliou
5,59531028
5,59531028
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f488044%2ffind-the-difference-between-two-files-and-print-it-in-the-proper-format%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown