how can I merge two text files together

Clash Royale CLAN TAG#URR8PPP
I have a text file like this
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse(613-640) 776=mil (25/2.20/2.50)
sp|P0065-2|Hila_Mouse(344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse(47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse(162-190) 22=mil (25/2.20/2.50)
I have another text file like this
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
I want to merge them together keep the similar part and put the rest together. So for example, in text 1 and text 2, the following are similar:
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
the first part is the most important. they are considered similar because they have similar strings sp|QBWMM1-2|PDCI_Mouse
then I want to put them togther and it becomes
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
So the output could be like this:
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1 (613-640) 776=mil (25/2.20/2.50))
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4 (344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2 47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1 (162-190) 22=mil (25/2.20/2.50)
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
text-processing join
add a comment |
I have a text file like this
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse(613-640) 776=mil (25/2.20/2.50)
sp|P0065-2|Hila_Mouse(344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse(47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse(162-190) 22=mil (25/2.20/2.50)
I have another text file like this
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
I want to merge them together keep the similar part and put the rest together. So for example, in text 1 and text 2, the following are similar:
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
the first part is the most important. they are considered similar because they have similar strings sp|QBWMM1-2|PDCI_Mouse
then I want to put them togther and it becomes
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
So the output could be like this:
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1 (613-640) 776=mil (25/2.20/2.50))
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4 (344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2 47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1 (162-190) 22=mil (25/2.20/2.50)
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
text-processing join
1
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
2
You have totally changed the question with that edit! But the answer is usingjoin. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.
– Sparhawk
Jan 4 at 3:07
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19
add a comment |
I have a text file like this
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse(613-640) 776=mil (25/2.20/2.50)
sp|P0065-2|Hila_Mouse(344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse(47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse(162-190) 22=mil (25/2.20/2.50)
I have another text file like this
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
I want to merge them together keep the similar part and put the rest together. So for example, in text 1 and text 2, the following are similar:
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
the first part is the most important. they are considered similar because they have similar strings sp|QBWMM1-2|PDCI_Mouse
then I want to put them togther and it becomes
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
So the output could be like this:
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1 (613-640) 776=mil (25/2.20/2.50))
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4 (344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2 47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1 (162-190) 22=mil (25/2.20/2.50)
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
text-processing join
I have a text file like this
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse(613-640) 776=mil (25/2.20/2.50)
sp|P0065-2|Hila_Mouse(344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse(47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse(162-190) 22=mil (25/2.20/2.50)
I have another text file like this
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
I want to merge them together keep the similar part and put the rest together. So for example, in text 1 and text 2, the following are similar:
sp|QBWMM1-2|PDCI_Mouse(741-770) 8864=mil (25/2.20/2.50)
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1
the first part is the most important. they are considered similar because they have similar strings sp|QBWMM1-2|PDCI_Mouse
then I want to put them togther and it becomes
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
So the output could be like this:
sp|QBWMM1-2|PDCI_Mouse complex subunit alpha OS=Mouse OX=90009 PE=1 SV=1 (741-770) 8864=mil (25/2.20/2.50)
sp|Pm345|Hisf_Mouse Heat shock 70 kDa protein 1A OS=Mouse OX=90009 PE=1 SV=1 (613-640) 776=mil (25/2.20/2.50))
sp|O15012|SCCC_Mouse Protein transport protein Sec16A OS=MOUSE OX=90009 PE=1 SV=4
sp|P0065-2|Hila_Mouse Filamin-A OS=MOUSE OX=90009 PE=1 SV=4 (344-393)6543=mil (25/2.20/2.50)
sp|Q90081|Rira_Mouse Alpha-actin OS=Mouse OX=90009 PE=1 SV=2 47-72) 7365=mil (25/2.20/2.50)
sp|QQQQQ1|Ubs_Mouse Tubulin alpha8 chain OS=Mouse OX=90009 PE=1 SV=1 (162-190) 22=mil (25/2.20/2.50)
sp|QQQQQ2|Ubs_Mouse Plasta-3 OS=Mouse OX=90009 PE=1 SV=4
text-processing join
text-processing join
edited Jan 4 at 18:50
Jeff Schaller
39.5k1054126
39.5k1054126
asked Jan 4 at 2:31
LearnerLearner
1134
1134
1
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
2
You have totally changed the question with that edit! But the answer is usingjoin. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.
– Sparhawk
Jan 4 at 3:07
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19
add a comment |
1
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
2
You have totally changed the question with that edit! But the answer is usingjoin. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.
– Sparhawk
Jan 4 at 3:07
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19
1
1
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
2
2
You have totally changed the question with that edit! But the answer is using
join. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.– Sparhawk
Jan 4 at 3:07
You have totally changed the question with that edit! But the answer is using
join. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.– Sparhawk
Jan 4 at 3:07
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19
add a comment |
1 Answer
1
active
oldest
votes
If file1 is your first text file, and file2 is your second text file, this should work in Bash:
join -a 1 <(sort -k1,1 file2) <(sed -E -e 's/([^s])(/1 (/' file1 | sort -k1,1)
Join is called with two files as arguments, which two files (anonymous pipes or FIFOs actually) are the result of two process substitutions that take the form <(...)in the above command.
The second argument file contains a modified version of file1, having a space inserted before the first ( on each line so as to create a proper space-delimited join field; this modification is done by sed. The contents of both argument files are sorted based on the join field (i.e. field 1), as required by join. The -a 1 option ensures that non-pairable lines in file2 (i.e. the first argument file) are also output.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492363%2fhow-can-i-merge-two-text-files-together%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If file1 is your first text file, and file2 is your second text file, this should work in Bash:
join -a 1 <(sort -k1,1 file2) <(sed -E -e 's/([^s])(/1 (/' file1 | sort -k1,1)
Join is called with two files as arguments, which two files (anonymous pipes or FIFOs actually) are the result of two process substitutions that take the form <(...)in the above command.
The second argument file contains a modified version of file1, having a space inserted before the first ( on each line so as to create a proper space-delimited join field; this modification is done by sed. The contents of both argument files are sorted based on the join field (i.e. field 1), as required by join. The -a 1 option ensures that non-pairable lines in file2 (i.e. the first argument file) are also output.
add a comment |
If file1 is your first text file, and file2 is your second text file, this should work in Bash:
join -a 1 <(sort -k1,1 file2) <(sed -E -e 's/([^s])(/1 (/' file1 | sort -k1,1)
Join is called with two files as arguments, which two files (anonymous pipes or FIFOs actually) are the result of two process substitutions that take the form <(...)in the above command.
The second argument file contains a modified version of file1, having a space inserted before the first ( on each line so as to create a proper space-delimited join field; this modification is done by sed. The contents of both argument files are sorted based on the join field (i.e. field 1), as required by join. The -a 1 option ensures that non-pairable lines in file2 (i.e. the first argument file) are also output.
add a comment |
If file1 is your first text file, and file2 is your second text file, this should work in Bash:
join -a 1 <(sort -k1,1 file2) <(sed -E -e 's/([^s])(/1 (/' file1 | sort -k1,1)
Join is called with two files as arguments, which two files (anonymous pipes or FIFOs actually) are the result of two process substitutions that take the form <(...)in the above command.
The second argument file contains a modified version of file1, having a space inserted before the first ( on each line so as to create a proper space-delimited join field; this modification is done by sed. The contents of both argument files are sorted based on the join field (i.e. field 1), as required by join. The -a 1 option ensures that non-pairable lines in file2 (i.e. the first argument file) are also output.
If file1 is your first text file, and file2 is your second text file, this should work in Bash:
join -a 1 <(sort -k1,1 file2) <(sed -E -e 's/([^s])(/1 (/' file1 | sort -k1,1)
Join is called with two files as arguments, which two files (anonymous pipes or FIFOs actually) are the result of two process substitutions that take the form <(...)in the above command.
The second argument file contains a modified version of file1, having a space inserted before the first ( on each line so as to create a proper space-delimited join field; this modification is done by sed. The contents of both argument files are sorted based on the join field (i.e. field 1), as required by join. The -a 1 option ensures that non-pairable lines in file2 (i.e. the first argument file) are also output.
edited Jan 4 at 7:51
answered Jan 4 at 5:46
ozzyozzy
5755
5755
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492363%2fhow-can-i-merge-two-text-files-together%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
What is sublime? Sublime Text? Then why the bash tag? I don't know "sublime", but probably you want to search for a tutorial on regular expressions.
– Sparhawk
Jan 4 at 2:46
@Sparhawk you are right, I updated my question to make it more comprehensive
– Learner
Jan 4 at 3:05
2
You have totally changed the question with that edit! But the answer is using
join. Hopefully you can use that to search for a tutorial, but it's fairly straightforward. Good luck.– Sparhawk
Jan 4 at 3:07
join -a 1 text1.txt text2.txt does not give me the answer but thanks for your point
– Learner
Jan 4 at 3:19