copy the names of DNA sequences in a phylogenetic tree file and add the names of the species to it?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
-1
down vote

favorite












From a file looking like this one :




(AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




I need to obtain this file :




AJirio"AJirio" 
AJama.1.1"AJama"
AJkago.1"AJkago"
AJtok"AJtok"
Atab.1.1.1"Atab"



So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







share|improve this question
























    up vote
    -1
    down vote

    favorite












    From a file looking like this one :




    (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




    I need to obtain this file :




    AJirio"AJirio" 
    AJama.1.1"AJama"
    AJkago.1"AJkago"
    AJtok"AJtok"
    Atab.1.1.1"Atab"



    So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







    share|improve this question






















      up vote
      -1
      down vote

      favorite









      up vote
      -1
      down vote

      favorite











      From a file looking like this one :




      (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




      I need to obtain this file :




      AJirio"AJirio" 
      AJama.1.1"AJama"
      AJkago.1"AJkago"
      AJtok"AJtok"
      Atab.1.1.1"Atab"



      So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







      share|improve this question












      From a file looking like this one :




      (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




      I need to obtain this file :




      AJirio"AJirio" 
      AJama.1.1"AJama"
      AJkago.1"AJkago"
      AJtok"AJtok"
      Atab.1.1.1"Atab"



      So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.









      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 6 at 14:28









      DavidB

      114




      114




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f428523%2fcopy-the-names-of-dna-sequences-in-a-phylogenetic-tree-file-and-add-the-names-of%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02














          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02












          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer














          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 6 at 15:55

























          answered Mar 6 at 14:44









          RomanPerekhrest

          22.4k12144




          22.4k12144











          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02
















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02















          Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
          – DavidB
          Mar 6 at 15:39





          Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
          – DavidB
          Mar 6 at 15:39













          @DavidB, yes, use my update
          – RomanPerekhrest
          Mar 6 at 15:55




          @DavidB, yes, use my update
          – RomanPerekhrest
          Mar 6 at 15:55












          Great, thank you very much
          – DavidB
          Mar 6 at 16:00




          Great, thank you very much
          – DavidB
          Mar 6 at 16:00












          @DavidB, you're welcome
          – RomanPerekhrest
          Mar 6 at 16:02




          @DavidB, you're welcome
          – RomanPerekhrest
          Mar 6 at 16:02












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f428523%2fcopy-the-names-of-dna-sequences-in-a-phylogenetic-tree-file-and-add-the-names-of%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          Bahrain

          Postfix configuration issue with fips on centos 7; mailgun relay