copy the names of DNA sequences in a phylogenetic tree file and add the names of the species to it?

Multi tool use
Multi tool use

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
-1
down vote

favorite












From a file looking like this one :




(AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




I need to obtain this file :




AJirio"AJirio" 
AJama.1.1"AJama"
AJkago.1"AJkago"
AJtok"AJtok"
Atab.1.1.1"Atab"



So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







share|improve this question
























    up vote
    -1
    down vote

    favorite












    From a file looking like this one :




    (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




    I need to obtain this file :




    AJirio"AJirio" 
    AJama.1.1"AJama"
    AJkago.1"AJkago"
    AJtok"AJtok"
    Atab.1.1.1"Atab"



    So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







    share|improve this question






















      up vote
      -1
      down vote

      favorite









      up vote
      -1
      down vote

      favorite











      From a file looking like this one :




      (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




      I need to obtain this file :




      AJirio"AJirio" 
      AJama.1.1"AJama"
      AJkago.1"AJkago"
      AJtok"AJtok"
      Atab.1.1.1"Atab"



      So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.







      share|improve this question












      From a file looking like this one :




      (AJirio:0.00230,(AJama.1.1:0.00171,(AJkago.1:0.00057,AJtok:0.00033)1.00 :0.00080)0.94:0.00085,Atab.1.1.1:0.27697);




      I need to obtain this file :




      AJirio"AJirio" 
      AJama.1.1"AJama"
      AJkago.1"AJkago"
      AJtok"AJtok"
      Atab.1.1.1"Atab"



      So basically extract the name of the DNA sequences in a phylogenetic tree and add the name of the species (AJirio, AJkama..) with quotation marks to it.









      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 6 at 14:28









      DavidB

      114




      114




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f428523%2fcopy-the-names-of-dna-sequences-in-a-phylogenetic-tree-file-and-add-the-names-of%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02














          up vote
          1
          down vote



          accepted










          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer






















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02












          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"





          share|improve this answer














          Two approaches:



          Awk approach:



          awk -v RS=',' -F':' ' 
          sub(/(*/, ""); dna = $1;
          gsub(/[^a-zA-Z]/, "", $1);
          printf "%s42%s42n", dna, $1
          ' file



          sed approach:



          sed -En 's/(*?(([a-zA-Z]+)[^:]*):[^,]+/1"2"/g; s/,/n/gp' file



          The output:



          AJirio"AJirio"
          AJama.1.1"AJama"
          AJkago.1"AJkago"
          AJtok"AJtok"
          Atab.1.1.1"Atab"






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 6 at 15:55

























          answered Mar 6 at 14:44









          RomanPerekhrest

          22.4k12144




          22.4k12144











          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02
















          • Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
            – DavidB
            Mar 6 at 15:39











          • @DavidB, yes, use my update
            – RomanPerekhrest
            Mar 6 at 15:55










          • Great, thank you very much
            – DavidB
            Mar 6 at 16:00










          • @DavidB, you're welcome
            – RomanPerekhrest
            Mar 6 at 16:02















          Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
          – DavidB
          Mar 6 at 15:39





          Thank you! It's working on this file but in most of my files I have more parenthesis. For example : ((((AJirio.1.1 will become (((AJirio.1.1"AJirio" And I would like to have AJirio.1.1"AJirio". Do you know how to modify your script to have this?
          – DavidB
          Mar 6 at 15:39













          @DavidB, yes, use my update
          – RomanPerekhrest
          Mar 6 at 15:55




          @DavidB, yes, use my update
          – RomanPerekhrest
          Mar 6 at 15:55












          Great, thank you very much
          – DavidB
          Mar 6 at 16:00




          Great, thank you very much
          – DavidB
          Mar 6 at 16:00












          @DavidB, you're welcome
          – RomanPerekhrest
          Mar 6 at 16:02




          @DavidB, you're welcome
          – RomanPerekhrest
          Mar 6 at 16:02












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f428523%2fcopy-the-names-of-dna-sequences-in-a-phylogenetic-tree-file-and-add-the-names-of%23new-answer', 'question_page');

          );

          Post as a guest













































































          tRuGZ eRIlUMToA,myyf,L3HU6jiahqCUL 1Ir S2p,z,ZcXiNRG9AvxR7h0QklQ46lVBtbe2O,Q6AyEhM
          yB,b1dn eDlh1 QsKywFAxaCbiJU,gbQEET8jjC68ie qOCGL,F2TlGYfeYcpl7qZPfNvti3gg mIW

          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          How many registers does an x86_64 CPU actually have?

          Displaying single band from multi-band raster using QGIS