rearrange primer3 boulder IO output

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I'm trying to rearrange primer3_core output.



For example:



SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=


So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.



The output would look like this:



ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT









share|improve this question



















  • 1




    By the way, we also have a dedicated Bioinformatics site. Come check it out!
    – terdon♦
    Oct 2 '17 at 15:17











  • If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
    – Kusalananda
    Oct 5 '17 at 21:56














up vote
0
down vote

favorite












I'm trying to rearrange primer3_core output.



For example:



SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=


So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.



The output would look like this:



ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT









share|improve this question



















  • 1




    By the way, we also have a dedicated Bioinformatics site. Come check it out!
    – terdon♦
    Oct 2 '17 at 15:17











  • If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
    – Kusalananda
    Oct 5 '17 at 21:56












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I'm trying to rearrange primer3_core output.



For example:



SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=


So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.



The output would look like this:



ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT









share|improve this question















I'm trying to rearrange primer3_core output.



For example:



SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=


So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.



The output would look like this:



ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT






text-processing scripting bioinformatics






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 2 '17 at 15:17









terdon♦

123k28232404




123k28232404










asked Oct 2 '17 at 13:53









agp5432

1




1







  • 1




    By the way, we also have a dedicated Bioinformatics site. Come check it out!
    – terdon♦
    Oct 2 '17 at 15:17











  • If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
    – Kusalananda
    Oct 5 '17 at 21:56












  • 1




    By the way, we also have a dedicated Bioinformatics site. Come check it out!
    – terdon♦
    Oct 2 '17 at 15:17











  • If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
    – Kusalananda
    Oct 5 '17 at 21:56







1




1




By the way, we also have a dedicated Bioinformatics site. Come check it out!
– terdon♦
Oct 2 '17 at 15:17





By the way, we also have a dedicated Bioinformatics site. Come check it out!
– terdon♦
Oct 2 '17 at 15:17













If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
– Kusalananda
Oct 5 '17 at 21:56




If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
– Kusalananda
Oct 5 '17 at 21:56










3 Answers
3






active

oldest

votes

















up vote
1
down vote













awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename


Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data






share|improve this answer



























    up vote
    1
    down vote













    Awk approach:



    awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file


    The output:



    ID_1 ACGTGTAGCGGTTCAGACG
    ID_1 ACCATGCATGATCCATCCAGG
    ID_1 CACAGCCACAGCAGCACAC
    ID_1 ATGCAGGTGATCAAGTTACGCC
    ID_2 CACAGCCACAGCAGCACAC
    ID_2 GCAGGTGATCAAGTTACGCCATT





    share|improve this answer





























      up vote
      0
      down vote













      Using a sed script:



      # delete lines starting with '='
      /^=/d

      # handle sequence ID lines
      /^SEQUENCE_ID=/
      # remove everything up to and including the '='
      s///
      # put the sequence ID in the hold space
      h
      # delete the pattern space and continue with next line
      d


      # handle primer lines
      /^PRIMER.*=/
      # remove everything up to and including the '='
      s///
      # append a newline and the sequence ID from the hold space to the pattern space
      G
      # swap the two bits of the pattern space around, deleting the newline
      s/^(.*)n(.*)$/2 1/



      Testing it:



      $ sed -f script.sed file
      ID_1 ACGTGTAGCGGTTCAGACG
      ID_1 ACCATGCATGATCCATCCAGG
      ID_1 CACAGCCACAGCAGCACAC
      ID_1 ATGCAGGTGATCAAGTTACGCC
      ID_2 CACAGCCACAGCAGCACAC
      ID_2 GCAGGTGATCAAGTTACGCCATT


      Without a separate script file:



      $ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
      ID_1 ACGTGTAGCGGTTCAGACG
      ID_1 ACCATGCATGATCCATCCAGG
      ID_1 CACAGCCACAGCAGCACAC
      ID_1 ATGCAGGTGATCAAGTTACGCC
      ID_2 CACAGCCACAGCAGCACAC
      ID_2 GCAGGTGATCAAGTTACGCCATT


      Shorter variant:



      $ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file





      share|improve this answer






















        Your Answer







        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "106"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        convertImagesToLinks: false,
        noModals: false,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













         

        draft saved


        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f395650%2frearrange-primer3-boulder-io-output%23new-answer', 'question_page');

        );

        Post as a guest






























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        1
        down vote













        awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename


        Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data






        share|improve this answer
























          up vote
          1
          down vote













          awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename


          Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data






          share|improve this answer






















            up vote
            1
            down vote










            up vote
            1
            down vote









            awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename


            Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data






            share|improve this answer












            awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename


            Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Oct 2 '17 at 14:22









            Raman Sailopal

            1,18117




            1,18117






















                up vote
                1
                down vote













                Awk approach:



                awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file


                The output:



                ID_1 ACGTGTAGCGGTTCAGACG
                ID_1 ACCATGCATGATCCATCCAGG
                ID_1 CACAGCCACAGCAGCACAC
                ID_1 ATGCAGGTGATCAAGTTACGCC
                ID_2 CACAGCCACAGCAGCACAC
                ID_2 GCAGGTGATCAAGTTACGCCATT





                share|improve this answer


























                  up vote
                  1
                  down vote













                  Awk approach:



                  awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file


                  The output:



                  ID_1 ACGTGTAGCGGTTCAGACG
                  ID_1 ACCATGCATGATCCATCCAGG
                  ID_1 CACAGCCACAGCAGCACAC
                  ID_1 ATGCAGGTGATCAAGTTACGCC
                  ID_2 CACAGCCACAGCAGCACAC
                  ID_2 GCAGGTGATCAAGTTACGCCATT





                  share|improve this answer
























                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    Awk approach:



                    awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file


                    The output:



                    ID_1 ACGTGTAGCGGTTCAGACG
                    ID_1 ACCATGCATGATCCATCCAGG
                    ID_1 CACAGCCACAGCAGCACAC
                    ID_1 ATGCAGGTGATCAAGTTACGCC
                    ID_2 CACAGCCACAGCAGCACAC
                    ID_2 GCAGGTGATCAAGTTACGCCATT





                    share|improve this answer














                    Awk approach:



                    awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file


                    The output:



                    ID_1 ACGTGTAGCGGTTCAGACG
                    ID_1 ACCATGCATGATCCATCCAGG
                    ID_1 CACAGCCACAGCAGCACAC
                    ID_1 ATGCAGGTGATCAAGTTACGCC
                    ID_2 CACAGCCACAGCAGCACAC
                    ID_2 GCAGGTGATCAAGTTACGCCATT






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited May 6 at 7:20

























                    answered Oct 2 '17 at 15:08









                    RomanPerekhrest

                    22.5k12145




                    22.5k12145




















                        up vote
                        0
                        down vote













                        Using a sed script:



                        # delete lines starting with '='
                        /^=/d

                        # handle sequence ID lines
                        /^SEQUENCE_ID=/
                        # remove everything up to and including the '='
                        s///
                        # put the sequence ID in the hold space
                        h
                        # delete the pattern space and continue with next line
                        d


                        # handle primer lines
                        /^PRIMER.*=/
                        # remove everything up to and including the '='
                        s///
                        # append a newline and the sequence ID from the hold space to the pattern space
                        G
                        # swap the two bits of the pattern space around, deleting the newline
                        s/^(.*)n(.*)$/2 1/



                        Testing it:



                        $ sed -f script.sed file
                        ID_1 ACGTGTAGCGGTTCAGACG
                        ID_1 ACCATGCATGATCCATCCAGG
                        ID_1 CACAGCCACAGCAGCACAC
                        ID_1 ATGCAGGTGATCAAGTTACGCC
                        ID_2 CACAGCCACAGCAGCACAC
                        ID_2 GCAGGTGATCAAGTTACGCCATT


                        Without a separate script file:



                        $ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
                        ID_1 ACGTGTAGCGGTTCAGACG
                        ID_1 ACCATGCATGATCCATCCAGG
                        ID_1 CACAGCCACAGCAGCACAC
                        ID_1 ATGCAGGTGATCAAGTTACGCC
                        ID_2 CACAGCCACAGCAGCACAC
                        ID_2 GCAGGTGATCAAGTTACGCCATT


                        Shorter variant:



                        $ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file





                        share|improve this answer


























                          up vote
                          0
                          down vote













                          Using a sed script:



                          # delete lines starting with '='
                          /^=/d

                          # handle sequence ID lines
                          /^SEQUENCE_ID=/
                          # remove everything up to and including the '='
                          s///
                          # put the sequence ID in the hold space
                          h
                          # delete the pattern space and continue with next line
                          d


                          # handle primer lines
                          /^PRIMER.*=/
                          # remove everything up to and including the '='
                          s///
                          # append a newline and the sequence ID from the hold space to the pattern space
                          G
                          # swap the two bits of the pattern space around, deleting the newline
                          s/^(.*)n(.*)$/2 1/



                          Testing it:



                          $ sed -f script.sed file
                          ID_1 ACGTGTAGCGGTTCAGACG
                          ID_1 ACCATGCATGATCCATCCAGG
                          ID_1 CACAGCCACAGCAGCACAC
                          ID_1 ATGCAGGTGATCAAGTTACGCC
                          ID_2 CACAGCCACAGCAGCACAC
                          ID_2 GCAGGTGATCAAGTTACGCCATT


                          Without a separate script file:



                          $ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
                          ID_1 ACGTGTAGCGGTTCAGACG
                          ID_1 ACCATGCATGATCCATCCAGG
                          ID_1 CACAGCCACAGCAGCACAC
                          ID_1 ATGCAGGTGATCAAGTTACGCC
                          ID_2 CACAGCCACAGCAGCACAC
                          ID_2 GCAGGTGATCAAGTTACGCCATT


                          Shorter variant:



                          $ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file





                          share|improve this answer
























                            up vote
                            0
                            down vote










                            up vote
                            0
                            down vote









                            Using a sed script:



                            # delete lines starting with '='
                            /^=/d

                            # handle sequence ID lines
                            /^SEQUENCE_ID=/
                            # remove everything up to and including the '='
                            s///
                            # put the sequence ID in the hold space
                            h
                            # delete the pattern space and continue with next line
                            d


                            # handle primer lines
                            /^PRIMER.*=/
                            # remove everything up to and including the '='
                            s///
                            # append a newline and the sequence ID from the hold space to the pattern space
                            G
                            # swap the two bits of the pattern space around, deleting the newline
                            s/^(.*)n(.*)$/2 1/



                            Testing it:



                            $ sed -f script.sed file
                            ID_1 ACGTGTAGCGGTTCAGACG
                            ID_1 ACCATGCATGATCCATCCAGG
                            ID_1 CACAGCCACAGCAGCACAC
                            ID_1 ATGCAGGTGATCAAGTTACGCC
                            ID_2 CACAGCCACAGCAGCACAC
                            ID_2 GCAGGTGATCAAGTTACGCCATT


                            Without a separate script file:



                            $ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
                            ID_1 ACGTGTAGCGGTTCAGACG
                            ID_1 ACCATGCATGATCCATCCAGG
                            ID_1 CACAGCCACAGCAGCACAC
                            ID_1 ATGCAGGTGATCAAGTTACGCC
                            ID_2 CACAGCCACAGCAGCACAC
                            ID_2 GCAGGTGATCAAGTTACGCCATT


                            Shorter variant:



                            $ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file





                            share|improve this answer














                            Using a sed script:



                            # delete lines starting with '='
                            /^=/d

                            # handle sequence ID lines
                            /^SEQUENCE_ID=/
                            # remove everything up to and including the '='
                            s///
                            # put the sequence ID in the hold space
                            h
                            # delete the pattern space and continue with next line
                            d


                            # handle primer lines
                            /^PRIMER.*=/
                            # remove everything up to and including the '='
                            s///
                            # append a newline and the sequence ID from the hold space to the pattern space
                            G
                            # swap the two bits of the pattern space around, deleting the newline
                            s/^(.*)n(.*)$/2 1/



                            Testing it:



                            $ sed -f script.sed file
                            ID_1 ACGTGTAGCGGTTCAGACG
                            ID_1 ACCATGCATGATCCATCCAGG
                            ID_1 CACAGCCACAGCAGCACAC
                            ID_1 ATGCAGGTGATCAAGTTACGCC
                            ID_2 CACAGCCACAGCAGCACAC
                            ID_2 GCAGGTGATCAAGTTACGCCATT


                            Without a separate script file:



                            $ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
                            ID_1 ACGTGTAGCGGTTCAGACG
                            ID_1 ACCATGCATGATCCATCCAGG
                            ID_1 CACAGCCACAGCAGCACAC
                            ID_1 ATGCAGGTGATCAAGTTACGCC
                            ID_2 CACAGCCACAGCAGCACAC
                            ID_2 GCAGGTGATCAAGTTACGCCATT


                            Shorter variant:



                            $ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file






                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Oct 2 '17 at 17:23

























                            answered Oct 2 '17 at 17:14









                            Kusalananda

                            105k14209326




                            105k14209326



























                                 

                                draft saved


                                draft discarded















































                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f395650%2frearrange-primer3-boulder-io-output%23new-answer', 'question_page');

                                );

                                Post as a guest













































































                                Popular posts from this blog

                                How to check contact read email or not when send email to Individual?

                                Displaying single band from multi-band raster using QGIS

                                How many registers does an x86_64 CPU actually have?