rearrange primer3 boulder IO output
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I'm trying to rearrange primer3_core
output.
For example:
SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=
So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.
The output would look like this:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
text-processing scripting bioinformatics
add a comment |Â
up vote
0
down vote
favorite
I'm trying to rearrange primer3_core
output.
For example:
SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=
So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.
The output would look like this:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
text-processing scripting bioinformatics
1
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm trying to rearrange primer3_core
output.
For example:
SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=
So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.
The output would look like this:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
text-processing scripting bioinformatics
I'm trying to rearrange primer3_core
output.
For example:
SEQUENCE_ID=ID_1
PRIMER_LEFT_0_SEQUENCE=ACGTGTAGCGGTTCAGACG
PRIMER_RIGHT_0_SEQUENCE=ACCATGCATGATCCATCCAGG
PRIMER_LEFT_1_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_1_SEQUENCE=ATGCAGGTGATCAAGTTACGCC
=
SEQUENCE_ID=ID_2
PRIMER_LEFT_0_SEQUENCE=CACAGCCACAGCAGCACAC
PRIMER_RIGHT_0_SEQUENCE=GCAGGTGATCAAGTTACGCCATT
=
So, it's possible that each ID would have a different number of primers it produces, anywhere from 0-20.
The output would look like this:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
text-processing scripting bioinformatics
text-processing scripting bioinformatics
edited Oct 2 '17 at 15:17
terdonâ¦
123k28232404
123k28232404
asked Oct 2 '17 at 13:53
agp5432
1
1
1
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56
add a comment |Â
1
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56
1
1
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
1
down vote
awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename
Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data
add a comment |Â
up vote
1
down vote
Awk approach:
awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file
The output:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
add a comment |Â
up vote
0
down vote
Using a sed
script:
# delete lines starting with '='
/^=/d
# handle sequence ID lines
/^SEQUENCE_ID=/
# remove everything up to and including the '='
s///
# put the sequence ID in the hold space
h
# delete the pattern space and continue with next line
d
# handle primer lines
/^PRIMER.*=/
# remove everything up to and including the '='
s///
# append a newline and the sequence ID from the hold space to the pattern space
G
# swap the two bits of the pattern space around, deleting the newline
s/^(.*)n(.*)$/2 1/
Testing it:
$ sed -f script.sed file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Without a separate script file:
$ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Shorter variant:
$ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename
Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data
add a comment |Â
up vote
1
down vote
awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename
Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data
add a comment |Â
up vote
1
down vote
up vote
1
down vote
awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename
Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data
awk -F= '$0 ~ "^SEQUENCE" SEQ=$2 $0 !~ "^SEQUENCE" print SEQ" "$2 ' filename
Use awk and use = as the field delimiter. Where then line begins with SEQUENCE, set the SEQ variable equal to the second delimited piece. For all other instances, print SEQ along with the second delimited piece of data
answered Oct 2 '17 at 14:22
Raman Sailopal
1,18117
1,18117
add a comment |Â
add a comment |Â
up vote
1
down vote
Awk approach:
awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file
The output:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
add a comment |Â
up vote
1
down vote
Awk approach:
awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file
The output:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Awk approach:
awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file
The output:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Awk approach:
awk -F'=' '/^SEQUENCE_ID/ s = $2 /^PRIMER/ print s, $2 ' file
The output:
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
edited May 6 at 7:20
answered Oct 2 '17 at 15:08
RomanPerekhrest
22.5k12145
22.5k12145
add a comment |Â
add a comment |Â
up vote
0
down vote
Using a sed
script:
# delete lines starting with '='
/^=/d
# handle sequence ID lines
/^SEQUENCE_ID=/
# remove everything up to and including the '='
s///
# put the sequence ID in the hold space
h
# delete the pattern space and continue with next line
d
# handle primer lines
/^PRIMER.*=/
# remove everything up to and including the '='
s///
# append a newline and the sequence ID from the hold space to the pattern space
G
# swap the two bits of the pattern space around, deleting the newline
s/^(.*)n(.*)$/2 1/
Testing it:
$ sed -f script.sed file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Without a separate script file:
$ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Shorter variant:
$ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file
add a comment |Â
up vote
0
down vote
Using a sed
script:
# delete lines starting with '='
/^=/d
# handle sequence ID lines
/^SEQUENCE_ID=/
# remove everything up to and including the '='
s///
# put the sequence ID in the hold space
h
# delete the pattern space and continue with next line
d
# handle primer lines
/^PRIMER.*=/
# remove everything up to and including the '='
s///
# append a newline and the sequence ID from the hold space to the pattern space
G
# swap the two bits of the pattern space around, deleting the newline
s/^(.*)n(.*)$/2 1/
Testing it:
$ sed -f script.sed file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Without a separate script file:
$ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Shorter variant:
$ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Using a sed
script:
# delete lines starting with '='
/^=/d
# handle sequence ID lines
/^SEQUENCE_ID=/
# remove everything up to and including the '='
s///
# put the sequence ID in the hold space
h
# delete the pattern space and continue with next line
d
# handle primer lines
/^PRIMER.*=/
# remove everything up to and including the '='
s///
# append a newline and the sequence ID from the hold space to the pattern space
G
# swap the two bits of the pattern space around, deleting the newline
s/^(.*)n(.*)$/2 1/
Testing it:
$ sed -f script.sed file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Without a separate script file:
$ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Shorter variant:
$ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file
Using a sed
script:
# delete lines starting with '='
/^=/d
# handle sequence ID lines
/^SEQUENCE_ID=/
# remove everything up to and including the '='
s///
# put the sequence ID in the hold space
h
# delete the pattern space and continue with next line
d
# handle primer lines
/^PRIMER.*=/
# remove everything up to and including the '='
s///
# append a newline and the sequence ID from the hold space to the pattern space
G
# swap the two bits of the pattern space around, deleting the newline
s/^(.*)n(.*)$/2 1/
Testing it:
$ sed -f script.sed file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Without a separate script file:
$ sed -e '/^=/d' -e '/^SEQUENCE_ID=/s///;h;d;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/;' file
ID_1 ACGTGTAGCGGTTCAGACG
ID_1 ACCATGCATGATCCATCCAGG
ID_1 CACAGCCACAGCAGCACAC
ID_1 ATGCAGGTGATCAAGTTACGCC
ID_2 CACAGCCACAGCAGCACAC
ID_2 GCAGGTGATCAAGTTACGCCATT
Shorter variant:
$ sed -n -e '/^SEQUENCE_ID=/s///;h;' -e '/^PRIMER.*=/s///;G;s/^(.*)n(.*)$/2 1/p;' file
edited Oct 2 '17 at 17:23
answered Oct 2 '17 at 17:14
Kusalananda
105k14209326
105k14209326
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f395650%2frearrange-primer3-boulder-io-output%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
By the way, we also have a dedicated Bioinformatics site. Come check it out!
â terdonâ¦
Oct 2 '17 at 15:17
If you're happy with one or several of the answers, upvote them. If one is solving your issue, accepting it would be the best way of saying "Thank You!" :-)
â Kusalananda
Oct 5 '17 at 21:56