Extract the pattern of Directory in GNU parallel
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.
Below is an example of a folder.
Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz
I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.
parallel -j $NSLOTS --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.
The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.
Each 1 consists of e.g. (shown below only for one sample)
/tmp/FASTQ/Sample_EC_only
linux gnu-parallel
add a comment |Â
up vote
2
down vote
favorite
I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.
Below is an example of a folder.
Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz
I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.
parallel -j $NSLOTS --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.
The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.
Each 1 consists of e.g. (shown below only for one sample)
/tmp/FASTQ/Sample_EC_only
linux gnu-parallel
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.
Below is an example of a folder.
Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz
I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.
parallel -j $NSLOTS --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.
The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.
Each 1 consists of e.g. (shown below only for one sample)
/tmp/FASTQ/Sample_EC_only
linux gnu-parallel
I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.
Below is an example of a folder.
Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz
I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.
parallel -j $NSLOTS --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.
The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.
Each 1 consists of e.g. (shown below only for one sample)
/tmp/FASTQ/Sample_EC_only
linux gnu-parallel
linux gnu-parallel
edited Sep 10 at 20:03
Rui F Ribeiro
36.8k1273117
36.8k1273117
asked Sep 10 at 20:01
Ron
4182513
4182513
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53
add a comment |Â
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
If I understand you correctly, all you are looking for is 1/
instead of 1
. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl
where we have that replacement strings are implemented as
--rpl '/ s:.*/::'
and The positional replacement strings can also be modified using /
etc.
So 1/
is like removing all characters upto the final /
.
You can create your own replacement shorthand strings using --rpl
followed by a string which begins with a tag (/
in the example above), then a perl expression, such as the substitute command above (s:
pattern:
replacement:
).
I'm not sure what is allowed as tags, but we can use the tutorial example ..
for a positional tag, i.e. that can be used with number
. The perl expression to remove everything upto the last
/
followed by the word "Sample_" woudl be: s:.*/Sample_::
so you need to add before --xapply
the arguments
--rpl '.. s:.*/Sample_::'
and then use --id=1..
to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _
, rather than a fixed word Sample
, you can use a pattern such as
--rpl '.. s:.*/[^_]*_::'
The final command should look something like this:
parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.
â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
 |Â
show 2 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
If I understand you correctly, all you are looking for is 1/
instead of 1
. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl
where we have that replacement strings are implemented as
--rpl '/ s:.*/::'
and The positional replacement strings can also be modified using /
etc.
So 1/
is like removing all characters upto the final /
.
You can create your own replacement shorthand strings using --rpl
followed by a string which begins with a tag (/
in the example above), then a perl expression, such as the substitute command above (s:
pattern:
replacement:
).
I'm not sure what is allowed as tags, but we can use the tutorial example ..
for a positional tag, i.e. that can be used with number
. The perl expression to remove everything upto the last
/
followed by the word "Sample_" woudl be: s:.*/Sample_::
so you need to add before --xapply
the arguments
--rpl '.. s:.*/Sample_::'
and then use --id=1..
to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _
, rather than a fixed word Sample
, you can use a pattern such as
--rpl '.. s:.*/[^_]*_::'
The final command should look something like this:
parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.
â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
 |Â
show 2 more comments
up vote
2
down vote
accepted
If I understand you correctly, all you are looking for is 1/
instead of 1
. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl
where we have that replacement strings are implemented as
--rpl '/ s:.*/::'
and The positional replacement strings can also be modified using /
etc.
So 1/
is like removing all characters upto the final /
.
You can create your own replacement shorthand strings using --rpl
followed by a string which begins with a tag (/
in the example above), then a perl expression, such as the substitute command above (s:
pattern:
replacement:
).
I'm not sure what is allowed as tags, but we can use the tutorial example ..
for a positional tag, i.e. that can be used with number
. The perl expression to remove everything upto the last
/
followed by the word "Sample_" woudl be: s:.*/Sample_::
so you need to add before --xapply
the arguments
--rpl '.. s:.*/Sample_::'
and then use --id=1..
to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _
, rather than a fixed word Sample
, you can use a pattern such as
--rpl '.. s:.*/[^_]*_::'
The final command should look something like this:
parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.
â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
 |Â
show 2 more comments
up vote
2
down vote
accepted
up vote
2
down vote
accepted
If I understand you correctly, all you are looking for is 1/
instead of 1
. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl
where we have that replacement strings are implemented as
--rpl '/ s:.*/::'
and The positional replacement strings can also be modified using /
etc.
So 1/
is like removing all characters upto the final /
.
You can create your own replacement shorthand strings using --rpl
followed by a string which begins with a tag (/
in the example above), then a perl expression, such as the substitute command above (s:
pattern:
replacement:
).
I'm not sure what is allowed as tags, but we can use the tutorial example ..
for a positional tag, i.e. that can be used with number
. The perl expression to remove everything upto the last
/
followed by the word "Sample_" woudl be: s:.*/Sample_::
so you need to add before --xapply
the arguments
--rpl '.. s:.*/Sample_::'
and then use --id=1..
to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _
, rather than a fixed word Sample
, you can use a pattern such as
--rpl '.. s:.*/[^_]*_::'
The final command should look something like this:
parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
If I understand you correctly, all you are looking for is 1/
instead of 1
. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl
where we have that replacement strings are implemented as
--rpl '/ s:.*/::'
and The positional replacement strings can also be modified using /
etc.
So 1/
is like removing all characters upto the final /
.
You can create your own replacement shorthand strings using --rpl
followed by a string which begins with a tag (/
in the example above), then a perl expression, such as the substitute command above (s:
pattern:
replacement:
).
I'm not sure what is allowed as tags, but we can use the tutorial example ..
for a positional tag, i.e. that can be used with number
. The perl expression to remove everything upto the last
/
followed by the word "Sample_" woudl be: s:.*/Sample_::
so you need to add before --xapply
the arguments
--rpl '.. s:.*/Sample_::'
and then use --id=1..
to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _
, rather than a fixed word Sample
, you can use a pattern such as
--rpl '.. s:.*/[^_]*_::'
The final command should look something like this:
parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*
edited Sep 14 at 15:34
answered Sep 11 at 18:04
meuh
30.2k11752
30.2k11752
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.
â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
 |Â
show 2 more comments
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.
â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
Also what would be the easiest way to extract "EC_only" from the pattern ?
â Ron
Sep 11 at 22:45
See my updated answer.
â meuh
Sep 12 at 7:54
See my updated answer.
â meuh
Sep 12 at 7:54
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
â Ron
Sep 13 at 22:43
I added what I think you want as a final command at the end of my answer above. You need to add a
--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.â meuh
Sep 14 at 15:36
I added what I think you want as a final command at the end of my answer above. You need to add a
--rpl
which specifies the tag and the substitution command, then use the tag later on in the command.â meuh
Sep 14 at 15:36
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
â Ron
Sep 14 at 19:53
 |Â
show 2 more comments
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f468101%2fextract-the-pattern-of-directory-in-gnu-parallel%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/⦠In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
â Ole Tange
Sep 11 at 20:53