Extract the pattern of Directory in GNU parallel

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.



Below is an example of a folder.



Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz



I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.



parallel -j $NSLOTS --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*


I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.



The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.



Each 1 consists of e.g. (shown below only for one sample)



/tmp/FASTQ/Sample_EC_only










share|improve this question























  • While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
    – Ole Tange
    Sep 11 at 20:53














up vote
2
down vote

favorite












I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.



Below is an example of a folder.



Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz



I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.



parallel -j $NSLOTS --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*


I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.



The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.



Each 1 consists of e.g. (shown below only for one sample)



/tmp/FASTQ/Sample_EC_only










share|improve this question























  • While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
    – Ole Tange
    Sep 11 at 20:53












up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.



Below is an example of a folder.



Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz



I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.



parallel -j $NSLOTS --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*


I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.



The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.



Each 1 consists of e.g. (shown below only for one sample)



/tmp/FASTQ/Sample_EC_only










share|improve this question















I am running a command line software on multiple folder/samples.Each folder has such files *fastq.gz.



Below is an example of a folder.



Sample_EC_only/EC_only_S1_L005_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R1_001.fastq.gz
Sample_EC_only/EC_only_S1_L005_R2_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_I1_001.fastq.gz
Sample_EC_only/EC_only_S1_L006_R1_001.fastq.gz



I am trying to run this using gnu parallel for multiple softwares,but having issues with extracting the "ID" of the folder.



parallel -j $NSLOTS --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id = "basename 1"
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*


I want to extract for e.g. "Sample_EC_only" as a pattern from the folder inside gnu parallel. --fastqs is able to get the path using 1 ,but having issues with --id option.I have tried various options to extract a pattern from the paths in 1 but not working.



The --id parameter needs a pattern extracted from the path in 1 so that it can create a output dir.



Each 1 consists of e.g. (shown below only for one sample)



/tmp/FASTQ/Sample_EC_only







linux gnu-parallel






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 10 at 20:03









Rui F Ribeiro

36.8k1273117




36.8k1273117










asked Sep 10 at 20:01









Ron

4182513




4182513











  • While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
    – Ole Tange
    Sep 11 at 20:53
















  • While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
    – Ole Tange
    Sep 11 at 20:53















While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
– Ole Tange
Sep 11 at 20:53




While GNU Parallel will work on multiline commands, it is recommended to define a bash function and call that instead. gnu.org/software/parallel/… In your case the " around basename 1 will be confusing to the average reader, as they are not matching. Instead the first is ending the " from the line before and the second is ended with the " before :::
– Ole Tange
Sep 11 at 20:53










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










If I understand you correctly, all you are looking for is 1/ instead of 1. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl where we have that replacement strings are implemented as



 --rpl '/ s:.*/::'


and The positional replacement strings can also be modified using / etc.
So 1/ is like removing all characters upto the final /.




You can create your own replacement shorthand strings using --rpl followed by a string which begins with a tag (/ in the example above), then a perl expression, such as the substitute command above (s:pattern:replacement:).



I'm not sure what is allowed as tags, but we can use the tutorial example .. for a positional tag, i.e. that can be used with number. The perl expression to remove everything upto the last / followed by the word "Sample_" woudl be: s:.*/Sample_:: so you need to add before --xapply the arguments



--rpl '.. s:.*/Sample_::'


and then use --id=1.. to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _, rather than a fixed word Sample, you can use a pattern such as



--rpl '.. s:.*/[^_]*_::'



The final command should look something like this:



parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*





share|improve this answer






















  • Also what would be the easiest way to extract "EC_only" from the pattern ?
    – Ron
    Sep 11 at 22:45










  • See my updated answer.
    – meuh
    Sep 12 at 7:54










  • I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
    – Ron
    Sep 13 at 22:43










  • I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
    – meuh
    Sep 14 at 15:36










  • The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
    – Ron
    Sep 14 at 19:53











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f468101%2fextract-the-pattern-of-directory-in-gnu-parallel%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










If I understand you correctly, all you are looking for is 1/ instead of 1. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl where we have that replacement strings are implemented as



 --rpl '/ s:.*/::'


and The positional replacement strings can also be modified using / etc.
So 1/ is like removing all characters upto the final /.




You can create your own replacement shorthand strings using --rpl followed by a string which begins with a tag (/ in the example above), then a perl expression, such as the substitute command above (s:pattern:replacement:).



I'm not sure what is allowed as tags, but we can use the tutorial example .. for a positional tag, i.e. that can be used with number. The perl expression to remove everything upto the last / followed by the word "Sample_" woudl be: s:.*/Sample_:: so you need to add before --xapply the arguments



--rpl '.. s:.*/Sample_::'


and then use --id=1.. to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _, rather than a fixed word Sample, you can use a pattern such as



--rpl '.. s:.*/[^_]*_::'



The final command should look something like this:



parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*





share|improve this answer






















  • Also what would be the easiest way to extract "EC_only" from the pattern ?
    – Ron
    Sep 11 at 22:45










  • See my updated answer.
    – meuh
    Sep 12 at 7:54










  • I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
    – Ron
    Sep 13 at 22:43










  • I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
    – meuh
    Sep 14 at 15:36










  • The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
    – Ron
    Sep 14 at 19:53















up vote
2
down vote



accepted










If I understand you correctly, all you are looking for is 1/ instead of 1. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl where we have that replacement strings are implemented as



 --rpl '/ s:.*/::'


and The positional replacement strings can also be modified using / etc.
So 1/ is like removing all characters upto the final /.




You can create your own replacement shorthand strings using --rpl followed by a string which begins with a tag (/ in the example above), then a perl expression, such as the substitute command above (s:pattern:replacement:).



I'm not sure what is allowed as tags, but we can use the tutorial example .. for a positional tag, i.e. that can be used with number. The perl expression to remove everything upto the last / followed by the word "Sample_" woudl be: s:.*/Sample_:: so you need to add before --xapply the arguments



--rpl '.. s:.*/Sample_::'


and then use --id=1.. to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _, rather than a fixed word Sample, you can use a pattern such as



--rpl '.. s:.*/[^_]*_::'



The final command should look something like this:



parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*





share|improve this answer






















  • Also what would be the easiest way to extract "EC_only" from the pattern ?
    – Ron
    Sep 11 at 22:45










  • See my updated answer.
    – meuh
    Sep 12 at 7:54










  • I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
    – Ron
    Sep 13 at 22:43










  • I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
    – meuh
    Sep 14 at 15:36










  • The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
    – Ron
    Sep 14 at 19:53













up vote
2
down vote



accepted







up vote
2
down vote



accepted






If I understand you correctly, all you are looking for is 1/ instead of 1. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl where we have that replacement strings are implemented as



 --rpl '/ s:.*/::'


and The positional replacement strings can also be modified using / etc.
So 1/ is like removing all characters upto the final /.




You can create your own replacement shorthand strings using --rpl followed by a string which begins with a tag (/ in the example above), then a perl expression, such as the substitute command above (s:pattern:replacement:).



I'm not sure what is allowed as tags, but we can use the tutorial example .. for a positional tag, i.e. that can be used with number. The perl expression to remove everything upto the last / followed by the word "Sample_" woudl be: s:.*/Sample_:: so you need to add before --xapply the arguments



--rpl '.. s:.*/Sample_::'


and then use --id=1.. to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _, rather than a fixed word Sample, you can use a pattern such as



--rpl '.. s:.*/[^_]*_::'



The final command should look something like this:



parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*





share|improve this answer














If I understand you correctly, all you are looking for is 1/ instead of 1. It is the "basename" of the argument. See man parallel_tutorial and the discussion of --rpl where we have that replacement strings are implemented as



 --rpl '/ s:.*/::'


and The positional replacement strings can also be modified using / etc.
So 1/ is like removing all characters upto the final /.




You can create your own replacement shorthand strings using --rpl followed by a string which begins with a tag (/ in the example above), then a perl expression, such as the substitute command above (s:pattern:replacement:).



I'm not sure what is allowed as tags, but we can use the tutorial example .. for a positional tag, i.e. that can be used with number. The perl expression to remove everything upto the last / followed by the word "Sample_" woudl be: s:.*/Sample_:: so you need to add before --xapply the arguments



--rpl '.. s:.*/Sample_::'


and then use --id=1.. to apply this replacement to arg 1.
If, for example, you want to remove the word upto the first underline _, rather than a fixed word Sample, you can use a pattern such as



--rpl '.. s:.*/[^_]*_::'



The final command should look something like this:



parallel -j $NSLOTS --rpl '.. s:.*/Sample_::' --xapply 
" echo 1
/home/rob2056/software/cellranger-2.2.0/cellranger count --id=1/
--id2=1..
--transcriptome=$ref_data
--fastqs=1
" ::: $TMPDIR/FASTQ/Sample*






share|improve this answer














share|improve this answer



share|improve this answer








edited Sep 14 at 15:34

























answered Sep 11 at 18:04









meuh

30.2k11752




30.2k11752











  • Also what would be the easiest way to extract "EC_only" from the pattern ?
    – Ron
    Sep 11 at 22:45










  • See my updated answer.
    – meuh
    Sep 12 at 7:54










  • I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
    – Ron
    Sep 13 at 22:43










  • I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
    – meuh
    Sep 14 at 15:36










  • The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
    – Ron
    Sep 14 at 19:53

















  • Also what would be the easiest way to extract "EC_only" from the pattern ?
    – Ron
    Sep 11 at 22:45










  • See my updated answer.
    – meuh
    Sep 12 at 7:54










  • I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
    – Ron
    Sep 13 at 22:43










  • I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
    – meuh
    Sep 14 at 15:36










  • The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
    – Ron
    Sep 14 at 19:53
















Also what would be the easiest way to extract "EC_only" from the pattern ?
– Ron
Sep 11 at 22:45




Also what would be the easiest way to extract "EC_only" from the pattern ?
– Ron
Sep 11 at 22:45












See my updated answer.
– meuh
Sep 12 at 7:54




See my updated answer.
– meuh
Sep 12 at 7:54












I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
– Ron
Sep 13 at 22:43




I want to add another parameter inside my command for e.g. id2 which takes "Ec_only" pattern from 1 (folder name in my case). I tried the method you suggested however not sure if it works --id2= / s:.*/Sample_::
– Ron
Sep 13 at 22:43












I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
– meuh
Sep 14 at 15:36




I added what I think you want as a final command at the end of my answer above. You need to add a --rpl which specifies the tag and the substitution command, then use the tag later on in the command.
– meuh
Sep 14 at 15:36












The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
– Ron
Sep 14 at 19:53





The tag does not work --id2=1.. .The command wont run.(included the --rpl '.. s:.*/Sample_::' in parallel)
– Ron
Sep 14 at 19:53


















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f468101%2fextract-the-pattern-of-directory-in-gnu-parallel%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?