Submitting HPC jobs within an HPC job
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I have a large script which relies on input arguments (with getopts). One of these arguments is a directory containing files (all named *bam) This script has 2 parts:
- Part1: based on input *bam files, calculate one specific number. To be clear, the result is one single number, NOT one number per file.
- Part 2: using the number found in part1, perform a series of operations on each *bam file.
Now, originally, part1 was very quick, computationally speaking. So my setup was:
- Run script on terminal: bash script.sh
- Within the script.sh, for part2, make a HPC job submission for each file
However, now that I need to analyze many more files than originally planned, I am realising that Part1 will also be computationally heavy - I therefore need to also run this on the HPC.
So my question is:
- Is it possible to submit an HPC job which submits jobs in it?
- In other words, can I submit script.sh as a job and and still have it submit jobs in its part2?
To be clear, here is an example of what my script might look like:
#!/usr/bin/bash
# PART 0: accept all input arguments
USAGE() echo "Usage: bash $0 [-b <in-bam-files-dir>] [-o <out-dir>] [-c <chromlen>]" 1>&2; exit 1;
if (($# == 0)); then
USAGE
fi
# Use getopts to accept each argument
while getopts ":b:o:c:h" opt
do
case $opt in
b ) BAMFILES=$OPTARG
;;
o ) OUTDIR=$OPTARG
;;
c ) CHROMLEN=$OPTARG
;;
h ) USAGE
;;
? ) echo "Invalid option: -$OPTARG exiting" >&2
exit
;;
: ) echo "Option -$OPTARG requires an argument" >&2
exit
;;
esac
done
# PART1: calculate this unique number
NUMBER=0
for i in $(ls $BAMFILES/*.bam)
do
make some calculations on each file to obtain a number ...
keep only the smallest found number and assign its value to $NUMBER
done
echo "Final number is $NUMBER "
# PART2: Using $NUMBER that we found above, submit a job for each *bam file
for i in $(ls $BAMFILES/*bam)
do
if [ ! -f $OUTDIR/$SAMPLE.bw ];
then
command=" command -options -b $NUMBER $i"
echo $command | qsub -V -cwd -o $OUTDIR -e $OUTDIR -l tmem=6G -l h_vmem=6G -l h_rt=3600 -N result_$SAMPLE
fi
done
jobs getopts cluster-ssh qsub
add a comment |Â
up vote
0
down vote
favorite
I have a large script which relies on input arguments (with getopts). One of these arguments is a directory containing files (all named *bam) This script has 2 parts:
- Part1: based on input *bam files, calculate one specific number. To be clear, the result is one single number, NOT one number per file.
- Part 2: using the number found in part1, perform a series of operations on each *bam file.
Now, originally, part1 was very quick, computationally speaking. So my setup was:
- Run script on terminal: bash script.sh
- Within the script.sh, for part2, make a HPC job submission for each file
However, now that I need to analyze many more files than originally planned, I am realising that Part1 will also be computationally heavy - I therefore need to also run this on the HPC.
So my question is:
- Is it possible to submit an HPC job which submits jobs in it?
- In other words, can I submit script.sh as a job and and still have it submit jobs in its part2?
To be clear, here is an example of what my script might look like:
#!/usr/bin/bash
# PART 0: accept all input arguments
USAGE() echo "Usage: bash $0 [-b <in-bam-files-dir>] [-o <out-dir>] [-c <chromlen>]" 1>&2; exit 1;
if (($# == 0)); then
USAGE
fi
# Use getopts to accept each argument
while getopts ":b:o:c:h" opt
do
case $opt in
b ) BAMFILES=$OPTARG
;;
o ) OUTDIR=$OPTARG
;;
c ) CHROMLEN=$OPTARG
;;
h ) USAGE
;;
? ) echo "Invalid option: -$OPTARG exiting" >&2
exit
;;
: ) echo "Option -$OPTARG requires an argument" >&2
exit
;;
esac
done
# PART1: calculate this unique number
NUMBER=0
for i in $(ls $BAMFILES/*.bam)
do
make some calculations on each file to obtain a number ...
keep only the smallest found number and assign its value to $NUMBER
done
echo "Final number is $NUMBER "
# PART2: Using $NUMBER that we found above, submit a job for each *bam file
for i in $(ls $BAMFILES/*bam)
do
if [ ! -f $OUTDIR/$SAMPLE.bw ];
then
command=" command -options -b $NUMBER $i"
echo $command | qsub -V -cwd -o $OUTDIR -e $OUTDIR -l tmem=6G -l h_vmem=6G -l h_rt=3600 -N result_$SAMPLE
fi
done
jobs getopts cluster-ssh qsub
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a large script which relies on input arguments (with getopts). One of these arguments is a directory containing files (all named *bam) This script has 2 parts:
- Part1: based on input *bam files, calculate one specific number. To be clear, the result is one single number, NOT one number per file.
- Part 2: using the number found in part1, perform a series of operations on each *bam file.
Now, originally, part1 was very quick, computationally speaking. So my setup was:
- Run script on terminal: bash script.sh
- Within the script.sh, for part2, make a HPC job submission for each file
However, now that I need to analyze many more files than originally planned, I am realising that Part1 will also be computationally heavy - I therefore need to also run this on the HPC.
So my question is:
- Is it possible to submit an HPC job which submits jobs in it?
- In other words, can I submit script.sh as a job and and still have it submit jobs in its part2?
To be clear, here is an example of what my script might look like:
#!/usr/bin/bash
# PART 0: accept all input arguments
USAGE() echo "Usage: bash $0 [-b <in-bam-files-dir>] [-o <out-dir>] [-c <chromlen>]" 1>&2; exit 1;
if (($# == 0)); then
USAGE
fi
# Use getopts to accept each argument
while getopts ":b:o:c:h" opt
do
case $opt in
b ) BAMFILES=$OPTARG
;;
o ) OUTDIR=$OPTARG
;;
c ) CHROMLEN=$OPTARG
;;
h ) USAGE
;;
? ) echo "Invalid option: -$OPTARG exiting" >&2
exit
;;
: ) echo "Option -$OPTARG requires an argument" >&2
exit
;;
esac
done
# PART1: calculate this unique number
NUMBER=0
for i in $(ls $BAMFILES/*.bam)
do
make some calculations on each file to obtain a number ...
keep only the smallest found number and assign its value to $NUMBER
done
echo "Final number is $NUMBER "
# PART2: Using $NUMBER that we found above, submit a job for each *bam file
for i in $(ls $BAMFILES/*bam)
do
if [ ! -f $OUTDIR/$SAMPLE.bw ];
then
command=" command -options -b $NUMBER $i"
echo $command | qsub -V -cwd -o $OUTDIR -e $OUTDIR -l tmem=6G -l h_vmem=6G -l h_rt=3600 -N result_$SAMPLE
fi
done
jobs getopts cluster-ssh qsub
I have a large script which relies on input arguments (with getopts). One of these arguments is a directory containing files (all named *bam) This script has 2 parts:
- Part1: based on input *bam files, calculate one specific number. To be clear, the result is one single number, NOT one number per file.
- Part 2: using the number found in part1, perform a series of operations on each *bam file.
Now, originally, part1 was very quick, computationally speaking. So my setup was:
- Run script on terminal: bash script.sh
- Within the script.sh, for part2, make a HPC job submission for each file
However, now that I need to analyze many more files than originally planned, I am realising that Part1 will also be computationally heavy - I therefore need to also run this on the HPC.
So my question is:
- Is it possible to submit an HPC job which submits jobs in it?
- In other words, can I submit script.sh as a job and and still have it submit jobs in its part2?
To be clear, here is an example of what my script might look like:
#!/usr/bin/bash
# PART 0: accept all input arguments
USAGE() echo "Usage: bash $0 [-b <in-bam-files-dir>] [-o <out-dir>] [-c <chromlen>]" 1>&2; exit 1;
if (($# == 0)); then
USAGE
fi
# Use getopts to accept each argument
while getopts ":b:o:c:h" opt
do
case $opt in
b ) BAMFILES=$OPTARG
;;
o ) OUTDIR=$OPTARG
;;
c ) CHROMLEN=$OPTARG
;;
h ) USAGE
;;
? ) echo "Invalid option: -$OPTARG exiting" >&2
exit
;;
: ) echo "Option -$OPTARG requires an argument" >&2
exit
;;
esac
done
# PART1: calculate this unique number
NUMBER=0
for i in $(ls $BAMFILES/*.bam)
do
make some calculations on each file to obtain a number ...
keep only the smallest found number and assign its value to $NUMBER
done
echo "Final number is $NUMBER "
# PART2: Using $NUMBER that we found above, submit a job for each *bam file
for i in $(ls $BAMFILES/*bam)
do
if [ ! -f $OUTDIR/$SAMPLE.bw ];
then
command=" command -options -b $NUMBER $i"
echo $command | qsub -V -cwd -o $OUTDIR -e $OUTDIR -l tmem=6G -l h_vmem=6G -l h_rt=3600 -N result_$SAMPLE
fi
done
jobs getopts cluster-ssh qsub
jobs getopts cluster-ssh qsub
edited Aug 21 at 23:27
Jeff Schaller
32.7k849110
32.7k849110
asked Aug 21 at 16:55
m93
364
364
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
The answer is "it depends". Your HPC cluster can be set up to have execute node able to submit jobs but this isn't a requirement. Sounds like a quick question to your local HPC admin will give you a definitive answer. Or you could try a quick script that does nothing but submit a second job and see if it works.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
The answer is "it depends". Your HPC cluster can be set up to have execute node able to submit jobs but this isn't a requirement. Sounds like a quick question to your local HPC admin will give you a definitive answer. Or you could try a quick script that does nothing but submit a second job and see if it works.
add a comment |Â
up vote
0
down vote
accepted
The answer is "it depends". Your HPC cluster can be set up to have execute node able to submit jobs but this isn't a requirement. Sounds like a quick question to your local HPC admin will give you a definitive answer. Or you could try a quick script that does nothing but submit a second job and see if it works.
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
The answer is "it depends". Your HPC cluster can be set up to have execute node able to submit jobs but this isn't a requirement. Sounds like a quick question to your local HPC admin will give you a definitive answer. Or you could try a quick script that does nothing but submit a second job and see if it works.
The answer is "it depends". Your HPC cluster can be set up to have execute node able to submit jobs but this isn't a requirement. Sounds like a quick question to your local HPC admin will give you a definitive answer. Or you could try a quick script that does nothing but submit a second job and see if it works.
answered Aug 21 at 17:00
Doug O'Neal
2,6271716
2,6271716
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f463920%2fsubmitting-hpc-jobs-within-an-hpc-job%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password