Parallely running multiple copies of the same file with different inputs using shell script
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed
to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.
What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?
linux shell-script shell gnu-parallel
add a comment |Â
up vote
2
down vote
favorite
Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed
to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.
What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?
linux shell-script shell gnu-parallel
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed
to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.
What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?
linux shell-script shell gnu-parallel
Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed
to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.
What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?
linux shell-script shell gnu-parallel
linux shell-script shell gnu-parallel
asked Aug 31 at 5:54
Diptanil Roy
152
152
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
With GNU Parallel you can do this:
parallel analysis.C ::: *.txt
Or if you have really many .txt
-files:
printf '%s' *.txt | parallel -0 analysis.C
It will default to run one job per CPU thread. This can be adjusted with -j20
for 20 jobs in parallel.
Contrary to the parallel.moreutils
-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Read the book: https://doi.org/10.5281/zenodo.1146014
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
add a comment |Â
up vote
1
down vote
See the parallel
command (from the moreutils
package in many distros). From the man page:
parallel runs the specified command, passing it a single one of the specified
arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.
So:
parallel analysis.C -- a.0????.txt
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
With GNU Parallel you can do this:
parallel analysis.C ::: *.txt
Or if you have really many .txt
-files:
printf '%s' *.txt | parallel -0 analysis.C
It will default to run one job per CPU thread. This can be adjusted with -j20
for 20 jobs in parallel.
Contrary to the parallel.moreutils
-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Read the book: https://doi.org/10.5281/zenodo.1146014
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
add a comment |Â
up vote
1
down vote
accepted
With GNU Parallel you can do this:
parallel analysis.C ::: *.txt
Or if you have really many .txt
-files:
printf '%s' *.txt | parallel -0 analysis.C
It will default to run one job per CPU thread. This can be adjusted with -j20
for 20 jobs in parallel.
Contrary to the parallel.moreutils
-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Read the book: https://doi.org/10.5281/zenodo.1146014
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
With GNU Parallel you can do this:
parallel analysis.C ::: *.txt
Or if you have really many .txt
-files:
printf '%s' *.txt | parallel -0 analysis.C
It will default to run one job per CPU thread. This can be adjusted with -j20
for 20 jobs in parallel.
Contrary to the parallel.moreutils
-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Read the book: https://doi.org/10.5281/zenodo.1146014
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
With GNU Parallel you can do this:
parallel analysis.C ::: *.txt
Or if you have really many .txt
-files:
printf '%s' *.txt | parallel -0 analysis.C
It will default to run one job per CPU thread. This can be adjusted with -j20
for 20 jobs in parallel.
Contrary to the parallel.moreutils
-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Read the book: https://doi.org/10.5281/zenodo.1146014
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
answered Aug 31 at 7:10
Ole Tange
11.5k1445103
11.5k1445103
add a comment |Â
add a comment |Â
up vote
1
down vote
See the parallel
command (from the moreutils
package in many distros). From the man page:
parallel runs the specified command, passing it a single one of the specified
arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.
So:
parallel analysis.C -- a.0????.txt
add a comment |Â
up vote
1
down vote
See the parallel
command (from the moreutils
package in many distros). From the man page:
parallel runs the specified command, passing it a single one of the specified
arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.
So:
parallel analysis.C -- a.0????.txt
add a comment |Â
up vote
1
down vote
up vote
1
down vote
See the parallel
command (from the moreutils
package in many distros). From the man page:
parallel runs the specified command, passing it a single one of the specified
arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.
So:
parallel analysis.C -- a.0????.txt
See the parallel
command (from the moreutils
package in many distros). From the man page:
parallel runs the specified command, passing it a single one of the specified
arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.
So:
parallel analysis.C -- a.0????.txt
edited Aug 31 at 6:33
answered Aug 31 at 6:24
xenoid
1,7171620
1,7171620
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f465926%2fparallely-running-multiple-copies-of-the-same-file-with-different-inputs-using-s%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password