Parallely running multiple copies of the same file with different inputs using shell script

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite
1












Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.



What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?










share|improve this question

























    up vote
    2
    down vote

    favorite
    1












    Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.



    What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?










    share|improve this question























      up vote
      2
      down vote

      favorite
      1









      up vote
      2
      down vote

      favorite
      1






      1





      Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.



      What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?










      share|improve this question













      Suppose I have a file "Analysis.C" which takes a data file as input. The data file is named as "a.00001.txt" through "a.01000.txt". One way to loop over all the files is to write a shell script where I use sed to change the input file name in "Analysis.C" over an iteration from 0001 to 1000. However, I have to do this one input file at a time.



      What I want is to run multiple instances of the file "Analysis.C" in parallel where it takes different inputs in each instance (the constraint here is the number of cores I can spare on my PC, I suppose), and executes the different instances at the same time. How do I do that?







      linux shell-script shell gnu-parallel






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Aug 31 at 5:54









      Diptanil Roy

      152




      152




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          With GNU Parallel you can do this:



          parallel analysis.C ::: *.txt


          Or if you have really many .txt-files:



          printf '%s' *.txt | parallel -0 analysis.C


          It will default to run one job per CPU thread. This can be adjusted with -j20 for 20 jobs in parallel.



          Contrary to the parallel.moreutils-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.



          GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.



          If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:



          Simple scheduling



          GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:



          GNU Parallel scheduling



          Installation



          For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:



          (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


          For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README



          Learn more



          See more examples: http://www.gnu.org/software/parallel/man.html



          Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1



          Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html



          Read the book: https://doi.org/10.5281/zenodo.1146014



          Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel






          share|improve this answer



























            up vote
            1
            down vote













            See the parallel command (from the moreutils package in many distros). From the man page:




            parallel runs the specified command, passing it a single one of the specified
            arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.




            So:



            parallel analysis.C -- a.0????.txt





            share|improve this answer






















              Your Answer







              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "106"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f465926%2fparallely-running-multiple-copies-of-the-same-file-with-different-inputs-using-s%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              1
              down vote



              accepted










              With GNU Parallel you can do this:



              parallel analysis.C ::: *.txt


              Or if you have really many .txt-files:



              printf '%s' *.txt | parallel -0 analysis.C


              It will default to run one job per CPU thread. This can be adjusted with -j20 for 20 jobs in parallel.



              Contrary to the parallel.moreutils-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.



              GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.



              If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:



              Simple scheduling



              GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:



              GNU Parallel scheduling



              Installation



              For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:



              (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


              For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README



              Learn more



              See more examples: http://www.gnu.org/software/parallel/man.html



              Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1



              Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html



              Read the book: https://doi.org/10.5281/zenodo.1146014



              Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel






              share|improve this answer
























                up vote
                1
                down vote



                accepted










                With GNU Parallel you can do this:



                parallel analysis.C ::: *.txt


                Or if you have really many .txt-files:



                printf '%s' *.txt | parallel -0 analysis.C


                It will default to run one job per CPU thread. This can be adjusted with -j20 for 20 jobs in parallel.



                Contrary to the parallel.moreutils-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.



                GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.



                If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:



                Simple scheduling



                GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:



                GNU Parallel scheduling



                Installation



                For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:



                (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


                For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README



                Learn more



                See more examples: http://www.gnu.org/software/parallel/man.html



                Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1



                Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html



                Read the book: https://doi.org/10.5281/zenodo.1146014



                Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel






                share|improve this answer






















                  up vote
                  1
                  down vote



                  accepted







                  up vote
                  1
                  down vote



                  accepted






                  With GNU Parallel you can do this:



                  parallel analysis.C ::: *.txt


                  Or if you have really many .txt-files:



                  printf '%s' *.txt | parallel -0 analysis.C


                  It will default to run one job per CPU thread. This can be adjusted with -j20 for 20 jobs in parallel.



                  Contrary to the parallel.moreutils-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.



                  GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.



                  If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:



                  Simple scheduling



                  GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:



                  GNU Parallel scheduling



                  Installation



                  For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:



                  (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


                  For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README



                  Learn more



                  See more examples: http://www.gnu.org/software/parallel/man.html



                  Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1



                  Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html



                  Read the book: https://doi.org/10.5281/zenodo.1146014



                  Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel






                  share|improve this answer












                  With GNU Parallel you can do this:



                  parallel analysis.C ::: *.txt


                  Or if you have really many .txt-files:



                  printf '%s' *.txt | parallel -0 analysis.C


                  It will default to run one job per CPU thread. This can be adjusted with -j20 for 20 jobs in parallel.



                  Contrary to the parallel.moreutils-solution you can post process the output: The output is serialized, so you will never see output from two jobs mix.



                  GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.



                  If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:



                  Simple scheduling



                  GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:



                  GNU Parallel scheduling



                  Installation



                  For security reasons you should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:



                  (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


                  For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README



                  Learn more



                  See more examples: http://www.gnu.org/software/parallel/man.html



                  Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1



                  Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html



                  Read the book: https://doi.org/10.5281/zenodo.1146014



                  Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Aug 31 at 7:10









                  Ole Tange

                  11.5k1445103




                  11.5k1445103






















                      up vote
                      1
                      down vote













                      See the parallel command (from the moreutils package in many distros). From the man page:




                      parallel runs the specified command, passing it a single one of the specified
                      arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.




                      So:



                      parallel analysis.C -- a.0????.txt





                      share|improve this answer


























                        up vote
                        1
                        down vote













                        See the parallel command (from the moreutils package in many distros). From the man page:




                        parallel runs the specified command, passing it a single one of the specified
                        arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.




                        So:



                        parallel analysis.C -- a.0????.txt





                        share|improve this answer
























                          up vote
                          1
                          down vote










                          up vote
                          1
                          down vote









                          See the parallel command (from the moreutils package in many distros). From the man page:




                          parallel runs the specified command, passing it a single one of the specified
                          arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.




                          So:



                          parallel analysis.C -- a.0????.txt





                          share|improve this answer














                          See the parallel command (from the moreutils package in many distros). From the man page:




                          parallel runs the specified command, passing it a single one of the specified
                          arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.




                          So:



                          parallel analysis.C -- a.0????.txt






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Aug 31 at 6:33

























                          answered Aug 31 at 6:24









                          xenoid

                          1,7171620




                          1,7171620



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f465926%2fparallely-running-multiple-copies-of-the-same-file-with-different-inputs-using-s%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Popular posts from this blog

                              How to check contact read email or not when send email to Individual?

                              Displaying single band from multi-band raster using QGIS

                              How many registers does an x86_64 CPU actually have?