GNU parallel vs & (I mean background) vs xargs -P
Clash Royale CLAN TAG#URR8PPP
up vote
34
down vote
favorite
I'm confused about the difference or advantage (if any) of running a set of tasks in a .sh
script using GNU parallel
E.g. Ole Tange's answer:
parallel ./pngout -s0 R ::: *.png
rather than say looping through them putting them in the background &
.
E.g. frostschutz's answer:
#copied from the link for illustration
for stuff in things
do
( something
with
stuff ) &
done
wait # for all the something with stuff
In short are they just syntactically or practically different? And if practically different when should I use each?
shell-script background-process xargs gnu-parallel
add a comment |Â
up vote
34
down vote
favorite
I'm confused about the difference or advantage (if any) of running a set of tasks in a .sh
script using GNU parallel
E.g. Ole Tange's answer:
parallel ./pngout -s0 R ::: *.png
rather than say looping through them putting them in the background &
.
E.g. frostschutz's answer:
#copied from the link for illustration
for stuff in things
do
( something
with
stuff ) &
done
wait # for all the something with stuff
In short are they just syntactically or practically different? And if practically different when should I use each?
shell-script background-process xargs gnu-parallel
add a comment |Â
up vote
34
down vote
favorite
up vote
34
down vote
favorite
I'm confused about the difference or advantage (if any) of running a set of tasks in a .sh
script using GNU parallel
E.g. Ole Tange's answer:
parallel ./pngout -s0 R ::: *.png
rather than say looping through them putting them in the background &
.
E.g. frostschutz's answer:
#copied from the link for illustration
for stuff in things
do
( something
with
stuff ) &
done
wait # for all the something with stuff
In short are they just syntactically or practically different? And if practically different when should I use each?
shell-script background-process xargs gnu-parallel
I'm confused about the difference or advantage (if any) of running a set of tasks in a .sh
script using GNU parallel
E.g. Ole Tange's answer:
parallel ./pngout -s0 R ::: *.png
rather than say looping through them putting them in the background &
.
E.g. frostschutz's answer:
#copied from the link for illustration
for stuff in things
do
( something
with
stuff ) &
done
wait # for all the something with stuff
In short are they just syntactically or practically different? And if practically different when should I use each?
shell-script background-process xargs gnu-parallel
shell-script background-process xargs gnu-parallel
edited Sep 11 '17 at 10:58
Jeff Schaller
32.4k849110
32.4k849110
asked Dec 12 '13 at 0:08
Stephen Henderson
4243715
4243715
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
39
down vote
accepted
Putting multiple jobs in the background is a good way of using the multiple cores of a single machine. parallel
however, allows you to spread jobs across multiple servers of your network. From man parallel
:
GNU parallel is a shell tool for executing jobs in parallel using
one or more computers. The typical input is a list of
files, a list of hosts, a list of users, a list of URLs, or a list of tables.
Even when running on a single computer, parallel
gives you far greater control over how your jobs are parallelized. Take this example from the man
page:
To convert *.wav to *.mp3 using LAME running one process per CPU core
run:
parallel lame -o ..mp3 ::: *.wav
OK, you could do the same with
for i in *wav; do lame "$i" -o "$i%.wav.mp3" & done
However, that is longer and more cumbersome and, more importantly, will launch as many jobs as there are .wav
files. If you run this on a few thousand files, it is likely to bring a normal laptop to its knees. parallel
on the other hand, will launch one job per CPU core and keep everything nice and tidy.
Basically, parallel
offers you the ability to fine tune how your jobs are run and how much of available resources they should use. If you really want to see the power of this tool, go through its manual or, at the very least, the examples it offers.
Simple backgrounding really has nowhere near the level of sophistication to be compared to parallel. As for how parallel
differs from xargs
, the GNU crowd give a nice breakdown here. Some of the more salient points are:
- xargs deals badly with special characters (such as space, ' and ").
- xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
- xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process.
- xargs has no support for keeping the order of the output, therefore if running jobs in parallel using xargs the output of the second job cannot be postponed till the first job is done.
- xargs has no support for running jobs on remote computers.
- xargs has no support for context replace, so you will have to create the arguments.
1
That's a good answer, thx. It sort of confirms what I guessed. I hate theparallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?
â Stephen Henderson
Dec 12 '13 at 8:02
3
Have a look atsem
which is part of the GNU Parallel package. That might suit your syntax requirements better.
â Ole Tange
Dec 12 '13 at 10:53
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
It's true thatparallel
is more powerful thanxargs
, but that comparison is rather biased. For example,xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also-d
to emulateparallel
(even mentioned in the comparison!).xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.
â Sam Brightman
Aug 26 '16 at 10:10
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
39
down vote
accepted
Putting multiple jobs in the background is a good way of using the multiple cores of a single machine. parallel
however, allows you to spread jobs across multiple servers of your network. From man parallel
:
GNU parallel is a shell tool for executing jobs in parallel using
one or more computers. The typical input is a list of
files, a list of hosts, a list of users, a list of URLs, or a list of tables.
Even when running on a single computer, parallel
gives you far greater control over how your jobs are parallelized. Take this example from the man
page:
To convert *.wav to *.mp3 using LAME running one process per CPU core
run:
parallel lame -o ..mp3 ::: *.wav
OK, you could do the same with
for i in *wav; do lame "$i" -o "$i%.wav.mp3" & done
However, that is longer and more cumbersome and, more importantly, will launch as many jobs as there are .wav
files. If you run this on a few thousand files, it is likely to bring a normal laptop to its knees. parallel
on the other hand, will launch one job per CPU core and keep everything nice and tidy.
Basically, parallel
offers you the ability to fine tune how your jobs are run and how much of available resources they should use. If you really want to see the power of this tool, go through its manual or, at the very least, the examples it offers.
Simple backgrounding really has nowhere near the level of sophistication to be compared to parallel. As for how parallel
differs from xargs
, the GNU crowd give a nice breakdown here. Some of the more salient points are:
- xargs deals badly with special characters (such as space, ' and ").
- xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
- xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process.
- xargs has no support for keeping the order of the output, therefore if running jobs in parallel using xargs the output of the second job cannot be postponed till the first job is done.
- xargs has no support for running jobs on remote computers.
- xargs has no support for context replace, so you will have to create the arguments.
1
That's a good answer, thx. It sort of confirms what I guessed. I hate theparallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?
â Stephen Henderson
Dec 12 '13 at 8:02
3
Have a look atsem
which is part of the GNU Parallel package. That might suit your syntax requirements better.
â Ole Tange
Dec 12 '13 at 10:53
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
It's true thatparallel
is more powerful thanxargs
, but that comparison is rather biased. For example,xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also-d
to emulateparallel
(even mentioned in the comparison!).xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.
â Sam Brightman
Aug 26 '16 at 10:10
add a comment |Â
up vote
39
down vote
accepted
Putting multiple jobs in the background is a good way of using the multiple cores of a single machine. parallel
however, allows you to spread jobs across multiple servers of your network. From man parallel
:
GNU parallel is a shell tool for executing jobs in parallel using
one or more computers. The typical input is a list of
files, a list of hosts, a list of users, a list of URLs, or a list of tables.
Even when running on a single computer, parallel
gives you far greater control over how your jobs are parallelized. Take this example from the man
page:
To convert *.wav to *.mp3 using LAME running one process per CPU core
run:
parallel lame -o ..mp3 ::: *.wav
OK, you could do the same with
for i in *wav; do lame "$i" -o "$i%.wav.mp3" & done
However, that is longer and more cumbersome and, more importantly, will launch as many jobs as there are .wav
files. If you run this on a few thousand files, it is likely to bring a normal laptop to its knees. parallel
on the other hand, will launch one job per CPU core and keep everything nice and tidy.
Basically, parallel
offers you the ability to fine tune how your jobs are run and how much of available resources they should use. If you really want to see the power of this tool, go through its manual or, at the very least, the examples it offers.
Simple backgrounding really has nowhere near the level of sophistication to be compared to parallel. As for how parallel
differs from xargs
, the GNU crowd give a nice breakdown here. Some of the more salient points are:
- xargs deals badly with special characters (such as space, ' and ").
- xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
- xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process.
- xargs has no support for keeping the order of the output, therefore if running jobs in parallel using xargs the output of the second job cannot be postponed till the first job is done.
- xargs has no support for running jobs on remote computers.
- xargs has no support for context replace, so you will have to create the arguments.
1
That's a good answer, thx. It sort of confirms what I guessed. I hate theparallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?
â Stephen Henderson
Dec 12 '13 at 8:02
3
Have a look atsem
which is part of the GNU Parallel package. That might suit your syntax requirements better.
â Ole Tange
Dec 12 '13 at 10:53
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
It's true thatparallel
is more powerful thanxargs
, but that comparison is rather biased. For example,xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also-d
to emulateparallel
(even mentioned in the comparison!).xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.
â Sam Brightman
Aug 26 '16 at 10:10
add a comment |Â
up vote
39
down vote
accepted
up vote
39
down vote
accepted
Putting multiple jobs in the background is a good way of using the multiple cores of a single machine. parallel
however, allows you to spread jobs across multiple servers of your network. From man parallel
:
GNU parallel is a shell tool for executing jobs in parallel using
one or more computers. The typical input is a list of
files, a list of hosts, a list of users, a list of URLs, or a list of tables.
Even when running on a single computer, parallel
gives you far greater control over how your jobs are parallelized. Take this example from the man
page:
To convert *.wav to *.mp3 using LAME running one process per CPU core
run:
parallel lame -o ..mp3 ::: *.wav
OK, you could do the same with
for i in *wav; do lame "$i" -o "$i%.wav.mp3" & done
However, that is longer and more cumbersome and, more importantly, will launch as many jobs as there are .wav
files. If you run this on a few thousand files, it is likely to bring a normal laptop to its knees. parallel
on the other hand, will launch one job per CPU core and keep everything nice and tidy.
Basically, parallel
offers you the ability to fine tune how your jobs are run and how much of available resources they should use. If you really want to see the power of this tool, go through its manual or, at the very least, the examples it offers.
Simple backgrounding really has nowhere near the level of sophistication to be compared to parallel. As for how parallel
differs from xargs
, the GNU crowd give a nice breakdown here. Some of the more salient points are:
- xargs deals badly with special characters (such as space, ' and ").
- xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
- xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process.
- xargs has no support for keeping the order of the output, therefore if running jobs in parallel using xargs the output of the second job cannot be postponed till the first job is done.
- xargs has no support for running jobs on remote computers.
- xargs has no support for context replace, so you will have to create the arguments.
Putting multiple jobs in the background is a good way of using the multiple cores of a single machine. parallel
however, allows you to spread jobs across multiple servers of your network. From man parallel
:
GNU parallel is a shell tool for executing jobs in parallel using
one or more computers. The typical input is a list of
files, a list of hosts, a list of users, a list of URLs, or a list of tables.
Even when running on a single computer, parallel
gives you far greater control over how your jobs are parallelized. Take this example from the man
page:
To convert *.wav to *.mp3 using LAME running one process per CPU core
run:
parallel lame -o ..mp3 ::: *.wav
OK, you could do the same with
for i in *wav; do lame "$i" -o "$i%.wav.mp3" & done
However, that is longer and more cumbersome and, more importantly, will launch as many jobs as there are .wav
files. If you run this on a few thousand files, it is likely to bring a normal laptop to its knees. parallel
on the other hand, will launch one job per CPU core and keep everything nice and tidy.
Basically, parallel
offers you the ability to fine tune how your jobs are run and how much of available resources they should use. If you really want to see the power of this tool, go through its manual or, at the very least, the examples it offers.
Simple backgrounding really has nowhere near the level of sophistication to be compared to parallel. As for how parallel
differs from xargs
, the GNU crowd give a nice breakdown here. Some of the more salient points are:
- xargs deals badly with special characters (such as space, ' and ").
- xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
- xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process.
- xargs has no support for keeping the order of the output, therefore if running jobs in parallel using xargs the output of the second job cannot be postponed till the first job is done.
- xargs has no support for running jobs on remote computers.
- xargs has no support for context replace, so you will have to create the arguments.
edited Aug 10 at 19:12
Ole Tange
11.4k1344102
11.4k1344102
answered Dec 12 '13 at 4:09
terdonâ¦
123k28232404
123k28232404
1
That's a good answer, thx. It sort of confirms what I guessed. I hate theparallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?
â Stephen Henderson
Dec 12 '13 at 8:02
3
Have a look atsem
which is part of the GNU Parallel package. That might suit your syntax requirements better.
â Ole Tange
Dec 12 '13 at 10:53
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
It's true thatparallel
is more powerful thanxargs
, but that comparison is rather biased. For example,xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also-d
to emulateparallel
(even mentioned in the comparison!).xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.
â Sam Brightman
Aug 26 '16 at 10:10
add a comment |Â
1
That's a good answer, thx. It sort of confirms what I guessed. I hate theparallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?
â Stephen Henderson
Dec 12 '13 at 8:02
3
Have a look atsem
which is part of the GNU Parallel package. That might suit your syntax requirements better.
â Ole Tange
Dec 12 '13 at 10:53
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
It's true thatparallel
is more powerful thanxargs
, but that comparison is rather biased. For example,xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also-d
to emulateparallel
(even mentioned in the comparison!).xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.
â Sam Brightman
Aug 26 '16 at 10:10
1
1
That's a good answer, thx. It sort of confirms what I guessed. I hate the
parallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?â Stephen Henderson
Dec 12 '13 at 8:02
That's a good answer, thx. It sort of confirms what I guessed. I hate the
parallel
syntax, yet another new brand of keyboard-faceroll to memorise. But I guess the auto balancing across cores/jobs is worth it...?â Stephen Henderson
Dec 12 '13 at 8:02
3
3
Have a look at
sem
which is part of the GNU Parallel package. That might suit your syntax requirements better.â Ole Tange
Dec 12 '13 at 10:53
Have a look at
sem
which is part of the GNU Parallel package. That might suit your syntax requirements better.â Ole Tange
Dec 12 '13 at 10:53
1
1
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
@OleTange thx, good call
â Stephen Henderson
Dec 12 '13 at 11:37
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
> xargs has no support for context replace, so you will have to create the arguments. --- What does this mean? Isn't it xargs -I %
â raine
Feb 18 '16 at 11:00
2
2
It's true that
parallel
is more powerful than xargs
, but that comparison is rather biased. For example, xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also -d
to emulate parallel
(even mentioned in the comparison!). xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.â Sam Brightman
Aug 26 '16 at 10:10
It's true that
parallel
is more powerful than xargs
, but that comparison is rather biased. For example, xargs
supports null-terminated strings as input to avoid problems with spaces and quotes, and can also -d
to emulate parallel
(even mentioned in the comparison!). xargs -I
is sufficient context replacement for most simple cases, and I usually know the number of the cores on the machine. I never experienced a problem with ungrouped output.â Sam Brightman
Aug 26 '16 at 10:10
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f104778%2fgnu-parallel-vs-i-mean-background-vs-xargs-p%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password