Combine large amounts of files inside a directory
Clash Royale CLAN TAG#URR8PPP
I have a large amount of files within a single directory, currently 10,804 files. the amount of files can float between 5 and 100,000.
I am looking for a way to combine every 250 separate files into one large file with the remainder in a small file.
for example 1200 files
I want 4 files with 250 and 1 file with 200
I am using bash shell.
bash cat
|
show 1 more comment
I have a large amount of files within a single directory, currently 10,804 files. the amount of files can float between 5 and 100,000.
I am looking for a way to combine every 250 separate files into one large file with the remainder in a small file.
for example 1200 files
I want 4 files with 250 and 1 file with 200
I am using bash shell.
bash cat
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
... or are you looking at creatingtar
archives of the files?
– Kusalananda
Feb 8 at 19:40
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18
|
show 1 more comment
I have a large amount of files within a single directory, currently 10,804 files. the amount of files can float between 5 and 100,000.
I am looking for a way to combine every 250 separate files into one large file with the remainder in a small file.
for example 1200 files
I want 4 files with 250 and 1 file with 200
I am using bash shell.
bash cat
I have a large amount of files within a single directory, currently 10,804 files. the amount of files can float between 5 and 100,000.
I am looking for a way to combine every 250 separate files into one large file with the remainder in a small file.
for example 1200 files
I want 4 files with 250 and 1 file with 200
I am using bash shell.
bash cat
bash cat
edited Feb 28 at 2:48
Isaac
12k11852
12k11852
asked Feb 8 at 19:23
gizmogizmo
183
183
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
... or are you looking at creatingtar
archives of the files?
– Kusalananda
Feb 8 at 19:40
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18
|
show 1 more comment
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
... or are you looking at creatingtar
archives of the files?
– Kusalananda
Feb 8 at 19:40
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
... or are you looking at creating
tar
archives of the files?– Kusalananda
Feb 8 at 19:40
... or are you looking at creating
tar
archives of the files?– Kusalananda
Feb 8 at 19:40
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18
|
show 1 more comment
4 Answers
4
active
oldest
votes
Simply:
#!/bin/bash
files_count=`ls -1 ./ | wc -l`
block_size=10
blocks_count=$(($files_count/$block_size))
for i in $(seq 1 1 $blocks_count); do
files=`find . -type f -exec readlink -f ; | head -$block_size`
for j in $files; do
if [ -f $j ] && [[ "$j" != outfile* ]] ; then
cat $j >> outfile$i
fi
done
done
# remainder part
for i in *; do
if [ -f $i ] && [[ "$i" != outfile* ]] ; then
cat $i >> outfilelast
fi
done
Note:
Your files merged alphabetically and also script should be placed inside the same directory.
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
add a comment |
You can write a straightforward loop to do this with an array and $x:s:l
parameter expansion:
files=(*)
for (( i = 0; i < $#files[@]; i += 250 ))
do
cat -- "$files[@]:$i:250" > "file$i.combined"
done
Here we collect all the (non-hidden) files in .
into an array files
(file names sorted lexically), and loop over counting from 0 to however many files there are in 250s. For each 250, we expand out the filenames (0-249, 250-499, etc) as arguments to cat
and put the output into file0.combined
, file250.combined
, and so on.
This is just Bash's version of a traditional C-style for
loop. Since you're going to have to loop for each separate cat
anyway, there's not much point overcomplicating things.
You'll end up with several .combined
files at the end - because the filenames were already expanded, those won't be included in the concatenations again, but if you ran the command a second time they would be. If that's a concern, you could put them somewhere else, delete them afterwards, or if it's going straight to the printer even just pipe to lp
.
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
|
show 4 more comments
I tried with below method
for ((i=1;i<=1200;i++)); do j=$(($i + 249 )); sed -n ''$i','$j'p' filename >individual_$i ;i=$j; done
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
add a comment |
Assuming you are okay with combining them in the order find
finds them:
find . -maxdepth 1 -type f -print0 |
xargs -0 -L 250 sh -c 'cat "$@" >/tmp/combined-$1##*/' sh
For a directory containing files with names file-1
up to file-739
(as an example), this would create files in /tmp
called combined-file-1
,combined-file-251
, and combined-file-501
, where the bit after combined-
is the name of the first file in that combined file.
It does this by calling cat
to concatenate files in batches of a maximum of 250 files at a time in an in-line shell script executed repeatedly by xargs
(the $1##*/
in that script removes any directory path from the current batch's first file's pathname). The xargs
utility gets the filenames as nul-terminated strings from find
. The find
utility will look in the current directory (only) and output all pathnames therein that corresponds to regular files.
You would then print the /tmp/combined-*
files.
To only process files with a particular suffix, like .txt
, use -name '*.txt'
in the find
command, before -print0
.
The -print0
action of find
and the -0
option of xargs
are non-standard by commonly implemented.
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499538%2fcombine-large-amounts-of-files-inside-a-directory%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Simply:
#!/bin/bash
files_count=`ls -1 ./ | wc -l`
block_size=10
blocks_count=$(($files_count/$block_size))
for i in $(seq 1 1 $blocks_count); do
files=`find . -type f -exec readlink -f ; | head -$block_size`
for j in $files; do
if [ -f $j ] && [[ "$j" != outfile* ]] ; then
cat $j >> outfile$i
fi
done
done
# remainder part
for i in *; do
if [ -f $i ] && [[ "$i" != outfile* ]] ; then
cat $i >> outfilelast
fi
done
Note:
Your files merged alphabetically and also script should be placed inside the same directory.
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
add a comment |
Simply:
#!/bin/bash
files_count=`ls -1 ./ | wc -l`
block_size=10
blocks_count=$(($files_count/$block_size))
for i in $(seq 1 1 $blocks_count); do
files=`find . -type f -exec readlink -f ; | head -$block_size`
for j in $files; do
if [ -f $j ] && [[ "$j" != outfile* ]] ; then
cat $j >> outfile$i
fi
done
done
# remainder part
for i in *; do
if [ -f $i ] && [[ "$i" != outfile* ]] ; then
cat $i >> outfilelast
fi
done
Note:
Your files merged alphabetically and also script should be placed inside the same directory.
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
add a comment |
Simply:
#!/bin/bash
files_count=`ls -1 ./ | wc -l`
block_size=10
blocks_count=$(($files_count/$block_size))
for i in $(seq 1 1 $blocks_count); do
files=`find . -type f -exec readlink -f ; | head -$block_size`
for j in $files; do
if [ -f $j ] && [[ "$j" != outfile* ]] ; then
cat $j >> outfile$i
fi
done
done
# remainder part
for i in *; do
if [ -f $i ] && [[ "$i" != outfile* ]] ; then
cat $i >> outfilelast
fi
done
Note:
Your files merged alphabetically and also script should be placed inside the same directory.
Simply:
#!/bin/bash
files_count=`ls -1 ./ | wc -l`
block_size=10
blocks_count=$(($files_count/$block_size))
for i in $(seq 1 1 $blocks_count); do
files=`find . -type f -exec readlink -f ; | head -$block_size`
for j in $files; do
if [ -f $j ] && [[ "$j" != outfile* ]] ; then
cat $j >> outfile$i
fi
done
done
# remainder part
for i in *; do
if [ -f $i ] && [[ "$i" != outfile* ]] ; then
cat $i >> outfilelast
fi
done
Note:
Your files merged alphabetically and also script should be placed inside the same directory.
edited Feb 12 at 7:30
answered Feb 9 at 11:44
Akhil JAkhil J
597
597
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
add a comment |
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
Hi I tried this and I am getting the following error : files_count= 10805 + block_size=250 + blocks_count=43 + seq 1 1 43 seq: not found + [ ! -f outfilelast ] + exit 0 $
– gizmo
Feb 11 at 13:04
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
I updated the answer.check again.place the script inside same folder or modify script carefully.
– Akhil J
Feb 12 at 7:30
add a comment |
You can write a straightforward loop to do this with an array and $x:s:l
parameter expansion:
files=(*)
for (( i = 0; i < $#files[@]; i += 250 ))
do
cat -- "$files[@]:$i:250" > "file$i.combined"
done
Here we collect all the (non-hidden) files in .
into an array files
(file names sorted lexically), and loop over counting from 0 to however many files there are in 250s. For each 250, we expand out the filenames (0-249, 250-499, etc) as arguments to cat
and put the output into file0.combined
, file250.combined
, and so on.
This is just Bash's version of a traditional C-style for
loop. Since you're going to have to loop for each separate cat
anyway, there's not much point overcomplicating things.
You'll end up with several .combined
files at the end - because the filenames were already expanded, those won't be included in the concatenations again, but if you ran the command a second time they would be. If that's a concern, you could put them somewhere else, delete them afterwards, or if it's going straight to the printer even just pipe to lp
.
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
|
show 4 more comments
You can write a straightforward loop to do this with an array and $x:s:l
parameter expansion:
files=(*)
for (( i = 0; i < $#files[@]; i += 250 ))
do
cat -- "$files[@]:$i:250" > "file$i.combined"
done
Here we collect all the (non-hidden) files in .
into an array files
(file names sorted lexically), and loop over counting from 0 to however many files there are in 250s. For each 250, we expand out the filenames (0-249, 250-499, etc) as arguments to cat
and put the output into file0.combined
, file250.combined
, and so on.
This is just Bash's version of a traditional C-style for
loop. Since you're going to have to loop for each separate cat
anyway, there's not much point overcomplicating things.
You'll end up with several .combined
files at the end - because the filenames were already expanded, those won't be included in the concatenations again, but if you ran the command a second time they would be. If that's a concern, you could put them somewhere else, delete them afterwards, or if it's going straight to the printer even just pipe to lp
.
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
|
show 4 more comments
You can write a straightforward loop to do this with an array and $x:s:l
parameter expansion:
files=(*)
for (( i = 0; i < $#files[@]; i += 250 ))
do
cat -- "$files[@]:$i:250" > "file$i.combined"
done
Here we collect all the (non-hidden) files in .
into an array files
(file names sorted lexically), and loop over counting from 0 to however many files there are in 250s. For each 250, we expand out the filenames (0-249, 250-499, etc) as arguments to cat
and put the output into file0.combined
, file250.combined
, and so on.
This is just Bash's version of a traditional C-style for
loop. Since you're going to have to loop for each separate cat
anyway, there's not much point overcomplicating things.
You'll end up with several .combined
files at the end - because the filenames were already expanded, those won't be included in the concatenations again, but if you ran the command a second time they would be. If that's a concern, you could put them somewhere else, delete them afterwards, or if it's going straight to the printer even just pipe to lp
.
You can write a straightforward loop to do this with an array and $x:s:l
parameter expansion:
files=(*)
for (( i = 0; i < $#files[@]; i += 250 ))
do
cat -- "$files[@]:$i:250" > "file$i.combined"
done
Here we collect all the (non-hidden) files in .
into an array files
(file names sorted lexically), and loop over counting from 0 to however many files there are in 250s. For each 250, we expand out the filenames (0-249, 250-499, etc) as arguments to cat
and put the output into file0.combined
, file250.combined
, and so on.
This is just Bash's version of a traditional C-style for
loop. Since you're going to have to loop for each separate cat
anyway, there's not much point overcomplicating things.
You'll end up with several .combined
files at the end - because the filenames were already expanded, those won't be included in the concatenations again, but if you ran the command a second time they would be. If that's a concern, you could put them somewhere else, delete them afterwards, or if it's going straight to the printer even just pipe to lp
.
edited Feb 9 at 12:01
Stéphane Chazelas
308k57581939
308k57581939
answered Feb 8 at 19:44
Michael HomerMichael Homer
49.4k8133172
49.4k8133172
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
|
show 4 more comments
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
Hi I'm getting a syntax error - syntax error: got <&, expecting Word
– gizmo
Feb 8 at 19:49
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
What did you write when you got that error? Did it give a location? What shell are you running?
– Michael Homer
Feb 8 at 19:52
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
it appears to not like the files=(*).... shell is #!/bin/bash
– gizmo
Feb 8 at 19:55
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
I'm not sure how there's a syntax error on that line where the error is about something that isn't on that line. Can you post 1) the full error report and 2) the full line on which it reports that error?
– Michael Homer
Feb 8 at 19:57
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
got a different error after tying again ... ./bcnbd430.sh 15: syntax error: got (, expecting Newline line 15 -> files=(*)
– gizmo
Feb 8 at 19:58
|
show 4 more comments
I tried with below method
for ((i=1;i<=1200;i++)); do j=$(($i + 249 )); sed -n ''$i','$j'p' filename >individual_$i ;i=$j; done
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
add a comment |
I tried with below method
for ((i=1;i<=1200;i++)); do j=$(($i + 249 )); sed -n ''$i','$j'p' filename >individual_$i ;i=$j; done
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
add a comment |
I tried with below method
for ((i=1;i<=1200;i++)); do j=$(($i + 249 )); sed -n ''$i','$j'p' filename >individual_$i ;i=$j; done
I tried with below method
for ((i=1;i<=1200;i++)); do j=$(($i + 249 )); sed -n ''$i','$j'p' filename >individual_$i ;i=$j; done
answered Feb 9 at 10:04
Praveen Kumar BSPraveen Kumar BS
1,5241310
1,5241310
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
add a comment |
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
Hi I tried this and it doesnt like the for loop for ((i=1;i<=15050;i++)); I get the following error - ./bcnbd430.sh 16: syntax error: got <&, expecting Word
– gizmo
Feb 11 at 13:15
add a comment |
Assuming you are okay with combining them in the order find
finds them:
find . -maxdepth 1 -type f -print0 |
xargs -0 -L 250 sh -c 'cat "$@" >/tmp/combined-$1##*/' sh
For a directory containing files with names file-1
up to file-739
(as an example), this would create files in /tmp
called combined-file-1
,combined-file-251
, and combined-file-501
, where the bit after combined-
is the name of the first file in that combined file.
It does this by calling cat
to concatenate files in batches of a maximum of 250 files at a time in an in-line shell script executed repeatedly by xargs
(the $1##*/
in that script removes any directory path from the current batch's first file's pathname). The xargs
utility gets the filenames as nul-terminated strings from find
. The find
utility will look in the current directory (only) and output all pathnames therein that corresponds to regular files.
You would then print the /tmp/combined-*
files.
To only process files with a particular suffix, like .txt
, use -name '*.txt'
in the find
command, before -print0
.
The -print0
action of find
and the -0
option of xargs
are non-standard by commonly implemented.
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
add a comment |
Assuming you are okay with combining them in the order find
finds them:
find . -maxdepth 1 -type f -print0 |
xargs -0 -L 250 sh -c 'cat "$@" >/tmp/combined-$1##*/' sh
For a directory containing files with names file-1
up to file-739
(as an example), this would create files in /tmp
called combined-file-1
,combined-file-251
, and combined-file-501
, where the bit after combined-
is the name of the first file in that combined file.
It does this by calling cat
to concatenate files in batches of a maximum of 250 files at a time in an in-line shell script executed repeatedly by xargs
(the $1##*/
in that script removes any directory path from the current batch's first file's pathname). The xargs
utility gets the filenames as nul-terminated strings from find
. The find
utility will look in the current directory (only) and output all pathnames therein that corresponds to regular files.
You would then print the /tmp/combined-*
files.
To only process files with a particular suffix, like .txt
, use -name '*.txt'
in the find
command, before -print0
.
The -print0
action of find
and the -0
option of xargs
are non-standard by commonly implemented.
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
add a comment |
Assuming you are okay with combining them in the order find
finds them:
find . -maxdepth 1 -type f -print0 |
xargs -0 -L 250 sh -c 'cat "$@" >/tmp/combined-$1##*/' sh
For a directory containing files with names file-1
up to file-739
(as an example), this would create files in /tmp
called combined-file-1
,combined-file-251
, and combined-file-501
, where the bit after combined-
is the name of the first file in that combined file.
It does this by calling cat
to concatenate files in batches of a maximum of 250 files at a time in an in-line shell script executed repeatedly by xargs
(the $1##*/
in that script removes any directory path from the current batch's first file's pathname). The xargs
utility gets the filenames as nul-terminated strings from find
. The find
utility will look in the current directory (only) and output all pathnames therein that corresponds to regular files.
You would then print the /tmp/combined-*
files.
To only process files with a particular suffix, like .txt
, use -name '*.txt'
in the find
command, before -print0
.
The -print0
action of find
and the -0
option of xargs
are non-standard by commonly implemented.
Assuming you are okay with combining them in the order find
finds them:
find . -maxdepth 1 -type f -print0 |
xargs -0 -L 250 sh -c 'cat "$@" >/tmp/combined-$1##*/' sh
For a directory containing files with names file-1
up to file-739
(as an example), this would create files in /tmp
called combined-file-1
,combined-file-251
, and combined-file-501
, where the bit after combined-
is the name of the first file in that combined file.
It does this by calling cat
to concatenate files in batches of a maximum of 250 files at a time in an in-line shell script executed repeatedly by xargs
(the $1##*/
in that script removes any directory path from the current batch's first file's pathname). The xargs
utility gets the filenames as nul-terminated strings from find
. The find
utility will look in the current directory (only) and output all pathnames therein that corresponds to regular files.
You would then print the /tmp/combined-*
files.
To only process files with a particular suffix, like .txt
, use -name '*.txt'
in the find
command, before -print0
.
The -print0
action of find
and the -0
option of xargs
are non-standard by commonly implemented.
answered Feb 9 at 10:44
KusalanandaKusalananda
133k17254417
133k17254417
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
add a comment |
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
Hi I tried this and I get the following + DIRPATH=/E=/NBIPTEST/unprocessed/ + find . -maxdepth 1 -type f -print0 + xargs -0 -L 250 sh -c cat "$@" >$unprocessed/combined-$1##*/ sh xargs: Unknown option "-0" Usage: xargs [-l#][-L #] [-irepl][-I repl] [-n#] [-tpx] [-s#] [-eeof][-E eof] [cmd [args ...]] Unknown option "-maxdepth" Usage: find directory ... expression + exit 0
– gizmo
Feb 11 at 13:06
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
After removing the code that flags as an error I am left with find . -type f | xargs -L 250 sh -c 'cat "$@" >$unprocessed/combined-$1##*/' sh and this freezes when running
– gizmo
Feb 11 at 13:10
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
@gizmo You can't just delete bits of a command without understanding what the new command does!!! What Unix are you on?
– Kusalananda
Feb 11 at 13:11
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
I agree, but trying to isolate where the first error is coming from, and seeing if there was a way to fix it before I commented back here
– gizmo
Feb 11 at 13:42
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499538%2fcombine-large-amounts-of-files-inside-a-directory%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You want to concatenate groups of the files into big files, or you want to move groups of the files into subdirectories?
– L. Scott Johnson
Feb 8 at 19:37
... or are you looking at creating
tar
archives of the files?– Kusalananda
Feb 8 at 19:40
I want to cat groups of files into big files
– gizmo
Feb 8 at 19:43
the overall goal is to have small files to send to the printer, cat * creates a massive file that goes over the print buffers. sending individual files takes way to long for processing at the printer
– gizmo
Feb 8 at 19:44
thanks everyone for the help, I was able to find a solution that worked for my needs.
– gizmo
Feb 11 at 16:18