How do I create loops to join groups of files within a directory?

Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I have lots of very large files that take a long time to join manually. They look something like this:
- file_a1.txt
- file_a2.txt
- file_a3.txt
- file_b1.txt
- file_b2.txt
- file_b3.txt
- file_c1.txt
- file_c2.txt
- file_c3.txt
and so on. How do I join all the a's together, all the b's together and then all the c's using a loop ?
files scripting
add a comment |Â
up vote
0
down vote
favorite
I have lots of very large files that take a long time to join manually. They look something like this:
- file_a1.txt
- file_a2.txt
- file_a3.txt
- file_b1.txt
- file_b2.txt
- file_b3.txt
- file_c1.txt
- file_c2.txt
- file_c3.txt
and so on. How do I join all the a's together, all the b's together and then all the c's using a loop ?
files scripting
1
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
@jordanm: (1)â¯Instead of"file_$x"*, you could say simply"file_$x"*.âÂÂ(2)â¯Instead of"file_$x_combined", you must say"file_$x_combined","file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable$x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write filesfile_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.
â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have lots of very large files that take a long time to join manually. They look something like this:
- file_a1.txt
- file_a2.txt
- file_a3.txt
- file_b1.txt
- file_b2.txt
- file_b3.txt
- file_c1.txt
- file_c2.txt
- file_c3.txt
and so on. How do I join all the a's together, all the b's together and then all the c's using a loop ?
files scripting
I have lots of very large files that take a long time to join manually. They look something like this:
- file_a1.txt
- file_a2.txt
- file_a3.txt
- file_b1.txt
- file_b2.txt
- file_b3.txt
- file_c1.txt
- file_c2.txt
- file_c3.txt
and so on. How do I join all the a's together, all the b's together and then all the c's using a loop ?
files scripting
files scripting
edited Sep 13 at 6:58
Kusalananda
107k14209331
107k14209331
asked Sep 13 at 3:46
Danielle
83
83
1
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
@jordanm: (1)â¯Instead of"file_$x"*, you could say simply"file_$x"*.âÂÂ(2)â¯Instead of"file_$x_combined", you must say"file_$x_combined","file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable$x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write filesfile_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.
â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13
add a comment |Â
1
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
@jordanm: (1)â¯Instead of"file_$x"*, you could say simply"file_$x"*.âÂÂ(2)â¯Instead of"file_$x_combined", you must say"file_$x_combined","file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable$x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write filesfile_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.
â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13
1
1
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
@jordanm: (1)â¯Instead of
"file_$x"*, you could say simply "file_$x"*.âÂÂ(2)â¯Instead of "file_$x_combined", you must say "file_$x_combined", "file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable $x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write files file_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.â G-Man
Sep 19 at 20:13
@jordanm: (1)â¯Instead of
"file_$x"*, you could say simply "file_$x"*.âÂÂ(2)â¯Instead of "file_$x_combined", you must say "file_$x_combined", "file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable $x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write files file_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
I'm assuming that by "joining" you mean concatenating, and not "performing a relational JOIN operation between them using the join command". Furthermore I assume that the filenames follow the pattern file_XN.txt where X is some single letter and N is some number and that all files are located in the current directory.
Concatenation of files is done using the cat command, so you would have to call cat with all the "a-files", then the "b-files", etc., while writing the output to some appropriately named file.
for filename in ./file_[a-z]*.txt; do
# extract the letter
letter=$filename#*_ # ./file_XN.txt --> XN.txt
letter=$letter%[0-9]* # XN.txt --> X
cat "$filename" >>"combined_$letter.txt"
done
This would loop over all files and extract the letter from the filename. The letter would then be used to construct the output filename and the contents of the current file would be appended to that output file.
The data would be appended in the same order as the files would be listed in the directory by default.
Another shorter way of doing it if you're using bash and know exactly the letters you want to go through (using a through to t here):
for letter in a..t; do
cat "file_$letter"*.txt >"combined_$letter.txt"
done
Here, instead of looping over the filenames, we loop over the letters and for each letter we concatenate all files for that letter in one go.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
I'm assuming that by "joining" you mean concatenating, and not "performing a relational JOIN operation between them using the join command". Furthermore I assume that the filenames follow the pattern file_XN.txt where X is some single letter and N is some number and that all files are located in the current directory.
Concatenation of files is done using the cat command, so you would have to call cat with all the "a-files", then the "b-files", etc., while writing the output to some appropriately named file.
for filename in ./file_[a-z]*.txt; do
# extract the letter
letter=$filename#*_ # ./file_XN.txt --> XN.txt
letter=$letter%[0-9]* # XN.txt --> X
cat "$filename" >>"combined_$letter.txt"
done
This would loop over all files and extract the letter from the filename. The letter would then be used to construct the output filename and the contents of the current file would be appended to that output file.
The data would be appended in the same order as the files would be listed in the directory by default.
Another shorter way of doing it if you're using bash and know exactly the letters you want to go through (using a through to t here):
for letter in a..t; do
cat "file_$letter"*.txt >"combined_$letter.txt"
done
Here, instead of looping over the filenames, we loop over the letters and for each letter we concatenate all files for that letter in one go.
add a comment |Â
up vote
0
down vote
accepted
I'm assuming that by "joining" you mean concatenating, and not "performing a relational JOIN operation between them using the join command". Furthermore I assume that the filenames follow the pattern file_XN.txt where X is some single letter and N is some number and that all files are located in the current directory.
Concatenation of files is done using the cat command, so you would have to call cat with all the "a-files", then the "b-files", etc., while writing the output to some appropriately named file.
for filename in ./file_[a-z]*.txt; do
# extract the letter
letter=$filename#*_ # ./file_XN.txt --> XN.txt
letter=$letter%[0-9]* # XN.txt --> X
cat "$filename" >>"combined_$letter.txt"
done
This would loop over all files and extract the letter from the filename. The letter would then be used to construct the output filename and the contents of the current file would be appended to that output file.
The data would be appended in the same order as the files would be listed in the directory by default.
Another shorter way of doing it if you're using bash and know exactly the letters you want to go through (using a through to t here):
for letter in a..t; do
cat "file_$letter"*.txt >"combined_$letter.txt"
done
Here, instead of looping over the filenames, we loop over the letters and for each letter we concatenate all files for that letter in one go.
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
I'm assuming that by "joining" you mean concatenating, and not "performing a relational JOIN operation between them using the join command". Furthermore I assume that the filenames follow the pattern file_XN.txt where X is some single letter and N is some number and that all files are located in the current directory.
Concatenation of files is done using the cat command, so you would have to call cat with all the "a-files", then the "b-files", etc., while writing the output to some appropriately named file.
for filename in ./file_[a-z]*.txt; do
# extract the letter
letter=$filename#*_ # ./file_XN.txt --> XN.txt
letter=$letter%[0-9]* # XN.txt --> X
cat "$filename" >>"combined_$letter.txt"
done
This would loop over all files and extract the letter from the filename. The letter would then be used to construct the output filename and the contents of the current file would be appended to that output file.
The data would be appended in the same order as the files would be listed in the directory by default.
Another shorter way of doing it if you're using bash and know exactly the letters you want to go through (using a through to t here):
for letter in a..t; do
cat "file_$letter"*.txt >"combined_$letter.txt"
done
Here, instead of looping over the filenames, we loop over the letters and for each letter we concatenate all files for that letter in one go.
I'm assuming that by "joining" you mean concatenating, and not "performing a relational JOIN operation between them using the join command". Furthermore I assume that the filenames follow the pattern file_XN.txt where X is some single letter and N is some number and that all files are located in the current directory.
Concatenation of files is done using the cat command, so you would have to call cat with all the "a-files", then the "b-files", etc., while writing the output to some appropriately named file.
for filename in ./file_[a-z]*.txt; do
# extract the letter
letter=$filename#*_ # ./file_XN.txt --> XN.txt
letter=$letter%[0-9]* # XN.txt --> X
cat "$filename" >>"combined_$letter.txt"
done
This would loop over all files and extract the letter from the filename. The letter would then be used to construct the output filename and the contents of the current file would be appended to that output file.
The data would be appended in the same order as the files would be listed in the directory by default.
Another shorter way of doing it if you're using bash and know exactly the letters you want to go through (using a through to t here):
for letter in a..t; do
cat "file_$letter"*.txt >"combined_$letter.txt"
done
Here, instead of looping over the filenames, we loop over the letters and for each letter we concatenate all files for that letter in one go.
edited Sep 13 at 7:39
answered Sep 13 at 7:17
Kusalananda
107k14209331
107k14209331
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f468689%2fhow-do-i-create-loops-to-join-groups-of-files-within-a-directory%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
for x in a,b,c; do cat "file_$x"* > "file_$x_combined"`; done
â jordanm
Sep 13 at 4:17
@jordanm: (1)â¯Instead of
"file_$x"*, you could say simply"file_$x"*.âÂÂ(2)â¯Instead of"file_$x_combined", you must say"file_$x_combined","file_$x"_combined, or something equivalent.âÂÂYour current code tries to evaluate a variable$x_combined.âÂÂ(3)â¯Once pointâ¯#2 is fixed, your code will write filesfile_a_combined,âÂÂfile_b_combined,âÂÂandâÂÂfile_c_combined.âÂÂThen, if the command is run again, it will include those files in the wildcard.âÂÂIn situations like this, itâÂÂs wise to give the new file a name that doesnâÂÂt match the pattern of the input files.â G-Man
Sep 19 at 20:13
P.S.â There is an extra backtick (`) in your comment.
â G-Man
Sep 19 at 20:13