split file into N pieces with same name but different target directories

Clash Royale CLAN TAG#URR8PPP
up vote
6
down vote
favorite
I want to split sourcefile.txt which contains 10000 lines, (increasing everyday) into 30 equal files. I have directories called prog1 to prog30 and I would like to save split the file into these directories with the same filename. For example /prog1/myfile.txt, /prog2/myfile.txt to /prog30/myfile.txt.
Here is my bash script called divide.sh runs in prog directory
#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt)
divide="$(( $totalline / 30 ))"
split --lines=$divide $./prog1/myfile.txt
exit 1
fi
text-processing files split
add a comment |Â
up vote
6
down vote
favorite
I want to split sourcefile.txt which contains 10000 lines, (increasing everyday) into 30 equal files. I have directories called prog1 to prog30 and I would like to save split the file into these directories with the same filename. For example /prog1/myfile.txt, /prog2/myfile.txt to /prog30/myfile.txt.
Here is my bash script called divide.sh runs in prog directory
#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt)
divide="$(( $totalline / 30 ))"
split --lines=$divide $./prog1/myfile.txt
exit 1
fi
text-processing files split
add a comment |Â
up vote
6
down vote
favorite
up vote
6
down vote
favorite
I want to split sourcefile.txt which contains 10000 lines, (increasing everyday) into 30 equal files. I have directories called prog1 to prog30 and I would like to save split the file into these directories with the same filename. For example /prog1/myfile.txt, /prog2/myfile.txt to /prog30/myfile.txt.
Here is my bash script called divide.sh runs in prog directory
#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt)
divide="$(( $totalline / 30 ))"
split --lines=$divide $./prog1/myfile.txt
exit 1
fi
text-processing files split
I want to split sourcefile.txt which contains 10000 lines, (increasing everyday) into 30 equal files. I have directories called prog1 to prog30 and I would like to save split the file into these directories with the same filename. For example /prog1/myfile.txt, /prog2/myfile.txt to /prog30/myfile.txt.
Here is my bash script called divide.sh runs in prog directory
#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt)
divide="$(( $totalline / 30 ))"
split --lines=$divide $./prog1/myfile.txt
exit 1
fi
text-processing files split
edited Nov 16 '17 at 4:58
ñÃÂsýù÷
15.6k92563
15.6k92563
asked Oct 30 '17 at 11:14
danone
1069
1069
add a comment |Â
add a comment |Â
6 Answers
6
active
oldest
votes
up vote
1
down vote
accepted
Sed version for fun:
lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do
mkdir -p prog$((last/perfile+1))
echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
: $((last+=perfile))
done)
EOD
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
add a comment |Â
up vote
4
down vote
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30
OUTPUT_FOLDER=output
OUTPUT_FILE_FORMAT=myfile
# split
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1
# move all files to their repective directories
for i in 01..30
do
mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done
echo "done :)"
exit
The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1
add a comment |Â
up vote
3
down vote
The awk only solution (N here equals 30 files):
awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd
NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt
Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.
awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"'
l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txtsince awk's piping is, erm, awkward ...
â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
add a comment |Â
up vote
2
down vote
split + bash solution:
lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1
for f in myfile.txt[0-9]*; do
dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
mv "$f" "$dir_n/myfile.txt"
done
Assuming that you already have folders called prog1 to prog30 (as you mentioned)
lines- contains the integer number of lines per output filet- total number of lines of file./sourcefile.txtd=30is a divider
--numeric-suffixes=1- split's option, tells to use numeric suffixes starting at1
add a comment |Â
up vote
1
down vote
Steps
count the lines in file and divide by 30
lines = cat $file | wc -lget the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file
Expected result
x01-filename.extension, x02-filename.extension... xN-filename.extension
Wrap it into a for loop to process more than one file at a time
#!/bin/bash
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
split -l $lines -d --additional-suffix=-filename.extension $file
if [ $? -eq 0 ]; then
echo "$file splitted file correctly"
else
echo "there was a problem splitting $file"
exit 1 #we exit with an error code
fi
done
exit 0 #if all processed fine we exit with a success code
add a comment |Â
up vote
0
down vote
Parallelized with GNU Parallel:
parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt
This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.
add a comment |Â
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Sed version for fun:
lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do
mkdir -p prog$((last/perfile+1))
echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
: $((last+=perfile))
done)
EOD
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
add a comment |Â
up vote
1
down vote
accepted
Sed version for fun:
lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do
mkdir -p prog$((last/perfile+1))
echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
: $((last+=perfile))
done)
EOD
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Sed version for fun:
lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do
mkdir -p prog$((last/perfile+1))
echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
: $((last+=perfile))
done)
EOD
Sed version for fun:
lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do
mkdir -p prog$((last/perfile+1))
echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
: $((last+=perfile))
done)
EOD
answered Oct 30 '17 at 14:33
jthill
2,303715
2,303715
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
add a comment |Â
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â danone
Oct 31 '17 at 13:30
add a comment |Â
up vote
4
down vote
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30
OUTPUT_FOLDER=output
OUTPUT_FILE_FORMAT=myfile
# split
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1
# move all files to their repective directories
for i in 01..30
do
mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done
echo "done :)"
exit
The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1
add a comment |Â
up vote
4
down vote
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30
OUTPUT_FOLDER=output
OUTPUT_FILE_FORMAT=myfile
# split
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1
# move all files to their repective directories
for i in 01..30
do
mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done
echo "done :)"
exit
The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1
add a comment |Â
up vote
4
down vote
up vote
4
down vote
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30
OUTPUT_FOLDER=output
OUTPUT_FILE_FORMAT=myfile
# split
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1
# move all files to their repective directories
for i in 01..30
do
mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done
echo "done :)"
exit
The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1
#!/bin/bash
# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30
OUTPUT_FOLDER=output
OUTPUT_FILE_FORMAT=myfile
# split
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1
# move all files to their repective directories
for i in 01..30
do
mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done
echo "done :)"
exit
The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1
edited Oct 30 '17 at 13:32
ñÃÂsýù÷
15.6k92563
15.6k92563
answered Oct 30 '17 at 11:53
ashishk
513
513
add a comment |Â
add a comment |Â
up vote
3
down vote
The awk only solution (N here equals 30 files):
awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd
NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt
Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.
awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"'
l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txtsince awk's piping is, erm, awkward ...
â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
add a comment |Â
up vote
3
down vote
The awk only solution (N here equals 30 files):
awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd
NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt
Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.
awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"'
l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txtsince awk's piping is, erm, awkward ...
â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
add a comment |Â
up vote
3
down vote
up vote
3
down vote
The awk only solution (N here equals 30 files):
awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd
NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt
Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.
awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"'
l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt
The awk only solution (N here equals 30 files):
awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd
NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt
Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.
awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"'
l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt
edited Nov 16 '17 at 4:59
answered Oct 30 '17 at 13:52
ñÃÂsýù÷
15.6k92563
15.6k92563
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txtsince awk's piping is, erm, awkward ...
â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
add a comment |Â
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txtsince awk's piping is, erm, awkward ...
â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
1
1
I think I like this one best. I'd do the lines-per-file calc on the command line, though:
awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...â jthill
Oct 30 '17 at 14:48
I think I like this one best. I'd do the lines-per-file calc on the command line, though:
awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...â jthill
Oct 30 '17 at 14:48
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â Ã±ÃÂsýù÷
Nov 16 '17 at 4:53
add a comment |Â
up vote
2
down vote
split + bash solution:
lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1
for f in myfile.txt[0-9]*; do
dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
mv "$f" "$dir_n/myfile.txt"
done
Assuming that you already have folders called prog1 to prog30 (as you mentioned)
lines- contains the integer number of lines per output filet- total number of lines of file./sourcefile.txtd=30is a divider
--numeric-suffixes=1- split's option, tells to use numeric suffixes starting at1
add a comment |Â
up vote
2
down vote
split + bash solution:
lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1
for f in myfile.txt[0-9]*; do
dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
mv "$f" "$dir_n/myfile.txt"
done
Assuming that you already have folders called prog1 to prog30 (as you mentioned)
lines- contains the integer number of lines per output filet- total number of lines of file./sourcefile.txtd=30is a divider
--numeric-suffixes=1- split's option, tells to use numeric suffixes starting at1
add a comment |Â
up vote
2
down vote
up vote
2
down vote
split + bash solution:
lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1
for f in myfile.txt[0-9]*; do
dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
mv "$f" "$dir_n/myfile.txt"
done
Assuming that you already have folders called prog1 to prog30 (as you mentioned)
lines- contains the integer number of lines per output filet- total number of lines of file./sourcefile.txtd=30is a divider
--numeric-suffixes=1- split's option, tells to use numeric suffixes starting at1
split + bash solution:
lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1
for f in myfile.txt[0-9]*; do
dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
mv "$f" "$dir_n/myfile.txt"
done
Assuming that you already have folders called prog1 to prog30 (as you mentioned)
lines- contains the integer number of lines per output filet- total number of lines of file./sourcefile.txtd=30is a divider
--numeric-suffixes=1- split's option, tells to use numeric suffixes starting at1
edited Oct 30 '17 at 12:10
answered Oct 30 '17 at 11:53
RomanPerekhrest
22.5k12145
22.5k12145
add a comment |Â
add a comment |Â
up vote
1
down vote
Steps
count the lines in file and divide by 30
lines = cat $file | wc -lget the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file
Expected result
x01-filename.extension, x02-filename.extension... xN-filename.extension
Wrap it into a for loop to process more than one file at a time
#!/bin/bash
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
split -l $lines -d --additional-suffix=-filename.extension $file
if [ $? -eq 0 ]; then
echo "$file splitted file correctly"
else
echo "there was a problem splitting $file"
exit 1 #we exit with an error code
fi
done
exit 0 #if all processed fine we exit with a success code
add a comment |Â
up vote
1
down vote
Steps
count the lines in file and divide by 30
lines = cat $file | wc -lget the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file
Expected result
x01-filename.extension, x02-filename.extension... xN-filename.extension
Wrap it into a for loop to process more than one file at a time
#!/bin/bash
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
split -l $lines -d --additional-suffix=-filename.extension $file
if [ $? -eq 0 ]; then
echo "$file splitted file correctly"
else
echo "there was a problem splitting $file"
exit 1 #we exit with an error code
fi
done
exit 0 #if all processed fine we exit with a success code
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Steps
count the lines in file and divide by 30
lines = cat $file | wc -lget the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file
Expected result
x01-filename.extension, x02-filename.extension... xN-filename.extension
Wrap it into a for loop to process more than one file at a time
#!/bin/bash
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
split -l $lines -d --additional-suffix=-filename.extension $file
if [ $? -eq 0 ]; then
echo "$file splitted file correctly"
else
echo "there was a problem splitting $file"
exit 1 #we exit with an error code
fi
done
exit 0 #if all processed fine we exit with a success code
Steps
count the lines in file and divide by 30
lines = cat $file | wc -lget the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file
Expected result
x01-filename.extension, x02-filename.extension... xN-filename.extension
Wrap it into a for loop to process more than one file at a time
#!/bin/bash
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
split -l $lines -d --additional-suffix=-filename.extension $file
if [ $? -eq 0 ]; then
echo "$file splitted file correctly"
else
echo "there was a problem splitting $file"
exit 1 #we exit with an error code
fi
done
exit 0 #if all processed fine we exit with a success code
edited Oct 30 '17 at 13:34
ñÃÂsýù÷
15.6k92563
15.6k92563
answered Oct 30 '17 at 11:51
Kramer
1376
1376
add a comment |Â
add a comment |Â
up vote
0
down vote
Parallelized with GNU Parallel:
parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt
This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.
add a comment |Â
up vote
0
down vote
Parallelized with GNU Parallel:
parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt
This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Parallelized with GNU Parallel:
parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt
This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.
Parallelized with GNU Parallel:
parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt
This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.
answered Nov 17 '17 at 0:18
Ole Tange
11.4k1344102
11.4k1344102
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f401387%2fsplit-file-into-n-pieces-with-same-name-but-different-target-directories%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password