split file into N pieces with same name but different target directories

up vote
6
down vote

favorite

I want to split sourcefile.txt which contains 10000 lines, (increasing everyday) into 30 equal files. I have directories called prog1 to prog30 and I would like to save split the file into these directories with the same filename. For example /prog1/myfile.txt, /prog2/myfile.txt to /prog30/myfile.txt.

Here is my bash script called divide.sh runs in prog directory

#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt) 
divide="$(( $totalline / 30 ))" 
split --lines=$divide $./prog1/myfile.txt 
exit 1
fi

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 30 '17 at 11:14

danone

1069

add a commentÂ |Â

up vote
6
down vote

favorite

Here is my bash script called divide.sh runs in prog directory

#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt) 
divide="$(( $totalline / 30 ))" 
split --lines=$divide $./prog1/myfile.txt 
exit 1
fi

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 30 '17 at 11:14

danone

1069

add a commentÂ |Â

up vote
6
down vote

favorite

Here is my bash script called divide.sh runs in prog directory

#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt) 
divide="$(( $totalline / 30 ))" 
split --lines=$divide $./prog1/myfile.txt 
exit 1
fi

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 30 '17 at 11:14

danone

1069

Here is my bash script called divide.sh runs in prog directory

#!/bin/bash
programpath=/home/mywebsite/project/a1/
array=/prog1/
totalline=$(wc -l < ./sourcefile.txt) 
divide="$(( $totalline / 30 ))" 
split --lines=$divide $./prog1/myfile.txt 
exit 1
fi

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 30 '17 at 11:14

danone

1069

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Nov 16 '17 at 4:58

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 30 '17 at 11:14

danone

1069

asked Oct 30 '17 at 11:14

danone

1069

asked Oct 30 '17 at 11:14

danone

1069

add a commentÂ |Â

6 Answers
6

active

oldest

votes

up vote
1
down vote

accepted

Sed version for fun:

lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do 
 mkdir -p prog$((last/perfile+1))
 echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
 : $((last+=perfile))
 done)
EOD

answered Oct 30 '17 at 14:33

jthill

2,303715

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

add a commentÂ |Â

up vote
4
down vote

#!/bin/bash

# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30 
OUTPUT_FOLDER=output

OUTPUT_FILE_FORMAT=myfile

# split 
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01 
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1

# move all files to their repective directories
for i in 01..30 
do
 mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done

echo "done :)"

exit

The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

add a commentÂ |Â

up vote
3
down vote

The awk only solution (N here equals 30 files):

awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd 
 NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt

Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.

awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' 
 l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

1

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

add a commentÂ |Â

up vote
2
down vote

split + bash solution:

lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1

for f in myfile.txt[0-9]*; do 
 dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
 mv "$f" "$dir_n/myfile.txt"
done

Assuming that you already have folders called prog1 to prog30 (as you mentioned)

lines - contains the integer number of lines per output file
- t - total number of lines of file ./sourcefile.txt
- d=30 is a divider

--numeric-suffixes=1 - split's option, tells to use numeric suffixes starting at 1

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

add a commentÂ |Â

up vote
1
down vote

Steps

count the lines in file and divide by 30
lines = cat $file | wc -l

get the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30

use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file

Expected result

x01-filename.extension, x02-filename.extension... xN-filename.extension

Wrap it into a for loop to process more than one file at a time

#!/bin/bash 
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
 split -l $lines -d --additional-suffix=-filename.extension $file
 if [ $? -eq 0 ]; then
 echo "$file splitted file correctly"
 else
 echo "there was a problem splitting $file"
 exit 1 #we exit with an error code
 fi
done
exit 0 #if all processed fine we exit with a success code

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

add a commentÂ |Â

up vote
0
down vote

Parallelized with GNU Parallel:

parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt

This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f401387%2fsplit-file-into-n-pieces-with-same-name-but-different-target-directories%23new-answer', 'question_page');

);

Post as a guest

Name

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

up vote
1
down vote

accepted

Sed version for fun:

lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do 
 mkdir -p prog$((last/perfile+1))
 echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
 : $((last+=perfile))
 done)
EOD

answered Oct 30 '17 at 14:33

jthill

2,303715

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

add a commentÂ |Â

up vote
1
down vote

accepted

Sed version for fun:

lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do 
 mkdir -p prog$((last/perfile+1))
 echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
 : $((last+=perfile))
 done)
EOD

answered Oct 30 '17 at 14:33

jthill

2,303715

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

add a commentÂ |Â

up vote
1
down vote

accepted

Sed version for fun:

lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do 
 mkdir -p prog$((last/perfile+1))
 echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
 : $((last+=perfile))
 done)
EOD

answered Oct 30 '17 at 14:33

jthill

2,303715

Sed version for fun:

lines=$(wc -l <sourcefile.txt)
perfile=$(( (lines+29)/30 )) # see https://www.rfc-editor.org/rfc/rfc968.txt
last=0
sed -nf- sourcefile.txt <<EOD
$(while let $((last<lines)); do 
 mkdir -p prog$((last/perfile+1))
 echo $((last+1)),$((last+perfile)) w prog$((last/perfile+1))/myfile.txt
 : $((last+=perfile))
 done)
EOD

answered Oct 30 '17 at 14:33

jthill

2,303715

answered Oct 30 '17 at 14:33

jthill

2,303715

answered Oct 30 '17 at 14:33

jthill

2,303715

answered Oct 30 '17 at 14:33

jthill

2,303715

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

add a commentÂ |Â

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. although split function good for less than 10000 lines. It is messed up after 35.000. be careful last line it is splitting middle of line. I spend nearly one day just trying to find what is wrong with the split saved file, after deleting last line which cuts middle of sentence, it does work. On the other hand I use sed answer for 1.5M files, It perfectly cut without loosing any lines. I highly recommend to use sed for large lines. I didn't try awk solution.
â€“Â danone
Oct 31 '17 at 13:30

add a commentÂ |Â

up vote
4
down vote

#!/bin/bash

# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30 
OUTPUT_FOLDER=output

OUTPUT_FILE_FORMAT=myfile

# split 
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01 
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1

# move all files to their repective directories
for i in 01..30 
do
 mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done

echo "done :)"

exit

The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

add a commentÂ |Â

up vote
4
down vote

#!/bin/bash

# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30 
OUTPUT_FOLDER=output

OUTPUT_FILE_FORMAT=myfile

# split 
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01 
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1

# move all files to their repective directories
for i in 01..30 
do
 mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done

echo "done :)"

exit

The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

add a commentÂ |Â

up vote
4
down vote

#!/bin/bash

# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30 
OUTPUT_FOLDER=output

OUTPUT_FILE_FORMAT=myfile

# split 
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01 
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1

# move all files to their repective directories
for i in 01..30 
do
 mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done

echo "done :)"

exit

The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

#!/bin/bash

# assuming the file is in the same folder as the script
INPUT=large_file.txt
# assuming the folder called "output" is in the same folder
# as the script and there are folders that have the patter
# prog01 prog02 ... prog30
# create that with mkdir output/prog01..30 
OUTPUT_FOLDER=output

OUTPUT_FILE_FORMAT=myfile

# split 
# -n -> 30 files
# $OUTPUT_FILE_FORMAT -> should start with this pattern
# --numeric-suffixes=1 -> end of file name should start from 01 
split -n 30 $INPUT $OUTPUT_FILE_FORMAT --numeric-suffixes=1

# move all files to their repective directories
for i in 01..30 
do
 mv $OUTPUT_FILE_FORMAT$i $OUTPUT_FOLDER/prog$i/myfile.txt
done

echo "done :)"

exit

The split command is more than enough for this task. However the solution here requires you to make your folder names start from prog01 and not prog1

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 30 '17 at 13:32

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:53

ashishk

513

answered Oct 30 '17 at 11:53

ashishk

513

answered Oct 30 '17 at 11:53

ashishk

513

add a commentÂ |Â

up vote
3
down vote

The awk only solution (N here equals 30 files):

awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd 
 NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt

Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.

awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' 
 l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

1

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

add a commentÂ |Â

up vote
3
down vote

The awk only solution (N here equals 30 files):

awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd 
 NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt

Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.

awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' 
 l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

1

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

add a commentÂ |Â

up vote
3
down vote

The awk only solution (N here equals 30 files):

awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd 
 NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt

Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.

awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' 
 l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

The awk only solution (N here equals 30 files):

awk 'BEGIN cmd="wc -l <sourcefile.txt"; cmd 
 NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' sourcefile.txt

Or let shell run and return the number of lines in sourcefile.txt and pass to awk as suggested by jthill.

awk 'NR%l==1trgt=sprintf("prog%d",((++c)))print >trgt"/myfile.txt"' 
 l=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Nov 16 '17 at 4:59

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 13:52

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

1

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

add a commentÂ |Â

1

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

I think I like this one best. I'd do the lines-per-file calc on the command line, though: awk 'NR%perfile==1...' perfile=$(( ($(wc -l <sourcefile.txt)+29)/30 )) sourcefile.txt since awk's piping is, erm, awkward ...
â€“Â jthill
Oct 30 '17 at 14:48

@jthill Thanks, very good, I have moved your suggested command into my answer, thank you ^_^
â€“Â ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·
Nov 16 '17 at 4:53

add a commentÂ |Â

up vote
2
down vote

split + bash solution:

lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1

for f in myfile.txt[0-9]*; do 
 dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
 mv "$f" "$dir_n/myfile.txt"
done

Assuming that you already have folders called prog1 to prog30 (as you mentioned)

lines - contains the integer number of lines per output file
- t - total number of lines of file ./sourcefile.txt
- d=30 is a divider

--numeric-suffixes=1 - split's option, tells to use numeric suffixes starting at 1

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

add a commentÂ |Â

up vote
2
down vote

split + bash solution:

lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1

for f in myfile.txt[0-9]*; do 
 dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
 mv "$f" "$dir_n/myfile.txt"
done

Assuming that you already have folders called prog1 to prog30 (as you mentioned)

lines - contains the integer number of lines per output file
- t - total number of lines of file ./sourcefile.txt
- d=30 is a divider

--numeric-suffixes=1 - split's option, tells to use numeric suffixes starting at 1

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

add a commentÂ |Â

up vote
2
down vote

split + bash solution:

lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1

for f in myfile.txt[0-9]*; do 
 dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
 mv "$f" "$dir_n/myfile.txt"
done

Assuming that you already have folders called prog1 to prog30 (as you mentioned)

lines - contains the integer number of lines per output file
- t - total number of lines of file ./sourcefile.txt
- d=30 is a divider

--numeric-suffixes=1 - split's option, tells to use numeric suffixes starting at 1

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

split + bash solution:

lines=$(echo "t=$(wc -l ./sourcefile.txt | cut -d' ' -f1); d=30; if(t%d) t/d+1 else t/d" | bc)
split -l $lines ./sourcefile.txt "myfile.txt" --numeric-suffixes=1

for f in myfile.txt[0-9]*; do 
 dir_n="prog"$(printf "%d" "$f#*txt") # constructing directory name
 mv "$f" "$dir_n/myfile.txt"
done

Assuming that you already have folders called prog1 to prog30 (as you mentioned)

lines - contains the integer number of lines per output file
- t - total number of lines of file ./sourcefile.txt
- d=30 is a divider

--numeric-suffixes=1 - split's option, tells to use numeric suffixes starting at 1

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

edited Oct 30 '17 at 12:10

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

answered Oct 30 '17 at 11:53

RomanPerekhrest

22.5k12145

add a commentÂ |Â

up vote
1
down vote

Steps

count the lines in file and divide by 30
lines = cat $file | wc -l

get the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30

use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file

Expected result

x01-filename.extension, x02-filename.extension... xN-filename.extension

Wrap it into a for loop to process more than one file at a time

#!/bin/bash 
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
 split -l $lines -d --additional-suffix=-filename.extension $file
 if [ $? -eq 0 ]; then
 echo "$file splitted file correctly"
 else
 echo "there was a problem splitting $file"
 exit 1 #we exit with an error code
 fi
done
exit 0 #if all processed fine we exit with a success code

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

add a commentÂ |Â

up vote
1
down vote

Steps

count the lines in file and divide by 30
lines = cat $file | wc -l

get the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30

use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file

Expected result

x01-filename.extension, x02-filename.extension... xN-filename.extension

Wrap it into a for loop to process more than one file at a time

#!/bin/bash 
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
 split -l $lines -d --additional-suffix=-filename.extension $file
 if [ $? -eq 0 ]; then
 echo "$file splitted file correctly"
 else
 echo "there was a problem splitting $file"
 exit 1 #we exit with an error code
 fi
done
exit 0 #if all processed fine we exit with a success code

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

add a commentÂ |Â

up vote
1
down vote

Steps

count the lines in file and divide by 30
lines = cat $file | wc -l

get the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30

use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file

Expected result

x01-filename.extension, x02-filename.extension... xN-filename.extension

Wrap it into a for loop to process more than one file at a time

#!/bin/bash 
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
 split -l $lines -d --additional-suffix=-filename.extension $file
 if [ $? -eq 0 ]; then
 echo "$file splitted file correctly"
 else
 echo "there was a problem splitting $file"
 exit 1 #we exit with an error code
 fi
done
exit 0 #if all processed fine we exit with a success code

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

Steps

count the lines in file and divide by 30
lines = cat $file | wc -l

get the amount of files you need (bash will round it up to an integer)
numOfFiles = $lines / 30

use split to divide the file
split -l $lines -d --additional-suffix=-filename.extension $file

Expected result

x01-filename.extension, x02-filename.extension... xN-filename.extension

Wrap it into a for loop to process more than one file at a time

#!/bin/bash 
for FILE in $(find $pathToWorkingDir -type f -name "filename.extension")
do
 split -l $lines -d --additional-suffix=-filename.extension $file
 if [ $? -eq 0 ]; then
 echo "$file splitted file correctly"
 else
 echo "there was a problem splitting $file"
 exit 1 #we exit with an error code
 fi
done
exit 0 #if all processed fine we exit with a success code

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 30 '17 at 13:34

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 30 '17 at 11:51

Kramer

1376

answered Oct 30 '17 at 11:51

Kramer

1376

answered Oct 30 '17 at 11:51

Kramer

1376

add a commentÂ |Â

up vote
0
down vote

Parallelized with GNU Parallel:

parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt

This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

add a commentÂ |Â

up vote
0
down vote

Parallelized with GNU Parallel:

parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt

This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

add a commentÂ |Â

up vote
0
down vote

Parallelized with GNU Parallel:

parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt

This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

Parallelized with GNU Parallel:

parallel -j30 -a sourcefile.txt --pipepart --block -1 cat '>'prog#/myfile.txt

This will run 30 jobs in parallel, splitting sourcefile.txt into one part per job (i.e. 30) and give the parts to cat that saves into progjobnumber/myfile.txt.

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

answered Nov 17 '17 at 0:18

Ole Tange

11.4k1344102

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu

split file into N pieces with same name but different target directories

6 Answers
6

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Your Answer

Post as a guest

6 Answers
6

6 Answers
6

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

split file into N pieces with same name but different target directories

6 Answers 6

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Your Answer

Sign up or log in

Post as a guest

Post as a guest

6 Answers 6

6 Answers 6

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Steps

Expected result

Wrap it into a for loop to process more than one file at a time

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos

6 Answers
6

6 Answers
6

6 Answers
6