Extract middle section of lines of a text file?

I am writing a PHP script to parse a large text file to do database inserts from it. However on my host, the file is too large, and I hit the memory limit for PHP.

The file has about 16,000 lines; I want to split it up into four separate files (at first) to see if I can load those.

The first part I can get with head -4000 file.txt. The middle sections are slightly trickier -- I was thinking about piping tail output into head ( tail -4001 file.txt | head -4000 > section2.txt ), but is there another/better way?

Actually my logic is messed up -- for section two, I would need to so something like tail -12001 file.txt | head - 4000, and then lower the tail argument for the next sections. I'm getting mixed up already! :P

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

add a comment |

I am writing a PHP script to parse a large text file to do database inserts from it. However on my host, the file is too large, and I hit the memory limit for PHP.

The file has about 16,000 lines; I want to split it up into four separate files (at first) to see if I can load those.

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

add a comment |

I am writing a PHP script to parse a large text file to do database inserts from it. However on my host, the file is too large, and I hit the memory limit for PHP.

The file has about 16,000 lines; I want to split it up into four separate files (at first) to see if I can load those.

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

I am writing a PHP script to parse a large text file to do database inserts from it. However on my host, the file is too large, and I hit the memory limit for PHP.

The file has about 16,000 lines; I want to split it up into four separate files (at first) to see if I can load those.

shell command-line text-processing

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

edited Dec 30 '18 at 20:34

Peter Mortensen

88758

asked Oct 14 '11 at 16:56

user394

4,887155172

asked Oct 14 '11 at 16:56

user394

4,887155172

asked Oct 14 '11 at 16:56

user394

4,887155172

add a comment |

2 Answers
2

active

oldest

votes

If you want not to get messed up but still do it using tail and head, there is a useful way of invoking tail using a line-count from the beginning, not the end:

tail -n +4001 yourfile | head -4000

... But a better, automatic tool made just for splitting files is called... split! It's also a part of GNU coreutils, so any normal Linux system should have it. Here's how you can use it:

split -l 4000 yourInputFile thePrefixForOutputFiles

(See man split if in doubt.)

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

add a comment |

Combining head and tail as you did will work, but for this I would use sed

sed -n '1,4000p' input_file # print lines 1-4000 of input_file

This lets you solve your problem with a quick shell function

chunk_it()
 step=4
 start=1
 end=$step
 for n in 1..4 ; do
 sed -n "$start,$endp" "$1" > "$1".$start-$end
 let start+=$step
 let end+=$step
 done


chunk_it your_file

Now you have your_file.1-4000 and yuor_file.4001-8000 and so on.

Note: requires bash

answered Oct 14 '11 at 17:16

Sorpigal

907610

3

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f22623%2fextract-middle-section-of-lines-of-a-text-file%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

If you want not to get messed up but still do it using tail and head, there is a useful way of invoking tail using a line-count from the beginning, not the end:

tail -n +4001 yourfile | head -4000

... But a better, automatic tool made just for splitting files is called... split! It's also a part of GNU coreutils, so any normal Linux system should have it. Here's how you can use it:

split -l 4000 yourInputFile thePrefixForOutputFiles

(See man split if in doubt.)

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

add a comment |

If you want not to get messed up but still do it using tail and head, there is a useful way of invoking tail using a line-count from the beginning, not the end:

tail -n +4001 yourfile | head -4000

... But a better, automatic tool made just for splitting files is called... split! It's also a part of GNU coreutils, so any normal Linux system should have it. Here's how you can use it:

split -l 4000 yourInputFile thePrefixForOutputFiles

(See man split if in doubt.)

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

add a comment |

If you want not to get messed up but still do it using tail and head, there is a useful way of invoking tail using a line-count from the beginning, not the end:

tail -n +4001 yourfile | head -4000

... But a better, automatic tool made just for splitting files is called... split! It's also a part of GNU coreutils, so any normal Linux system should have it. Here's how you can use it:

split -l 4000 yourInputFile thePrefixForOutputFiles

(See man split if in doubt.)

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

If you want not to get messed up but still do it using tail and head, there is a useful way of invoking tail using a line-count from the beginning, not the end:

tail -n +4001 yourfile | head -4000

... But a better, automatic tool made just for splitting files is called... split! It's also a part of GNU coreutils, so any normal Linux system should have it. Here's how you can use it:

split -l 4000 yourInputFile thePrefixForOutputFiles

(See man split if in doubt.)

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

edited Oct 14 '11 at 17:19

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

answered Oct 14 '11 at 17:13

rozcietrzewiacz

29k47292

add a comment |

Combining head and tail as you did will work, but for this I would use sed

sed -n '1,4000p' input_file # print lines 1-4000 of input_file

This lets you solve your problem with a quick shell function

chunk_it()
 step=4
 start=1
 end=$step
 for n in 1..4 ; do
 sed -n "$start,$endp" "$1" > "$1".$start-$end
 let start+=$step
 let end+=$step
 done


chunk_it your_file

Now you have your_file.1-4000 and yuor_file.4001-8000 and so on.

Note: requires bash

answered Oct 14 '11 at 17:16

Sorpigal

907610

3

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

add a comment |

Combining head and tail as you did will work, but for this I would use sed

sed -n '1,4000p' input_file # print lines 1-4000 of input_file

This lets you solve your problem with a quick shell function

chunk_it()
 step=4
 start=1
 end=$step
 for n in 1..4 ; do
 sed -n "$start,$endp" "$1" > "$1".$start-$end
 let start+=$step
 let end+=$step
 done


chunk_it your_file

Now you have your_file.1-4000 and yuor_file.4001-8000 and so on.

Note: requires bash

answered Oct 14 '11 at 17:16

Sorpigal

907610

3

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

add a comment |

Combining head and tail as you did will work, but for this I would use sed

sed -n '1,4000p' input_file # print lines 1-4000 of input_file

This lets you solve your problem with a quick shell function

chunk_it()
 step=4
 start=1
 end=$step
 for n in 1..4 ; do
 sed -n "$start,$endp" "$1" > "$1".$start-$end
 let start+=$step
 let end+=$step
 done


chunk_it your_file

Now you have your_file.1-4000 and yuor_file.4001-8000 and so on.

Note: requires bash

answered Oct 14 '11 at 17:16

Sorpigal

907610

Combining head and tail as you did will work, but for this I would use sed

sed -n '1,4000p' input_file # print lines 1-4000 of input_file

This lets you solve your problem with a quick shell function

chunk_it()
 step=4
 start=1
 end=$step
 for n in 1..4 ; do
 sed -n "$start,$endp" "$1" > "$1".$start-$end
 let start+=$step
 let end+=$step
 done


chunk_it your_file

Now you have your_file.1-4000 and yuor_file.4001-8000 and so on.

Note: requires bash

answered Oct 14 '11 at 17:16

Sorpigal

907610

answered Oct 14 '11 at 17:16

Sorpigal

907610

answered Oct 14 '11 at 17:16

Sorpigal

907610

answered Oct 14 '11 at 17:16

Sorpigal

907610

3

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

add a comment |

3

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

I like the sed way.

– fanchyna
Feb 20 '16 at 15:38

This doesn't work for me because sed doesn't exit. It prints out the lines I want to stdout, but I have to ctrl-c out, and as a result, I can't redirect it to a file. Any suggestion to make it usable?

– Brent212
Jun 30 '17 at 18:41

Figured it out! "sed -n '<start_line>,<end_line>w <output_file>' <input_file>" works for me.

– Brent212
Jun 30 '17 at 18:54

@Brent212 Another option to note is that you can also pipe it into less or redirect the output to a file.

– Kyle s
Dec 19 '18 at 19:54

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu