Merging in Unix

up vote
3
down vote

favorite

I have a CSV file with vertical bars (|) as the delimiter, like below, for which I need to apply merging technique in Unix. The file contains hundreds of thousands of records (fourÃ‚Â fields), but IÃ‚Â gave only fiveÃ‚Â records for ease of reading.

field1 |field2 | field3 |field4|
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

And I want the output result as

field1|field2|field3|field4|
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Can someone help me how to do the same?

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

so you want leading and trailing spaces around the pipe symbols as well as any newlines except those after every 4th pipe symbol removed? is that correct?
â€“Â Sam
Sep 25 at 18:08

2

IÃ¢Â€Â™m sorry if youÃ¢Â€Â™re stuck with data that look like this.Ã¢Â€Â¯Ã¢Â€Â¯ While the answers that have been presented will handle this mangled structure in the best case, it is very precarious (sensitive) to data corruption.Ã¢Â€Â¯Ã¢Â€Â¯ For example, if you have a file where every record is split across two lines (every line has two fields), and one line gets deleted (or totally scrambled), the rebuilt (output) file will be wrong from there on.Ã¢Â€Â¯Ã¢Â€Â¯ You might want to specify that the first field (and only the first field) of each line is a number, so error checking becomes possible.Ã¢Â€Â¯Ã¢Â€Â¯ Ã¢Â€Â¦Ã¢Â€Â¯(ContÃ¢Â€Â™d)
â€“Â G-Man
Sep 25 at 21:09

(ContÃ¢Â€Â™d) Ã¢Â€Â¦Ã¢Â€Â‚ P.S. Is it possible for parts of multiple records to be on the same line?Ã¢Â€Â‚ For example, 1|abc|def|Ã¢Â€Â‚/Ã¢Â€Â‚ghi|4|ijk|Ã¢Â€Â‚/Ã¢Â€Â‚|lmn|?Ã¢Â€ÂƒÃ¢Â€ÂƒAnd is it possible for a field to be split across lines?Ã¢Â€Â‚ For example, 10|hjg|jÃ¢Â€Â‚/Ã¢Â€Â‚sh|nbm|?
â€“Â G-Man
Sep 25 at 21:09

add a commentÂ |Â

up vote
3
down vote

favorite

field1 |field2 | field3 |field4|
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

And I want the output result as

field1|field2|field3|field4|
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Can someone help me how to do the same?

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

so you want leading and trailing spaces around the pipe symbols as well as any newlines except those after every 4th pipe symbol removed? is that correct?
â€“Â Sam
Sep 25 at 18:08

2

IÃ¢Â€Â™m sorry if youÃ¢Â€Â™re stuck with data that look like this.Ã¢Â€Â¯Ã¢Â€Â¯ While the answers that have been presented will handle this mangled structure in the best case, it is very precarious (sensitive) to data corruption.Ã¢Â€Â¯Ã¢Â€Â¯ For example, if you have a file where every record is split across two lines (every line has two fields), and one line gets deleted (or totally scrambled), the rebuilt (output) file will be wrong from there on.Ã¢Â€Â¯Ã¢Â€Â¯ You might want to specify that the first field (and only the first field) of each line is a number, so error checking becomes possible.Ã¢Â€Â¯Ã¢Â€Â¯ Ã¢Â€Â¦Ã¢Â€Â¯(ContÃ¢Â€Â™d)
â€“Â G-Man
Sep 25 at 21:09

(ContÃ¢Â€Â™d) Ã¢Â€Â¦Ã¢Â€Â‚ P.S. Is it possible for parts of multiple records to be on the same line?Ã¢Â€Â‚ For example, 1|abc|def|Ã¢Â€Â‚/Ã¢Â€Â‚ghi|4|ijk|Ã¢Â€Â‚/Ã¢Â€Â‚|lmn|?Ã¢Â€ÂƒÃ¢Â€ÂƒAnd is it possible for a field to be split across lines?Ã¢Â€Â‚ For example, 10|hjg|jÃ¢Â€Â‚/Ã¢Â€Â‚sh|nbm|?
â€“Â G-Man
Sep 25 at 21:09

add a commentÂ |Â

up vote
3
down vote

favorite

field1 |field2 | field3 |field4|
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

And I want the output result as

field1|field2|field3|field4|
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Can someone help me how to do the same?

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

field1 |field2 | field3 |field4|
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

And I want the output result as

field1|field2|field3|field4|
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Can someone help me how to do the same?

text-processing awk sed merge

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

edited Sep 25 at 20:54

G-Man

11.9k92658

edited Sep 25 at 20:54

G-Man

11.9k92658

edited Sep 25 at 20:54

G-Man

11.9k92658

asked Sep 25 at 17:59

Sankar

191

asked Sep 25 at 17:59

Sankar

191

asked Sep 25 at 17:59

Sankar

191

so you want leading and trailing spaces around the pipe symbols as well as any newlines except those after every 4th pipe symbol removed? is that correct?
â€“Â Sam
Sep 25 at 18:08

2

IÃ¢Â€Â™m sorry if youÃ¢Â€Â™re stuck with data that look like this.Ã¢Â€Â¯Ã¢Â€Â¯ While the answers that have been presented will handle this mangled structure in the best case, it is very precarious (sensitive) to data corruption.Ã¢Â€Â¯Ã¢Â€Â¯ For example, if you have a file where every record is split across two lines (every line has two fields), and one line gets deleted (or totally scrambled), the rebuilt (output) file will be wrong from there on.Ã¢Â€Â¯Ã¢Â€Â¯ You might want to specify that the first field (and only the first field) of each line is a number, so error checking becomes possible.Ã¢Â€Â¯Ã¢Â€Â¯ Ã¢Â€Â¦Ã¢Â€Â¯(ContÃ¢Â€Â™d)
â€“Â G-Man
Sep 25 at 21:09

(ContÃ¢Â€Â™d) Ã¢Â€Â¦Ã¢Â€Â‚ P.S. Is it possible for parts of multiple records to be on the same line?Ã¢Â€Â‚ For example, 1|abc|def|Ã¢Â€Â‚/Ã¢Â€Â‚ghi|4|ijk|Ã¢Â€Â‚/Ã¢Â€Â‚|lmn|?Ã¢Â€ÂƒÃ¢Â€ÂƒAnd is it possible for a field to be split across lines?Ã¢Â€Â‚ For example, 10|hjg|jÃ¢Â€Â‚/Ã¢Â€Â‚sh|nbm|?
â€“Â G-Man
Sep 25 at 21:09

add a commentÂ |Â

so you want leading and trailing spaces around the pipe symbols as well as any newlines except those after every 4th pipe symbol removed? is that correct?
â€“Â Sam
Sep 25 at 18:08

2

IÃ¢Â€Â™m sorry if youÃ¢Â€Â™re stuck with data that look like this.Ã¢Â€Â¯Ã¢Â€Â¯ While the answers that have been presented will handle this mangled structure in the best case, it is very precarious (sensitive) to data corruption.Ã¢Â€Â¯Ã¢Â€Â¯ For example, if you have a file where every record is split across two lines (every line has two fields), and one line gets deleted (or totally scrambled), the rebuilt (output) file will be wrong from there on.Ã¢Â€Â¯Ã¢Â€Â¯ You might want to specify that the first field (and only the first field) of each line is a number, so error checking becomes possible.Ã¢Â€Â¯Ã¢Â€Â¯ Ã¢Â€Â¦Ã¢Â€Â¯(ContÃ¢Â€Â™d)
â€“Â G-Man
Sep 25 at 21:09

(ContÃ¢Â€Â™d) Ã¢Â€Â¦Ã¢Â€Â‚ P.S. Is it possible for parts of multiple records to be on the same line?Ã¢Â€Â‚ For example, 1|abc|def|Ã¢Â€Â‚/Ã¢Â€Â‚ghi|4|ijk|Ã¢Â€Â‚/Ã¢Â€Â‚|lmn|?Ã¢Â€ÂƒÃ¢Â€ÂƒAnd is it possible for a field to be split across lines?Ã¢Â€Â‚ For example, 10|hjg|jÃ¢Â€Â‚/Ã¢Â€Â‚sh|nbm|?
â€“Â G-Man
Sep 25 at 21:09

so you want leading and trailing spaces around the pipe symbols as well as any newlines except those after every 4th pipe symbol removed? is that correct?
â€“Â Sam
Sep 25 at 18:08

IÃ¢Â€Â™m sorry if youÃ¢Â€Â™re stuck with data that look like this.Ã¢Â€Â¯Ã¢Â€Â¯ While the answers that have been presented will handle this mangled structure in the best case, it is very precarious (sensitive) to data corruption.Ã¢Â€Â¯Ã¢Â€Â¯ For example, if you have a file where every record is split across two lines (every line has two fields), and one line gets deleted (or totally scrambled), the rebuilt (output) file will be wrong from there on.Ã¢Â€Â¯Ã¢Â€Â¯ You might want to specify that the first field (and only the first field) of each line is a number, so error checking becomes possible.Ã¢Â€Â¯Ã¢Â€Â¯ Ã¢Â€Â¦Ã¢Â€Â¯(ContÃ¢Â€Â™d)
â€“Â G-Man
Sep 25 at 21:09

(ContÃ¢Â€Â™d) Ã¢Â€Â¦Ã¢Â€Â‚ P.S. Is it possible for parts of multiple records to be on the same line?Ã¢Â€Â‚ For example, 1|abc|def|Ã¢Â€Â‚/Ã¢Â€Â‚ghi|4|ijk|Ã¢Â€Â‚/Ã¢Â€Â‚|lmn|?Ã¢Â€ÂƒÃ¢Â€ÂƒAnd is it possible for a field to be split across lines?Ã¢Â€Â‚ For example, 10|hjg|jÃ¢Â€Â‚/Ã¢Â€Â‚sh|nbm|?
â€“Â G-Man
Sep 25 at 21:09

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
3
down vote

I'm assuming you don't want all those blank lines.

$ cat file
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

$ awk -F'|' 'while (NF < 5) getline nextline; $0 = $0 nextline1' file
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Update for the question edit: remove whitespace around the field separator

awk -F'[[:blank:]]*[|][[:blank:]]*' -v OFS='|' '
 while (NF < 5) getline nextline; $0 = $0 nextline; $1=$1; print 
' file

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

1

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

2

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

2

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

Â |Â
show 5 more comments

up vote
0
down vote

With GNU sed:

sed ':loop /(.*|)4.*/ !N; s/n//; b loop; s/ *| */|/g' file

The command dissected:

:loop

The : signals a label that we can use for branches. "loop" is just the name that I chose for the label.

/(.*|)4.*/

Is a line selector regex that matches lines that contain 4 pipe symbols, each allowed to be preceded by zero or more arbitrary characters (.*|), with zero or more arbitrary characters allowed to follow the last pipe.

! ...

Applies the commands in the brackets to any line that did not match the previous regex.

N; s/n//; b loop

N concatenes the current line in pattern space with a newline symbol and the next line from the source file, then s/n// removes the newline symbol and b loop branches back to the label we have defined in the start, so the concatenated line will be compared against the regex again.

Lastly

s/ *| */|/g

will be applied to any line in pattern space before it is output. This removes any spaces around pipe symbols.

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

1

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

1

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

Â |Â
show 5 more comments

up vote
0
down vote

If using Vim is an option:

vim -Nesc 'g!/(.*|)4$/j!' -cwq input.txt

-Nes runs Vim in script mode, making it easier to automate

-c ... runs Vim commands after opening the file

g!/(.*|)4$/j! - on every line :g, that doesn't ! match /(.*|)4$/ (a regex matching 4 pipes separated by anything), join the next line to it (:j).

wq - save and quit.

answered Sep 26 at 7:43

muru

33.9k578147

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f471391%2fmerging-in-unix%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
3
down vote

I'm assuming you don't want all those blank lines.

$ cat file
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

$ awk -F'|' 'while (NF < 5) getline nextline; $0 = $0 nextline1' file
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Update for the question edit: remove whitespace around the field separator

awk -F'[[:blank:]]*[|][[:blank:]]*' -v OFS='|' '
 while (NF < 5) getline nextline; $0 = $0 nextline; $1=$1; print 
' file

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

1

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

2

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

2

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

Â |Â
show 5 more comments

up vote
3
down vote

I'm assuming you don't want all those blank lines.

$ cat file
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

$ awk -F'|' 'while (NF < 5) getline nextline; $0 = $0 nextline1' file
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Update for the question edit: remove whitespace around the field separator

awk -F'[[:blank:]]*[|][[:blank:]]*' -v OFS='|' '
 while (NF < 5) getline nextline; $0 = $0 nextline; $1=$1; print 
' file

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

1

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

2

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

2

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

Â |Â
show 5 more comments

up vote
3
down vote

I'm assuming you don't want all those blank lines.

$ cat file
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

$ awk -F'|' 'while (NF < 5) getline nextline; $0 = $0 nextline1' file
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Update for the question edit: remove whitespace around the field separator

awk -F'[[:blank:]]*[|][[:blank:]]*' -v OFS='|' '
 while (NF < 5) getline nextline; $0 = $0 nextline; $1=$1; print 
' file

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

I'm assuming you don't want all those blank lines.

$ cat file
1|abc|def|ghi|
4|ijk|
|lmn|
5||opq|rst|
8|
uvw||xyz|
10|hjg|jsh|nbm|

$ awk -F'|' 'while (NF < 5) getline nextline; $0 = $0 nextline1' file
1|abc|def|ghi|
4|ijk||lmn|
5||opq|rst|
8|uvw||xyz|
10|hjg|jsh|nbm|

Update for the question edit: remove whitespace around the field separator

awk -F'[[:blank:]]*[|][[:blank:]]*' -v OFS='|' '
 while (NF < 5) getline nextline; $0 = $0 nextline; $1=$1; print 
' file

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

edited Sep 25 at 21:42

answered Sep 25 at 18:18

glenn jackman

48.3k365105

answered Sep 25 at 18:18

glenn jackman

48.3k365105

answered Sep 25 at 18:18

glenn jackman

48.3k365105

1

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

2

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

2

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

Â |Â
show 5 more comments

1

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

2

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

2

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

genius solution !! what we call this process? may I kindly ask you to add some explanations for newbies like me. thank you!
â€“Â Shervan
Sep 25 at 18:37

Is there any particular bit you're unclear about? I assume a while loop is clear. getline reads the next line into the given variable. Then I concatentate the current line with the next line, and we re-check the number of fields. Other awk help can be found on the awk tag info page.
â€“Â glenn jackman
Sep 25 at 18:40

Yes, $0 = $0 and 1 at the end. thank you for any clarification!
â€“Â Shervan
Sep 25 at 18:42

It's not $0=$0, it's "assign to $0 the concatenation of $0 and nextline". awk doesn't have a concatenation operator: other languages might want $0 = $0 + nextline, but with awk you just put strings or variables side-by-side. For clarity we can write $0 = ($0 nextline)
â€“Â glenn jackman
Sep 25 at 19:55

The 1 is a common awk idiom that means "print the current record". Follow the link I gave and do some reading: it's well documented.
â€“Â glenn jackman
Sep 25 at 19:56

Â |Â
show 5 more comments

up vote
0
down vote

With GNU sed:

sed ':loop /(.*|)4.*/ !N; s/n//; b loop; s/ *| */|/g' file

The command dissected:

:loop

The : signals a label that we can use for branches. "loop" is just the name that I chose for the label.

/(.*|)4.*/

! ...

Applies the commands in the brackets to any line that did not match the previous regex.

N; s/n//; b loop

Lastly

s/ *| */|/g

will be applied to any line in pattern space before it is output. This removes any spaces around pipe symbols.

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

1

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

1

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

Â |Â
show 5 more comments

up vote
0
down vote

With GNU sed:

sed ':loop /(.*|)4.*/ !N; s/n//; b loop; s/ *| */|/g' file

The command dissected:

:loop

The : signals a label that we can use for branches. "loop" is just the name that I chose for the label.

/(.*|)4.*/

! ...

Applies the commands in the brackets to any line that did not match the previous regex.

N; s/n//; b loop

Lastly

s/ *| */|/g

will be applied to any line in pattern space before it is output. This removes any spaces around pipe symbols.

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

1

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

1

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

Â |Â
show 5 more comments

up vote
0
down vote

With GNU sed:

sed ':loop /(.*|)4.*/ !N; s/n//; b loop; s/ *| */|/g' file

The command dissected:

:loop

The : signals a label that we can use for branches. "loop" is just the name that I chose for the label.

/(.*|)4.*/

! ...

Applies the commands in the brackets to any line that did not match the previous regex.

N; s/n//; b loop

Lastly

s/ *| */|/g

will be applied to any line in pattern space before it is output. This removes any spaces around pipe symbols.

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

With GNU sed:

sed ':loop /(.*|)4.*/ !N; s/n//; b loop; s/ *| */|/g' file

The command dissected:

:loop

The : signals a label that we can use for branches. "loop" is just the name that I chose for the label.

/(.*|)4.*/

! ...

Applies the commands in the brackets to any line that did not match the previous regex.

N; s/n//; b loop

Lastly

s/ *| */|/g

will be applied to any line in pattern space before it is output. This removes any spaces around pipe symbols.

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

edited Sep 26 at 7:26

answered Sep 25 at 18:25

Sam

29219

answered Sep 25 at 18:25

Sam

29219

answered Sep 25 at 18:25

Sam

29219

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

1

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

1

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

Â |Â
show 5 more comments

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

1

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

1

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

this code not working!
â€“Â Shervan
Sep 25 at 18:35

does too for me with GNU sed 4.4
â€“Â Sam
Sep 25 at 18:37

sed --version My sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc.
â€“Â Shervan
Sep 25 at 18:38

oh, man... the command is not at fault. you are definitely not typing it as displayed. you are using double quotes and your shell's history expansion feature is enabled.
â€“Â Sam
Sep 26 at 5:09

@Shervan is probably using csh or tcsh where that ! needs to be escaped, even inside single quotes.
â€“Â StÃ©phane Chazelas
Sep 26 at 7:37

Â |Â
show 5 more comments

up vote
0
down vote

If using Vim is an option:

vim -Nesc 'g!/(.*|)4$/j!' -cwq input.txt

-Nes runs Vim in script mode, making it easier to automate

-c ... runs Vim commands after opening the file

g!/(.*|)4$/j! - on every line :g, that doesn't ! match /(.*|)4$/ (a regex matching 4 pipes separated by anything), join the next line to it (:j).

wq - save and quit.

answered Sep 26 at 7:43

muru

33.9k578147

add a commentÂ |Â

up vote
0
down vote

If using Vim is an option:

vim -Nesc 'g!/(.*|)4$/j!' -cwq input.txt

-Nes runs Vim in script mode, making it easier to automate

-c ... runs Vim commands after opening the file

g!/(.*|)4$/j! - on every line :g, that doesn't ! match /(.*|)4$/ (a regex matching 4 pipes separated by anything), join the next line to it (:j).

wq - save and quit.

answered Sep 26 at 7:43

muru

33.9k578147

add a commentÂ |Â

up vote
0
down vote

If using Vim is an option:

vim -Nesc 'g!/(.*|)4$/j!' -cwq input.txt

-Nes runs Vim in script mode, making it easier to automate

-c ... runs Vim commands after opening the file

g!/(.*|)4$/j! - on every line :g, that doesn't ! match /(.*|)4$/ (a regex matching 4 pipes separated by anything), join the next line to it (:j).

wq - save and quit.

answered Sep 26 at 7:43

muru

33.9k578147

If using Vim is an option:

vim -Nesc 'g!/(.*|)4$/j!' -cwq input.txt

-Nes runs Vim in script mode, making it easier to automate

-c ... runs Vim commands after opening the file

g!/(.*|)4$/j! - on every line :g, that doesn't ! match /(.*|)4$/ (a regex matching 4 pipes separated by anything), join the next line to it (:j).

wq - save and quit.

answered Sep 26 at 7:43

muru

33.9k578147

answered Sep 26 at 7:43

muru

33.9k578147

answered Sep 26 at 7:43

muru

33.9k578147

answered Sep 26 at 7:43

muru

33.9k578147

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu

Merging in Unix

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

Merging in Unix

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

3 Answers
3

3 Answers
3

3 Answers
3