Fixing malformed CSV with incorrect new line chars using sed or perl only

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I have a comma-delimited CSV file but for some reason our system inserts a new line character at a random location in the file which causes the entire file to break. I can get the number of columns in the file.



How do I solve it with sed and/or perl in a one liner command? I know it's solvable with awk but this is for learning purposes. If using perl, I don't want to use the built-in CSV functions. Is it solvable?? I'm on this problem for several days i can't seem to find a solution :(



Sample malformed input (lots of randomly inserted n)



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residen
tial Lot”,3
206893,FL,CLAY COUNTY,-81.7
00455,“Residen
tial Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,
3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,
“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
4335
12,FL,CLAY COUNTY,-81.704613,
“Residential Lot”,1


Required output



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1






share|improve this question






















  • Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
    – Kusalananda
    Apr 2 at 6:33










  • Guess: Newlines are inserted when some internal buffer is full, and written out.
    – dirkt
    Apr 2 at 6:36










  • For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
    – dirkt
    Apr 2 at 6:38










  • How would you fix it in awk and why can't you do something similar in Perl or sed?
    – muru
    Apr 2 at 6:56










  • on this problem for several days please add at least one of those attempts to question..
    – Sundeep
    Apr 2 at 7:50














up vote
0
down vote

favorite












I have a comma-delimited CSV file but for some reason our system inserts a new line character at a random location in the file which causes the entire file to break. I can get the number of columns in the file.



How do I solve it with sed and/or perl in a one liner command? I know it's solvable with awk but this is for learning purposes. If using perl, I don't want to use the built-in CSV functions. Is it solvable?? I'm on this problem for several days i can't seem to find a solution :(



Sample malformed input (lots of randomly inserted n)



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residen
tial Lot”,3
206893,FL,CLAY COUNTY,-81.7
00455,“Residen
tial Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,
3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,
“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
4335
12,FL,CLAY COUNTY,-81.704613,
“Residential Lot”,1


Required output



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1






share|improve this question






















  • Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
    – Kusalananda
    Apr 2 at 6:33










  • Guess: Newlines are inserted when some internal buffer is full, and written out.
    – dirkt
    Apr 2 at 6:36










  • For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
    – dirkt
    Apr 2 at 6:38










  • How would you fix it in awk and why can't you do something similar in Perl or sed?
    – muru
    Apr 2 at 6:56










  • on this problem for several days please add at least one of those attempts to question..
    – Sundeep
    Apr 2 at 7:50












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have a comma-delimited CSV file but for some reason our system inserts a new line character at a random location in the file which causes the entire file to break. I can get the number of columns in the file.



How do I solve it with sed and/or perl in a one liner command? I know it's solvable with awk but this is for learning purposes. If using perl, I don't want to use the built-in CSV functions. Is it solvable?? I'm on this problem for several days i can't seem to find a solution :(



Sample malformed input (lots of randomly inserted n)



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residen
tial Lot”,3
206893,FL,CLAY COUNTY,-81.7
00455,“Residen
tial Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,
3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,
“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
4335
12,FL,CLAY COUNTY,-81.704613,
“Residential Lot”,1


Required output



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1






share|improve this question














I have a comma-delimited CSV file but for some reason our system inserts a new line character at a random location in the file which causes the entire file to break. I can get the number of columns in the file.



How do I solve it with sed and/or perl in a one liner command? I know it's solvable with awk but this is for learning purposes. If using perl, I don't want to use the built-in CSV functions. Is it solvable?? I'm on this problem for several days i can't seem to find a solution :(



Sample malformed input (lots of randomly inserted n)



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residen
tial Lot”,3
206893,FL,CLAY COUNTY,-81.7
00455,“Residen
tial Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,
3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,
“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
4335
12,FL,CLAY COUNTY,-81.704613,
“Residential Lot”,1


Required output



policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1








share|improve this question













share|improve this question




share|improve this question








edited Apr 2 at 20:31









Jeff Schaller

31.1k846105




31.1k846105










asked Apr 2 at 6:28









Harry McKenzie

31




31











  • Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
    – Kusalananda
    Apr 2 at 6:33










  • Guess: Newlines are inserted when some internal buffer is full, and written out.
    – dirkt
    Apr 2 at 6:36










  • For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
    – dirkt
    Apr 2 at 6:38










  • How would you fix it in awk and why can't you do something similar in Perl or sed?
    – muru
    Apr 2 at 6:56










  • on this problem for several days please add at least one of those attempts to question..
    – Sundeep
    Apr 2 at 7:50
















  • Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
    – Kusalananda
    Apr 2 at 6:33










  • Guess: Newlines are inserted when some internal buffer is full, and written out.
    – dirkt
    Apr 2 at 6:36










  • For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
    – dirkt
    Apr 2 at 6:38










  • How would you fix it in awk and why can't you do something similar in Perl or sed?
    – muru
    Apr 2 at 6:56










  • on this problem for several days please add at least one of those attempts to question..
    – Sundeep
    Apr 2 at 7:50















Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
– Kusalananda
Apr 2 at 6:33




Find why the newlines are inserted and fix that. That would be the best resolution to the issue.
– Kusalananda
Apr 2 at 6:33












Guess: Newlines are inserted when some internal buffer is full, and written out.
– dirkt
Apr 2 at 6:36




Guess: Newlines are inserted when some internal buffer is full, and written out.
– dirkt
Apr 2 at 6:36












For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
– dirkt
Apr 2 at 6:38




For the actual question: Your lines start and end with digits, so removing all newlines not between digit will be a start, but this won't fix the file completely. To detect the other newlines, I guess one would need to add knowledge about how the fields look like.
– dirkt
Apr 2 at 6:38












How would you fix it in awk and why can't you do something similar in Perl or sed?
– muru
Apr 2 at 6:56




How would you fix it in awk and why can't you do something similar in Perl or sed?
– muru
Apr 2 at 6:56












on this problem for several days please add at least one of those attempts to question..
– Sundeep
Apr 2 at 7:50




on this problem for several days please add at least one of those attempts to question..
– Sundeep
Apr 2 at 7:50










2 Answers
2






active

oldest

votes

















up vote
1
down vote



accepted










$ awk -F, ' $NF == "") brokenline=$0; getline; $0 = brokenline $0; print ' file.csv
policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1


The awk code will append the next line of input to the current line for as long as there is less than six fields in the current line, or the last field is empty (there is one line that is broken just after the last field separator).




A Perl workalike:



perl -ne 'chomp;while (tr/,/,/ < 5 || /,$/) $_ .= readline; chomp print "$_n"' file.csv





share|improve this answer




















  • thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
    – Harry McKenzie
    Apr 2 at 12:44











  • @HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
    – Kusalananda
    Apr 2 at 12:46











  • yeah but i need to use perl :(
    – Harry McKenzie
    Apr 2 at 12:47










  • @HarryMcKenzie I don't quite see why.
    – Kusalananda
    Apr 2 at 12:47










  • it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
    – Harry McKenzie
    Apr 2 at 13:04

















up vote
0
down vote













Like say by Kusalananda, there is 6 fields on each line, so you can try this gnu sed.



sed -E ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA;g' infile





share|improve this answer




















  • but unfortunately doesn't work. you didnt test it :p
    – Harry McKenzie
    Apr 2 at 13:10










  • Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
    – ctac_
    Apr 2 at 13:41










  • im using sed gnu. but thank you very much for helping me, it works now! thank you :)
    – Harry McKenzie
    Apr 4 at 11:17










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f434979%2ffixing-malformed-csv-with-incorrect-new-line-chars-using-sed-or-perl-only%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










$ awk -F, ' $NF == "") brokenline=$0; getline; $0 = brokenline $0; print ' file.csv
policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1


The awk code will append the next line of input to the current line for as long as there is less than six fields in the current line, or the last field is empty (there is one line that is broken just after the last field separator).




A Perl workalike:



perl -ne 'chomp;while (tr/,/,/ < 5 || /,$/) $_ .= readline; chomp print "$_n"' file.csv





share|improve this answer




















  • thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
    – Harry McKenzie
    Apr 2 at 12:44











  • @HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
    – Kusalananda
    Apr 2 at 12:46











  • yeah but i need to use perl :(
    – Harry McKenzie
    Apr 2 at 12:47










  • @HarryMcKenzie I don't quite see why.
    – Kusalananda
    Apr 2 at 12:47










  • it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
    – Harry McKenzie
    Apr 2 at 13:04














up vote
1
down vote



accepted










$ awk -F, ' $NF == "") brokenline=$0; getline; $0 = brokenline $0; print ' file.csv
policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1


The awk code will append the next line of input to the current line for as long as there is less than six fields in the current line, or the last field is empty (there is one line that is broken just after the last field separator).




A Perl workalike:



perl -ne 'chomp;while (tr/,/,/ < 5 || /,$/) $_ .= readline; chomp print "$_n"' file.csv





share|improve this answer




















  • thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
    – Harry McKenzie
    Apr 2 at 12:44











  • @HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
    – Kusalananda
    Apr 2 at 12:46











  • yeah but i need to use perl :(
    – Harry McKenzie
    Apr 2 at 12:47










  • @HarryMcKenzie I don't quite see why.
    – Kusalananda
    Apr 2 at 12:47










  • it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
    – Harry McKenzie
    Apr 2 at 13:04












up vote
1
down vote



accepted







up vote
1
down vote



accepted






$ awk -F, ' $NF == "") brokenline=$0; getline; $0 = brokenline $0; print ' file.csv
policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1


The awk code will append the next line of input to the current line for as long as there is less than six fields in the current line, or the last field is empty (there is one line that is broken just after the last field separator).




A Perl workalike:



perl -ne 'chomp;while (tr/,/,/ < 5 || /,$/) $_ .= readline; chomp print "$_n"' file.csv





share|improve this answer












$ awk -F, ' $NF == "") brokenline=$0; getline; $0 = brokenline $0; print ' file.csv
policyID,statecode,county,Point longitude,Some Thing Here,point_granularity
119736,FL,CLAY COUNTY,-81.711777,“Residential Lot”,1
448094,FL,CLAY COUNTY,-81.707664,“Residential Lot”,3
206893,FL,CLAY COUNTY,-81.700455,“Residential Lot”,1
333743,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
172534,FL,CLAY COUNTY,-81.702675,“Residential Lot”,1
785275,FL,CLAY COUNTY,-81.707703,“Residential Lot”,3
995932,FL,CLAY COUNTY,-81.713882,“Residential Lot”,1
223488,FL,CLAY COUNTY,-81.707146,“Residential Lot”,1
433512,FL,CLAY COUNTY,-81.704613,“Residential Lot”,1


The awk code will append the next line of input to the current line for as long as there is less than six fields in the current line, or the last field is empty (there is one line that is broken just after the last field separator).




A Perl workalike:



perl -ne 'chomp;while (tr/,/,/ < 5 || /,$/) $_ .= readline; chomp print "$_n"' file.csv






share|improve this answer












share|improve this answer



share|improve this answer










answered Apr 2 at 7:41









Kusalananda

102k13201317




102k13201317











  • thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
    – Harry McKenzie
    Apr 2 at 12:44











  • @HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
    – Kusalananda
    Apr 2 at 12:46











  • yeah but i need to use perl :(
    – Harry McKenzie
    Apr 2 at 12:47










  • @HarryMcKenzie I don't quite see why.
    – Kusalananda
    Apr 2 at 12:47










  • it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
    – Harry McKenzie
    Apr 2 at 13:04
















  • thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
    – Harry McKenzie
    Apr 2 at 12:44











  • @HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
    – Kusalananda
    Apr 2 at 12:46











  • yeah but i need to use perl :(
    – Harry McKenzie
    Apr 2 at 12:47










  • @HarryMcKenzie I don't quite see why.
    – Kusalananda
    Apr 2 at 12:47










  • it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
    – Harry McKenzie
    Apr 2 at 13:04















thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
– Harry McKenzie
Apr 2 at 12:44





thank you for your response! i was trying to make it work with any type of delimiter like so: perl -ne 'chomp;while (tr/quotemeta('"$d2"')/quotemeta('"$d2"')/ < '"$columns"'-1 || /quotemeta('"$d2"')$/) $_ .=readline; chomp print "$_n"' where $d2 for example is a bash variable with value || but it doesnt work :(
– Harry McKenzie
Apr 2 at 12:44













@HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
– Kusalananda
Apr 2 at 12:46





@HarryMcKenzie I would use the awk solution for that as it's easier to just change the option-argument for the -F flag.
– Kusalananda
Apr 2 at 12:46













yeah but i need to use perl :(
– Harry McKenzie
Apr 2 at 12:47




yeah but i need to use perl :(
– Harry McKenzie
Apr 2 at 12:47












@HarryMcKenzie I don't quite see why.
– Kusalananda
Apr 2 at 12:47




@HarryMcKenzie I don't quite see why.
– Kusalananda
Apr 2 at 12:47












it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
– Harry McKenzie
Apr 2 at 13:04




it's ok... it just sucks that our system only has a limited amount of commands and awk is not part of it. and they don't want to install it for some reason so we have to live with sed and perl only.
– Harry McKenzie
Apr 2 at 13:04












up vote
0
down vote













Like say by Kusalananda, there is 6 fields on each line, so you can try this gnu sed.



sed -E ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA;g' infile





share|improve this answer




















  • but unfortunately doesn't work. you didnt test it :p
    – Harry McKenzie
    Apr 2 at 13:10










  • Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
    – ctac_
    Apr 2 at 13:41










  • im using sed gnu. but thank you very much for helping me, it works now! thank you :)
    – Harry McKenzie
    Apr 4 at 11:17














up vote
0
down vote













Like say by Kusalananda, there is 6 fields on each line, so you can try this gnu sed.



sed -E ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA;g' infile





share|improve this answer




















  • but unfortunately doesn't work. you didnt test it :p
    – Harry McKenzie
    Apr 2 at 13:10










  • Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
    – ctac_
    Apr 2 at 13:41










  • im using sed gnu. but thank you very much for helping me, it works now! thank you :)
    – Harry McKenzie
    Apr 4 at 11:17












up vote
0
down vote










up vote
0
down vote









Like say by Kusalananda, there is 6 fields on each line, so you can try this gnu sed.



sed -E ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA;g' infile





share|improve this answer












Like say by Kusalananda, there is 6 fields on each line, so you can try this gnu sed.



sed -E ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA;g' infile






share|improve this answer












share|improve this answer



share|improve this answer










answered Apr 2 at 10:22









ctac_

1,016116




1,016116











  • but unfortunately doesn't work. you didnt test it :p
    – Harry McKenzie
    Apr 2 at 13:10










  • Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
    – ctac_
    Apr 2 at 13:41










  • im using sed gnu. but thank you very much for helping me, it works now! thank you :)
    – Harry McKenzie
    Apr 4 at 11:17
















  • but unfortunately doesn't work. you didnt test it :p
    – Harry McKenzie
    Apr 2 at 13:10










  • Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
    – ctac_
    Apr 2 at 13:41










  • im using sed gnu. but thank you very much for helping me, it works now! thank you :)
    – Harry McKenzie
    Apr 4 at 11:17















but unfortunately doesn't work. you didnt test it :p
– Harry McKenzie
Apr 2 at 13:10




but unfortunately doesn't work. you didnt test it :p
– Harry McKenzie
Apr 2 at 13:10












Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
– ctac_
Apr 2 at 13:41




Is your sed gnu sed? On Openbsd, I must change the code like that : sed -Ee ':A;h;s/^/,/;s/((,[^,]+)6)(.*)/3/;/./g;N;s/n//;bA' -e ';g' infile
– ctac_
Apr 2 at 13:41












im using sed gnu. but thank you very much for helping me, it works now! thank you :)
– Harry McKenzie
Apr 4 at 11:17




im using sed gnu. but thank you very much for helping me, it works now! thank you :)
– Harry McKenzie
Apr 4 at 11:17












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f434979%2ffixing-malformed-csv-with-incorrect-new-line-chars-using-sed-or-perl-only%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?