Linux - how do I ignore special characters between â€œ â€?

up vote
1
down vote

favorite

My file: (1 sample line)

MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING

I need to read this into a Postgresql table with 7 columns.

Breakdown of columns:

MMP

"01_janitorial,02_cleaning_tools"

CUBIC_INCH

"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.

I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.

edited Jan 3 at 21:44

Michael Daffin

2,5801517

asked Jan 3 at 21:16

J.Turck

If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
â€“Â glenn jackman
Jan 3 at 21:33

This an export file from Akeneo (cots product) so I can not change the process that generates the file.
â€“Â J.Turck
Jan 3 at 21:37

3

Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
â€“Â Michael Daffin
Jan 3 at 21:57

Kindly provide sample input and output
â€“Â Praveen Kumar BS
Jan 4 at 3:09

this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
â€“Â cas
Jan 4 at 4:01

Â |Â
show 3 more comments

up vote
1
down vote

favorite

My file: (1 sample line)

MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING

I need to read this into a Postgresql table with 7 columns.

Breakdown of columns:

MMP

"01_janitorial,02_cleaning_tools"

CUBIC_INCH

"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.

I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.

edited Jan 3 at 21:44

Michael Daffin

2,5801517

asked Jan 3 at 21:16

J.Turck

If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
â€“Â glenn jackman
Jan 3 at 21:33

This an export file from Akeneo (cots product) so I can not change the process that generates the file.
â€“Â J.Turck
Jan 3 at 21:37

3

Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
â€“Â Michael Daffin
Jan 3 at 21:57

Kindly provide sample input and output
â€“Â Praveen Kumar BS
Jan 4 at 3:09

this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
â€“Â cas
Jan 4 at 4:01

Â |Â
show 3 more comments

up vote
1
down vote

favorite

My file: (1 sample line)

MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING

I need to read this into a Postgresql table with 7 columns.

Breakdown of columns:

MMP

"01_janitorial,02_cleaning_tools"

CUBIC_INCH

"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.

I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.

edited Jan 3 at 21:44

Michael Daffin

2,5801517

asked Jan 3 at 21:16

J.Turck

My file: (1 sample line)

MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING

I need to read this into a Postgresql table with 7 columns.

Breakdown of columns:

MMP

"01_janitorial,02_cleaning_tools"

CUBIC_INCH

"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.

I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.

edited Jan 3 at 21:44

Michael Daffin

2,5801517

asked Jan 3 at 21:16

J.Turck

edited Jan 3 at 21:44

Michael Daffin

2,5801517

edited Jan 3 at 21:44

Michael Daffin

2,5801517

edited Jan 3 at 21:44

Michael Daffin

2,5801517

asked Jan 3 at 21:16

J.Turck

asked Jan 3 at 21:16

J.Turck

asked Jan 3 at 21:16

J.Turck

If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
â€“Â glenn jackman
Jan 3 at 21:33

This an export file from Akeneo (cots product) so I can not change the process that generates the file.
â€“Â J.Turck
Jan 3 at 21:37

3

Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
â€“Â Michael Daffin
Jan 3 at 21:57

Kindly provide sample input and output
â€“Â Praveen Kumar BS
Jan 4 at 3:09

this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
â€“Â cas
Jan 4 at 4:01

Â |Â
show 3 more comments

If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
â€“Â glenn jackman
Jan 3 at 21:33

This an export file from Akeneo (cots product) so I can not change the process that generates the file.
â€“Â J.Turck
Jan 3 at 21:37

3

Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
â€“Â Michael Daffin
Jan 3 at 21:57

Kindly provide sample input and output
â€“Â Praveen Kumar BS
Jan 4 at 3:09

this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
â€“Â cas
Jan 4 at 4:01

If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
â€“Â glenn jackman
Jan 3 at 21:33

This an export file from Akeneo (cots product) so I can not change the process that generates the file.
â€“Â J.Turck
Jan 3 at 21:37

Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
â€“Â Michael Daffin
Jan 3 at 21:57

Kindly provide sample input and output
â€“Â Praveen Kumar BS
Jan 4 at 3:09

this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
â€“Â cas
Jan 4 at 4:01

Â |Â
show 3 more comments

4 Answers
4

active

oldest

votes

up vote
0
down vote

Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.

Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.

That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".

This will, however, explode if any of your quoted data themselves contain double-quotes.

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

add a commentÂ |Â

up vote
0
down vote

As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.

Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.

answered Jan 3 at 23:44

Putnik

4142514

add a commentÂ |Â

up vote
0
down vote

The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
 echo "Line: $line"

# grabbing the first field is easy ..
 f1=$(echo $line | cut -d, -f1 )

# now remove the first field from the line
 line=$(echo $line | sed "s/$f1,//" )
 echo "Line is now: $line"

# to grab the second field use quote as a delimiter
 f2=$(echo $line | cut -d" -f2 )

# now remove the second field from the line
 line=$(echo $line | sed "s/"$f2",//" )
 echo "Line is now: $line"

# fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
 f3=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f3,//" )
 echo "Line is now: $line"
 f4=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f4,//" )
 echo "Line is now: $line"
 f5=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f5,//" )

# here is the "trick" ... reverse the string, then you can cut field 7 first!
 line=$(echo $line | rev)
 echo "Line is now: $line"
 f7=$(echo $line | cut -d, -f1 )

# now remove field 7 from the string, then reverse it back
 line=$(echo $line | sed "s/$f7,//" )
 f7=$(echo $f7 | rev)

# now we can reverse the remaining string, which is field 6 back to normal
 line=$(echo $line | rev)
# and then remove the leading quote
 line=$(echo $line | cut --complement -c 1)
# and then remove the trailing quote
 line=$(echo $line | sed "s/"$//" )
 echo "Line is now: $line"
# and then double up all the remaining quotes
 f6=$(echo $line | sed "s/"/""/g" )

 echo f1 = $f1
 echo f2 = $f2
 echo f3 = $f3
 echo f4 = $f4
 echo f5 = $f5
 echo f6 = $f6
 echo f7 = $f7
 echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
done < "$1"

I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.

Here is an example run and output:

[root@alpha ~]# ./fixit.sh test.txt
Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f1 = MMP
f2 = 01_janitorial,02_cleaning_tools
f3 = 1
f4 =
f5 = CUBIC_INCH
f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f7 = CLEANING

If you need to escape the quotes in some other manner that should be pretty obvious given the above.

answered Jan 4 at 4:50

alfreema

293210

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Â |Â
show 1 more comment

up vote
0
down vote

I get the final product by removing carriage returns within a quoted filed as following script :

$ cat remove_cr.awk
#!/usr/bin/awk -f
 record = record $0
 # If number of quotes is odd, continue reading record.
 if ( gsub( /"/, "&", record ) % 2 )
 record = record " "
 next
 

 print record
 record = ""

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f414641%2flinux-how-do-i-ignore-special-characters-between%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
0
down vote

Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.

That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".

This will, however, explode if any of your quoted data themselves contain double-quotes.

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

add a commentÂ |Â

up vote
0
down vote

Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.

That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".

This will, however, explode if any of your quoted data themselves contain double-quotes.

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

add a commentÂ |Â

up vote
0
down vote

Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.

That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".

This will, however, explode if any of your quoted data themselves contain double-quotes.

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.

That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".

This will, however, explode if any of your quoted data themselves contain double-quotes.

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

answered Jan 3 at 21:26

DopeGhoti

40.5k54979

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

add a commentÂ |Â

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
â€“Â J.Turck
Jan 3 at 21:30

Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
â€“Â DopeGhoti
Jan 3 at 21:35

Unfortunately no. It is an export from a cots package - AKeneo.
â€“Â J.Turck
Jan 3 at 21:38

add a commentÂ |Â

up vote
0
down vote

Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.

answered Jan 3 at 23:44

Putnik

4142514

add a commentÂ |Â

up vote
0
down vote

Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.

answered Jan 3 at 23:44

Putnik

4142514

add a commentÂ |Â

up vote
0
down vote

Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.

answered Jan 3 at 23:44

Putnik

4142514

Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.

answered Jan 3 at 23:44

Putnik

4142514

answered Jan 3 at 23:44

Putnik

4142514

answered Jan 3 at 23:44

Putnik

4142514

answered Jan 3 at 23:44

Putnik

4142514

add a commentÂ |Â

up vote
0
down vote

The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
 echo "Line: $line"

# grabbing the first field is easy ..
 f1=$(echo $line | cut -d, -f1 )

# now remove the first field from the line
 line=$(echo $line | sed "s/$f1,//" )
 echo "Line is now: $line"

# to grab the second field use quote as a delimiter
 f2=$(echo $line | cut -d" -f2 )

# now remove the second field from the line
 line=$(echo $line | sed "s/"$f2",//" )
 echo "Line is now: $line"

# fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
 f3=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f3,//" )
 echo "Line is now: $line"
 f4=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f4,//" )
 echo "Line is now: $line"
 f5=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f5,//" )

# here is the "trick" ... reverse the string, then you can cut field 7 first!
 line=$(echo $line | rev)
 echo "Line is now: $line"
 f7=$(echo $line | cut -d, -f1 )

# now remove field 7 from the string, then reverse it back
 line=$(echo $line | sed "s/$f7,//" )
 f7=$(echo $f7 | rev)

# now we can reverse the remaining string, which is field 6 back to normal
 line=$(echo $line | rev)
# and then remove the leading quote
 line=$(echo $line | cut --complement -c 1)
# and then remove the trailing quote
 line=$(echo $line | sed "s/"$//" )
 echo "Line is now: $line"
# and then double up all the remaining quotes
 f6=$(echo $line | sed "s/"/""/g" )

 echo f1 = $f1
 echo f2 = $f2
 echo f3 = $f3
 echo f4 = $f4
 echo f5 = $f5
 echo f6 = $f6
 echo f7 = $f7
 echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
done < "$1"

I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.

Here is an example run and output:

[root@alpha ~]# ./fixit.sh test.txt
Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f1 = MMP
f2 = 01_janitorial,02_cleaning_tools
f3 = 1
f4 =
f5 = CUBIC_INCH
f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f7 = CLEANING

If you need to escape the quotes in some other manner that should be pretty obvious given the above.

answered Jan 4 at 4:50

alfreema

293210

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Â |Â
show 1 more comment

up vote
0
down vote

The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
 echo "Line: $line"

# grabbing the first field is easy ..
 f1=$(echo $line | cut -d, -f1 )

# now remove the first field from the line
 line=$(echo $line | sed "s/$f1,//" )
 echo "Line is now: $line"

# to grab the second field use quote as a delimiter
 f2=$(echo $line | cut -d" -f2 )

# now remove the second field from the line
 line=$(echo $line | sed "s/"$f2",//" )
 echo "Line is now: $line"

# fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
 f3=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f3,//" )
 echo "Line is now: $line"
 f4=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f4,//" )
 echo "Line is now: $line"
 f5=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f5,//" )

# here is the "trick" ... reverse the string, then you can cut field 7 first!
 line=$(echo $line | rev)
 echo "Line is now: $line"
 f7=$(echo $line | cut -d, -f1 )

# now remove field 7 from the string, then reverse it back
 line=$(echo $line | sed "s/$f7,//" )
 f7=$(echo $f7 | rev)

# now we can reverse the remaining string, which is field 6 back to normal
 line=$(echo $line | rev)
# and then remove the leading quote
 line=$(echo $line | cut --complement -c 1)
# and then remove the trailing quote
 line=$(echo $line | sed "s/"$//" )
 echo "Line is now: $line"
# and then double up all the remaining quotes
 f6=$(echo $line | sed "s/"/""/g" )

 echo f1 = $f1
 echo f2 = $f2
 echo f3 = $f3
 echo f4 = $f4
 echo f5 = $f5
 echo f6 = $f6
 echo f7 = $f7
 echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
done < "$1"

I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.

Here is an example run and output:

[root@alpha ~]# ./fixit.sh test.txt
Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f1 = MMP
f2 = 01_janitorial,02_cleaning_tools
f3 = 1
f4 =
f5 = CUBIC_INCH
f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f7 = CLEANING

If you need to escape the quotes in some other manner that should be pretty obvious given the above.

answered Jan 4 at 4:50

alfreema

293210

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Â |Â
show 1 more comment

up vote
0
down vote

The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
 echo "Line: $line"

# grabbing the first field is easy ..
 f1=$(echo $line | cut -d, -f1 )

# now remove the first field from the line
 line=$(echo $line | sed "s/$f1,//" )
 echo "Line is now: $line"

# to grab the second field use quote as a delimiter
 f2=$(echo $line | cut -d" -f2 )

# now remove the second field from the line
 line=$(echo $line | sed "s/"$f2",//" )
 echo "Line is now: $line"

# fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
 f3=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f3,//" )
 echo "Line is now: $line"
 f4=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f4,//" )
 echo "Line is now: $line"
 f5=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f5,//" )

# here is the "trick" ... reverse the string, then you can cut field 7 first!
 line=$(echo $line | rev)
 echo "Line is now: $line"
 f7=$(echo $line | cut -d, -f1 )

# now remove field 7 from the string, then reverse it back
 line=$(echo $line | sed "s/$f7,//" )
 f7=$(echo $f7 | rev)

# now we can reverse the remaining string, which is field 6 back to normal
 line=$(echo $line | rev)
# and then remove the leading quote
 line=$(echo $line | cut --complement -c 1)
# and then remove the trailing quote
 line=$(echo $line | sed "s/"$//" )
 echo "Line is now: $line"
# and then double up all the remaining quotes
 f6=$(echo $line | sed "s/"/""/g" )

 echo f1 = $f1
 echo f2 = $f2
 echo f3 = $f3
 echo f4 = $f4
 echo f5 = $f5
 echo f6 = $f6
 echo f7 = $f7
 echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
done < "$1"

I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.

Here is an example run and output:

[root@alpha ~]# ./fixit.sh test.txt
Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f1 = MMP
f2 = 01_janitorial,02_cleaning_tools
f3 = 1
f4 =
f5 = CUBIC_INCH
f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f7 = CLEANING

If you need to escape the quotes in some other manner that should be pretty obvious given the above.

answered Jan 4 at 4:50

alfreema

293210

The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
 echo "Line: $line"

# grabbing the first field is easy ..
 f1=$(echo $line | cut -d, -f1 )

# now remove the first field from the line
 line=$(echo $line | sed "s/$f1,//" )
 echo "Line is now: $line"

# to grab the second field use quote as a delimiter
 f2=$(echo $line | cut -d" -f2 )

# now remove the second field from the line
 line=$(echo $line | sed "s/"$f2",//" )
 echo "Line is now: $line"

# fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
 f3=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f3,//" )
 echo "Line is now: $line"
 f4=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f4,//" )
 echo "Line is now: $line"
 f5=$(echo $line | cut -d, -f1 )
 line=$(echo $line | sed "s/$f5,//" )

# here is the "trick" ... reverse the string, then you can cut field 7 first!
 line=$(echo $line | rev)
 echo "Line is now: $line"
 f7=$(echo $line | cut -d, -f1 )

# now remove field 7 from the string, then reverse it back
 line=$(echo $line | sed "s/$f7,//" )
 f7=$(echo $f7 | rev)

# now we can reverse the remaining string, which is field 6 back to normal
 line=$(echo $line | rev)
# and then remove the leading quote
 line=$(echo $line | cut --complement -c 1)
# and then remove the trailing quote
 line=$(echo $line | sed "s/"$//" )
 echo "Line is now: $line"
# and then double up all the remaining quotes
 f6=$(echo $line | sed "s/"/""/g" )

 echo f1 = $f1
 echo f2 = $f2
 echo f3 = $f3
 echo f4 = $f4
 echo f5 = $f5
 echo f6 = $f6
 echo f7 = $f7
 echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
done < "$1"

I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.

Here is an example run and output:

[root@alpha ~]# ./fixit.sh test.txt
Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f1 = MMP
f2 = 01_janitorial,02_cleaning_tools
f3 = 1
f4 =
f5 = CUBIC_INCH
f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
f7 = CLEANING

If you need to escape the quotes in some other manner that should be pretty obvious given the above.

answered Jan 4 at 4:50

alfreema

293210

answered Jan 4 at 4:50

alfreema

293210

answered Jan 4 at 4:50

alfreema

293210

answered Jan 4 at 4:50

alfreema

293210

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Â |Â
show 1 more comment

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
â€“Â J.Turck
Jan 4 at 18:30

Ã¢Â€Â¢ Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
â€“Â J.Turck
Jan 4 at 18:38

I want to remove the $ and leave the ^M.
â€“Â J.Turck
Jan 4 at 18:38

Are the soft returns literally $, or are you just using that as a placeholder for the example?
â€“Â alfreema
Jan 4 at 19:16

when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
â€“Â J.Turck
Jan 5 at 2:08

Â |Â
show 1 more comment

up vote
0
down vote

I get the final product by removing carriage returns within a quoted filed as following script :

$ cat remove_cr.awk
#!/usr/bin/awk -f
 record = record $0
 # If number of quotes is odd, continue reading record.
 if ( gsub( /"/, "&", record ) % 2 )
 record = record " "
 next
 

 print record
 record = ""

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

add a commentÂ |Â

up vote
0
down vote

I get the final product by removing carriage returns within a quoted filed as following script :

$ cat remove_cr.awk
#!/usr/bin/awk -f
 record = record $0
 # If number of quotes is odd, continue reading record.
 if ( gsub( /"/, "&", record ) % 2 )
 record = record " "
 next
 

 print record
 record = ""

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

add a commentÂ |Â

up vote
0
down vote

I get the final product by removing carriage returns within a quoted filed as following script :

$ cat remove_cr.awk
#!/usr/bin/awk -f
 record = record $0
 # If number of quotes is odd, continue reading record.
 if ( gsub( /"/, "&", record ) % 2 )
 record = record " "
 next
 

 print record
 record = ""

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

I get the final product by removing carriage returns within a quoted filed as following script :

$ cat remove_cr.awk
#!/usr/bin/awk -f
 record = record $0
 # If number of quotes is odd, continue reading record.
 if ( gsub( /"/, "&", record ) % 2 )
 record = record " "
 next
 

 print record
 record = ""

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

edited Jan 8 at 17:57

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.3k92462

answered Jan 8 at 17:50

J.Turck

answered Jan 8 at 17:50

J.Turck

answered Jan 8 at 17:50

J.Turck

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu