Linux - how do I ignore special characters between “ ”?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












My file: (1 sample line)



MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING


I need to read this into a Postgresql table with 7 columns.



Breakdown of columns:



  1. MMP

  2. "01_janitorial,02_cleaning_tools"

  3. 1



  4. CUBIC_INCH

  5. "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

  6. CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.



I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.







share|improve this question






















  • If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
    – glenn jackman
    Jan 3 at 21:33











  • This an export file from Akeneo (cots product) so I can not change the process that generates the file.
    – J.Turck
    Jan 3 at 21:37






  • 3




    Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
    – Michael Daffin
    Jan 3 at 21:57










  • Kindly provide sample input and output
    – Praveen Kumar BS
    Jan 4 at 3:09










  • this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
    – cas
    Jan 4 at 4:01














up vote
1
down vote

favorite












My file: (1 sample line)



MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING


I need to read this into a Postgresql table with 7 columns.



Breakdown of columns:



  1. MMP

  2. "01_janitorial,02_cleaning_tools"

  3. 1



  4. CUBIC_INCH

  5. "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

  6. CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.



I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.







share|improve this question






















  • If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
    – glenn jackman
    Jan 3 at 21:33











  • This an export file from Akeneo (cots product) so I can not change the process that generates the file.
    – J.Turck
    Jan 3 at 21:37






  • 3




    Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
    – Michael Daffin
    Jan 3 at 21:57










  • Kindly provide sample input and output
    – Praveen Kumar BS
    Jan 4 at 3:09










  • this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
    – cas
    Jan 4 at 4:01












up vote
1
down vote

favorite









up vote
1
down vote

favorite











My file: (1 sample line)



MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING


I need to read this into a Postgresql table with 7 columns.



Breakdown of columns:



  1. MMP

  2. "01_janitorial,02_cleaning_tools"

  3. 1



  4. CUBIC_INCH

  5. "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

  6. CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.



I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.







share|improve this question














My file: (1 sample line)



MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i
& MetroMax Q shelf, NSF",CLEANING


I need to read this into a Postgresql table with 7 columns.



Breakdown of columns:



  1. MMP

  2. "01_janitorial,02_cleaning_tools"

  3. 1



  4. CUBIC_INCH

  5. "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 1. 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"

  6. CLEANING

The file is basically comma delimited, but I need to ignore the commas, carriage return (if present), and double quotes IF the text is inside a double quote. As in column 2 and 6.



I can use a postgresql copy command to load, or convert the file using awk, perl, sed or whatever to convert the file and then load.









share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 21:44









Michael Daffin

2,5801517




2,5801517










asked Jan 3 at 21:16









J.Turck

62




62











  • If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
    – glenn jackman
    Jan 3 at 21:33











  • This an export file from Akeneo (cots product) so I can not change the process that generates the file.
    – J.Turck
    Jan 3 at 21:37






  • 3




    Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
    – Michael Daffin
    Jan 3 at 21:57










  • Kindly provide sample input and output
    – Praveen Kumar BS
    Jan 4 at 3:09










  • this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
    – cas
    Jan 4 at 4:01
















  • If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
    – glenn jackman
    Jan 3 at 21:33











  • This an export file from Akeneo (cots product) so I can not change the process that generates the file.
    – J.Turck
    Jan 3 at 21:37






  • 3




    Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
    – Michael Daffin
    Jan 3 at 21:57










  • Kindly provide sample input and output
    – Praveen Kumar BS
    Jan 4 at 3:09










  • this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
    – cas
    Jan 4 at 4:01















If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
– glenn jackman
Jan 3 at 21:33





If possible, fix the process that generates that file. For the CSV format, a field that contains the quote character should double it: 123,"a string ""with double"" quotes, and commas",456
– glenn jackman
Jan 3 at 21:33













This an export file from Akeneo (cots product) so I can not change the process that generates the file.
– J.Turck
Jan 3 at 21:37




This an export file from Akeneo (cots product) so I can not change the process that generates the file.
– J.Turck
Jan 3 at 21:37




3




3




Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
– Michael Daffin
Jan 3 at 21:57




Definitely raise an issue with the software provider of the product that produced that output - there is no way to parse that file without hitting loads of edge case and cause incorrect parsing. In the meantime I would write a parser in something like python to handle the edge cases as you encounter them. If you can get them to fix it upstream then you can just drop it and use a real csv parser.
– Michael Daffin
Jan 3 at 21:57












Kindly provide sample input and output
– Praveen Kumar BS
Jan 4 at 3:09




Kindly provide sample input and output
– Praveen Kumar BS
Jan 4 at 3:09












this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
– cas
Jan 4 at 4:01




this quick-and-dirty sed script will fix up the quotes-inside-quotes so that the input file can be processed by a CSV parser:sed -e 's/"/""/g; s/,""/,"/g; s/"",/",/g; s/^""|""$/"/g;'. it first converts all "s to "", then changes them back to just " if they are next to commas or the start or end of the line. It doesn't fix any other potential problems in the input file (and if they get something like this wrong, that's quite likely).
– cas
Jan 4 at 4:01










4 Answers
4






active

oldest

votes

















up vote
0
down vote













Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.



Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.



That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".



This will, however, explode if any of your quoted data themselves contain double-quotes.






share|improve this answer




















  • line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
    – J.Turck
    Jan 3 at 21:30











  • Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
    – DopeGhoti
    Jan 3 at 21:35










  • Unfortunately no. It is an export from a cots package - AKeneo.
    – J.Turck
    Jan 3 at 21:38

















up vote
0
down vote













As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.



Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.






share|improve this answer



























    up vote
    0
    down vote













    The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:



    #!/bin/bash
    while IFS='' read -r line || [[ -n "$line" ]]; do
    echo "Line: $line"

    # grabbing the first field is easy ..
    f1=$(echo $line | cut -d, -f1 )

    # now remove the first field from the line
    line=$(echo $line | sed "s/$f1,//" )
    echo "Line is now: $line"

    # to grab the second field use quote as a delimiter
    f2=$(echo $line | cut -d" -f2 )

    # now remove the second field from the line
    line=$(echo $line | sed "s/"$f2",//" )
    echo "Line is now: $line"

    # fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
    f3=$(echo $line | cut -d, -f1 )
    line=$(echo $line | sed "s/$f3,//" )
    echo "Line is now: $line"
    f4=$(echo $line | cut -d, -f1 )
    line=$(echo $line | sed "s/$f4,//" )
    echo "Line is now: $line"
    f5=$(echo $line | cut -d, -f1 )
    line=$(echo $line | sed "s/$f5,//" )

    # here is the "trick" ... reverse the string, then you can cut field 7 first!
    line=$(echo $line | rev)
    echo "Line is now: $line"
    f7=$(echo $line | cut -d, -f1 )

    # now remove field 7 from the string, then reverse it back
    line=$(echo $line | sed "s/$f7,//" )
    f7=$(echo $f7 | rev)

    # now we can reverse the remaining string, which is field 6 back to normal
    line=$(echo $line | rev)
    # and then remove the leading quote
    line=$(echo $line | cut --complement -c 1)
    # and then remove the trailing quote
    line=$(echo $line | sed "s/"$//" )
    echo "Line is now: $line"
    # and then double up all the remaining quotes
    f6=$(echo $line | sed "s/"/""/g" )

    echo f1 = $f1
    echo f2 = $f2
    echo f3 = $f3
    echo f4 = $f4
    echo f5 = $f5
    echo f6 = $f6
    echo f7 = $f7
    echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
    done < "$1"


    I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.



    Here is an example run and output:



    [root@alpha ~]# ./fixit.sh test.txt
    Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
    Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
    Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
    Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
    Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
    Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
    Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
    f1 = MMP
    f2 = 01_janitorial,02_cleaning_tools
    f3 = 1
    f4 =
    f5 = CUBIC_INCH
    f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
    f7 = CLEANING


    If you need to escape the quotes in some other manner that should be pretty obvious given the above.






    share|improve this answer




















    • Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
      – J.Turck
      Jan 4 at 18:30











    • • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
      – J.Turck
      Jan 4 at 18:38










    • I want to remove the $ and leave the ^M.
      – J.Turck
      Jan 4 at 18:38










    • Are the soft returns literally $, or are you just using that as a placeholder for the example?
      – alfreema
      Jan 4 at 19:16










    • when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
      – J.Turck
      Jan 5 at 2:08

















    up vote
    0
    down vote













    I get the final product by removing carriage returns within a quoted filed as following script :



    $ cat remove_cr.awk
    #!/usr/bin/awk -f
    record = record $0
    # If number of quotes is odd, continue reading record.
    if ( gsub( /"/, "&", record ) % 2 )
    record = record " "
    next


    print record
    record = ""






    share|improve this answer






















      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );








       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f414641%2flinux-how-do-i-ignore-special-characters-between%23new-answer', 'question_page');

      );

      Post as a guest






























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      0
      down vote













      Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.



      Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.



      That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".



      This will, however, explode if any of your quoted data themselves contain double-quotes.






      share|improve this answer




















      • line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
        – J.Turck
        Jan 3 at 21:30











      • Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
        – DopeGhoti
        Jan 3 at 21:35










      • Unfortunately no. It is an export from a cots package - AKeneo.
        – J.Turck
        Jan 3 at 21:38














      up vote
      0
      down vote













      Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.



      Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.



      That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".



      This will, however, explode if any of your quoted data themselves contain double-quotes.






      share|improve this answer




















      • line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
        – J.Turck
        Jan 3 at 21:30











      • Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
        – DopeGhoti
        Jan 3 at 21:35










      • Unfortunately no. It is an export from a cots package - AKeneo.
        – J.Turck
        Jan 3 at 21:38












      up vote
      0
      down vote










      up vote
      0
      down vote









      Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.



      Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.



      That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".



      This will, however, explode if any of your quoted data themselves contain double-quotes.






      share|improve this answer












      Simply using -F, is often not enough to parse a CSV file. Especially if, as described, the delimiter can be part of a quoted string. You can work around some of this by using FPAT to use an expression to define a field rather than defining a character for your field delimiter, but awk will still go line-by-line, so you will have to preemptively consume the line breaks within your data.



      Once done, you can do something such as awk 'BEGIN ("[^"]+")" /* normal processing here */ ' /path/to/file.



      That expression will define as a field either "anything that is not a comma" or "A double-quote, one or more of anything that is not a double-quote, followed by a double-quote".



      This will, however, explode if any of your quoted data themselves contain double-quotes.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Jan 3 at 21:26









      DopeGhoti

      40.5k54979




      40.5k54979











      • line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
        – J.Turck
        Jan 3 at 21:30











      • Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
        – DopeGhoti
        Jan 3 at 21:35










      • Unfortunately no. It is an export from a cots package - AKeneo.
        – J.Turck
        Jan 3 at 21:38
















      • line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
        – J.Turck
        Jan 3 at 21:30











      • Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
        – DopeGhoti
        Jan 3 at 21:35










      • Unfortunately no. It is an export from a cots package - AKeneo.
        – J.Turck
        Jan 3 at 21:38















      line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
      – J.Turck
      Jan 3 at 21:30





      line 6 has double quotes and commas between the double quotes: "(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF"
      – J.Turck
      Jan 3 at 21:30













      Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
      – DopeGhoti
      Jan 3 at 21:35




      Is it possible to generate your data file using a delimiter which is not present in the data themselves (e. g. |)?
      – DopeGhoti
      Jan 3 at 21:35












      Unfortunately no. It is an export from a cots package - AKeneo.
      – J.Turck
      Jan 3 at 21:38




      Unfortunately no. It is an export from a cots package - AKeneo.
      – J.Turck
      Jan 3 at 21:38












      up vote
      0
      down vote













      As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.



      Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.






      share|improve this answer
























        up vote
        0
        down vote













        As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.



        Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.






        share|improve this answer






















          up vote
          0
          down vote










          up vote
          0
          down vote









          As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.



          Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.






          share|improve this answer












          As was said, the file was generated incorrectly. Nevertheless, you may try to workaroumd it using not only , delimiter but also ", and ,". Of course, custom script will be needed and no warranty you won't meet something like that in your 6th field.



          Alternatively, you can strip first five fields assuming the 6th field is the only one messed, then from the result cut out the last field and comma. The remains will be 6th field content.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 3 at 23:44









          Putnik

          4142514




          4142514




















              up vote
              0
              down vote













              The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:



              #!/bin/bash
              while IFS='' read -r line || [[ -n "$line" ]]; do
              echo "Line: $line"

              # grabbing the first field is easy ..
              f1=$(echo $line | cut -d, -f1 )

              # now remove the first field from the line
              line=$(echo $line | sed "s/$f1,//" )
              echo "Line is now: $line"

              # to grab the second field use quote as a delimiter
              f2=$(echo $line | cut -d" -f2 )

              # now remove the second field from the line
              line=$(echo $line | sed "s/"$f2",//" )
              echo "Line is now: $line"

              # fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
              f3=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f3,//" )
              echo "Line is now: $line"
              f4=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f4,//" )
              echo "Line is now: $line"
              f5=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f5,//" )

              # here is the "trick" ... reverse the string, then you can cut field 7 first!
              line=$(echo $line | rev)
              echo "Line is now: $line"
              f7=$(echo $line | cut -d, -f1 )

              # now remove field 7 from the string, then reverse it back
              line=$(echo $line | sed "s/$f7,//" )
              f7=$(echo $f7 | rev)

              # now we can reverse the remaining string, which is field 6 back to normal
              line=$(echo $line | rev)
              # and then remove the leading quote
              line=$(echo $line | cut --complement -c 1)
              # and then remove the trailing quote
              line=$(echo $line | sed "s/"$//" )
              echo "Line is now: $line"
              # and then double up all the remaining quotes
              f6=$(echo $line | sed "s/"/""/g" )

              echo f1 = $f1
              echo f2 = $f2
              echo f3 = $f3
              echo f4 = $f4
              echo f5 = $f5
              echo f6 = $f6
              echo f7 = $f7
              echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
              done < "$1"


              I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.



              Here is an example run and output:



              [root@alpha ~]# ./fixit.sh test.txt
              Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
              Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f1 = MMP
              f2 = 01_janitorial,02_cleaning_tools
              f3 = 1
              f4 =
              f5 = CUBIC_INCH
              f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f7 = CLEANING


              If you need to escape the quotes in some other manner that should be pretty obvious given the above.






              share|improve this answer




















              • Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
                – J.Turck
                Jan 4 at 18:30











              • • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
                – J.Turck
                Jan 4 at 18:38










              • I want to remove the $ and leave the ^M.
                – J.Turck
                Jan 4 at 18:38










              • Are the soft returns literally $, or are you just using that as a placeholder for the example?
                – alfreema
                Jan 4 at 19:16










              • when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
                – J.Turck
                Jan 5 at 2:08














              up vote
              0
              down vote













              The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:



              #!/bin/bash
              while IFS='' read -r line || [[ -n "$line" ]]; do
              echo "Line: $line"

              # grabbing the first field is easy ..
              f1=$(echo $line | cut -d, -f1 )

              # now remove the first field from the line
              line=$(echo $line | sed "s/$f1,//" )
              echo "Line is now: $line"

              # to grab the second field use quote as a delimiter
              f2=$(echo $line | cut -d" -f2 )

              # now remove the second field from the line
              line=$(echo $line | sed "s/"$f2",//" )
              echo "Line is now: $line"

              # fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
              f3=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f3,//" )
              echo "Line is now: $line"
              f4=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f4,//" )
              echo "Line is now: $line"
              f5=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f5,//" )

              # here is the "trick" ... reverse the string, then you can cut field 7 first!
              line=$(echo $line | rev)
              echo "Line is now: $line"
              f7=$(echo $line | cut -d, -f1 )

              # now remove field 7 from the string, then reverse it back
              line=$(echo $line | sed "s/$f7,//" )
              f7=$(echo $f7 | rev)

              # now we can reverse the remaining string, which is field 6 back to normal
              line=$(echo $line | rev)
              # and then remove the leading quote
              line=$(echo $line | cut --complement -c 1)
              # and then remove the trailing quote
              line=$(echo $line | sed "s/"$//" )
              echo "Line is now: $line"
              # and then double up all the remaining quotes
              f6=$(echo $line | sed "s/"/""/g" )

              echo f1 = $f1
              echo f2 = $f2
              echo f3 = $f3
              echo f4 = $f4
              echo f5 = $f5
              echo f6 = $f6
              echo f7 = $f7
              echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
              done < "$1"


              I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.



              Here is an example run and output:



              [root@alpha ~]# ./fixit.sh test.txt
              Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
              Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f1 = MMP
              f2 = 01_janitorial,02_cleaning_tools
              f3 = 1
              f4 =
              f5 = CUBIC_INCH
              f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f7 = CLEANING


              If you need to escape the quotes in some other manner that should be pretty obvious given the above.






              share|improve this answer




















              • Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
                – J.Turck
                Jan 4 at 18:30











              • • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
                – J.Turck
                Jan 4 at 18:38










              • I want to remove the $ and leave the ^M.
                – J.Turck
                Jan 4 at 18:38










              • Are the soft returns literally $, or are you just using that as a placeholder for the example?
                – alfreema
                Jan 4 at 19:16










              • when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
                – J.Turck
                Jan 5 at 2:08












              up vote
              0
              down vote










              up vote
              0
              down vote









              The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:



              #!/bin/bash
              while IFS='' read -r line || [[ -n "$line" ]]; do
              echo "Line: $line"

              # grabbing the first field is easy ..
              f1=$(echo $line | cut -d, -f1 )

              # now remove the first field from the line
              line=$(echo $line | sed "s/$f1,//" )
              echo "Line is now: $line"

              # to grab the second field use quote as a delimiter
              f2=$(echo $line | cut -d" -f2 )

              # now remove the second field from the line
              line=$(echo $line | sed "s/"$f2",//" )
              echo "Line is now: $line"

              # fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
              f3=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f3,//" )
              echo "Line is now: $line"
              f4=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f4,//" )
              echo "Line is now: $line"
              f5=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f5,//" )

              # here is the "trick" ... reverse the string, then you can cut field 7 first!
              line=$(echo $line | rev)
              echo "Line is now: $line"
              f7=$(echo $line | cut -d, -f1 )

              # now remove field 7 from the string, then reverse it back
              line=$(echo $line | sed "s/$f7,//" )
              f7=$(echo $f7 | rev)

              # now we can reverse the remaining string, which is field 6 back to normal
              line=$(echo $line | rev)
              # and then remove the leading quote
              line=$(echo $line | cut --complement -c 1)
              # and then remove the trailing quote
              line=$(echo $line | sed "s/"$//" )
              echo "Line is now: $line"
              # and then double up all the remaining quotes
              f6=$(echo $line | sed "s/"/""/g" )

              echo f1 = $f1
              echo f2 = $f2
              echo f3 = $f3
              echo f4 = $f4
              echo f5 = $f5
              echo f6 = $f6
              echo f7 = $f7
              echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
              done < "$1"


              I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.



              Here is an example run and output:



              [root@alpha ~]# ./fixit.sh test.txt
              Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
              Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f1 = MMP
              f2 = 01_janitorial,02_cleaning_tools
              f3 = 1
              f4 =
              f5 = CUBIC_INCH
              f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f7 = CLEANING


              If you need to escape the quotes in some other manner that should be pretty obvious given the above.






              share|improve this answer












              The solution is going to be very specific to your data file since the quotes aren't properly escaped off. Since there is only one trouble column, it's quite doable though. Here ya go:



              #!/bin/bash
              while IFS='' read -r line || [[ -n "$line" ]]; do
              echo "Line: $line"

              # grabbing the first field is easy ..
              f1=$(echo $line | cut -d, -f1 )

              # now remove the first field from the line
              line=$(echo $line | sed "s/$f1,//" )
              echo "Line is now: $line"

              # to grab the second field use quote as a delimiter
              f2=$(echo $line | cut -d" -f2 )

              # now remove the second field from the line
              line=$(echo $line | sed "s/"$f2",//" )
              echo "Line is now: $line"

              # fields 3,4,5 are trivial .. just repeat the same pattern as 1 and then remove them
              f3=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f3,//" )
              echo "Line is now: $line"
              f4=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f4,//" )
              echo "Line is now: $line"
              f5=$(echo $line | cut -d, -f1 )
              line=$(echo $line | sed "s/$f5,//" )

              # here is the "trick" ... reverse the string, then you can cut field 7 first!
              line=$(echo $line | rev)
              echo "Line is now: $line"
              f7=$(echo $line | cut -d, -f1 )

              # now remove field 7 from the string, then reverse it back
              line=$(echo $line | sed "s/$f7,//" )
              f7=$(echo $f7 | rev)

              # now we can reverse the remaining string, which is field 6 back to normal
              line=$(echo $line | rev)
              # and then remove the leading quote
              line=$(echo $line | cut --complement -c 1)
              # and then remove the trailing quote
              line=$(echo $line | sed "s/"$//" )
              echo "Line is now: $line"
              # and then double up all the remaining quotes
              f6=$(echo $line | sed "s/"/""/g" )

              echo f1 = $f1
              echo f2 = $f2
              echo f3 = $f3
              echo f4 = $f4
              echo f5 = $f5
              echo f6 = $f6
              echo f7 = $f7
              echo $f1,"$f2",$f3,$f4,$f5,"$f6",$f7 >> fixed.txt
              done < "$1"


              I made it echo lots of output to show you how it works, you can remove all the echo statements to make it fast once you understand it. It appends the fixed line to fixed.txt.



              Here is an example run and output:



              [root@alpha ~]# ./fixit.sh test.txt
              Line: MMP,"01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: "01_janitorial,02_cleaning_tools",1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: 1,,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: ,CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: CUBIC_INCH,"(14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF",CLEANING
              Line is now: GNINAELC,"FSN ,flehs Q xaMorteM & i xaMorteM stif ,yxope epuat ,D"42 x W"84 no stnuom ,gnicaps "3 htiw thgirpu "6 ,yticapac yart )41("
              Line is now: (14) tray capacity, 6" upright with 3" spacing, mounts on 48"W x 24"D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f1 = MMP
              f2 = 01_janitorial,02_cleaning_tools
              f3 = 1
              f4 =
              f5 = CUBIC_INCH
              f6 = (14) tray capacity, 6"" upright with 3"" spacing, mounts on 48""W x 24""D, taupe epoxy, fits MetroMax i & MetroMax Q shelf, NSF
              f7 = CLEANING


              If you need to escape the quotes in some other manner that should be pretty obvious given the above.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Jan 4 at 4:50









              alfreema

              293210




              293210











              • Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
                – J.Turck
                Jan 4 at 18:30











              • • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
                – J.Turck
                Jan 4 at 18:38










              • I want to remove the $ and leave the ^M.
                – J.Turck
                Jan 4 at 18:38










              • Are the soft returns literally $, or are you just using that as a placeholder for the example?
                – alfreema
                Jan 4 at 19:16










              • when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
                – J.Turck
                Jan 5 at 2:08
















              • Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
                – J.Turck
                Jan 4 at 18:30











              • • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
                – J.Turck
                Jan 4 at 18:38










              • I want to remove the $ and leave the ^M.
                – J.Turck
                Jan 4 at 18:38










              • Are the soft returns literally $, or are you just using that as a placeholder for the example?
                – alfreema
                Jan 4 at 19:16










              • when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
                – J.Turck
                Jan 5 at 2:08















              Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
              – J.Turck
              Jan 4 at 18:30





              Wow, @alfreema, that's a lot of effort. I really appreciate it. The file I gave as my example is a shortened version of reality. My example showed 7 columns, I actually have 155. I believe I have most of it sorted, with the exception of what I call 'soft returns' needing to be removed while leaving the EOL character. The application allows formatting in the certain fields and the users add Tabs, bullets, and returns to format the data. When i read the file each 'soft return' is read as an EOL.
              – J.Turck
              Jan 4 at 18:30













              • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
              – J.Turck
              Jan 4 at 18:38




              • Sturdy construction: 300lb. (136kg), 400lb. (181kg), and$ 500lb. (227kg) capacity models available.",,,,,,,,,,,,0,DAY,24,INCH,,,Metro,"Fast drying of trays, pans, lids, pots and all pot sink items Promotes food safety by eliminating moisture. Interchangeable shelves adapt to all environments. Efficient organized drying area. Superior air circulation. Mobile unit allows for easy floor clean-up and$ transportability.",MTR2448XEA,,Cutting Board & Tray Drying Rack Group,570-825-2741,Customer,Service,PA,Wilkes-Barre,18705,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,48,INCH,,,^M$
              – J.Turck
              Jan 4 at 18:38












              I want to remove the $ and leave the ^M.
              – J.Turck
              Jan 4 at 18:38




              I want to remove the $ and leave the ^M.
              – J.Turck
              Jan 4 at 18:38












              Are the soft returns literally $, or are you just using that as a placeholder for the example?
              – alfreema
              Jan 4 at 19:16




              Are the soft returns literally $, or are you just using that as a placeholder for the example?
              – alfreema
              Jan 4 at 19:16












              when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
              – J.Turck
              Jan 5 at 2:08




              when i use :set list (inside the text file with vi) they show as $. My example is a cut and paste.
              – J.Turck
              Jan 5 at 2:08










              up vote
              0
              down vote













              I get the final product by removing carriage returns within a quoted filed as following script :



              $ cat remove_cr.awk
              #!/usr/bin/awk -f
              record = record $0
              # If number of quotes is odd, continue reading record.
              if ( gsub( /"/, "&", record ) % 2 )
              record = record " "
              next


              print record
              record = ""






              share|improve this answer


























                up vote
                0
                down vote













                I get the final product by removing carriage returns within a quoted filed as following script :



                $ cat remove_cr.awk
                #!/usr/bin/awk -f
                record = record $0
                # If number of quotes is odd, continue reading record.
                if ( gsub( /"/, "&", record ) % 2 )
                record = record " "
                next


                print record
                record = ""






                share|improve this answer
























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  I get the final product by removing carriage returns within a quoted filed as following script :



                  $ cat remove_cr.awk
                  #!/usr/bin/awk -f
                  record = record $0
                  # If number of quotes is odd, continue reading record.
                  if ( gsub( /"/, "&", record ) % 2 )
                  record = record " "
                  next


                  print record
                  record = ""






                  share|improve this answer














                  I get the final product by removing carriage returns within a quoted filed as following script :



                  $ cat remove_cr.awk
                  #!/usr/bin/awk -f
                  record = record $0
                  # If number of quotes is odd, continue reading record.
                  if ( gsub( /"/, "&", record ) % 2 )
                  record = record " "
                  next


                  print record
                  record = ""







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 8 at 17:57









                  αғsнιη

                  15.3k92462




                  15.3k92462










                  answered Jan 8 at 17:50









                  J.Turck

                  62




                  62






















                       

                      draft saved


                      draft discarded


























                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f414641%2flinux-how-do-i-ignore-special-characters-between%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Displaying single band from multi-band raster using QGIS

                      How many registers does an x86_64 CPU actually have?