How to use Unix Shell to show only the first n columns and last n columns?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
2












I have many csv files. The original design was supposed to have five columns.



I just found out that the middle column of the csv file has a string with arbitrary number of commas in it and it is not quoted properly. This leads to rows with arbitrary number of columns.



How do I get just the first two and last two columns of these csv files?



Since the number of commas can change from row to row I need a way to specify first two and last two columns.







share|improve this question


















  • 4




    Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
    – Monty Harder
    Jan 23 at 22:06






  • 1




    If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
    – cas
    Jan 24 at 2:50










  • Yeah it wasn't quoted. The worst part is that column was not needed at all.
    – PoorLifeChoicesMadeMeWhoIAm
    Jan 25 at 1:11














up vote
3
down vote

favorite
2












I have many csv files. The original design was supposed to have five columns.



I just found out that the middle column of the csv file has a string with arbitrary number of commas in it and it is not quoted properly. This leads to rows with arbitrary number of columns.



How do I get just the first two and last two columns of these csv files?



Since the number of commas can change from row to row I need a way to specify first two and last two columns.







share|improve this question


















  • 4




    Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
    – Monty Harder
    Jan 23 at 22:06






  • 1




    If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
    – cas
    Jan 24 at 2:50










  • Yeah it wasn't quoted. The worst part is that column was not needed at all.
    – PoorLifeChoicesMadeMeWhoIAm
    Jan 25 at 1:11












up vote
3
down vote

favorite
2









up vote
3
down vote

favorite
2






2





I have many csv files. The original design was supposed to have five columns.



I just found out that the middle column of the csv file has a string with arbitrary number of commas in it and it is not quoted properly. This leads to rows with arbitrary number of columns.



How do I get just the first two and last two columns of these csv files?



Since the number of commas can change from row to row I need a way to specify first two and last two columns.







share|improve this question














I have many csv files. The original design was supposed to have five columns.



I just found out that the middle column of the csv file has a string with arbitrary number of commas in it and it is not quoted properly. This leads to rows with arbitrary number of columns.



How do I get just the first two and last two columns of these csv files?



Since the number of commas can change from row to row I need a way to specify first two and last two columns.









share|improve this question













share|improve this question




share|improve this question








edited Jan 25 at 1:23

























asked Jan 23 at 18:56









PoorLifeChoicesMadeMeWhoIAm

184




184







  • 4




    Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
    – Monty Harder
    Jan 23 at 22:06






  • 1




    If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
    – cas
    Jan 24 at 2:50










  • Yeah it wasn't quoted. The worst part is that column was not needed at all.
    – PoorLifeChoicesMadeMeWhoIAm
    Jan 25 at 1:11












  • 4




    Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
    – Monty Harder
    Jan 23 at 22:06






  • 1




    If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
    – cas
    Jan 24 at 2:50










  • Yeah it wasn't quoted. The worst part is that column was not needed at all.
    – PoorLifeChoicesMadeMeWhoIAm
    Jan 25 at 1:11







4




4




Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
– Monty Harder
Jan 23 at 22:06




Please tell me the middle column is enclosed in quote marks. If it is not, tell whoever created these files they did it wrong.
– Monty Harder
Jan 23 at 22:06




1




1




If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
– cas
Jan 24 at 2:50




If the columns are properly quoted, then you can use a language (like perl or python or even awk) that has a CSV parsing library. See also stackoverflow.com/questions/4205431/…
– cas
Jan 24 at 2:50












Yeah it wasn't quoted. The worst part is that column was not needed at all.
– PoorLifeChoicesMadeMeWhoIAm
Jan 25 at 1:11




Yeah it wasn't quoted. The worst part is that column was not needed at all.
– PoorLifeChoicesMadeMeWhoIAm
Jan 25 at 1:11










3 Answers
3






active

oldest

votes

















up vote
13
down vote



accepted










awk -F, 'print $1, $2, $(NF-1), $NF' < input



More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing some columns twice --



awk -v n=2 ' 
for(i=1; i <= n && i <= NF; i++)
printf "%s%s", $i, OFS
for(i=NF-n+1; i <= NF && i >= 1; i++)
printf "%s%s", $i, OFS
printf "%s", ORS
' < input


(using -F as needed for the delimiter)






share|improve this answer






















  • you dont have to use a redirect
    – Steven Penny
    Jan 23 at 23:52










  • That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
    – Jeff Schaller
    Jan 24 at 0:23

















up vote
1
down vote













perl:



echo a,b,X,X,X,X,c,d | perl -F, -slane 'print join ",", @F[0..$n-1, -$n..-1]' -- -n=2




a,b,c,d





share|improve this answer



























    up vote
    1
    down vote













    You can use this sed too



    sed -E 's/(([^,]*,)2).*((,[^,]*)2)/13/;s/,,/,/'





    share|improve this answer




















      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );








       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f419162%2fhow-to-use-unix-shell-to-show-only-the-first-n-columns-and-last-n-columns%23new-answer', 'question_page');

      );

      Post as a guest






























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      13
      down vote



      accepted










      awk -F, 'print $1, $2, $(NF-1), $NF' < input



      More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing some columns twice --



      awk -v n=2 ' 
      for(i=1; i <= n && i <= NF; i++)
      printf "%s%s", $i, OFS
      for(i=NF-n+1; i <= NF && i >= 1; i++)
      printf "%s%s", $i, OFS
      printf "%s", ORS
      ' < input


      (using -F as needed for the delimiter)






      share|improve this answer






















      • you dont have to use a redirect
        – Steven Penny
        Jan 23 at 23:52










      • That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
        – Jeff Schaller
        Jan 24 at 0:23














      up vote
      13
      down vote



      accepted










      awk -F, 'print $1, $2, $(NF-1), $NF' < input



      More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing some columns twice --



      awk -v n=2 ' 
      for(i=1; i <= n && i <= NF; i++)
      printf "%s%s", $i, OFS
      for(i=NF-n+1; i <= NF && i >= 1; i++)
      printf "%s%s", $i, OFS
      printf "%s", ORS
      ' < input


      (using -F as needed for the delimiter)






      share|improve this answer






















      • you dont have to use a redirect
        – Steven Penny
        Jan 23 at 23:52










      • That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
        – Jeff Schaller
        Jan 24 at 0:23












      up vote
      13
      down vote



      accepted







      up vote
      13
      down vote



      accepted






      awk -F, 'print $1, $2, $(NF-1), $NF' < input



      More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing some columns twice --



      awk -v n=2 ' 
      for(i=1; i <= n && i <= NF; i++)
      printf "%s%s", $i, OFS
      for(i=NF-n+1; i <= NF && i >= 1; i++)
      printf "%s%s", $i, OFS
      printf "%s", ORS
      ' < input


      (using -F as needed for the delimiter)






      share|improve this answer














      awk -F, 'print $1, $2, $(NF-1), $NF' < input



      More generally (per the Question's title), to print the first and last n columns of the input -- without checking to see whether that means printing some columns twice --



      awk -v n=2 ' 
      for(i=1; i <= n && i <= NF; i++)
      printf "%s%s", $i, OFS
      for(i=NF-n+1; i <= NF && i >= 1; i++)
      printf "%s%s", $i, OFS
      printf "%s", ORS
      ' < input


      (using -F as needed for the delimiter)







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Jan 24 at 1:00

























      answered Jan 23 at 19:07









      Jeff Schaller

      31.7k847107




      31.7k847107











      • you dont have to use a redirect
        – Steven Penny
        Jan 23 at 23:52










      • That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
        – Jeff Schaller
        Jan 24 at 0:23
















      • you dont have to use a redirect
        – Steven Penny
        Jan 23 at 23:52










      • That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
        – Jeff Schaller
        Jan 24 at 0:23















      you dont have to use a redirect
      – Steven Penny
      Jan 23 at 23:52




      you dont have to use a redirect
      – Steven Penny
      Jan 23 at 23:52












      That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
      – Jeff Schaller
      Jan 24 at 0:23




      That’s correct. It’s a habit I’m trying to form after seeing advice in that direction from other U&L members here. It means that the shell reports errors/failures instead of the utility — more consistency. It also prevents the utility from running if the IO redirection fails.
      – Jeff Schaller
      Jan 24 at 0:23












      up vote
      1
      down vote













      perl:



      echo a,b,X,X,X,X,c,d | perl -F, -slane 'print join ",", @F[0..$n-1, -$n..-1]' -- -n=2




      a,b,c,d





      share|improve this answer
























        up vote
        1
        down vote













        perl:



        echo a,b,X,X,X,X,c,d | perl -F, -slane 'print join ",", @F[0..$n-1, -$n..-1]' -- -n=2




        a,b,c,d





        share|improve this answer






















          up vote
          1
          down vote










          up vote
          1
          down vote









          perl:



          echo a,b,X,X,X,X,c,d | perl -F, -slane 'print join ",", @F[0..$n-1, -$n..-1]' -- -n=2




          a,b,c,d





          share|improve this answer












          perl:



          echo a,b,X,X,X,X,c,d | perl -F, -slane 'print join ",", @F[0..$n-1, -$n..-1]' -- -n=2




          a,b,c,d






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 25 at 19:06









          glenn jackman

          46.6k265103




          46.6k265103




















              up vote
              1
              down vote













              You can use this sed too



              sed -E 's/(([^,]*,)2).*((,[^,]*)2)/13/;s/,,/,/'





              share|improve this answer
























                up vote
                1
                down vote













                You can use this sed too



                sed -E 's/(([^,]*,)2).*((,[^,]*)2)/13/;s/,,/,/'





                share|improve this answer






















                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  You can use this sed too



                  sed -E 's/(([^,]*,)2).*((,[^,]*)2)/13/;s/,,/,/'





                  share|improve this answer












                  You can use this sed too



                  sed -E 's/(([^,]*,)2).*((,[^,]*)2)/13/;s/,,/,/'






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 27 at 12:22









                  ctac_

                  1,016116




                  1,016116






















                       

                      draft saved


                      draft discarded


























                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f419162%2fhow-to-use-unix-shell-to-show-only-the-first-n-columns-and-last-n-columns%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Displaying single band from multi-band raster using QGIS

                      How many registers does an x86_64 CPU actually have?