SED delete csv column if exists

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












2















I need to remove a column from my tabulated CSV file if this column exists.



My CSV file:



GENE REF ALT
AKT A G
AKT G G


Desired output:
if column REF exists delete this column



GENE ALT
AKT G
AKT G


I tried to do that:



sed 's/tREF.[^t]*//' filename.csv


but it doesn't work.










share|improve this question






















  • Use awk: awk 'NF == 3 print $1,$3 file`

    – Valentin Bajrami
    Jan 4 at 10:32











  • Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

    – user979974
    Jan 4 at 10:40











  • No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

    – Valentin Bajrami
    Jan 4 at 12:29











  • what about awk ' print $1, $NF' OFS='t' infile?

    – αғsнιη
    Jan 4 at 14:43







  • 2





    sed is not the right tool for this job. The word REF does not appear on every line.

    – glenn jackman
    Jan 4 at 16:03















2















I need to remove a column from my tabulated CSV file if this column exists.



My CSV file:



GENE REF ALT
AKT A G
AKT G G


Desired output:
if column REF exists delete this column



GENE ALT
AKT G
AKT G


I tried to do that:



sed 's/tREF.[^t]*//' filename.csv


but it doesn't work.










share|improve this question






















  • Use awk: awk 'NF == 3 print $1,$3 file`

    – Valentin Bajrami
    Jan 4 at 10:32











  • Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

    – user979974
    Jan 4 at 10:40











  • No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

    – Valentin Bajrami
    Jan 4 at 12:29











  • what about awk ' print $1, $NF' OFS='t' infile?

    – αғsнιη
    Jan 4 at 14:43







  • 2





    sed is not the right tool for this job. The word REF does not appear on every line.

    – glenn jackman
    Jan 4 at 16:03













2












2








2








I need to remove a column from my tabulated CSV file if this column exists.



My CSV file:



GENE REF ALT
AKT A G
AKT G G


Desired output:
if column REF exists delete this column



GENE ALT
AKT G
AKT G


I tried to do that:



sed 's/tREF.[^t]*//' filename.csv


but it doesn't work.










share|improve this question














I need to remove a column from my tabulated CSV file if this column exists.



My CSV file:



GENE REF ALT
AKT A G
AKT G G


Desired output:
if column REF exists delete this column



GENE ALT
AKT G
AKT G


I tried to do that:



sed 's/tREF.[^t]*//' filename.csv


but it doesn't work.







sed csv delete






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 4 at 10:07









user979974user979974

1133




1133












  • Use awk: awk 'NF == 3 print $1,$3 file`

    – Valentin Bajrami
    Jan 4 at 10:32











  • Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

    – user979974
    Jan 4 at 10:40











  • No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

    – Valentin Bajrami
    Jan 4 at 12:29











  • what about awk ' print $1, $NF' OFS='t' infile?

    – αғsнιη
    Jan 4 at 14:43







  • 2





    sed is not the right tool for this job. The word REF does not appear on every line.

    – glenn jackman
    Jan 4 at 16:03

















  • Use awk: awk 'NF == 3 print $1,$3 file`

    – Valentin Bajrami
    Jan 4 at 10:32











  • Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

    – user979974
    Jan 4 at 10:40











  • No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

    – Valentin Bajrami
    Jan 4 at 12:29











  • what about awk ' print $1, $NF' OFS='t' infile?

    – αғsнιη
    Jan 4 at 14:43







  • 2





    sed is not the right tool for this job. The word REF does not appear on every line.

    – glenn jackman
    Jan 4 at 16:03
















Use awk: awk 'NF == 3 print $1,$3 file`

– Valentin Bajrami
Jan 4 at 10:32





Use awk: awk 'NF == 3 print $1,$3 file`

– Valentin Bajrami
Jan 4 at 10:32













Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

– user979974
Jan 4 at 10:40





Hi Valentin, thanks for your reply. The thing is sometimes REF column can exists sometimes not. Is with your command line it will remove always column 2?

– user979974
Jan 4 at 10:40













No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

– Valentin Bajrami
Jan 4 at 12:29





No. The NF checks if there are at least 3 fields. You might also want awk -vOFS="t" 'NF.....} file.

– Valentin Bajrami
Jan 4 at 12:29













what about awk ' print $1, $NF' OFS='t' infile?

– αғsнιη
Jan 4 at 14:43






what about awk ' print $1, $NF' OFS='t' infile?

– αғsнιη
Jan 4 at 14:43





2




2





sed is not the right tool for this job. The word REF does not appear on every line.

– glenn jackman
Jan 4 at 16:03





sed is not the right tool for this job. The word REF does not appear on every line.

– glenn jackman
Jan 4 at 16:03










4 Answers
4






active

oldest

votes


















1














Hi with miller (http://johnkerl.org/miller/doc) and this input.csv



GENE,REF,ALT
AKT,A,G
AKT,G,G


is very easy



mlr --csv cut -x -f REF input.csv


The output is



GENE,ALT
AKT,G
AKT,G





share|improve this answer




















  • 1





    This miller tool works good :-)

    – user979974
    Jan 7 at 11:05


















4














With perl



$ perl -F't' -lane '@non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1;
print join "t", @F[@non_ref_cols]' ip.txt
GENE ALT
AKT G
AKT G



  • -F't' use tab as field separator


  • @non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1 for the header line, get index of all fields that is NOT REF


  • print join "t", @F[@non_ref_cols] print all fields using the index array we got for the header line, using tab as field separator





share|improve this answer




















  • 1





    Nice solution. I improved your variable name for clarity.

    – glenn jackman
    Jan 4 at 16:01


















1














With awk:



awk -F't' -v OFS='t' '
NR == 1 for (i=1; i<=NF; i++) if ($i == "REF") refCol = i
refCol for (i = refCol + 1; i <=NF; i++) $(i-1) = $i; NF--
1
' file.csv


The first line finds which column is the REF column. The refCol variable will be uninitialized if there is no such column.

The second line removes the column if it has been found.

The third line prints the record.






share|improve this answer






























    1














    You could determine which column REF is at with grep and then decide to delete it, e.g. with GNU cut and assuming your headings are single words and delimiters are tabs:



    colnumber=$(head -n1 file.tsv | grep -o '[^t]+' | grep -nx 'REF' | cut -d: -f1)
    [[ -n $colnumber ]] && cut --complement -f$colnumber file.tsv > file.tsv.new


    Output:



    GENE ALT
    AKT G
    AKT G





    share|improve this answer

























    • Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

      – glenn jackman
      Jan 4 at 15:58












    • After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

      – glenn jackman
      Jan 4 at 16:05












    • @glennjackman: I agree, updated.

      – Thor
      Jan 4 at 16:21










    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492434%2fsed-delete-csv-column-if-exists%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Hi with miller (http://johnkerl.org/miller/doc) and this input.csv



    GENE,REF,ALT
    AKT,A,G
    AKT,G,G


    is very easy



    mlr --csv cut -x -f REF input.csv


    The output is



    GENE,ALT
    AKT,G
    AKT,G





    share|improve this answer




















    • 1





      This miller tool works good :-)

      – user979974
      Jan 7 at 11:05















    1














    Hi with miller (http://johnkerl.org/miller/doc) and this input.csv



    GENE,REF,ALT
    AKT,A,G
    AKT,G,G


    is very easy



    mlr --csv cut -x -f REF input.csv


    The output is



    GENE,ALT
    AKT,G
    AKT,G





    share|improve this answer




















    • 1





      This miller tool works good :-)

      – user979974
      Jan 7 at 11:05













    1












    1








    1







    Hi with miller (http://johnkerl.org/miller/doc) and this input.csv



    GENE,REF,ALT
    AKT,A,G
    AKT,G,G


    is very easy



    mlr --csv cut -x -f REF input.csv


    The output is



    GENE,ALT
    AKT,G
    AKT,G





    share|improve this answer















    Hi with miller (http://johnkerl.org/miller/doc) and this input.csv



    GENE,REF,ALT
    AKT,A,G
    AKT,G,G


    is very easy



    mlr --csv cut -x -f REF input.csv


    The output is



    GENE,ALT
    AKT,G
    AKT,G






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 4 at 17:34

























    answered Jan 4 at 17:17









    aborrusoaborruso

    20619




    20619







    • 1





      This miller tool works good :-)

      – user979974
      Jan 7 at 11:05












    • 1





      This miller tool works good :-)

      – user979974
      Jan 7 at 11:05







    1




    1





    This miller tool works good :-)

    – user979974
    Jan 7 at 11:05





    This miller tool works good :-)

    – user979974
    Jan 7 at 11:05













    4














    With perl



    $ perl -F't' -lane '@non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1;
    print join "t", @F[@non_ref_cols]' ip.txt
    GENE ALT
    AKT G
    AKT G



    • -F't' use tab as field separator


    • @non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1 for the header line, get index of all fields that is NOT REF


    • print join "t", @F[@non_ref_cols] print all fields using the index array we got for the header line, using tab as field separator





    share|improve this answer




















    • 1





      Nice solution. I improved your variable name for clarity.

      – glenn jackman
      Jan 4 at 16:01















    4














    With perl



    $ perl -F't' -lane '@non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1;
    print join "t", @F[@non_ref_cols]' ip.txt
    GENE ALT
    AKT G
    AKT G



    • -F't' use tab as field separator


    • @non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1 for the header line, get index of all fields that is NOT REF


    • print join "t", @F[@non_ref_cols] print all fields using the index array we got for the header line, using tab as field separator





    share|improve this answer




















    • 1





      Nice solution. I improved your variable name for clarity.

      – glenn jackman
      Jan 4 at 16:01













    4












    4








    4







    With perl



    $ perl -F't' -lane '@non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1;
    print join "t", @F[@non_ref_cols]' ip.txt
    GENE ALT
    AKT G
    AKT G



    • -F't' use tab as field separator


    • @non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1 for the header line, get index of all fields that is NOT REF


    • print join "t", @F[@non_ref_cols] print all fields using the index array we got for the header line, using tab as field separator





    share|improve this answer















    With perl



    $ perl -F't' -lane '@non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1;
    print join "t", @F[@non_ref_cols]' ip.txt
    GENE ALT
    AKT G
    AKT G



    • -F't' use tab as field separator


    • @non_ref_cols = grep $F[$_] ne "REF" 0..$#F if $. == 1 for the header line, get index of all fields that is NOT REF


    • print join "t", @F[@non_ref_cols] print all fields using the index array we got for the header line, using tab as field separator






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 4 at 16:01









    glenn jackman

    51k571110




    51k571110










    answered Jan 4 at 15:30









    SundeepSundeep

    7,4011927




    7,4011927







    • 1





      Nice solution. I improved your variable name for clarity.

      – glenn jackman
      Jan 4 at 16:01












    • 1





      Nice solution. I improved your variable name for clarity.

      – glenn jackman
      Jan 4 at 16:01







    1




    1





    Nice solution. I improved your variable name for clarity.

    – glenn jackman
    Jan 4 at 16:01





    Nice solution. I improved your variable name for clarity.

    – glenn jackman
    Jan 4 at 16:01











    1














    With awk:



    awk -F't' -v OFS='t' '
    NR == 1 for (i=1; i<=NF; i++) if ($i == "REF") refCol = i
    refCol for (i = refCol + 1; i <=NF; i++) $(i-1) = $i; NF--
    1
    ' file.csv


    The first line finds which column is the REF column. The refCol variable will be uninitialized if there is no such column.

    The second line removes the column if it has been found.

    The third line prints the record.






    share|improve this answer



























      1














      With awk:



      awk -F't' -v OFS='t' '
      NR == 1 for (i=1; i<=NF; i++) if ($i == "REF") refCol = i
      refCol for (i = refCol + 1; i <=NF; i++) $(i-1) = $i; NF--
      1
      ' file.csv


      The first line finds which column is the REF column. The refCol variable will be uninitialized if there is no such column.

      The second line removes the column if it has been found.

      The third line prints the record.






      share|improve this answer

























        1












        1








        1







        With awk:



        awk -F't' -v OFS='t' '
        NR == 1 for (i=1; i<=NF; i++) if ($i == "REF") refCol = i
        refCol for (i = refCol + 1; i <=NF; i++) $(i-1) = $i; NF--
        1
        ' file.csv


        The first line finds which column is the REF column. The refCol variable will be uninitialized if there is no such column.

        The second line removes the column if it has been found.

        The third line prints the record.






        share|improve this answer













        With awk:



        awk -F't' -v OFS='t' '
        NR == 1 for (i=1; i<=NF; i++) if ($i == "REF") refCol = i
        refCol for (i = refCol + 1; i <=NF; i++) $(i-1) = $i; NF--
        1
        ' file.csv


        The first line finds which column is the REF column. The refCol variable will be uninitialized if there is no such column.

        The second line removes the column if it has been found.

        The third line prints the record.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 4 at 15:54









        glenn jackmanglenn jackman

        51k571110




        51k571110





















            1














            You could determine which column REF is at with grep and then decide to delete it, e.g. with GNU cut and assuming your headings are single words and delimiters are tabs:



            colnumber=$(head -n1 file.tsv | grep -o '[^t]+' | grep -nx 'REF' | cut -d: -f1)
            [[ -n $colnumber ]] && cut --complement -f$colnumber file.tsv > file.tsv.new


            Output:



            GENE ALT
            AKT G
            AKT G





            share|improve this answer

























            • Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

              – glenn jackman
              Jan 4 at 15:58












            • After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

              – glenn jackman
              Jan 4 at 16:05












            • @glennjackman: I agree, updated.

              – Thor
              Jan 4 at 16:21















            1














            You could determine which column REF is at with grep and then decide to delete it, e.g. with GNU cut and assuming your headings are single words and delimiters are tabs:



            colnumber=$(head -n1 file.tsv | grep -o '[^t]+' | grep -nx 'REF' | cut -d: -f1)
            [[ -n $colnumber ]] && cut --complement -f$colnumber file.tsv > file.tsv.new


            Output:



            GENE ALT
            AKT G
            AKT G





            share|improve this answer

























            • Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

              – glenn jackman
              Jan 4 at 15:58












            • After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

              – glenn jackman
              Jan 4 at 16:05












            • @glennjackman: I agree, updated.

              – Thor
              Jan 4 at 16:21













            1












            1








            1







            You could determine which column REF is at with grep and then decide to delete it, e.g. with GNU cut and assuming your headings are single words and delimiters are tabs:



            colnumber=$(head -n1 file.tsv | grep -o '[^t]+' | grep -nx 'REF' | cut -d: -f1)
            [[ -n $colnumber ]] && cut --complement -f$colnumber file.tsv > file.tsv.new


            Output:



            GENE ALT
            AKT G
            AKT G





            share|improve this answer















            You could determine which column REF is at with grep and then decide to delete it, e.g. with GNU cut and assuming your headings are single words and delimiters are tabs:



            colnumber=$(head -n1 file.tsv | grep -o '[^t]+' | grep -nx 'REF' | cut -d: -f1)
            [[ -n $colnumber ]] && cut --complement -f$colnumber file.tsv > file.tsv.new


            Output:



            GENE ALT
            AKT G
            AKT G






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Jan 4 at 16:20

























            answered Jan 4 at 11:08









            ThorThor

            11.7k13359




            11.7k13359












            • Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

              – glenn jackman
              Jan 4 at 15:58












            • After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

              – glenn jackman
              Jan 4 at 16:05












            • @glennjackman: I agree, updated.

              – Thor
              Jan 4 at 16:21

















            • Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

              – glenn jackman
              Jan 4 at 15:58












            • After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

              – glenn jackman
              Jan 4 at 16:05












            • @glennjackman: I agree, updated.

              – Thor
              Jan 4 at 16:21
















            Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

            – glenn jackman
            Jan 4 at 15:58






            Very nice. Great idea to use --complement. You might want to add -d $'t' to the cut command. I would use grep -nx REF, but given the previous grep, that's no different.

            – glenn jackman
            Jan 4 at 15:58














            After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

            – glenn jackman
            Jan 4 at 16:05






            After a 2nd look, this would be safer given it's a tab-separated file and spaces would be allowed as valid field characters: grep -Eo [^t]+ | grep -nx REF

            – glenn jackman
            Jan 4 at 16:05














            @glennjackman: I agree, updated.

            – Thor
            Jan 4 at 16:21





            @glennjackman: I agree, updated.

            – Thor
            Jan 4 at 16:21

















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492434%2fsed-delete-csv-column-if-exists%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?