How to extract lines when two columns strings are not equal

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
-1
down vote

favorite












I have a CSV file that looks like this format:



text1,text2,string1,string2
text3,text3,string3,string2
text4,text5,string1,string2
text6,text6,string6,string7


I want to extract rows when column1 and column2 are not equal. The expected result in the above example would be:



text1,text2,string1,string2
text4,text5,string1,string2


When column1 and column2 are not equal. I am familiar with commands that allow me extract specific column like the following to extract the first column:



cat input.csv | cut -d ',' -f1 > output.csv









share|improve this question



























    up vote
    -1
    down vote

    favorite












    I have a CSV file that looks like this format:



    text1,text2,string1,string2
    text3,text3,string3,string2
    text4,text5,string1,string2
    text6,text6,string6,string7


    I want to extract rows when column1 and column2 are not equal. The expected result in the above example would be:



    text1,text2,string1,string2
    text4,text5,string1,string2


    When column1 and column2 are not equal. I am familiar with commands that allow me extract specific column like the following to extract the first column:



    cat input.csv | cut -d ',' -f1 > output.csv









    share|improve this question

























      up vote
      -1
      down vote

      favorite









      up vote
      -1
      down vote

      favorite











      I have a CSV file that looks like this format:



      text1,text2,string1,string2
      text3,text3,string3,string2
      text4,text5,string1,string2
      text6,text6,string6,string7


      I want to extract rows when column1 and column2 are not equal. The expected result in the above example would be:



      text1,text2,string1,string2
      text4,text5,string1,string2


      When column1 and column2 are not equal. I am familiar with commands that allow me extract specific column like the following to extract the first column:



      cat input.csv | cut -d ',' -f1 > output.csv









      share|improve this question















      I have a CSV file that looks like this format:



      text1,text2,string1,string2
      text3,text3,string3,string2
      text4,text5,string1,string2
      text6,text6,string6,string7


      I want to extract rows when column1 and column2 are not equal. The expected result in the above example would be:



      text1,text2,string1,string2
      text4,text5,string1,string2


      When column1 and column2 are not equal. I am familiar with commands that allow me extract specific column like the following to extract the first column:



      cat input.csv | cut -d ',' -f1 > output.csv






      text-processing csv-simple






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 25 at 13:58









      Rui F Ribeiro

      38.3k1476127




      38.3k1476127










      asked Nov 25 at 12:23









      user9371654

      2336




      2336




















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          Assuming that this is a simple CSV file, without any fancy embedding of commas within the fields of the actual data, you may use awk to do this:



          awk -F ',' '$1 != $2' <input.csv


          This is a shorthand way of writing



          awk 'BEGIN FS = "," $1 != $2 print ' <input.csv


          and it sets the input field separator to a comma and prints each line if the first and second field ($1 and $2) are not identical.



          An equivalent Perl variant:



          perl -F ',' -na -e 'print if $F[0] ne $F[1]' <input.csv





          share|improve this answer






















          • Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
            – steve
            Nov 25 at 13:47


















          up vote
          1
          down vote













          GNU sed solution:



          sed -E '/^([^,]+,)1/d' input.csv


          The output:



          text1,text2,string1,string2
          text4,text5,string1,string2





          share|improve this answer






















          • Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
            – user9371654
            Nov 25 at 14:17










          • @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
            – Kusalananda
            Nov 25 at 14:49


















          up vote
          0
          down vote













          $ awk -F "," 'if ($1 != $2)print $0' filename
          text1,text2,string1,string2
          text4,text5,string1,string2





          share|improve this answer






















            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "106"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484025%2fhow-to-extract-lines-when-two-columns-strings-are-not-equal%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote



            accepted










            Assuming that this is a simple CSV file, without any fancy embedding of commas within the fields of the actual data, you may use awk to do this:



            awk -F ',' '$1 != $2' <input.csv


            This is a shorthand way of writing



            awk 'BEGIN FS = "," $1 != $2 print ' <input.csv


            and it sets the input field separator to a comma and prints each line if the first and second field ($1 and $2) are not identical.



            An equivalent Perl variant:



            perl -F ',' -na -e 'print if $F[0] ne $F[1]' <input.csv





            share|improve this answer






















            • Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
              – steve
              Nov 25 at 13:47















            up vote
            2
            down vote



            accepted










            Assuming that this is a simple CSV file, without any fancy embedding of commas within the fields of the actual data, you may use awk to do this:



            awk -F ',' '$1 != $2' <input.csv


            This is a shorthand way of writing



            awk 'BEGIN FS = "," $1 != $2 print ' <input.csv


            and it sets the input field separator to a comma and prints each line if the first and second field ($1 and $2) are not identical.



            An equivalent Perl variant:



            perl -F ',' -na -e 'print if $F[0] ne $F[1]' <input.csv





            share|improve this answer






















            • Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
              – steve
              Nov 25 at 13:47













            up vote
            2
            down vote



            accepted







            up vote
            2
            down vote



            accepted






            Assuming that this is a simple CSV file, without any fancy embedding of commas within the fields of the actual data, you may use awk to do this:



            awk -F ',' '$1 != $2' <input.csv


            This is a shorthand way of writing



            awk 'BEGIN FS = "," $1 != $2 print ' <input.csv


            and it sets the input field separator to a comma and prints each line if the first and second field ($1 and $2) are not identical.



            An equivalent Perl variant:



            perl -F ',' -na -e 'print if $F[0] ne $F[1]' <input.csv





            share|improve this answer














            Assuming that this is a simple CSV file, without any fancy embedding of commas within the fields of the actual data, you may use awk to do this:



            awk -F ',' '$1 != $2' <input.csv


            This is a shorthand way of writing



            awk 'BEGIN FS = "," $1 != $2 print ' <input.csv


            and it sets the input field separator to a comma and prints each line if the first and second field ($1 and $2) are not identical.



            An equivalent Perl variant:



            perl -F ',' -na -e 'print if $F[0] ne $F[1]' <input.csv






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 25 at 14:34

























            answered Nov 25 at 12:30









            Kusalananda

            118k16223361




            118k16223361











            • Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
              – steve
              Nov 25 at 13:47

















            • Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
              – steve
              Nov 25 at 13:47
















            Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
            – steve
            Nov 25 at 13:47





            Good description. Even shorter if strip unnecessary quotes/spaces/direction : awk -F, '$1!=$2' input.csv
            – steve
            Nov 25 at 13:47













            up vote
            1
            down vote













            GNU sed solution:



            sed -E '/^([^,]+,)1/d' input.csv


            The output:



            text1,text2,string1,string2
            text4,text5,string1,string2





            share|improve this answer






















            • Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
              – user9371654
              Nov 25 at 14:17










            • @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
              – Kusalananda
              Nov 25 at 14:49















            up vote
            1
            down vote













            GNU sed solution:



            sed -E '/^([^,]+,)1/d' input.csv


            The output:



            text1,text2,string1,string2
            text4,text5,string1,string2





            share|improve this answer






















            • Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
              – user9371654
              Nov 25 at 14:17










            • @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
              – Kusalananda
              Nov 25 at 14:49













            up vote
            1
            down vote










            up vote
            1
            down vote









            GNU sed solution:



            sed -E '/^([^,]+,)1/d' input.csv


            The output:



            text1,text2,string1,string2
            text4,text5,string1,string2





            share|improve this answer














            GNU sed solution:



            sed -E '/^([^,]+,)1/d' input.csv


            The output:



            text1,text2,string1,string2
            text4,text5,string1,string2






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 25 at 13:01









            Kusalananda

            118k16223361




            118k16223361










            answered Nov 25 at 12:42









            RomanPerekhrest

            22.7k12246




            22.7k12246











            • Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
              – user9371654
              Nov 25 at 14:17










            • @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
              – Kusalananda
              Nov 25 at 14:49

















            • Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
              – user9371654
              Nov 25 at 14:17










            • @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
              – Kusalananda
              Nov 25 at 14:49
















            Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
            – user9371654
            Nov 25 at 14:17




            Can you please provide explanation. Does this command compares the two columns and print the different column1 and column2? Because I do not care about column3 or column4 at all. The may be different or similar. I do not consider them.
            – user9371654
            Nov 25 at 14:17












            @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
            – Kusalananda
            Nov 25 at 14:49





            @user9371654 The command uses a regular expression that will only match those lines that have equal values in the first and second column (literally: "any string not containing a comma, followed by a comma and the same string and comma again"). The lines that matches are removed from the input, and only the lines that you'd like to see are let through.
            – Kusalananda
            Nov 25 at 14:49











            up vote
            0
            down vote













            $ awk -F "," 'if ($1 != $2)print $0' filename
            text1,text2,string1,string2
            text4,text5,string1,string2





            share|improve this answer


























              up vote
              0
              down vote













              $ awk -F "," 'if ($1 != $2)print $0' filename
              text1,text2,string1,string2
              text4,text5,string1,string2





              share|improve this answer
























                up vote
                0
                down vote










                up vote
                0
                down vote









                $ awk -F "," 'if ($1 != $2)print $0' filename
                text1,text2,string1,string2
                text4,text5,string1,string2





                share|improve this answer














                $ awk -F "," 'if ($1 != $2)print $0' filename
                text1,text2,string1,string2
                text4,text5,string1,string2






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 25 at 14:50









                Jeff Schaller

                37k1052121




                37k1052121










                answered Nov 25 at 13:40









                Praveen Kumar BS

                1,168138




                1,168138



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Unix & Linux Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484025%2fhow-to-extract-lines-when-two-columns-strings-are-not-equal%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown






                    Popular posts from this blog

                    How to check contact read email or not when send email to Individual?

                    How many registers does an x86_64 CPU actually have?

                    Nur Jahan