awk search and replace string in a specific column of CSV file

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite












I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.



The following is the sample output of my data:



SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv



I tried the following code:



awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv


This runs without any output, but the string Market1 still remains.







share|improve this question

























    up vote
    3
    down vote

    favorite












    I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.



    The following is the sample output of my data:



    SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
    1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
    1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
    1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


    In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv



    I tried the following code:



    awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv


    This runs without any output, but the string Market1 still remains.







    share|improve this question























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.



      The following is the sample output of my data:



      SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
      1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
      1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
      1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


      In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv



      I tried the following code:



      awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv


      This runs without any output, but the string Market1 still remains.







      share|improve this question













      I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.



      The following is the sample output of my data:



      SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
      1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
      1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
      1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


      In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv



      I tried the following code:



      awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv


      This runs without any output, but the string Market1 still remains.









      share|improve this question












      share|improve this question




      share|improve this question








      edited Jun 27 at 8:01









      Jeff Schaller

      30.8k846104




      30.8k846104









      asked Jun 27 at 6:21









      Apricot

      2287




      2287




















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          8
          down vote



          accepted










          awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


          The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).



          You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.



          Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.



          The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.



          The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.






          share|improve this answer























          • Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
            – Apricot
            Jun 27 at 6:34






          • 1




            @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
            – Kusalananda
            Jun 27 at 6:36











          • Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
            – Apricot
            Jun 27 at 6:37

















          up vote
          2
          down vote













          The issue is with input field separator.



          Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.



          $ awk -F '[| +]' 'print NF' test.csv
          17
          26
          23
          21


          if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.



          awk -F "|" 'print NF' test.csv
          17
          17
          17
          17


          Solution 1: with multiple IFS.



          awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

          SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
          1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
          1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
          1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


          Solution 2: with fixed field 16



          awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

          SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
          1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
          1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
          1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active





          share|improve this answer






























            up vote
            0
            down vote













            For the sake of clarity for others who may face similar issues:



            Both these answers worked for this scenario:



            Kusalananda's answer:



            awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


            My revised answer based on Kusalananda's answer:



            awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv





            share|improve this answer

















            • 1




              Take care that your input and output filenames are the same.
              – Rakesh Sharma
              Jun 27 at 7:21










            • As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
              – Kusalananda
              Jun 27 at 8:39











            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "106"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f452148%2fawk-search-and-replace-string-in-a-specific-column-of-csv-file%23new-answer', 'question_page');

            );

            Post as a guest






























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            8
            down vote



            accepted










            awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


            The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).



            You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.



            Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.



            The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.



            The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.






            share|improve this answer























            • Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
              – Apricot
              Jun 27 at 6:34






            • 1




              @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
              – Kusalananda
              Jun 27 at 6:36











            • Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
              – Apricot
              Jun 27 at 6:37














            up vote
            8
            down vote



            accepted










            awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


            The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).



            You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.



            Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.



            The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.



            The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.






            share|improve this answer























            • Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
              – Apricot
              Jun 27 at 6:34






            • 1




              @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
              – Kusalananda
              Jun 27 at 6:36











            • Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
              – Apricot
              Jun 27 at 6:37












            up vote
            8
            down vote



            accepted







            up vote
            8
            down vote



            accepted






            awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


            The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).



            You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.



            Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.



            The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.



            The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.






            share|improve this answer















            awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


            The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).



            You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.



            Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.



            The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.



            The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.







            share|improve this answer















            share|improve this answer



            share|improve this answer








            edited Jun 27 at 6:38


























            answered Jun 27 at 6:29









            Kusalananda

            101k13199312




            101k13199312











            • Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
              – Apricot
              Jun 27 at 6:34






            • 1




              @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
              – Kusalananda
              Jun 27 at 6:36











            • Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
              – Apricot
              Jun 27 at 6:37
















            • Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
              – Apricot
              Jun 27 at 6:34






            • 1




              @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
              – Kusalananda
              Jun 27 at 6:36











            • Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
              – Apricot
              Jun 27 at 6:37















            Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
            – Apricot
            Jun 27 at 6:34




            Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
            – Apricot
            Jun 27 at 6:34




            1




            1




            @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
            – Kusalananda
            Jun 27 at 6:36





            @Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
            – Kusalananda
            Jun 27 at 6:36













            Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
            – Apricot
            Jun 27 at 6:37




            Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
            – Apricot
            Jun 27 at 6:37












            up vote
            2
            down vote













            The issue is with input field separator.



            Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.



            $ awk -F '[| +]' 'print NF' test.csv
            17
            26
            23
            21


            if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.



            awk -F "|" 'print NF' test.csv
            17
            17
            17
            17


            Solution 1: with multiple IFS.



            awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

            SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
            1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
            1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
            1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


            Solution 2: with fixed field 16



            awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

            SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
            1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
            1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
            1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active





            share|improve this answer



























              up vote
              2
              down vote













              The issue is with input field separator.



              Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.



              $ awk -F '[| +]' 'print NF' test.csv
              17
              26
              23
              21


              if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.



              awk -F "|" 'print NF' test.csv
              17
              17
              17
              17


              Solution 1: with multiple IFS.



              awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

              SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
              1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
              1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
              1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


              Solution 2: with fixed field 16



              awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

              SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
              1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
              1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
              1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active





              share|improve this answer

























                up vote
                2
                down vote










                up vote
                2
                down vote









                The issue is with input field separator.



                Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.



                $ awk -F '[| +]' 'print NF' test.csv
                17
                26
                23
                21


                if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.



                awk -F "|" 'print NF' test.csv
                17
                17
                17
                17


                Solution 1: with multiple IFS.



                awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

                SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
                1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


                Solution 2: with fixed field 16



                awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

                SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
                1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active





                share|improve this answer















                The issue is with input field separator.



                Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.



                $ awk -F '[| +]' 'print NF' test.csv
                17
                26
                23
                21


                if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.



                awk -F "|" 'print NF' test.csv
                17
                17
                17
                17


                Solution 1: with multiple IFS.



                awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

                SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
                1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active


                Solution 2: with fixed field 16



                awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

                SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
                1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
                1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active






                share|improve this answer















                share|improve this answer



                share|improve this answer








                edited Jun 27 at 8:36


























                answered Jun 27 at 6:29









                SivaPrasath

                3,88611737




                3,88611737




















                    up vote
                    0
                    down vote













                    For the sake of clarity for others who may face similar issues:



                    Both these answers worked for this scenario:



                    Kusalananda's answer:



                    awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


                    My revised answer based on Kusalananda's answer:



                    awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv





                    share|improve this answer

















                    • 1




                      Take care that your input and output filenames are the same.
                      – Rakesh Sharma
                      Jun 27 at 7:21










                    • As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                      – Kusalananda
                      Jun 27 at 8:39















                    up vote
                    0
                    down vote













                    For the sake of clarity for others who may face similar issues:



                    Both these answers worked for this scenario:



                    Kusalananda's answer:



                    awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


                    My revised answer based on Kusalananda's answer:



                    awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv





                    share|improve this answer

















                    • 1




                      Take care that your input and output filenames are the same.
                      – Rakesh Sharma
                      Jun 27 at 7:21










                    • As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                      – Kusalananda
                      Jun 27 at 8:39













                    up vote
                    0
                    down vote










                    up vote
                    0
                    down vote









                    For the sake of clarity for others who may face similar issues:



                    Both these answers worked for this scenario:



                    Kusalananda's answer:



                    awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


                    My revised answer based on Kusalananda's answer:



                    awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv





                    share|improve this answer













                    For the sake of clarity for others who may face similar issues:



                    Both these answers worked for this scenario:



                    Kusalananda's answer:



                    awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv


                    My revised answer based on Kusalananda's answer:



                    awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv






                    share|improve this answer













                    share|improve this answer



                    share|improve this answer











                    answered Jun 27 at 6:43









                    Apricot

                    2287




                    2287







                    • 1




                      Take care that your input and output filenames are the same.
                      – Rakesh Sharma
                      Jun 27 at 7:21










                    • As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                      – Kusalananda
                      Jun 27 at 8:39













                    • 1




                      Take care that your input and output filenames are the same.
                      – Rakesh Sharma
                      Jun 27 at 7:21










                    • As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                      – Kusalananda
                      Jun 27 at 8:39








                    1




                    1




                    Take care that your input and output filenames are the same.
                    – Rakesh Sharma
                    Jun 27 at 7:21




                    Take care that your input and output filenames are the same.
                    – Rakesh Sharma
                    Jun 27 at 7:21












                    As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                    – Kusalananda
                    Jun 27 at 8:39





                    As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
                    – Kusalananda
                    Jun 27 at 8:39













                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f452148%2fawk-search-and-replace-string-in-a-specific-column-of-csv-file%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    How to check contact read email or not when send email to Individual?

                    Displaying single band from multi-band raster using QGIS

                    How many registers does an x86_64 CPU actually have?