awk search and replace string in a specific column of CSV file

up vote
3
down vote

favorite

I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.

The following is the sample output of my data:

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv

I tried the following code:

awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

This runs without any output, but the string Market1 still remains.

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

asked Jun 27 at 6:21

Apricot

2287

add a commentÂ |Â

up vote
3
down vote

favorite

The following is the sample output of my data:

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv

I tried the following code:

awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

This runs without any output, but the string Market1 still remains.

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

asked Jun 27 at 6:21

Apricot

2287

add a commentÂ |Â

up vote
3
down vote

favorite

The following is the sample output of my data:

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv

I tried the following code:

awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

This runs without any output, but the string Market1 still remains.

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

asked Jun 27 at 6:21

Apricot

2287

The following is the sample output of my data:

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

In this the 16th column is Market, wherein I want to change the Market1 to MarketPrime. The name of the file is marketinfo_2018-06-26.csv

I tried the following code:

awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

This runs without any output, but the string Market1 still remains.

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

asked Jun 27 at 6:21

Apricot

2287

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

edited Jun 27 at 8:01

Jeff Schaller

30.8k846104

asked Jun 27 at 6:21

Apricot

2287

asked Jun 27 at 6:21

Apricot

2287

asked Jun 27 at 6:21

Apricot

2287

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
8
down vote

accepted

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

The only real issue in your code is that you set the input file separator to not just | but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).

You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.

Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12 or TheMarket1, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$ as the expression to replace, or to use a string comparison.

The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.

The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

1

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

add a commentÂ |Â

up vote
2
down vote

The issue is with input field separator.

Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.

$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21

if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.

awk -F "|" 'print NF' test.csv
17
17
17
17

Solution 1: with multiple IFS.

awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

Solution 2: with fixed field 16

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

add a commentÂ |Â

up vote
0
down vote

For the sake of clarity for others who may face similar issues:

Both these answers worked for this scenario:

Kusalananda's answer:

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

My revised answer based on Kusalananda's answer:

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

answered Jun 27 at 6:43

Apricot

2287

1

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f452148%2fawk-search-and-replace-string-in-a-specific-column-of-csv-file%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
8
down vote

accepted

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.

The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.

The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

1

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

add a commentÂ |Â

up vote
8
down vote

accepted

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.

The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.

The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

1

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

add a commentÂ |Â

up vote
8
down vote

accepted

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.

The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.

The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk program would have no data to read.

The awk command above uses only | as a field separator and then does a string comparison with the 16th field. If that field is Market1, it is set to MarketPrime.

The trailing 1 at the end of the awk code causes every record (modified or not) to be printed.

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

edited Jun 27 at 6:38

answered Jun 27 at 6:29

Kusalananda

101k13199312

answered Jun 27 at 6:29

Kusalananda

101k13199312

answered Jun 27 at 6:29

Kusalananda

101k13199312

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

1

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

add a commentÂ |Â

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

1

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â€“Â Apricot
Jun 27 at 6:34

@Apricot You did not run my command. My command has -v OFS='|' on the command line. (or should have had, oops, sorry! Fixed now)
â€“Â Kusalananda
Jun 27 at 6:36

Thank you for your inputs, when I changed the awk -F '| +' to awk -F '|' it worked perfectly....accepting your answer. Thank you again.
â€“Â Apricot
Jun 27 at 6:37

add a commentÂ |Â

up vote
2
down vote

The issue is with input field separator.

Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.

$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21

if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.

awk -F "|" 'print NF' test.csv
17
17
17
17

Solution 1: with multiple IFS.

awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

Solution 2: with fixed field 16

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

add a commentÂ |Â

up vote
2
down vote

The issue is with input field separator.

Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.

$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21

if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.

awk -F "|" 'print NF' test.csv
17
17
17
17

Solution 1: with multiple IFS.

awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

Solution 2: with fixed field 16

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

add a commentÂ |Â

up vote
2
down vote

The issue is with input field separator.

Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.

$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21

if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.

awk -F "|" 'print NF' test.csv
17
17
17
17

Solution 1: with multiple IFS.

awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

Solution 2: with fixed field 16

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

The issue is with input field separator.

Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.

$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21

if you use only | as IFS, then your code will work. Since each row has 17 fields as shown below.

awk -F "|" 'print NF' test.csv
17
17
17
17

Solution 1: with multiple IFS.

awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

Solution 2: with fixed field 16

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv

SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

edited Jun 27 at 8:36

answered Jun 27 at 6:29

SivaPrasath

3,88611737

answered Jun 27 at 6:29

SivaPrasath

3,88611737

answered Jun 27 at 6:29

SivaPrasath

3,88611737

add a commentÂ |Â

up vote
0
down vote

For the sake of clarity for others who may face similar issues:

Both these answers worked for this scenario:

Kusalananda's answer:

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

My revised answer based on Kusalananda's answer:

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

answered Jun 27 at 6:43

Apricot

2287

1

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

add a commentÂ |Â

up vote
0
down vote

For the sake of clarity for others who may face similar issues:

Both these answers worked for this scenario:

Kusalananda's answer:

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

My revised answer based on Kusalananda's answer:

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

answered Jun 27 at 6:43

Apricot

2287

1

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

add a commentÂ |Â

up vote
0
down vote

For the sake of clarity for others who may face similar issues:

Both these answers worked for this scenario:

Kusalananda's answer:

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

My revised answer based on Kusalananda's answer:

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

answered Jun 27 at 6:43

Apricot

2287

For the sake of clarity for others who may face similar issues:

Both these answers worked for this scenario:

Kusalananda's answer:

awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv

My revised answer based on Kusalananda's answer:

awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv

answered Jun 27 at 6:43

Apricot

2287

answered Jun 27 at 6:43

Apricot

2287

answered Jun 27 at 6:43

Apricot

2287

answered Jun 27 at 6:43

Apricot

2287

1

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

add a commentÂ |Â

1

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

Take care that your input and output filenames are the same.
â€“Â Rakesh Sharma
Jun 27 at 7:21

As I pointed out in my answer, awk ... file >file will truncate file, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1.
â€“Â Kusalananda
Jun 27 at 8:39

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu