awk search and replace string in a specific column of CSV file
Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.
The following is the sample output of my data:
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
In this the 16th column is Market, wherein I want to change the Market1
to MarketPrime
. The name of the file is marketinfo_2018-06-26.csv
I tried the following code:
awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
This runs without any output, but the string Market1
still remains.
text-processing awk csv
add a comment |Â
up vote
3
down vote
favorite
I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.
The following is the sample output of my data:
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
In this the 16th column is Market, wherein I want to change the Market1
to MarketPrime
. The name of the file is marketinfo_2018-06-26.csv
I tried the following code:
awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
This runs without any output, but the string Market1
still remains.
text-processing awk csv
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.
The following is the sample output of my data:
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
In this the 16th column is Market, wherein I want to change the Market1
to MarketPrime
. The name of the file is marketinfo_2018-06-26.csv
I tried the following code:
awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
This runs without any output, but the string Market1
still remains.
text-processing awk csv
I have a csv file with 17 columns and million rows. I want to search for a specific string in the 16th column and replace all the instances of that string with another string. Since my rest of the program uses bash script, I thought using awk instead of Python search & replace. My current OS is Rhel6.
The following is the sample output of my data:
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|Market1|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
In this the 16th column is Market, wherein I want to change the Market1
to MarketPrime
. The name of the file is marketinfo_2018-06-26.csv
I tried the following code:
awk -F '| +' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
This runs without any output, but the string Market1
still remains.
text-processing awk csv
edited Jun 27 at 8:01
Jeff Schaller
30.8k846104
30.8k846104
asked Jun 27 at 6:21
Apricot
2287
2287
add a comment |Â
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
8
down vote
accepted
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
The only real issue in your code is that you set the input file separator to not just |
but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).
You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk
program would have no data to read.
Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12
or TheMarket1
, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$
as the expression to replace, or to use a string comparison.
The awk
command above uses only |
as a field separator and then does a string comparison with the 16th field. If that field is Market1
, it is set to MarketPrime
.
The trailing 1
at the end of the awk
code causes every record (modified or not) to be printed.
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
@Apricot You did not run my command. My command has-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)
â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed theawk -F '| +'
toawk -F '|'
it worked perfectly....accepting your answer. Thank you again.
â Apricot
Jun 27 at 6:37
add a comment |Â
up vote
2
down vote
The issue is with input field separator.
Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.
$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21
if you use only |
as IFS, then your code will work. Since each row has 17 fields as shown below.
awk -F "|" 'print NF' test.csv
17
17
17
17
Solution 1: with multiple IFS.
awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
Solution 2: with fixed field 16
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
add a comment |Â
up vote
0
down vote
For the sake of clarity for others who may face similar issues:
Both these answers worked for this scenario:
Kusalananda's answer:
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
My revised answer based on Kusalananda's answer:
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,awk ... file >file
will truncatefile
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactlyMarket1
.
â Kusalananda
Jun 27 at 8:39
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
8
down vote
accepted
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
The only real issue in your code is that you set the input file separator to not just |
but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).
You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk
program would have no data to read.
Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12
or TheMarket1
, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$
as the expression to replace, or to use a string comparison.
The awk
command above uses only |
as a field separator and then does a string comparison with the 16th field. If that field is Market1
, it is set to MarketPrime
.
The trailing 1
at the end of the awk
code causes every record (modified or not) to be printed.
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
@Apricot You did not run my command. My command has-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)
â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed theawk -F '| +'
toawk -F '|'
it worked perfectly....accepting your answer. Thank you again.
â Apricot
Jun 27 at 6:37
add a comment |Â
up vote
8
down vote
accepted
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
The only real issue in your code is that you set the input file separator to not just |
but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).
You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk
program would have no data to read.
Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12
or TheMarket1
, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$
as the expression to replace, or to use a string comparison.
The awk
command above uses only |
as a field separator and then does a string comparison with the 16th field. If that field is Market1
, it is set to MarketPrime
.
The trailing 1
at the end of the awk
code causes every record (modified or not) to be printed.
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
@Apricot You did not run my command. My command has-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)
â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed theawk -F '| +'
toawk -F '|'
it worked perfectly....accepting your answer. Thank you again.
â Apricot
Jun 27 at 6:37
add a comment |Â
up vote
8
down vote
accepted
up vote
8
down vote
accepted
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
The only real issue in your code is that you set the input file separator to not just |
but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).
You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk
program would have no data to read.
Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12
or TheMarket1
, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$
as the expression to replace, or to use a string comparison.
The awk
command above uses only |
as a field separator and then does a string comparison with the 16th field. If that field is Market1
, it is set to MarketPrime
.
The trailing 1
at the end of the awk
code causes every record (modified or not) to be printed.
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
The only real issue in your code is that you set the input file separator to not just |
but to spaces as well. This will make the spaces count as field separators in the data and it will be incredibly hard to figure out what the correct field number is (since some fields contain a variable number of spaces).
You also can not redirect into the same filename as you use to read from. Doing so would cause the shell to first truncate (empty) the output file, and your awk
program would have no data to read.
Your code does a regular expression replacement. This is ok, but you need to be aware that if the 16th field happens to be something like Market12
or TheMarket1
, it would trigger the substitution due to the missing anchor points. It would be safer to use ^Market1$
as the expression to replace, or to use a string comparison.
The awk
command above uses only |
as a field separator and then does a string comparison with the 16th field. If that field is Market1
, it is set to MarketPrime
.
The trailing 1
at the end of the awk
code causes every record (modified or not) to be printed.
edited Jun 27 at 6:38
answered Jun 27 at 6:29
Kusalananda
101k13199312
101k13199312
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
@Apricot You did not run my command. My command has-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)
â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed theawk -F '| +'
toawk -F '|'
it worked perfectly....accepting your answer. Thank you again.
â Apricot
Jun 27 at 6:37
add a comment |Â
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
@Apricot You did not run my command. My command has-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)
â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed theawk -F '| +'
toawk -F '|'
it worked perfectly....accepting your answer. Thank you again.
â Apricot
Jun 27 at 6:37
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
Thank you. This worked...but it removed the '|' pipe separator from the entire line wherever I had Market1, and hence those lines in the new file got jumbled up together.
â Apricot
Jun 27 at 6:34
1
1
@Apricot You did not run my command. My command has
-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)â Kusalananda
Jun 27 at 6:36
@Apricot You did not run my command. My command has
-v OFS='|'
on the command line. (or should have had, oops, sorry! Fixed now)â Kusalananda
Jun 27 at 6:36
Thank you for your inputs, when I changed the
awk -F '| +'
to awk -F '|'
it worked perfectly....accepting your answer. Thank you again.â Apricot
Jun 27 at 6:37
Thank you for your inputs, when I changed the
awk -F '| +'
to awk -F '|'
it worked perfectly....accepting your answer. Thank you again.â Apricot
Jun 27 at 6:37
add a comment |Â
up vote
2
down vote
The issue is with input field separator.
Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.
$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21
if you use only |
as IFS, then your code will work. Since each row has 17 fields as shown below.
awk -F "|" 'print NF' test.csv
17
17
17
17
Solution 1: with multiple IFS.
awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
Solution 2: with fixed field 16
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
add a comment |Â
up vote
2
down vote
The issue is with input field separator.
Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.
$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21
if you use only |
as IFS, then your code will work. Since each row has 17 fields as shown below.
awk -F "|" 'print NF' test.csv
17
17
17
17
Solution 1: with multiple IFS.
awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
Solution 2: with fixed field 16
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
add a comment |Â
up vote
2
down vote
up vote
2
down vote
The issue is with input field separator.
Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.
$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21
if you use only |
as IFS, then your code will work. Since each row has 17 fields as shown below.
awk -F "|" 'print NF' test.csv
17
17
17
17
Solution 1: with multiple IFS.
awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
Solution 2: with fixed field 16
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
The issue is with input field separator.
Since you want to use multiple field separator(which is not required), the number of fields in each row is different as shown below.
$ awk -F '[| +]' 'print NF' test.csv
17
26
23
21
if you use only |
as IFS, then your code will work. Since each row has 17 fields as shown below.
awk -F "|" 'print NF' test.csv
17
17
17
17
Solution 1: with multiple IFS.
awk -F '[| +]' 'gsub("Market1","MarketPrime",$(NF-1)); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual|package|199|May17|pack|Basic|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|Package|199|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual|Pack|Premium|Package|Annual|08/28/2017||027445053518|Primary|Pace|-|31|000223871682|Yes|AMP|English|Movies|pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
Solution 2: with fixed field 16
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" test.csv
SUBSCRIBER_ID|ACCOUNT_CATEGORY|ACCOUNT_ACTIVATION_DATE|PACKAGE_NAME|PACKAGE_TYPE|DURATION|ACTIVE_DATE|INACTIVE_DATE|STB_NO|PRIMARY_SECONDARY|MODEL_TYPE|VC_NO|MULTIROOM|STB_TYPE|IPKG|SERVICE_STATE|CURRENT_STATUS
1001098068|ResidentialRegular|01/20/2007|Annual package 199 May17 pack|Basic Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|Package 199 pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual Pack|Premium Package|Annual|08/28/2017||027445053518|Primary|Pace - 31|000223871682|Yes|AMP|English Movies pack|MarketPrime|Active
1001098068|ResidentialRegular|01/20/2007|Annual SingleUnit Jun17 Pack|Secondary Pack|Annual|08/28/2017||032089364015|Secondary|Kaon|000017213968|Yes|AMP|SingleUnit|Market2|Active
edited Jun 27 at 8:36
answered Jun 27 at 6:29
SivaPrasath
3,88611737
3,88611737
add a comment |Â
add a comment |Â
up vote
0
down vote
For the sake of clarity for others who may face similar issues:
Both these answers worked for this scenario:
Kusalananda's answer:
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
My revised answer based on Kusalananda's answer:
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,awk ... file >file
will truncatefile
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactlyMarket1
.
â Kusalananda
Jun 27 at 8:39
add a comment |Â
up vote
0
down vote
For the sake of clarity for others who may face similar issues:
Both these answers worked for this scenario:
Kusalananda's answer:
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
My revised answer based on Kusalananda's answer:
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,awk ... file >file
will truncatefile
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactlyMarket1
.
â Kusalananda
Jun 27 at 8:39
add a comment |Â
up vote
0
down vote
up vote
0
down vote
For the sake of clarity for others who may face similar issues:
Both these answers worked for this scenario:
Kusalananda's answer:
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
My revised answer based on Kusalananda's answer:
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
For the sake of clarity for others who may face similar issues:
Both these answers worked for this scenario:
Kusalananda's answer:
awk -F '|' -v OFS='|' '$16 == "Market1" $16 = "MarketPrime" 1' file.csv >new-file.csv
My revised answer based on Kusalananda's answer:
awk -F '|' 'gsub("Market1","MarketPrime",$16); print' OFS="|" marketinfo_2018-06-26.csv > marketinfo_2018-06-26.csv
answered Jun 27 at 6:43
Apricot
2287
2287
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,awk ... file >file
will truncatefile
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactlyMarket1
.
â Kusalananda
Jun 27 at 8:39
add a comment |Â
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,awk ... file >file
will truncatefile
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactlyMarket1
.
â Kusalananda
Jun 27 at 8:39
1
1
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
Take care that your input and output filenames are the same.
â Rakesh Sharma
Jun 27 at 7:21
As I pointed out in my answer,
awk ... file >file
will truncate file
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1
.â Kusalananda
Jun 27 at 8:39
As I pointed out in my answer,
awk ... file >file
will truncate file
, so that second command is not correct in that sense. Also, the regular expression may potentially also replace strings other than exactly Market1
.â Kusalananda
Jun 27 at 8:39
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f452148%2fawk-search-and-replace-string-in-a-specific-column-of-csv-file%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password