How could I easily expand a list of numbers with hyphens replacing repeated parts?
Clash Royale CLAN TAG#URR8PPP
ATTENTION! I have changed the RegEx and sample data so some answers could be wrong! I apologize if doing this is bad practice.
I used grep (online tool) to extract a list of data where repeated parts are sometimes substituted with hyphens (-o flag). The numbers are always 8 digits. There may be more 8-digit numbers following these
RegEx used was: [0-9]8(, -[0-9]*)*(, [0-9]8)*
Sample data below:
33520470
33520850, -60, -70, -80, -90, 33630077
25453810
13815206, -07, -08, 60682651, 60709994
13340820
61040146, -55
60819060, -79
60819088
And my desired output would be:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
Could this be done with grep? If not, could you suggest any unix or other tools to achieve this result? I was thinking sed or awk.
EDIT: This has been solved. I will include the correct command here just for convenience to skip having to dig through the comments:
-F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 7) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
awk sed grep
add a comment |
ATTENTION! I have changed the RegEx and sample data so some answers could be wrong! I apologize if doing this is bad practice.
I used grep (online tool) to extract a list of data where repeated parts are sometimes substituted with hyphens (-o flag). The numbers are always 8 digits. There may be more 8-digit numbers following these
RegEx used was: [0-9]8(, -[0-9]*)*(, [0-9]8)*
Sample data below:
33520470
33520850, -60, -70, -80, -90, 33630077
25453810
13815206, -07, -08, 60682651, 60709994
13340820
61040146, -55
60819060, -79
60819088
And my desired output would be:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
Could this be done with grep? If not, could you suggest any unix or other tools to achieve this result? I was thinking sed or awk.
EDIT: This has been solved. I will include the correct command here just for convenience to skip having to dig through the comments:
-F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 7) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
awk sed grep
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50
add a comment |
ATTENTION! I have changed the RegEx and sample data so some answers could be wrong! I apologize if doing this is bad practice.
I used grep (online tool) to extract a list of data where repeated parts are sometimes substituted with hyphens (-o flag). The numbers are always 8 digits. There may be more 8-digit numbers following these
RegEx used was: [0-9]8(, -[0-9]*)*(, [0-9]8)*
Sample data below:
33520470
33520850, -60, -70, -80, -90, 33630077
25453810
13815206, -07, -08, 60682651, 60709994
13340820
61040146, -55
60819060, -79
60819088
And my desired output would be:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
Could this be done with grep? If not, could you suggest any unix or other tools to achieve this result? I was thinking sed or awk.
EDIT: This has been solved. I will include the correct command here just for convenience to skip having to dig through the comments:
-F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 7) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
awk sed grep
ATTENTION! I have changed the RegEx and sample data so some answers could be wrong! I apologize if doing this is bad practice.
I used grep (online tool) to extract a list of data where repeated parts are sometimes substituted with hyphens (-o flag). The numbers are always 8 digits. There may be more 8-digit numbers following these
RegEx used was: [0-9]8(, -[0-9]*)*(, [0-9]8)*
Sample data below:
33520470
33520850, -60, -70, -80, -90, 33630077
25453810
13815206, -07, -08, 60682651, 60709994
13340820
61040146, -55
60819060, -79
60819088
And my desired output would be:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
Could this be done with grep? If not, could you suggest any unix or other tools to achieve this result? I was thinking sed or awk.
EDIT: This has been solved. I will include the correct command here just for convenience to skip having to dig through the comments:
-F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 7) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
awk sed grep
awk sed grep
edited Feb 18 at 14:13
Sigmund Freud
asked Feb 4 at 6:59
Sigmund FreudSigmund Freud
33
33
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50
add a comment |
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50
add a comment |
2 Answers
2
active
oldest
votes
I tried it with awk:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++)printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) '
Output:
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Edit:
Code to get correct result:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 3) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
Result:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!
– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?-07
-08
– Matej
Feb 4 at 9:03
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
|
show 15 more comments
Updated with a pre-processing step to handle the modified input.
The rest of this answer assumes that the data has been pre-processed with
grep -oE '[0-9]8(, -[0-9]+)*'
I.e., the full solution would require
grep -oE ... file | awk ...
BEGIN FS = ", *"
print $1
for (i = 2; i <= NF; ++i)
print substr($1, 1, length($1) - length($i) + 1) substr($i, 2)
This awk
script reads a line and then prints the first comma-delimited field. It then loops over the remaining fields and outputs the first field with enough characters cut off at the end to insert the characters after the -
in the other fields.
The code allows for "suffixes" of variable length.
Testing:
$ awk -f script.awk file
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Another example:
$ cat file
1111
2222,-3,-4, -33,-44, -333,-444
$ awk -f script.awk file
1111
2222
2223
2224
2233
2244
2333
2444
As a "one-liner":
awk -F ', *' 'print $1; for(i=2;i<=NF;++i)print substr($1,1,length($1)-length($i)+1)substr($i,2)' file
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498538%2fhow-could-i-easily-expand-a-list-of-numbers-with-hyphens-replacing-repeated-part%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I tried it with awk:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++)printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) '
Output:
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Edit:
Code to get correct result:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 3) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
Result:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!
– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?-07
-08
– Matej
Feb 4 at 9:03
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
|
show 15 more comments
I tried it with awk:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++)printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) '
Output:
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Edit:
Code to get correct result:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 3) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
Result:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!
– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?-07
-08
– Matej
Feb 4 at 9:03
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
|
show 15 more comments
I tried it with awk:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++)printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) '
Output:
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Edit:
Code to get correct result:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 3) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
Result:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
I tried it with awk:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++)printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) '
Output:
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Edit:
Code to get correct result:
cat file | awk -F ', ' ' print $1; for(a=2;a <= NF; a ++) if(length($a) <= 3) printf("%s%sn",substr($1,1,length($1)-(length($a)-1)),substr($a, 2)) else print $a '
Result:
33520470
33520850
33520860
33520870
33520880
33520890
33630077
25453810
13815206
13815207
13815208
60682651
60709994
13340820
61040146
61040155
60819060
60819079
60819088
edited Feb 4 at 11:29
answered Feb 4 at 7:39
MatejMatej
2066
2066
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!
– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?-07
-08
– Matej
Feb 4 at 9:03
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
|
show 15 more comments
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!
– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?-07
-08
– Matej
Feb 4 at 9:03
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
This works well. Does awk support RegEx? Could it be tweaked further to include what grep does in the first example? Currently I have to use grep and then your solution on grep's output.
– Sigmund Freud
Feb 4 at 7:46
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
@SigmundFreud Yes, awk supports regex: tutorialspoint.com/awk/awk_regular_expressions.htm but I think that using multiple commands to get result is not bad solution.
– Matej
Feb 4 at 7:49
I discovered that I had an error in my RegEx. The new one is
[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as 13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!– Sigmund Freud
Feb 4 at 8:58
I discovered that I had an error in my RegEx. The new one is
[0-9]8(, -[0-9]*)*(, [0-9]8)*
and returns rows with more 8-digit numbers such as 13815206, -07, -08, 60682651, 60709994
which does not work with that command. The solution worked perfectly with the previously provided data set, however!– Sigmund Freud
Feb 4 at 8:58
@SigmundFreud The difference is in these numbers?
-07
-08
– Matej
Feb 4 at 9:03
@SigmundFreud The difference is in these numbers?
-07
-08
– Matej
Feb 4 at 9:03
1
1
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
You did it! It works perfectly now.
– Sigmund Freud
Feb 4 at 11:42
|
show 15 more comments
Updated with a pre-processing step to handle the modified input.
The rest of this answer assumes that the data has been pre-processed with
grep -oE '[0-9]8(, -[0-9]+)*'
I.e., the full solution would require
grep -oE ... file | awk ...
BEGIN FS = ", *"
print $1
for (i = 2; i <= NF; ++i)
print substr($1, 1, length($1) - length($i) + 1) substr($i, 2)
This awk
script reads a line and then prints the first comma-delimited field. It then loops over the remaining fields and outputs the first field with enough characters cut off at the end to insert the characters after the -
in the other fields.
The code allows for "suffixes" of variable length.
Testing:
$ awk -f script.awk file
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Another example:
$ cat file
1111
2222,-3,-4, -33,-44, -333,-444
$ awk -f script.awk file
1111
2222
2223
2224
2233
2244
2333
2444
As a "one-liner":
awk -F ', *' 'print $1; for(i=2;i<=NF;++i)print substr($1,1,length($1)-length($i)+1)substr($i,2)' file
add a comment |
Updated with a pre-processing step to handle the modified input.
The rest of this answer assumes that the data has been pre-processed with
grep -oE '[0-9]8(, -[0-9]+)*'
I.e., the full solution would require
grep -oE ... file | awk ...
BEGIN FS = ", *"
print $1
for (i = 2; i <= NF; ++i)
print substr($1, 1, length($1) - length($i) + 1) substr($i, 2)
This awk
script reads a line and then prints the first comma-delimited field. It then loops over the remaining fields and outputs the first field with enough characters cut off at the end to insert the characters after the -
in the other fields.
The code allows for "suffixes" of variable length.
Testing:
$ awk -f script.awk file
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Another example:
$ cat file
1111
2222,-3,-4, -33,-44, -333,-444
$ awk -f script.awk file
1111
2222
2223
2224
2233
2244
2333
2444
As a "one-liner":
awk -F ', *' 'print $1; for(i=2;i<=NF;++i)print substr($1,1,length($1)-length($i)+1)substr($i,2)' file
add a comment |
Updated with a pre-processing step to handle the modified input.
The rest of this answer assumes that the data has been pre-processed with
grep -oE '[0-9]8(, -[0-9]+)*'
I.e., the full solution would require
grep -oE ... file | awk ...
BEGIN FS = ", *"
print $1
for (i = 2; i <= NF; ++i)
print substr($1, 1, length($1) - length($i) + 1) substr($i, 2)
This awk
script reads a line and then prints the first comma-delimited field. It then loops over the remaining fields and outputs the first field with enough characters cut off at the end to insert the characters after the -
in the other fields.
The code allows for "suffixes" of variable length.
Testing:
$ awk -f script.awk file
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Another example:
$ cat file
1111
2222,-3,-4, -33,-44, -333,-444
$ awk -f script.awk file
1111
2222
2223
2224
2233
2244
2333
2444
As a "one-liner":
awk -F ', *' 'print $1; for(i=2;i<=NF;++i)print substr($1,1,length($1)-length($i)+1)substr($i,2)' file
Updated with a pre-processing step to handle the modified input.
The rest of this answer assumes that the data has been pre-processed with
grep -oE '[0-9]8(, -[0-9]+)*'
I.e., the full solution would require
grep -oE ... file | awk ...
BEGIN FS = ", *"
print $1
for (i = 2; i <= NF; ++i)
print substr($1, 1, length($1) - length($i) + 1) substr($i, 2)
This awk
script reads a line and then prints the first comma-delimited field. It then loops over the remaining fields and outputs the first field with enough characters cut off at the end to insert the characters after the -
in the other fields.
The code allows for "suffixes" of variable length.
Testing:
$ awk -f script.awk file
33520470
33520850
33520860
33520870
33520880
33520890
25453810
13340820
61040146
61040155
60819060
60819079
60819088
Another example:
$ cat file
1111
2222,-3,-4, -33,-44, -333,-444
$ awk -f script.awk file
1111
2222
2223
2224
2233
2244
2333
2444
As a "one-liner":
awk -F ', *' 'print $1; for(i=2;i<=NF;++i)print substr($1,1,length($1)-length($i)+1)substr($i,2)' file
edited Feb 4 at 9:24
answered Feb 4 at 8:01
KusalanandaKusalananda
132k17252416
132k17252416
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498538%2fhow-could-i-easily-expand-a-list-of-numbers-with-hyphens-replacing-repeated-part%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You can't use grep for that. awk will work, but you will need a small program to do it.
– RalfFriedl
Feb 4 at 7:06
Are the hyphened parts always exactly two digits?
– Sparhawk
Feb 4 at 7:19
There may be up to 7, the hyphen substitutes the repeating parts of the numbers. It would be safer to cover all possible occurrences.
– Sigmund Freud
Feb 4 at 7:28
I am not sure about the version, I am using the website online-utility.org/text/grep.jsp
– Sigmund Freud
Feb 4 at 7:40
This can still be done with grep, but with the new conditions it is definitely better handled by awk and sed.
– WAF
Feb 4 at 9:50