Extract a number in each line and add another number to it
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I have a file whose content is
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
...
I want to change its content to be
(bookmarks
("Cover"
"#2" )
("Title page"
"#3" )
("Preface"
"#5" )
...
by keeping the number just before .djvu
, removing leading zeros, and adding one to it. I wonder how you would do that using awk?
Thanks.
awk
add a comment |Â
up vote
2
down vote
favorite
I have a file whose content is
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
...
I want to change its content to be
(bookmarks
("Cover"
"#2" )
("Title page"
"#3" )
("Preface"
"#5" )
...
by keeping the number just before .djvu
, removing leading zeros, and adding one to it. I wonder how you would do that using awk?
Thanks.
awk
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into theperl
command to be added to$1
?
â Tim
Mar 16 at 3:05
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have a file whose content is
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
...
I want to change its content to be
(bookmarks
("Cover"
"#2" )
("Title page"
"#3" )
("Preface"
"#5" )
...
by keeping the number just before .djvu
, removing leading zeros, and adding one to it. I wonder how you would do that using awk?
Thanks.
awk
I have a file whose content is
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
...
I want to change its content to be
(bookmarks
("Cover"
"#2" )
("Title page"
"#3" )
("Preface"
"#5" )
...
by keeping the number just before .djvu
, removing leading zeros, and adding one to it. I wonder how you would do that using awk?
Thanks.
awk
asked Mar 16 at 1:58
Tim
22.7k64224401
22.7k64224401
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into theperl
command to be added to$1
?
â Tim
Mar 16 at 3:05
add a comment |Â
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into theperl
command to be added to$1
?
â Tim
Mar 16 at 3:05
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the
perl
command to be added to $1
?â Tim
Mar 16 at 3:05
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the
perl
command to be added to $1
?â Tim
Mar 16 at 3:05
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
5
down vote
That's more a job for perl
:
perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file
With variable:
INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file
Or:
perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
add a comment |Â
up vote
1
down vote
Here's a GNU awk
solution:
awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'
or the identical, but spread over a few lines for readability:
awk '/^ *(/ print
!/^ *(/ split( $1, aa, "[0-9]+", bb )
printf ""#%s" )n", bb[length(bb)]+1 '
/^ *
and!/^ *(/
are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.split( $1, aa, "[0-9]+", bb )
For lines that don't, split the line into two arrays.aa
is the line content delimited by the regex "[0-9]+", andbb
are the delimiters that matched the regex. The final element ofbb
is what interests you.printf ""#%s" )n"
formats the output line, waiting for a single variable...bb[length(bb)]+1
one plus the value of the last element of bb.
add a comment |Â
up vote
1
down vote
gawk '
sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
print
' FPAT='[0-9]+.djvu' input.txt
The idea is following:
- extract the
.djvu
extension with leading numbers from thedjvu
filename, using the[0-9]+.djvu
pattern (FPAT
). Example: original filename is#all_24223_to_00243.cpc0002.djvu
, the extracted part will be0002.djvu
. - substitute the previous
djvu
filename#.*.djvu
to the extracted one, increasing it by1
previously. Example: take whole line$0
and substitute#all_24223_to_00243.cpc0002.djvu
inside it, to the0002.djvu + 1
(this expression results to the plain number3
, because of how the string to number conversion works in thegawk
). Add the#
sign and.djvu
extension to it. Result:#3.djvu
.
This solution will work only for lines with one djvu
filename, as in your sample input.
Input
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
Output
(bookmarks
("Cover"
"#2.djvu" )
("Title page"
"#3.djvu" )
("Preface"
"#5.djvu" )
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
That's more a job for perl
:
perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file
With variable:
INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file
Or:
perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
add a comment |Â
up vote
5
down vote
That's more a job for perl
:
perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file
With variable:
INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file
Or:
perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
add a comment |Â
up vote
5
down vote
up vote
5
down vote
That's more a job for perl
:
perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file
With variable:
INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file
Or:
perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file
That's more a job for perl
:
perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file
With variable:
INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file
Or:
perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file
edited Mar 16 at 9:24
answered Mar 16 at 9:14
Stéphane Chazelas
280k53515847
280k53515847
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
add a comment |Â
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
could also use positive lookahead to avoid adding the quote back in replacement section..
â Sundeep
Mar 16 at 9:17
1
1
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â Stéphane Chazelas
Mar 16 at 9:20
add a comment |Â
up vote
1
down vote
Here's a GNU awk
solution:
awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'
or the identical, but spread over a few lines for readability:
awk '/^ *(/ print
!/^ *(/ split( $1, aa, "[0-9]+", bb )
printf ""#%s" )n", bb[length(bb)]+1 '
/^ *
and!/^ *(/
are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.split( $1, aa, "[0-9]+", bb )
For lines that don't, split the line into two arrays.aa
is the line content delimited by the regex "[0-9]+", andbb
are the delimiters that matched the regex. The final element ofbb
is what interests you.printf ""#%s" )n"
formats the output line, waiting for a single variable...bb[length(bb)]+1
one plus the value of the last element of bb.
add a comment |Â
up vote
1
down vote
Here's a GNU awk
solution:
awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'
or the identical, but spread over a few lines for readability:
awk '/^ *(/ print
!/^ *(/ split( $1, aa, "[0-9]+", bb )
printf ""#%s" )n", bb[length(bb)]+1 '
/^ *
and!/^ *(/
are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.split( $1, aa, "[0-9]+", bb )
For lines that don't, split the line into two arrays.aa
is the line content delimited by the regex "[0-9]+", andbb
are the delimiters that matched the regex. The final element ofbb
is what interests you.printf ""#%s" )n"
formats the output line, waiting for a single variable...bb[length(bb)]+1
one plus the value of the last element of bb.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Here's a GNU awk
solution:
awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'
or the identical, but spread over a few lines for readability:
awk '/^ *(/ print
!/^ *(/ split( $1, aa, "[0-9]+", bb )
printf ""#%s" )n", bb[length(bb)]+1 '
/^ *
and!/^ *(/
are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.split( $1, aa, "[0-9]+", bb )
For lines that don't, split the line into two arrays.aa
is the line content delimited by the regex "[0-9]+", andbb
are the delimiters that matched the regex. The final element ofbb
is what interests you.printf ""#%s" )n"
formats the output line, waiting for a single variable...bb[length(bb)]+1
one plus the value of the last element of bb.
Here's a GNU awk
solution:
awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'
or the identical, but spread over a few lines for readability:
awk '/^ *(/ print
!/^ *(/ split( $1, aa, "[0-9]+", bb )
printf ""#%s" )n", bb[length(bb)]+1 '
/^ *
and!/^ *(/
are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.split( $1, aa, "[0-9]+", bb )
For lines that don't, split the line into two arrays.aa
is the line content delimited by the regex "[0-9]+", andbb
are the delimiters that matched the regex. The final element ofbb
is what interests you.printf ""#%s" )n"
formats the output line, waiting for a single variable...bb[length(bb)]+1
one plus the value of the last element of bb.
edited Mar 16 at 13:13
Stéphane Chazelas
280k53515847
280k53515847
answered Mar 16 at 8:33
user1404316
2,314520
2,314520
add a comment |Â
add a comment |Â
up vote
1
down vote
gawk '
sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
print
' FPAT='[0-9]+.djvu' input.txt
The idea is following:
- extract the
.djvu
extension with leading numbers from thedjvu
filename, using the[0-9]+.djvu
pattern (FPAT
). Example: original filename is#all_24223_to_00243.cpc0002.djvu
, the extracted part will be0002.djvu
. - substitute the previous
djvu
filename#.*.djvu
to the extracted one, increasing it by1
previously. Example: take whole line$0
and substitute#all_24223_to_00243.cpc0002.djvu
inside it, to the0002.djvu + 1
(this expression results to the plain number3
, because of how the string to number conversion works in thegawk
). Add the#
sign and.djvu
extension to it. Result:#3.djvu
.
This solution will work only for lines with one djvu
filename, as in your sample input.
Input
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
Output
(bookmarks
("Cover"
"#2.djvu" )
("Title page"
"#3.djvu" )
("Preface"
"#5.djvu" )
add a comment |Â
up vote
1
down vote
gawk '
sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
print
' FPAT='[0-9]+.djvu' input.txt
The idea is following:
- extract the
.djvu
extension with leading numbers from thedjvu
filename, using the[0-9]+.djvu
pattern (FPAT
). Example: original filename is#all_24223_to_00243.cpc0002.djvu
, the extracted part will be0002.djvu
. - substitute the previous
djvu
filename#.*.djvu
to the extracted one, increasing it by1
previously. Example: take whole line$0
and substitute#all_24223_to_00243.cpc0002.djvu
inside it, to the0002.djvu + 1
(this expression results to the plain number3
, because of how the string to number conversion works in thegawk
). Add the#
sign and.djvu
extension to it. Result:#3.djvu
.
This solution will work only for lines with one djvu
filename, as in your sample input.
Input
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
Output
(bookmarks
("Cover"
"#2.djvu" )
("Title page"
"#3.djvu" )
("Preface"
"#5.djvu" )
add a comment |Â
up vote
1
down vote
up vote
1
down vote
gawk '
sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
print
' FPAT='[0-9]+.djvu' input.txt
The idea is following:
- extract the
.djvu
extension with leading numbers from thedjvu
filename, using the[0-9]+.djvu
pattern (FPAT
). Example: original filename is#all_24223_to_00243.cpc0002.djvu
, the extracted part will be0002.djvu
. - substitute the previous
djvu
filename#.*.djvu
to the extracted one, increasing it by1
previously. Example: take whole line$0
and substitute#all_24223_to_00243.cpc0002.djvu
inside it, to the0002.djvu + 1
(this expression results to the plain number3
, because of how the string to number conversion works in thegawk
). Add the#
sign and.djvu
extension to it. Result:#3.djvu
.
This solution will work only for lines with one djvu
filename, as in your sample input.
Input
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
Output
(bookmarks
("Cover"
"#2.djvu" )
("Title page"
"#3.djvu" )
("Preface"
"#5.djvu" )
gawk '
sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
print
' FPAT='[0-9]+.djvu' input.txt
The idea is following:
- extract the
.djvu
extension with leading numbers from thedjvu
filename, using the[0-9]+.djvu
pattern (FPAT
). Example: original filename is#all_24223_to_00243.cpc0002.djvu
, the extracted part will be0002.djvu
. - substitute the previous
djvu
filename#.*.djvu
to the extracted one, increasing it by1
previously. Example: take whole line$0
and substitute#all_24223_to_00243.cpc0002.djvu
inside it, to the0002.djvu + 1
(this expression results to the plain number3
, because of how the string to number conversion works in thegawk
). Add the#
sign and.djvu
extension to it. Result:#3.djvu
.
This solution will work only for lines with one djvu
filename, as in your sample input.
Input
(bookmarks
("Cover"
"#01.djvu" )
("Title page"
"#all_24223_to_00243.cpc0002.djvu" )
("Preface"
"#all_24223_to_00243.cpc0004.djvu" )
Output
(bookmarks
("Cover"
"#2.djvu" )
("Title page"
"#3.djvu" )
("Preface"
"#5.djvu" )
edited Mar 17 at 10:34
answered Mar 17 at 10:05
MiniMax
2,681718
2,681718
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f430536%2fextract-a-number-in-each-line-and-add-another-number-to-it%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â Tim
Mar 16 at 2:15
Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â Tim
Mar 16 at 2:40
Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the
perl
command to be added to$1
?â Tim
Mar 16 at 3:05