Extract a number in each line and add another number to it

up vote
2
down vote

favorite

I have a file whose content is

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )
 ...

I want to change its content to be

(bookmarks
 ("Cover"
 "#2" )
 ("Title page"
 "#3" )
 ("Preface"
 "#5" )
...

by keeping the number just before .djvu, removing leading zeros, and adding one to it. I wonder how you would do that using awk?

Thanks.

asked Mar 16 at 1:58

Tim

22.7k64224401

Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â€“Â Tim
Mar 16 at 2:15

Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â€“Â Tim
Mar 16 at 2:40

Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the perl command to be added to $1?
â€“Â Tim
Mar 16 at 3:05

add a commentÂ |Â

up vote
2
down vote

favorite

I have a file whose content is

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )
 ...

I want to change its content to be

(bookmarks
 ("Cover"
 "#2" )
 ("Title page"
 "#3" )
 ("Preface"
 "#5" )
...

by keeping the number just before .djvu, removing leading zeros, and adding one to it. I wonder how you would do that using awk?

Thanks.

asked Mar 16 at 1:58

Tim

22.7k64224401

Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â€“Â Tim
Mar 16 at 2:15

Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â€“Â Tim
Mar 16 at 2:40

Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the perl command to be added to $1?
â€“Â Tim
Mar 16 at 3:05

add a commentÂ |Â

up vote
2
down vote

favorite

I have a file whose content is

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )
 ...

I want to change its content to be

(bookmarks
 ("Cover"
 "#2" )
 ("Title page"
 "#3" )
 ("Preface"
 "#5" )
...

by keeping the number just before .djvu, removing leading zeros, and adding one to it. I wonder how you would do that using awk?

Thanks.

asked Mar 16 at 1:58

Tim

22.7k64224401

I have a file whose content is

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )
 ...

I want to change its content to be

(bookmarks
 ("Cover"
 "#2" )
 ("Title page"
 "#3" )
 ("Preface"
 "#5" )
...

by keeping the number just before .djvu, removing leading zeros, and adding one to it. I wonder how you would do that using awk?

Thanks.

asked Mar 16 at 1:58

Tim

22.7k64224401

asked Mar 16 at 1:58

Tim

22.7k64224401

asked Mar 16 at 1:58

Tim

22.7k64224401

asked Mar 16 at 1:58

Tim

22.7k64224401

Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â€“Â Tim
Mar 16 at 2:15

Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â€“Â Tim
Mar 16 at 2:40

Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the perl command to be added to $1?
â€“Â Tim
Mar 16 at 3:05

add a commentÂ |Â

Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â€“Â Tim
Mar 16 at 2:15

Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â€“Â Tim
Mar 16 at 2:40

Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the perl command to be added to $1?
â€“Â Tim
Mar 16 at 3:05

Thanks. How does perl handle the testing for whether a line matches the pattern (if no, print the line, and if yes, replace, add and print)?
â€“Â Tim
Mar 16 at 2:15

Thanks. I still don't know how -p make it print the lines without match, and print the lines with match after replacement. That is, why is there no explicit testing each line if it has any match. In awk, I remember such testing is needed.
â€“Â Tim
Mar 16 at 2:40

Additionally, if I store a number in a shell variable, and want to add it to the extracted number of each line, how can I pass the shell variable into the perl command to be added to $1?
â€“Â Tim
Mar 16 at 3:05

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
5
down vote

That's more a job for perl:

perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file

With variable:

INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file

Or:

perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

1

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

add a commentÂ |Â

up vote
1
down vote

Here's a GNU awk solution:

awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'

or the identical, but spread over a few lines for readability:

awk '/^ *(/ print 
 !/^ *(/ split( $1, aa, "[0-9]+", bb )
 printf ""#%s" )n", bb[length(bb)]+1 '

/^ * and !/^ *(/ are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.

split( $1, aa, "[0-9]+", bb ) For lines that don't, split the line into two arrays. aa is the line content delimited by the regex "[0-9]+", and bb are the delimiters that matched the regex. The final element of bb is what interests you.

printf ""#%s" )n" formats the output line, waiting for a single variable...

bb[length(bb)]+1 one plus the value of the last element of bb.

edited Mar 16 at 13:13

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

add a commentÂ |Â

up vote
1
down vote

gawk '
 sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
 print
' FPAT='[0-9]+.djvu' input.txt

The idea is following:

extract the .djvu extension with leading numbers from the djvu filename, using the [0-9]+.djvu pattern (FPAT). Example: original filename is #all_24223_to_00243.cpc0002.djvu, the extracted part will be 0002.djvu.

substitute the previous djvu filename #.*.djvu to the extracted one, increasing it by 1 previously. Example: take whole line $0 and substitute #all_24223_to_00243.cpc0002.djvu inside it, to the 0002.djvu + 1 (this expression results to the plain number 3, because of how the string to number conversion works in the gawk). Add the # sign and .djvu extension to it. Result: #3.djvu.

This solution will work only for lines with one djvu filename, as in your sample input.

Input

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )

Output

(bookmarks
 ("Cover"
 "#2.djvu" )
 ("Title page"
 "#3.djvu" )
 ("Preface"
 "#5.djvu" )

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f430536%2fextract-a-number-in-each-line-and-add-another-number-to-it%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
5
down vote

That's more a job for perl:

perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file

With variable:

INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file

Or:

perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

1

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

add a commentÂ |Â

up vote
5
down vote

That's more a job for perl:

perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file

With variable:

INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file

Or:

perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

1

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

add a commentÂ |Â

up vote
5
down vote

That's more a job for perl:

perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file

With variable:

INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file

Or:

perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

That's more a job for perl:

perl -pe 's/"#K.*?(d+).djvu(?=")/$1+1/ge' <file

With variable:

INCR=1 perl -pe 's/"#K.*?(d+).djvu(?=")/$1+$ENVINCR/ge' <file

Or:

perl -spe 's/"#K.*?(d+).djvu(?=")/$1+$incr/ge' -- -incr=1 <file

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

edited Mar 16 at 9:24

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 9:14

StÃ©phane Chazelas

280k53515847

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

1

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

add a commentÂ |Â

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

1

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

could also use positive lookahead to avoid adding the quote back in replacement section..
â€“Â Sundeep
Mar 16 at 9:17

@Sundeep, true, that makes it a bit shorter and more pleasant to the eye. I've added it in. Thanks.
â€“Â StÃ©phane Chazelas
Mar 16 at 9:20

add a commentÂ |Â

up vote
1
down vote

Here's a GNU awk solution:

awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'

or the identical, but spread over a few lines for readability:

awk '/^ *(/ print 
 !/^ *(/ split( $1, aa, "[0-9]+", bb )
 printf ""#%s" )n", bb[length(bb)]+1 '

/^ * and !/^ *(/ are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.

split( $1, aa, "[0-9]+", bb ) For lines that don't, split the line into two arrays. aa is the line content delimited by the regex "[0-9]+", and bb are the delimiters that matched the regex. The final element of bb is what interests you.

printf ""#%s" )n" formats the output line, waiting for a single variable...

bb[length(bb)]+1 one plus the value of the last element of bb.

edited Mar 16 at 13:13

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

add a commentÂ |Â

up vote
1
down vote

Here's a GNU awk solution:

awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'

or the identical, but spread over a few lines for readability:

awk '/^ *(/ print 
 !/^ *(/ split( $1, aa, "[0-9]+", bb )
 printf ""#%s" )n", bb[length(bb)]+1 '

/^ * and !/^ *(/ are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.

split( $1, aa, "[0-9]+", bb ) For lines that don't, split the line into two arrays. aa is the line content delimited by the regex "[0-9]+", and bb are the delimiters that matched the regex. The final element of bb is what interests you.

printf ""#%s" )n" formats the output line, waiting for a single variable...

bb[length(bb)]+1 one plus the value of the last element of bb.

edited Mar 16 at 13:13

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

add a commentÂ |Â

up vote
1
down vote

Here's a GNU awk solution:

awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'

or the identical, but spread over a few lines for readability:

awk '/^ *(/ print 
 !/^ *(/ split( $1, aa, "[0-9]+", bb )
 printf ""#%s" )n", bb[length(bb)]+1 '

/^ * and !/^ *(/ are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.

split( $1, aa, "[0-9]+", bb ) For lines that don't, split the line into two arrays. aa is the line content delimited by the regex "[0-9]+", and bb are the delimiters that matched the regex. The final element of bb is what interests you.

printf ""#%s" )n" formats the output line, waiting for a single variable...

bb[length(bb)]+1 one plus the value of the last element of bb.

edited Mar 16 at 13:13

StÃ©phane Chazelas

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

Here's a GNU awk solution:

awk '/^ *(/print!/^ *(/split($1,aa,"[0-9]+",bb);printf ""#%s" )n", bb[length(bb)]+1'

or the identical, but spread over a few lines for readability:

awk '/^ *(/ print 
 !/^ *(/ split( $1, aa, "[0-9]+", bb )
 printf ""#%s" )n", bb[length(bb)]+1 '

/^ * and !/^ *(/ are two address rules, covering lines beginning with optional spaces and an open parenthesis ... and lines that don't.

split( $1, aa, "[0-9]+", bb ) For lines that don't, split the line into two arrays. aa is the line content delimited by the regex "[0-9]+", and bb are the delimiters that matched the regex. The final element of bb is what interests you.

printf ""#%s" )n" formats the output line, waiting for a single variable...

bb[length(bb)]+1 one plus the value of the last element of bb.

edited Mar 16 at 13:13

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

edited Mar 16 at 13:13

280k53515847

edited Mar 16 at 13:13

280k53515847

edited Mar 16 at 13:13

280k53515847

answered Mar 16 at 8:33

user1404316

2,314520

answered Mar 16 at 8:33

user1404316

2,314520

answered Mar 16 at 8:33

user1404316

2,314520

add a commentÂ |Â

up vote
1
down vote

gawk '
 sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
 print
' FPAT='[0-9]+.djvu' input.txt

The idea is following:

extract the .djvu extension with leading numbers from the djvu filename, using the [0-9]+.djvu pattern (FPAT). Example: original filename is #all_24223_to_00243.cpc0002.djvu, the extracted part will be 0002.djvu.

substitute the previous djvu filename #.*.djvu to the extracted one, increasing it by 1 previously. Example: take whole line $0 and substitute #all_24223_to_00243.cpc0002.djvu inside it, to the 0002.djvu + 1 (this expression results to the plain number 3, because of how the string to number conversion works in the gawk). Add the # sign and .djvu extension to it. Result: #3.djvu.

This solution will work only for lines with one djvu filename, as in your sample input.

Input

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )

Output

(bookmarks
 ("Cover"
 "#2.djvu" )
 ("Title page"
 "#3.djvu" )
 ("Preface"
 "#5.djvu" )

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

add a commentÂ |Â

up vote
1
down vote

gawk '
 sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
 print
' FPAT='[0-9]+.djvu' input.txt

The idea is following:

extract the .djvu extension with leading numbers from the djvu filename, using the [0-9]+.djvu pattern (FPAT). Example: original filename is #all_24223_to_00243.cpc0002.djvu, the extracted part will be 0002.djvu.

substitute the previous djvu filename #.*.djvu to the extracted one, increasing it by 1 previously. Example: take whole line $0 and substitute #all_24223_to_00243.cpc0002.djvu inside it, to the 0002.djvu + 1 (this expression results to the plain number 3, because of how the string to number conversion works in the gawk). Add the # sign and .djvu extension to it. Result: #3.djvu.

This solution will work only for lines with one djvu filename, as in your sample input.

Input

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )

Output

(bookmarks
 ("Cover"
 "#2.djvu" )
 ("Title page"
 "#3.djvu" )
 ("Preface"
 "#5.djvu" )

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

add a commentÂ |Â

up vote
1
down vote

gawk '
 sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
 print
' FPAT='[0-9]+.djvu' input.txt

The idea is following:

extract the .djvu extension with leading numbers from the djvu filename, using the [0-9]+.djvu pattern (FPAT). Example: original filename is #all_24223_to_00243.cpc0002.djvu, the extracted part will be 0002.djvu.

substitute the previous djvu filename #.*.djvu to the extracted one, increasing it by 1 previously. Example: take whole line $0 and substitute #all_24223_to_00243.cpc0002.djvu inside it, to the 0002.djvu + 1 (this expression results to the plain number 3, because of how the string to number conversion works in the gawk). Add the # sign and .djvu extension to it. Result: #3.djvu.

This solution will work only for lines with one djvu filename, as in your sample input.

Input

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )

Output

(bookmarks
 ("Cover"
 "#2.djvu" )
 ("Title page"
 "#3.djvu" )
 ("Preface"
 "#5.djvu" )

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

gawk '
 sub(/#.*.djvu/, "#" $1 + 1 ".djvu")
 print
' FPAT='[0-9]+.djvu' input.txt

The idea is following:

extract the .djvu extension with leading numbers from the djvu filename, using the [0-9]+.djvu pattern (FPAT). Example: original filename is #all_24223_to_00243.cpc0002.djvu, the extracted part will be 0002.djvu.

substitute the previous djvu filename #.*.djvu to the extracted one, increasing it by 1 previously. Example: take whole line $0 and substitute #all_24223_to_00243.cpc0002.djvu inside it, to the 0002.djvu + 1 (this expression results to the plain number 3, because of how the string to number conversion works in the gawk). Add the # sign and .djvu extension to it. Result: #3.djvu.

This solution will work only for lines with one djvu filename, as in your sample input.

Input

(bookmarks
 ("Cover"
 "#01.djvu" )
 ("Title page"
 "#all_24223_to_00243.cpc0002.djvu" )
 ("Preface"
 "#all_24223_to_00243.cpc0004.djvu" )

Output

(bookmarks
 ("Cover"
 "#2.djvu" )
 ("Title page"
 "#3.djvu" )
 ("Preface"
 "#5.djvu" )

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

edited Mar 17 at 10:34

answered Mar 17 at 10:05

MiniMax

2,681718

answered Mar 17 at 10:05

MiniMax

2,681718

answered Mar 17 at 10:05

MiniMax

2,681718

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu