How does awk '!a[$0]++' work?

Clash Royale CLAN TAG#URR8PPP
This one-liner removes duplicate lines from text input without pre-sorting.
For example:
$ cat >f
q
w
e
w
r
$ awk '!a[$0]++' <f
q
w
e
r
$
The original code I have found on the internets read:
awk '!_[$0]++'
This was even more perplexing to me as I took _ to have a special meaning in awk, like in Perl, but it turned out to be just a name of an array.
Now, I understand the logic behind the one-liner:
each input line is used as a key in a hash array, thus, upon completion, the hash contains unique lines in the order of arrival.
What I would like to learn is how exactly this notation is interpreted by awk. E.g. what the bang sign (!) means and the other elements of this code snippet.
How does it work?
shell-script awk scripting sort uniq
add a comment |
This one-liner removes duplicate lines from text input without pre-sorting.
For example:
$ cat >f
q
w
e
w
r
$ awk '!a[$0]++' <f
q
w
e
r
$
The original code I have found on the internets read:
awk '!_[$0]++'
This was even more perplexing to me as I took _ to have a special meaning in awk, like in Perl, but it turned out to be just a name of an array.
Now, I understand the logic behind the one-liner:
each input line is used as a key in a hash array, thus, upon completion, the hash contains unique lines in the order of arrival.
What I would like to learn is how exactly this notation is interpreted by awk. E.g. what the bang sign (!) means and the other elements of this code snippet.
How does it work?
shell-script awk scripting sort uniq
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
1
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10
add a comment |
This one-liner removes duplicate lines from text input without pre-sorting.
For example:
$ cat >f
q
w
e
w
r
$ awk '!a[$0]++' <f
q
w
e
r
$
The original code I have found on the internets read:
awk '!_[$0]++'
This was even more perplexing to me as I took _ to have a special meaning in awk, like in Perl, but it turned out to be just a name of an array.
Now, I understand the logic behind the one-liner:
each input line is used as a key in a hash array, thus, upon completion, the hash contains unique lines in the order of arrival.
What I would like to learn is how exactly this notation is interpreted by awk. E.g. what the bang sign (!) means and the other elements of this code snippet.
How does it work?
shell-script awk scripting sort uniq
This one-liner removes duplicate lines from text input without pre-sorting.
For example:
$ cat >f
q
w
e
w
r
$ awk '!a[$0]++' <f
q
w
e
r
$
The original code I have found on the internets read:
awk '!_[$0]++'
This was even more perplexing to me as I took _ to have a special meaning in awk, like in Perl, but it turned out to be just a name of an array.
Now, I understand the logic behind the one-liner:
each input line is used as a key in a hash array, thus, upon completion, the hash contains unique lines in the order of arrival.
What I would like to learn is how exactly this notation is interpreted by awk. E.g. what the bang sign (!) means and the other elements of this code snippet.
How does it work?
shell-script awk scripting sort uniq
shell-script awk scripting sort uniq
asked Oct 6 '14 at 20:56
Alexander ShcheblikinAlexander Shcheblikin
9251712
9251712
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
1
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10
add a comment |
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
1
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
1
1
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10
add a comment |
2 Answers
2
active
oldest
votes
Let's see,
!a[$0]++
first
a[$0]
we look at the value of a[$0] (array a with whole input line ($0) as key).
If it does not exist ( ! is negation in test will eval to true)
!a[$0]
we print the input line $0 (default action).
Also, we add one ( ++ ) to a[$0], so next time !a[$0] will evaluate to false.
Nice, find!! You should have a look at code golf!
1
So the essence is this: the expression in the single quotes is used byawkas a test for each input line; every time the test succeedsawkexecutes the action in curly braces, which, when omitted isprint. Thanks!
– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin inawk, the default action isprint $0. This means that anything evaluated as true will execute this as default. So for exampleawk '1' fileprints all the lines,awk '$1' fileprints all those lines whose first field is not empty or 0, etc.
– fedorqui
Oct 7 '14 at 10:31
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
|
show 3 more comments
Here is the processing:
a[$0]: look at the value of key$0, in associative arraya. If it does not exist, create it.a[$0]++: increment the value ofa[$0], return the old value as value of expression. Ifa[$0]does not exist, return0and incrementa[$0]to1(++operator returns numeric value).!a[$0]++: negate the value of expression. Ifa[$0]++return0, the whole expression is evaluated to true, makeawkperformed default actionprint $0. Otherwise, the whole expression is evaluated to false, causesawkdo nothing.
References:
- Expression in awk
- gawk - Increment and Decrement Operators
With gawk, we can use dgawk (or awk --debug with newer version) to debug a gawk script. First, create a gawk script, named test.awk:
BEGIN
a = 0;
!a++;
Then run:
dgawk -f test.awk
or:
gawk --debug -f test.awk
In debugger console:
$ dgawk -f test.awk
dgawk> trace on
dgawk> watch a
Watchpoint 1: a
dgawk> run
Starting program:
[ 1:0x7fe59154cfe0] Op_rule : [in_rule = BEGIN] [source_file = test.awk]
[ 2:0x7fe59154bf80] Op_push_i : 0 [PERM|NUMCUR|NUMBER]
[ 2:0x7fe59154bf20] Op_store_var : a [do_reference = FALSE]
[ 3:0x7fe59154bf60] Op_push_lhs : a [do_reference = TRUE]
Stopping in BEGIN ...
Watchpoint 1: a
Old value: untyped variable
New value: 0
main() at `test.awk':3
3 !a++;
dgawk> step
[ 3:0x7fe59154bfc0] Op_postincrement :
[ 3:0x7fe59154bf40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
dgawk>
You can see, Op_postincrement was executed before Op_not.
You can also use si or stepi instead of s or step to see more clearly:
dgawk> si
[ 3:0x7ff061ac1fc0] Op_postincrement :
3 !a++;
dgawk> si
[ 3:0x7ff061ac1f40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
3
@Archemar: Your answer indicate that!is applied before++.
– cuonglm
Oct 7 '14 at 5:37
6
This answer is wrong. The incrementation happens after the result of the!operator is calculated. You are confusing operator precedence (!a[$0]++is parsed like!(a[$0]++)) with order of evaluation (the assignment of the new value ofa[$0]happens after the value of the expression has been calculated).
– Gilles
Oct 7 '14 at 17:21
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value!xis calculated, wherexis the old value ofa[$0]. Thena[$0]is set to1+x.
– Gilles
Oct 7 '14 at 17:26
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
|
show 7 more comments
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f159695%2fhow-does-awk-a0-work%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Let's see,
!a[$0]++
first
a[$0]
we look at the value of a[$0] (array a with whole input line ($0) as key).
If it does not exist ( ! is negation in test will eval to true)
!a[$0]
we print the input line $0 (default action).
Also, we add one ( ++ ) to a[$0], so next time !a[$0] will evaluate to false.
Nice, find!! You should have a look at code golf!
1
So the essence is this: the expression in the single quotes is used byawkas a test for each input line; every time the test succeedsawkexecutes the action in curly braces, which, when omitted isprint. Thanks!
– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin inawk, the default action isprint $0. This means that anything evaluated as true will execute this as default. So for exampleawk '1' fileprints all the lines,awk '$1' fileprints all those lines whose first field is not empty or 0, etc.
– fedorqui
Oct 7 '14 at 10:31
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
|
show 3 more comments
Let's see,
!a[$0]++
first
a[$0]
we look at the value of a[$0] (array a with whole input line ($0) as key).
If it does not exist ( ! is negation in test will eval to true)
!a[$0]
we print the input line $0 (default action).
Also, we add one ( ++ ) to a[$0], so next time !a[$0] will evaluate to false.
Nice, find!! You should have a look at code golf!
1
So the essence is this: the expression in the single quotes is used byawkas a test for each input line; every time the test succeedsawkexecutes the action in curly braces, which, when omitted isprint. Thanks!
– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin inawk, the default action isprint $0. This means that anything evaluated as true will execute this as default. So for exampleawk '1' fileprints all the lines,awk '$1' fileprints all those lines whose first field is not empty or 0, etc.
– fedorqui
Oct 7 '14 at 10:31
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
|
show 3 more comments
Let's see,
!a[$0]++
first
a[$0]
we look at the value of a[$0] (array a with whole input line ($0) as key).
If it does not exist ( ! is negation in test will eval to true)
!a[$0]
we print the input line $0 (default action).
Also, we add one ( ++ ) to a[$0], so next time !a[$0] will evaluate to false.
Nice, find!! You should have a look at code golf!
Let's see,
!a[$0]++
first
a[$0]
we look at the value of a[$0] (array a with whole input line ($0) as key).
If it does not exist ( ! is negation in test will eval to true)
!a[$0]
we print the input line $0 (default action).
Also, we add one ( ++ ) to a[$0], so next time !a[$0] will evaluate to false.
Nice, find!! You should have a look at code golf!
edited Oct 8 '14 at 8:01
Gilles
543k12811001617
543k12811001617
answered Oct 6 '14 at 21:03
ArchemarArchemar
20.2k93973
20.2k93973
1
So the essence is this: the expression in the single quotes is used byawkas a test for each input line; every time the test succeedsawkexecutes the action in curly braces, which, when omitted isprint. Thanks!
– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin inawk, the default action isprint $0. This means that anything evaluated as true will execute this as default. So for exampleawk '1' fileprints all the lines,awk '$1' fileprints all those lines whose first field is not empty or 0, etc.
– fedorqui
Oct 7 '14 at 10:31
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
|
show 3 more comments
1
So the essence is this: the expression in the single quotes is used byawkas a test for each input line; every time the test succeedsawkexecutes the action in curly braces, which, when omitted isprint. Thanks!
– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin inawk, the default action isprint $0. This means that anything evaluated as true will execute this as default. So for exampleawk '1' fileprints all the lines,awk '$1' fileprints all those lines whose first field is not empty or 0, etc.
– fedorqui
Oct 7 '14 at 10:31
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
1
1
So the essence is this: the expression in the single quotes is used by
awk as a test for each input line; every time the test succeeds awk executes the action in curly braces, which, when omitted is print. Thanks!– Alexander Shcheblikin
Oct 7 '14 at 1:21
So the essence is this: the expression in the single quotes is used by
awk as a test for each input line; every time the test succeeds awk executes the action in curly braces, which, when omitted is print. Thanks!– Alexander Shcheblikin
Oct 7 '14 at 1:21
3
3
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@Archemar: This answer is wrong, see mine.
– cuonglm
Oct 7 '14 at 3:29
@AlexanderShcheblikin in
awk, the default action is print $0. This means that anything evaluated as true will execute this as default. So for example awk '1' file prints all the lines, awk '$1' file prints all those lines whose first field is not empty or 0, etc.– fedorqui
Oct 7 '14 at 10:31
@AlexanderShcheblikin in
awk, the default action is print $0. This means that anything evaluated as true will execute this as default. So for example awk '1' file prints all the lines, awk '$1' file prints all those lines whose first field is not empty or 0, etc.– fedorqui
Oct 7 '14 at 10:31
6
6
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@Gnouc I don't see any serious error in this answer. If that is what you're referring to, the incrementation is indeed applied after the value of the expression is calculated. It's true that the incrementation happens before the printing, but that's a minor imprecision which doesn't affect the basic explanation.
– Gilles
Oct 7 '14 at 17:23
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
@AlexanderShcheblikin: See my updated answer.
– cuonglm
Oct 8 '14 at 2:24
|
show 3 more comments
Here is the processing:
a[$0]: look at the value of key$0, in associative arraya. If it does not exist, create it.a[$0]++: increment the value ofa[$0], return the old value as value of expression. Ifa[$0]does not exist, return0and incrementa[$0]to1(++operator returns numeric value).!a[$0]++: negate the value of expression. Ifa[$0]++return0, the whole expression is evaluated to true, makeawkperformed default actionprint $0. Otherwise, the whole expression is evaluated to false, causesawkdo nothing.
References:
- Expression in awk
- gawk - Increment and Decrement Operators
With gawk, we can use dgawk (or awk --debug with newer version) to debug a gawk script. First, create a gawk script, named test.awk:
BEGIN
a = 0;
!a++;
Then run:
dgawk -f test.awk
or:
gawk --debug -f test.awk
In debugger console:
$ dgawk -f test.awk
dgawk> trace on
dgawk> watch a
Watchpoint 1: a
dgawk> run
Starting program:
[ 1:0x7fe59154cfe0] Op_rule : [in_rule = BEGIN] [source_file = test.awk]
[ 2:0x7fe59154bf80] Op_push_i : 0 [PERM|NUMCUR|NUMBER]
[ 2:0x7fe59154bf20] Op_store_var : a [do_reference = FALSE]
[ 3:0x7fe59154bf60] Op_push_lhs : a [do_reference = TRUE]
Stopping in BEGIN ...
Watchpoint 1: a
Old value: untyped variable
New value: 0
main() at `test.awk':3
3 !a++;
dgawk> step
[ 3:0x7fe59154bfc0] Op_postincrement :
[ 3:0x7fe59154bf40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
dgawk>
You can see, Op_postincrement was executed before Op_not.
You can also use si or stepi instead of s or step to see more clearly:
dgawk> si
[ 3:0x7ff061ac1fc0] Op_postincrement :
3 !a++;
dgawk> si
[ 3:0x7ff061ac1f40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
3
@Archemar: Your answer indicate that!is applied before++.
– cuonglm
Oct 7 '14 at 5:37
6
This answer is wrong. The incrementation happens after the result of the!operator is calculated. You are confusing operator precedence (!a[$0]++is parsed like!(a[$0]++)) with order of evaluation (the assignment of the new value ofa[$0]happens after the value of the expression has been calculated).
– Gilles
Oct 7 '14 at 17:21
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value!xis calculated, wherexis the old value ofa[$0]. Thena[$0]is set to1+x.
– Gilles
Oct 7 '14 at 17:26
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
|
show 7 more comments
Here is the processing:
a[$0]: look at the value of key$0, in associative arraya. If it does not exist, create it.a[$0]++: increment the value ofa[$0], return the old value as value of expression. Ifa[$0]does not exist, return0and incrementa[$0]to1(++operator returns numeric value).!a[$0]++: negate the value of expression. Ifa[$0]++return0, the whole expression is evaluated to true, makeawkperformed default actionprint $0. Otherwise, the whole expression is evaluated to false, causesawkdo nothing.
References:
- Expression in awk
- gawk - Increment and Decrement Operators
With gawk, we can use dgawk (or awk --debug with newer version) to debug a gawk script. First, create a gawk script, named test.awk:
BEGIN
a = 0;
!a++;
Then run:
dgawk -f test.awk
or:
gawk --debug -f test.awk
In debugger console:
$ dgawk -f test.awk
dgawk> trace on
dgawk> watch a
Watchpoint 1: a
dgawk> run
Starting program:
[ 1:0x7fe59154cfe0] Op_rule : [in_rule = BEGIN] [source_file = test.awk]
[ 2:0x7fe59154bf80] Op_push_i : 0 [PERM|NUMCUR|NUMBER]
[ 2:0x7fe59154bf20] Op_store_var : a [do_reference = FALSE]
[ 3:0x7fe59154bf60] Op_push_lhs : a [do_reference = TRUE]
Stopping in BEGIN ...
Watchpoint 1: a
Old value: untyped variable
New value: 0
main() at `test.awk':3
3 !a++;
dgawk> step
[ 3:0x7fe59154bfc0] Op_postincrement :
[ 3:0x7fe59154bf40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
dgawk>
You can see, Op_postincrement was executed before Op_not.
You can also use si or stepi instead of s or step to see more clearly:
dgawk> si
[ 3:0x7ff061ac1fc0] Op_postincrement :
3 !a++;
dgawk> si
[ 3:0x7ff061ac1f40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
3
@Archemar: Your answer indicate that!is applied before++.
– cuonglm
Oct 7 '14 at 5:37
6
This answer is wrong. The incrementation happens after the result of the!operator is calculated. You are confusing operator precedence (!a[$0]++is parsed like!(a[$0]++)) with order of evaluation (the assignment of the new value ofa[$0]happens after the value of the expression has been calculated).
– Gilles
Oct 7 '14 at 17:21
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value!xis calculated, wherexis the old value ofa[$0]. Thena[$0]is set to1+x.
– Gilles
Oct 7 '14 at 17:26
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
|
show 7 more comments
Here is the processing:
a[$0]: look at the value of key$0, in associative arraya. If it does not exist, create it.a[$0]++: increment the value ofa[$0], return the old value as value of expression. Ifa[$0]does not exist, return0and incrementa[$0]to1(++operator returns numeric value).!a[$0]++: negate the value of expression. Ifa[$0]++return0, the whole expression is evaluated to true, makeawkperformed default actionprint $0. Otherwise, the whole expression is evaluated to false, causesawkdo nothing.
References:
- Expression in awk
- gawk - Increment and Decrement Operators
With gawk, we can use dgawk (or awk --debug with newer version) to debug a gawk script. First, create a gawk script, named test.awk:
BEGIN
a = 0;
!a++;
Then run:
dgawk -f test.awk
or:
gawk --debug -f test.awk
In debugger console:
$ dgawk -f test.awk
dgawk> trace on
dgawk> watch a
Watchpoint 1: a
dgawk> run
Starting program:
[ 1:0x7fe59154cfe0] Op_rule : [in_rule = BEGIN] [source_file = test.awk]
[ 2:0x7fe59154bf80] Op_push_i : 0 [PERM|NUMCUR|NUMBER]
[ 2:0x7fe59154bf20] Op_store_var : a [do_reference = FALSE]
[ 3:0x7fe59154bf60] Op_push_lhs : a [do_reference = TRUE]
Stopping in BEGIN ...
Watchpoint 1: a
Old value: untyped variable
New value: 0
main() at `test.awk':3
3 !a++;
dgawk> step
[ 3:0x7fe59154bfc0] Op_postincrement :
[ 3:0x7fe59154bf40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
dgawk>
You can see, Op_postincrement was executed before Op_not.
You can also use si or stepi instead of s or step to see more clearly:
dgawk> si
[ 3:0x7ff061ac1fc0] Op_postincrement :
3 !a++;
dgawk> si
[ 3:0x7ff061ac1f40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
Here is the processing:
a[$0]: look at the value of key$0, in associative arraya. If it does not exist, create it.a[$0]++: increment the value ofa[$0], return the old value as value of expression. Ifa[$0]does not exist, return0and incrementa[$0]to1(++operator returns numeric value).!a[$0]++: negate the value of expression. Ifa[$0]++return0, the whole expression is evaluated to true, makeawkperformed default actionprint $0. Otherwise, the whole expression is evaluated to false, causesawkdo nothing.
References:
- Expression in awk
- gawk - Increment and Decrement Operators
With gawk, we can use dgawk (or awk --debug with newer version) to debug a gawk script. First, create a gawk script, named test.awk:
BEGIN
a = 0;
!a++;
Then run:
dgawk -f test.awk
or:
gawk --debug -f test.awk
In debugger console:
$ dgawk -f test.awk
dgawk> trace on
dgawk> watch a
Watchpoint 1: a
dgawk> run
Starting program:
[ 1:0x7fe59154cfe0] Op_rule : [in_rule = BEGIN] [source_file = test.awk]
[ 2:0x7fe59154bf80] Op_push_i : 0 [PERM|NUMCUR|NUMBER]
[ 2:0x7fe59154bf20] Op_store_var : a [do_reference = FALSE]
[ 3:0x7fe59154bf60] Op_push_lhs : a [do_reference = TRUE]
Stopping in BEGIN ...
Watchpoint 1: a
Old value: untyped variable
New value: 0
main() at `test.awk':3
3 !a++;
dgawk> step
[ 3:0x7fe59154bfc0] Op_postincrement :
[ 3:0x7fe59154bf40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
dgawk>
You can see, Op_postincrement was executed before Op_not.
You can also use si or stepi instead of s or step to see more clearly:
dgawk> si
[ 3:0x7ff061ac1fc0] Op_postincrement :
3 !a++;
dgawk> si
[ 3:0x7ff061ac1f40] Op_not :
Watchpoint 1: a
Old value: 0
New value: 1
main() at `test.awk':3
3 !a++;
edited Jun 8 '15 at 7:47
answered Oct 7 '14 at 2:02
cuonglmcuonglm
105k25209307
105k25209307
3
@Archemar: Your answer indicate that!is applied before++.
– cuonglm
Oct 7 '14 at 5:37
6
This answer is wrong. The incrementation happens after the result of the!operator is calculated. You are confusing operator precedence (!a[$0]++is parsed like!(a[$0]++)) with order of evaluation (the assignment of the new value ofa[$0]happens after the value of the expression has been calculated).
– Gilles
Oct 7 '14 at 17:21
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value!xis calculated, wherexis the old value ofa[$0]. Thena[$0]is set to1+x.
– Gilles
Oct 7 '14 at 17:26
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
|
show 7 more comments
3
@Archemar: Your answer indicate that!is applied before++.
– cuonglm
Oct 7 '14 at 5:37
6
This answer is wrong. The incrementation happens after the result of the!operator is calculated. You are confusing operator precedence (!a[$0]++is parsed like!(a[$0]++)) with order of evaluation (the assignment of the new value ofa[$0]happens after the value of the expression has been calculated).
– Gilles
Oct 7 '14 at 17:21
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value!xis calculated, wherexis the old value ofa[$0]. Thena[$0]is set to1+x.
– Gilles
Oct 7 '14 at 17:26
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
3
3
@Archemar: Your answer indicate that
! is applied before ++.– cuonglm
Oct 7 '14 at 5:37
@Archemar: Your answer indicate that
! is applied before ++.– cuonglm
Oct 7 '14 at 5:37
6
6
This answer is wrong. The incrementation happens after the result of the
! operator is calculated. You are confusing operator precedence (!a[$0]++ is parsed like !(a[$0]++)) with order of evaluation (the assignment of the new value of a[$0] happens after the value of the expression has been calculated).– Gilles
Oct 7 '14 at 17:21
This answer is wrong. The incrementation happens after the result of the
! operator is calculated. You are confusing operator precedence (!a[$0]++ is parsed like !(a[$0]++)) with order of evaluation (the assignment of the new value of a[$0] happens after the value of the expression has been calculated).– Gilles
Oct 7 '14 at 17:21
5
5
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value
!x is calculated, where x is the old value of a[$0]. Then a[$0] is set to 1+x.– Gilles
Oct 7 '14 at 17:26
@Gnouc It says right in the passage you quoted, and if it worked the way you described, this code wouldn't have the desired effect. First the value
!x is calculated, where x is the old value of a[$0]. Then a[$0] is set to 1+x.– Gilles
Oct 7 '14 at 17:26
7
7
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
I believe that your analysis of what awk does is correct. Sorry if I implied otherwise yesterday. However, your critique of Archemar's answer is wrong. Archemar does not misunderstand precedence, you do, you're confusing precedence with order of evaluation (see my previous comment). If you remove any mention of Archemar's answer in yours, your answer should be correct. As it is, it is focused on proving Archemar wrong, and this is not the case.
– Gilles
Oct 8 '14 at 7:59
4
4
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
well, at least now I know about awk's debugger ...
– Archemar
Oct 8 '14 at 8:51
|
show 7 more comments
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f159695%2fhow-does-awk-a0-work%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
title is misleading, it should be $0 (Zero), not $o (o).
– Archemar
Oct 6 '14 at 21:06
1
As it's a hash, it's unordered, so "in the order of arrival" isn't actually correct.
– Kevin
Oct 7 '14 at 6:10