What are the differences between a+i and &a[i] for pointer arithmetic in C++?

Clash Royale CLAN TAG#URR8PPP
Supposing we have:
char* a;
int i;
Many introductions to C++ (like this one) suggest that the rvalues a+i and &a[i] are interchangeable. I naively believed this for several decades, until I recently stumbled upon the following text (here) quoted from [dcl.ref]:
in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
In other words, "binding" a reference object to a null-dereference causes undefined behavior. Based on the context of the above text, one infers that merely evaluating &a[i] (within the offsetof macro) is considered "binding" a reference. Furthermore, there seems to be a consensus that &a[i] causes undefined behavior in the case where a=null and i=0. This behavior is different from a+i (at least in C++, in the a=null, i=0 case).
This leads to at least 2 questions about the differences between a+i and &a[i]:
First, what is the underlying semantic difference between a+i and &a[i] that causes this difference in behavior. Can it be explained in terms of any kind of general principles, not just "binding a reference to a null dereference object causes undefined behavior just because this is a very specific case that everybody knows"? Is it that &a[i] might generate a memory access to a[i]? Or the spec author wasn't happy with null dereferences that day? Or something else?
Second, besides the case where a=null and i=0, are there any other cases where a+i and &a[i] behave differently? (could be covered by the first question, depending on the answer to it.)
c++ language-lawyer pointer-arithmetic
|
show 5 more comments
Supposing we have:
char* a;
int i;
Many introductions to C++ (like this one) suggest that the rvalues a+i and &a[i] are interchangeable. I naively believed this for several decades, until I recently stumbled upon the following text (here) quoted from [dcl.ref]:
in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
In other words, "binding" a reference object to a null-dereference causes undefined behavior. Based on the context of the above text, one infers that merely evaluating &a[i] (within the offsetof macro) is considered "binding" a reference. Furthermore, there seems to be a consensus that &a[i] causes undefined behavior in the case where a=null and i=0. This behavior is different from a+i (at least in C++, in the a=null, i=0 case).
This leads to at least 2 questions about the differences between a+i and &a[i]:
First, what is the underlying semantic difference between a+i and &a[i] that causes this difference in behavior. Can it be explained in terms of any kind of general principles, not just "binding a reference to a null dereference object causes undefined behavior just because this is a very specific case that everybody knows"? Is it that &a[i] might generate a memory access to a[i]? Or the spec author wasn't happy with null dereferences that day? Or something else?
Second, besides the case where a=null and i=0, are there any other cases where a+i and &a[i] behave differently? (could be covered by the first question, depending on the answer to it.)
c++ language-lawyer pointer-arithmetic
according to the answers here,a+iis undefined ifa=null, though your 4th link says it is defined ifi=0, hmmm
– kmdreko
Mar 1 at 6:05
@kmdreko. That's a good point. I've tweaked the difference description to focus on thea=null,i=0case for establishing that there is a difference betweena+iand&a[i]... Again, leading one to wonder if there are any other differences between them.
– personal_cloud
Mar 1 at 6:14
2
The intent of the standard never was to disallow&*awhenais a null pointer. This is a subject of issue 232.
– n.m.
Mar 1 at 6:16
@n.m. Very interesting that the proposed resolution only specifies what happens in the case whereaisnullor "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just makea+iand&a[i]completely equivalent...?
– personal_cloud
Mar 1 at 6:27
I'm wondering if perhaps there is no difference betweena+iand&a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference betweena+iand&a[i]when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to botha+iand&a[i]? (but the restrictions themselves are clouded in controversy).
– personal_cloud
Mar 1 at 7:13
|
show 5 more comments
Supposing we have:
char* a;
int i;
Many introductions to C++ (like this one) suggest that the rvalues a+i and &a[i] are interchangeable. I naively believed this for several decades, until I recently stumbled upon the following text (here) quoted from [dcl.ref]:
in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
In other words, "binding" a reference object to a null-dereference causes undefined behavior. Based on the context of the above text, one infers that merely evaluating &a[i] (within the offsetof macro) is considered "binding" a reference. Furthermore, there seems to be a consensus that &a[i] causes undefined behavior in the case where a=null and i=0. This behavior is different from a+i (at least in C++, in the a=null, i=0 case).
This leads to at least 2 questions about the differences between a+i and &a[i]:
First, what is the underlying semantic difference between a+i and &a[i] that causes this difference in behavior. Can it be explained in terms of any kind of general principles, not just "binding a reference to a null dereference object causes undefined behavior just because this is a very specific case that everybody knows"? Is it that &a[i] might generate a memory access to a[i]? Or the spec author wasn't happy with null dereferences that day? Or something else?
Second, besides the case where a=null and i=0, are there any other cases where a+i and &a[i] behave differently? (could be covered by the first question, depending on the answer to it.)
c++ language-lawyer pointer-arithmetic
Supposing we have:
char* a;
int i;
Many introductions to C++ (like this one) suggest that the rvalues a+i and &a[i] are interchangeable. I naively believed this for several decades, until I recently stumbled upon the following text (here) quoted from [dcl.ref]:
in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
In other words, "binding" a reference object to a null-dereference causes undefined behavior. Based on the context of the above text, one infers that merely evaluating &a[i] (within the offsetof macro) is considered "binding" a reference. Furthermore, there seems to be a consensus that &a[i] causes undefined behavior in the case where a=null and i=0. This behavior is different from a+i (at least in C++, in the a=null, i=0 case).
This leads to at least 2 questions about the differences between a+i and &a[i]:
First, what is the underlying semantic difference between a+i and &a[i] that causes this difference in behavior. Can it be explained in terms of any kind of general principles, not just "binding a reference to a null dereference object causes undefined behavior just because this is a very specific case that everybody knows"? Is it that &a[i] might generate a memory access to a[i]? Or the spec author wasn't happy with null dereferences that day? Or something else?
Second, besides the case where a=null and i=0, are there any other cases where a+i and &a[i] behave differently? (could be covered by the first question, depending on the answer to it.)
c++ language-lawyer pointer-arithmetic
c++ language-lawyer pointer-arithmetic
edited Mar 1 at 7:32
personal_cloud
asked Mar 1 at 5:45
personal_cloudpersonal_cloud
882716
882716
according to the answers here,a+iis undefined ifa=null, though your 4th link says it is defined ifi=0, hmmm
– kmdreko
Mar 1 at 6:05
@kmdreko. That's a good point. I've tweaked the difference description to focus on thea=null,i=0case for establishing that there is a difference betweena+iand&a[i]... Again, leading one to wonder if there are any other differences between them.
– personal_cloud
Mar 1 at 6:14
2
The intent of the standard never was to disallow&*awhenais a null pointer. This is a subject of issue 232.
– n.m.
Mar 1 at 6:16
@n.m. Very interesting that the proposed resolution only specifies what happens in the case whereaisnullor "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just makea+iand&a[i]completely equivalent...?
– personal_cloud
Mar 1 at 6:27
I'm wondering if perhaps there is no difference betweena+iand&a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference betweena+iand&a[i]when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to botha+iand&a[i]? (but the restrictions themselves are clouded in controversy).
– personal_cloud
Mar 1 at 7:13
|
show 5 more comments
according to the answers here,a+iis undefined ifa=null, though your 4th link says it is defined ifi=0, hmmm
– kmdreko
Mar 1 at 6:05
@kmdreko. That's a good point. I've tweaked the difference description to focus on thea=null,i=0case for establishing that there is a difference betweena+iand&a[i]... Again, leading one to wonder if there are any other differences between them.
– personal_cloud
Mar 1 at 6:14
2
The intent of the standard never was to disallow&*awhenais a null pointer. This is a subject of issue 232.
– n.m.
Mar 1 at 6:16
@n.m. Very interesting that the proposed resolution only specifies what happens in the case whereaisnullor "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just makea+iand&a[i]completely equivalent...?
– personal_cloud
Mar 1 at 6:27
I'm wondering if perhaps there is no difference betweena+iand&a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference betweena+iand&a[i]when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to botha+iand&a[i]? (but the restrictions themselves are clouded in controversy).
– personal_cloud
Mar 1 at 7:13
according to the answers here,
a+i is undefined if a=null, though your 4th link says it is defined if i=0, hmmm– kmdreko
Mar 1 at 6:05
according to the answers here,
a+i is undefined if a=null, though your 4th link says it is defined if i=0, hmmm– kmdreko
Mar 1 at 6:05
@kmdreko. That's a good point. I've tweaked the difference description to focus on the
a=null, i=0 case for establishing that there is a difference between a+i and &a[i]... Again, leading one to wonder if there are any other differences between them.– personal_cloud
Mar 1 at 6:14
@kmdreko. That's a good point. I've tweaked the difference description to focus on the
a=null, i=0 case for establishing that there is a difference between a+i and &a[i]... Again, leading one to wonder if there are any other differences between them.– personal_cloud
Mar 1 at 6:14
2
2
The intent of the standard never was to disallow
&*a when a is a null pointer. This is a subject of issue 232.– n.m.
Mar 1 at 6:16
The intent of the standard never was to disallow
&*a when a is a null pointer. This is a subject of issue 232.– n.m.
Mar 1 at 6:16
@n.m. Very interesting that the proposed resolution only specifies what happens in the case where
a is null or "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just make a+i and &a[i] completely equivalent...?– personal_cloud
Mar 1 at 6:27
@n.m. Very interesting that the proposed resolution only specifies what happens in the case where
a is null or "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just make a+i and &a[i] completely equivalent...?– personal_cloud
Mar 1 at 6:27
I'm wondering if perhaps there is no difference between
a+i and &a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference between a+i and &a[i] when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to both a+i and &a[i]? (but the restrictions themselves are clouded in controversy).– personal_cloud
Mar 1 at 7:13
I'm wondering if perhaps there is no difference between
a+i and &a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference between a+i and &a[i] when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to both a+i and &a[i]? (but the restrictions themselves are clouded in controversy).– personal_cloud
Mar 1 at 7:13
|
show 5 more comments
2 Answers
2
active
oldest
votes
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]is identical (by definition) to*((E1)+(E2)).
This means that &a[i] is exactly the same as &*(a+i). So you would dereference * a pointer first and get the address & second. In case the pointer is invalid (i.e. nullptr, but also out of range), this is UB.
a+i is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the&*doesn't introduce any UB cases that were not already created by the(a+i)? (and if we don't believe [expr.add]/4, then the&*might conceivably create UB cases that did not exist in(a+i))? I guess I can accept that as a complete answer. Thank you.
– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
add a comment |
TL;DR: a+i and &a[i] are both well-formed and produce a null pointer when a is a null pointer and i is 0, according to (the intent of) the standard, and all compilers agree.
a+i is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i] is tricky. Per [expr.sub]/1, a[i] is equivalent to *(a+i), thus &a[i] is equivalent to &*(a+i). Now the standard is not quite clear about whether &*(a+i) is well-formed when a+i is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert is UB, or if they think the result is not true, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == (char* a=nullptr, int i=0)
return a+i;
());
static_assert(nullptr == (char* a=nullptr, int i=0)
return &a[i];
());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&), and thus is subject to the requirements on references.
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that*pis UB whenp == nullptr.
– Language Lawyer
Mar 1 at 13:10
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointerp, then*pdesignates its memory location. Doesn't say it has to be a valid location. So by your logic,*pis not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.
– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And sincep == nullptrdoes not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.
– Language Lawyer
Mar 1 at 18:14
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54938610%2fwhat-are-the-differences-between-ai-and-ai-for-pointer-arithmetic-in-c%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]is identical (by definition) to*((E1)+(E2)).
This means that &a[i] is exactly the same as &*(a+i). So you would dereference * a pointer first and get the address & second. In case the pointer is invalid (i.e. nullptr, but also out of range), this is UB.
a+i is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the&*doesn't introduce any UB cases that were not already created by the(a+i)? (and if we don't believe [expr.add]/4, then the&*might conceivably create UB cases that did not exist in(a+i))? I guess I can accept that as a complete answer. Thank you.
– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
add a comment |
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]is identical (by definition) to*((E1)+(E2)).
This means that &a[i] is exactly the same as &*(a+i). So you would dereference * a pointer first and get the address & second. In case the pointer is invalid (i.e. nullptr, but also out of range), this is UB.
a+i is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the&*doesn't introduce any UB cases that were not already created by the(a+i)? (and if we don't believe [expr.add]/4, then the&*might conceivably create UB cases that did not exist in(a+i))? I guess I can accept that as a complete answer. Thank you.
– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
add a comment |
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]is identical (by definition) to*((E1)+(E2)).
This means that &a[i] is exactly the same as &*(a+i). So you would dereference * a pointer first and get the address & second. In case the pointer is invalid (i.e. nullptr, but also out of range), this is UB.
a+i is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]is identical (by definition) to*((E1)+(E2)).
This means that &a[i] is exactly the same as &*(a+i). So you would dereference * a pointer first and get the address & second. In case the pointer is invalid (i.e. nullptr, but also out of range), this is UB.
a+i is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤
n; otherwise, the behavior is undefined. Likewise, the expression P -
J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j
≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.
answered Mar 1 at 7:37
ChristopheChristophe
41.9k43681
41.9k43681
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the&*doesn't introduce any UB cases that were not already created by the(a+i)? (and if we don't believe [expr.add]/4, then the&*might conceivably create UB cases that did not exist in(a+i))? I guess I can accept that as a complete answer. Thank you.
– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
add a comment |
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the&*doesn't introduce any UB cases that were not already created by the(a+i)? (and if we don't believe [expr.add]/4, then the&*might conceivably create UB cases that did not exist in(a+i))? I guess I can accept that as a complete answer. Thank you.
– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
1
1
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
But see [expr.add]/7 of C++17 DIS, or [expr.add]/(4.1) of the current draft standard.
– cpplearner
Mar 1 at 8:14
1
1
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
@cpplearner github.com/cplusplus/draft/issues/2299
– Language Lawyer
Mar 1 at 13:14
Thanks for breaking this down into
&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the &* doesn't introduce any UB cases that were not already created by the (a+i)? (and if we don't believe [expr.add]/4, then the &* might conceivably create UB cases that did not exist in (a+i))? I guess I can accept that as a complete answer. Thank you.– personal_cloud
Mar 1 at 17:24
Thanks for breaking this down into
&*(a+i); that is very helpful. Just to clarify: I think you're saying that, if we believe [expr.add]/4, then the &* doesn't introduce any UB cases that were not already created by the (a+i)? (and if we don't believe [expr.add]/4, then the &* might conceivably create UB cases that did not exist in (a+i))? I guess I can accept that as a complete answer. Thank you.– personal_cloud
Mar 1 at 17:24
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
@personal_cloud Yes! The pointer arithmetic and indexing are indeed defined in a very consistent manner, so that they lead to the same result (a part from the exception that you've already mentioned in your question).
– Christophe
Mar 1 at 19:57
add a comment |
TL;DR: a+i and &a[i] are both well-formed and produce a null pointer when a is a null pointer and i is 0, according to (the intent of) the standard, and all compilers agree.
a+i is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i] is tricky. Per [expr.sub]/1, a[i] is equivalent to *(a+i), thus &a[i] is equivalent to &*(a+i). Now the standard is not quite clear about whether &*(a+i) is well-formed when a+i is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert is UB, or if they think the result is not true, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == (char* a=nullptr, int i=0)
return a+i;
());
static_assert(nullptr == (char* a=nullptr, int i=0)
return &a[i];
());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&), and thus is subject to the requirements on references.
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that*pis UB whenp == nullptr.
– Language Lawyer
Mar 1 at 13:10
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointerp, then*pdesignates its memory location. Doesn't say it has to be a valid location. So by your logic,*pis not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.
– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And sincep == nullptrdoes not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.
– Language Lawyer
Mar 1 at 18:14
add a comment |
TL;DR: a+i and &a[i] are both well-formed and produce a null pointer when a is a null pointer and i is 0, according to (the intent of) the standard, and all compilers agree.
a+i is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i] is tricky. Per [expr.sub]/1, a[i] is equivalent to *(a+i), thus &a[i] is equivalent to &*(a+i). Now the standard is not quite clear about whether &*(a+i) is well-formed when a+i is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert is UB, or if they think the result is not true, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == (char* a=nullptr, int i=0)
return a+i;
());
static_assert(nullptr == (char* a=nullptr, int i=0)
return &a[i];
());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&), and thus is subject to the requirements on references.
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that*pis UB whenp == nullptr.
– Language Lawyer
Mar 1 at 13:10
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointerp, then*pdesignates its memory location. Doesn't say it has to be a valid location. So by your logic,*pis not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.
– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And sincep == nullptrdoes not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.
– Language Lawyer
Mar 1 at 18:14
add a comment |
TL;DR: a+i and &a[i] are both well-formed and produce a null pointer when a is a null pointer and i is 0, according to (the intent of) the standard, and all compilers agree.
a+i is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i] is tricky. Per [expr.sub]/1, a[i] is equivalent to *(a+i), thus &a[i] is equivalent to &*(a+i). Now the standard is not quite clear about whether &*(a+i) is well-formed when a+i is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert is UB, or if they think the result is not true, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == (char* a=nullptr, int i=0)
return a+i;
());
static_assert(nullptr == (char* a=nullptr, int i=0)
return &a[i];
());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&), and thus is subject to the requirements on references.
TL;DR: a+i and &a[i] are both well-formed and produce a null pointer when a is a null pointer and i is 0, according to (the intent of) the standard, and all compilers agree.
a+i is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i] is tricky. Per [expr.sub]/1, a[i] is equivalent to *(a+i), thus &a[i] is equivalent to &*(a+i). Now the standard is not quite clear about whether &*(a+i) is well-formed when a+i is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert is UB, or if they think the result is not true, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == (char* a=nullptr, int i=0)
return a+i;
());
static_assert(nullptr == (char* a=nullptr, int i=0)
return &a[i];
());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&), and thus is subject to the requirements on references.
edited Mar 1 at 9:22
answered Mar 1 at 8:27
cpplearnercpplearner
5,55222342
5,55222342
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that*pis UB whenp == nullptr.
– Language Lawyer
Mar 1 at 13:10
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointerp, then*pdesignates its memory location. Doesn't say it has to be a valid location. So by your logic,*pis not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.
– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And sincep == nullptrdoes not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.
– Language Lawyer
Mar 1 at 18:14
add a comment |
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that*pis UB whenp == nullptr.
– Language Lawyer
Mar 1 at 13:10
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointerp, then*pdesignates its memory location. Doesn't say it has to be a valid location. So by your logic,*pis not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.
– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And sincep == nullptrdoes not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.
– Language Lawyer
Mar 1 at 18:14
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that
*p is UB when p == nullptr.– Language Lawyer
Mar 1 at 13:10
Since core language UB is required to be caught in a constant expression [expr.const] says "... would have undefined behavior as specified in ..." which I always understood as that the UB has to be explicitly specified. And there is no wording explicitly saying that
*p is UB when p == nullptr.– Language Lawyer
Mar 1 at 13:10
1
1
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
@LanguageLawyer Well if you mean it's well-formed, then I agree. If you mean there's something that's "not explicitly specified as UB, but is still UB", then I guess you need to prove the existence of such a thing.
– cpplearner
Mar 1 at 13:17
1
1
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
Prove the existence of UB which is not explicitly marked as UB by the standard? It follows from the definition of UB: behavior for which this International Standard imposes no requirements. If the standard imposes no requirement on something, this is UB, even if the standard does not say this explicitly. The note after the definition says this: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior.
– Language Lawyer
Mar 1 at 13:24
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointer
p, then *p designates its memory location. Doesn't say it has to be a valid location. So by your logic, *p is not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.– personal_cloud
Mar 1 at 17:48
@Language Lawyer But expr.unary.op/1 seems to define a basic requirement that if you have a pointer
p, then *p designates its memory location. Doesn't say it has to be a valid location. So by your logic, *p is not UB... I think @cpplearner's logic is correct here; it is other sections of the spec, like [dcl.ref] that specifically create the possibility of UB here.– personal_cloud
Mar 1 at 17:48
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And since
p == nullptr does not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.– Language Lawyer
Mar 1 at 18:14
@personal_cloud I haven't found anything about memory location in the definition of the indirection operator. It says that the resulting lvalue refers to an object or function to which the pointer expression points. And since
p == nullptr does not point to any object or function and this case is not explicitly handled by the standard, it is considered to be undefined behavior. AFAIU since this UB is implicit, the compilers are not required to diagnose it in constant expressions. But I'm not 100% sure here.– Language Lawyer
Mar 1 at 18:14
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54938610%2fwhat-are-the-differences-between-ai-and-ai-for-pointer-arithmetic-in-c%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
according to the answers here,
a+iis undefined ifa=null, though your 4th link says it is defined ifi=0, hmmm– kmdreko
Mar 1 at 6:05
@kmdreko. That's a good point. I've tweaked the difference description to focus on the
a=null,i=0case for establishing that there is a difference betweena+iand&a[i]... Again, leading one to wonder if there are any other differences between them.– personal_cloud
Mar 1 at 6:14
2
The intent of the standard never was to disallow
&*awhenais a null pointer. This is a subject of issue 232.– n.m.
Mar 1 at 6:16
@n.m. Very interesting that the proposed resolution only specifies what happens in the case where
aisnullor "one past the last element of an array". These are pretty much the two most useful cases of empty lvalue! But why did they stop there and not just makea+iand&a[i]completely equivalent...?– personal_cloud
Mar 1 at 6:27
I'm wondering if perhaps there is no difference between
a+iand&a[i], but rather just some disagreements and/or misinterpretations of the spec, applied variously to one syntax or the other, without specific intent to apply only to one syntax or the other. Thereby making it appear that there is a difference betweena+iand&a[i]when there is not really a difference? If one believes that C++ has restrictions on pointer arithmetic, then one would apply the same restrictions to botha+iand&a[i]? (but the restrictions themselves are clouded in controversy).– personal_cloud
Mar 1 at 7:13