reinterpret_cast vs strict aliasing
Clash Royale CLAN TAG#URR8PPP
up vote
7
down vote
favorite
I was reading about strict aliasing, but its still kinda foggy and I am never sure where is the line of defined / undefined behaviour. The most detailed post i found concentrates on C. So it would be nice if you could tell me if this is allowed and what has changed since C++98/11/...
#include <iostream>
#include <cstring>
template <typename T> T transform(T t);
struct my_buffer
char data[128];
unsigned pos;
my_buffer() : pos(0)
void rewind() pos = 0;
template <typename T> void push_via_pointer_cast(const T& t)
*reinterpret_cast<T*>(&data[pos]) = transform(t);
pos += sizeof(T);
template <typename T> void pop_via_pointer_cast(T& t)
t = transform( *reinterpret_cast<T*>(&data[pos]) );
pos += sizeof(T);
;
// actually do some real transformation here (and actually also needs an inverse)
// ie this restricts allowed types for T
template<> int transform<int>(int x) return x;
template<> double transform<double>(double x) return x;
int main()
my_buffer b;
b.push_via_pointer_cast(1);
b.push_via_pointer_cast(2.0);
b.rewind();
int x;
double y;
b.pop_via_pointer_cast(x);
b.pop_via_pointer_cast(y);
std::cout << x << " " << y << 'n';
Please dont pay too much attention to a possible out-of-bounds access and the fact that maybe there is no need to write something like that. I know that char*
is allowed to point to anything, but I also have a T*
that points to a char*
. And maybe there is something else I am missing.
Here is a complete example also including push/pop via memcpy
, which afaik isn't affected by strict aliasing.
TL;DR: Does the above code exhibit undefined behaviour (neglecting a out-of-bound acces for the moment), if yes, why? Did anything change with C++11 or one of the newer standards?
c++ language-lawyer strict-aliasing reinterpret-cast
add a comment |Â
up vote
7
down vote
favorite
I was reading about strict aliasing, but its still kinda foggy and I am never sure where is the line of defined / undefined behaviour. The most detailed post i found concentrates on C. So it would be nice if you could tell me if this is allowed and what has changed since C++98/11/...
#include <iostream>
#include <cstring>
template <typename T> T transform(T t);
struct my_buffer
char data[128];
unsigned pos;
my_buffer() : pos(0)
void rewind() pos = 0;
template <typename T> void push_via_pointer_cast(const T& t)
*reinterpret_cast<T*>(&data[pos]) = transform(t);
pos += sizeof(T);
template <typename T> void pop_via_pointer_cast(T& t)
t = transform( *reinterpret_cast<T*>(&data[pos]) );
pos += sizeof(T);
;
// actually do some real transformation here (and actually also needs an inverse)
// ie this restricts allowed types for T
template<> int transform<int>(int x) return x;
template<> double transform<double>(double x) return x;
int main()
my_buffer b;
b.push_via_pointer_cast(1);
b.push_via_pointer_cast(2.0);
b.rewind();
int x;
double y;
b.pop_via_pointer_cast(x);
b.pop_via_pointer_cast(y);
std::cout << x << " " << y << 'n';
Please dont pay too much attention to a possible out-of-bounds access and the fact that maybe there is no need to write something like that. I know that char*
is allowed to point to anything, but I also have a T*
that points to a char*
. And maybe there is something else I am missing.
Here is a complete example also including push/pop via memcpy
, which afaik isn't affected by strict aliasing.
TL;DR: Does the above code exhibit undefined behaviour (neglecting a out-of-bound acces for the moment), if yes, why? Did anything change with C++11 or one of the newer standards?
c++ language-lawyer strict-aliasing reinterpret-cast
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged asC
andC++faq
but it does not have theC++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flag
â user463035818
Aug 23 at 18:51
add a comment |Â
up vote
7
down vote
favorite
up vote
7
down vote
favorite
I was reading about strict aliasing, but its still kinda foggy and I am never sure where is the line of defined / undefined behaviour. The most detailed post i found concentrates on C. So it would be nice if you could tell me if this is allowed and what has changed since C++98/11/...
#include <iostream>
#include <cstring>
template <typename T> T transform(T t);
struct my_buffer
char data[128];
unsigned pos;
my_buffer() : pos(0)
void rewind() pos = 0;
template <typename T> void push_via_pointer_cast(const T& t)
*reinterpret_cast<T*>(&data[pos]) = transform(t);
pos += sizeof(T);
template <typename T> void pop_via_pointer_cast(T& t)
t = transform( *reinterpret_cast<T*>(&data[pos]) );
pos += sizeof(T);
;
// actually do some real transformation here (and actually also needs an inverse)
// ie this restricts allowed types for T
template<> int transform<int>(int x) return x;
template<> double transform<double>(double x) return x;
int main()
my_buffer b;
b.push_via_pointer_cast(1);
b.push_via_pointer_cast(2.0);
b.rewind();
int x;
double y;
b.pop_via_pointer_cast(x);
b.pop_via_pointer_cast(y);
std::cout << x << " " << y << 'n';
Please dont pay too much attention to a possible out-of-bounds access and the fact that maybe there is no need to write something like that. I know that char*
is allowed to point to anything, but I also have a T*
that points to a char*
. And maybe there is something else I am missing.
Here is a complete example also including push/pop via memcpy
, which afaik isn't affected by strict aliasing.
TL;DR: Does the above code exhibit undefined behaviour (neglecting a out-of-bound acces for the moment), if yes, why? Did anything change with C++11 or one of the newer standards?
c++ language-lawyer strict-aliasing reinterpret-cast
I was reading about strict aliasing, but its still kinda foggy and I am never sure where is the line of defined / undefined behaviour. The most detailed post i found concentrates on C. So it would be nice if you could tell me if this is allowed and what has changed since C++98/11/...
#include <iostream>
#include <cstring>
template <typename T> T transform(T t);
struct my_buffer
char data[128];
unsigned pos;
my_buffer() : pos(0)
void rewind() pos = 0;
template <typename T> void push_via_pointer_cast(const T& t)
*reinterpret_cast<T*>(&data[pos]) = transform(t);
pos += sizeof(T);
template <typename T> void pop_via_pointer_cast(T& t)
t = transform( *reinterpret_cast<T*>(&data[pos]) );
pos += sizeof(T);
;
// actually do some real transformation here (and actually also needs an inverse)
// ie this restricts allowed types for T
template<> int transform<int>(int x) return x;
template<> double transform<double>(double x) return x;
int main()
my_buffer b;
b.push_via_pointer_cast(1);
b.push_via_pointer_cast(2.0);
b.rewind();
int x;
double y;
b.pop_via_pointer_cast(x);
b.pop_via_pointer_cast(y);
std::cout << x << " " << y << 'n';
Please dont pay too much attention to a possible out-of-bounds access and the fact that maybe there is no need to write something like that. I know that char*
is allowed to point to anything, but I also have a T*
that points to a char*
. And maybe there is something else I am missing.
Here is a complete example also including push/pop via memcpy
, which afaik isn't affected by strict aliasing.
TL;DR: Does the above code exhibit undefined behaviour (neglecting a out-of-bound acces for the moment), if yes, why? Did anything change with C++11 or one of the newer standards?
c++ language-lawyer strict-aliasing reinterpret-cast
c++ language-lawyer strict-aliasing reinterpret-cast
edited Aug 23 at 9:56
asked Aug 23 at 9:38
user463035818
14.4k22257
14.4k22257
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged asC
andC++faq
but it does not have theC++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flag
â user463035818
Aug 23 at 18:51
add a comment |Â
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged asC
andC++faq
but it does not have theC++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flag
â user463035818
Aug 23 at 18:51
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged as
C
and C++faq
but it does not have the C++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flagâ user463035818
Aug 23 at 18:51
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged as
C
and C++faq
but it does not have the C++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flagâ user463035818
Aug 23 at 18:51
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
6
down vote
accepted
I know that
char*
is allowed to point to anything, but I also have aT*
that points to achar*
.
Right, and that is a problem. While the pointer cast itself has defined behaviour, using it to access a non-existent object of type T
is not.
Unlike C, C++ does not allow impromptu creation of objects*. You cannot simply assign to some memory location as type T
and have an object of that type be created, you need an object of that type to be there already. This requires placement new
. Previous standards were ambiguous on it, but currently, per [intro.object]:
1 [...] An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2). [...]
Since you are not doing any of these things, no object is created.
Furthermore, C++ does not implicitly consider pointers to different object at the same address as equivalent. Your &data[pos]
computes a pointer to a char
object. Casting it to T*
does not make it point to any T
object residing at that address, and dereferencing that pointer has undefined behaviour. C++17 adds std::launder
, which is a way to let the compiler know that you want to access a different object at that address than what you have a pointer to.
When you modify your code to use placement new
and std::launder
, and ensure you have no misaligned accesses (I presume you left that out for brevity), your code will have defined behaviour.
* There is discussion on allowing this in a future version of C++.
i left out the memcpy solution for brevity.std::launder
is completely new to me ;)
â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about usingmemcpy
instead of the casts? In that case I am also not really creating an instance ofT
in place ofdata
, but afaik there is no problem when usingmemcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were aT
, you avoid the need for having an object of typeT
there.
â hvd
Aug 24 at 12:01
that makes sense. So in thechar
array there is just some bits that happen to be memcopyable to aT
(because I copied the bits of aT
in that place) but not aT
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of typeT
I have to first create an object of typeT
at that place.
â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
 |Â
show 1 more comment
up vote
2
down vote
Short answer:
You may not do this:
*reinterpret_cast<T*>(&data[pos]) =
until there has been an object of typeT
constructed at the pointed-to address. Which you can accomplish by placement new.Even then, you might need to use
std::launder
as for C++17 and later, since you access the created object (of typeT
) through a pointer&data[pos]
of typechar*
.
"Direct" reinterpret_cast
is allowed only in some special cases, e.g., when T
is std::byte
, char
, or unsigned char
.
Before C++17 I would use the memcpy
-based solution. Compiler will likely optimize away any unnecessary copies.
Whoa, 1. you can. What's the diference between:int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
andchar buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?
â Kamil Cuk
Aug 23 at 9:59
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NBstd::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structs
â Swift - Friday Pie
Aug 23 at 10:42
add a comment |Â
up vote
2
down vote
Aliasing is a situation when two refer to the same object. That might be references or pointers.
int x;
int* p = &x;
int& r = x;
// aliases: x, r ø *p refer to same object.
It's important for compiler to expect that if a value was written using one name it would be accessible through another.
int foo(int* a, int* b)
*a = 0;
*b = 1;
return *a;
// *a might be 0, might be 1, if b points at same object.
// Compiler can't short-circuit this to "return 0;"
Now if pointers are of unrelated types, there is no reason for compiler to expect that they point at same address. This is the simplest UB:
int foo( float *f, int *i )
*i = 1;
*f = 0.f;
return *i;
int main()
int a = 0;
std::cout << a << std::endl;
x = foo(reinterpret_cast<float*>(&a), &a);
std::cout << a << "n"; // Surprise?
Simply put, strict aliasing means that compiler expects names of unrelated types refer to object of different type, thus located in separate storage units. Because addresses used to access those storage units are de-facto same, result of accessing stored value is undefined and usually depends on optimization flags.
memcpy()
circumvents that by taking the address, by pointer to char, and makes copy of data stored, within code of library function.
Strict aliasing applies to union members, which described separately, but reason is same: writing to one member of union doesn't guarantee the values of other members to change. That doesn't apply to shared fields in beginning of struct stored within union. Thus, type punning by union is prohibited. (Most compilers do not honor this for historical reasons and convenience of maintaining legacy code.)
From 2017 Standard: 6.10 Lvalues and rvalues
8 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined
(8.1) â the dynamic type of the object,
(8.2) â a cv-qualified version of the dynamic type of the object,
(8.3) â a type similar (as defined in 7.5) to the dynamic type of the
object,
(8.4) â a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
(8.5) â a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
(8.6) â an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
(8.7) â a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,
(8.8) â a char, unsigned char, or std::byte type.
In 7.5
1 A cv-decomposition of a type T is a sequence of cvi and Pi such that T is âÂÂcv0 P0 cv1 P1 ÷ ÷ ÷ cvnâÂÂ1 PnâÂÂ1 cvn Uâ for n > 0, where each
cvi is a set of cv-qualifiers (6.9.3), and each Pi is âÂÂpointer toâÂÂ
(11.3.1), âÂÂpointer to member of class Ci of typeâ (11.3.3), âÂÂarray of
NiâÂÂ, or âÂÂarray of unknown bound ofâ (11.3.4). If Pi designates an
array, the cv-qualifiers cvi+1 on the element type are also taken as
the cv-qualifiers cvi of the array. [ Example: The type denoted by the
type-id const int ** has two cv-decompositions, taking U as âÂÂintâ and
as âÂÂpointer to const intâÂÂ. âÂÂend example ] The n-tuple of cv-qualifiers
after the first one in the longest cv-decomposition of T, that is,
cv1, cv2, . . . , cvn, is called the cv-qualification signature of T.
2 Two types T1 and T2 are similar if they have cv-decompositions with
the same n such that corresponding Pi components are the same and the
types denoted by U are the same.
Outcome is: while you can reinterpret_cast the pointer to a different, unrelated and not similar type, you can't use that pointer to access stored value:
char* pc = new char[100]1,2,3,4,5,6,7,8,9,10; // Note, initialized.
int* pi = reinterpret_cast<int*>(pc); // no problem.
int i = *pi; // UB
int* pc2 = reinterpret_cast<char*>(pi+2));
char c = *pc2; // no problem, unless increment didn't put us beyond array bound.
Reinterpret cast also doesn't create objects they point to and assigning value to non-existing object is UB, so you can't use dereferenced result of cast to store data either if class it points to wasn't trivial.
appreciate your exhaustive answer, but the case of aint foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my code
â user463035818
Aug 23 at 14:43
1
btwstd::memcpy
takesvoid*
as parameter, but iirc there are the same excpetions forvoid*
as forchar*
regarding strict aliasing
â user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Givenvoid foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointerp
withinfoo
would normally be regarded as "aliasing" pointerx
frombar
, because anything that knows that both exist would know thatp
is derived fromx
.
â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
accepted
I know that
char*
is allowed to point to anything, but I also have aT*
that points to achar*
.
Right, and that is a problem. While the pointer cast itself has defined behaviour, using it to access a non-existent object of type T
is not.
Unlike C, C++ does not allow impromptu creation of objects*. You cannot simply assign to some memory location as type T
and have an object of that type be created, you need an object of that type to be there already. This requires placement new
. Previous standards were ambiguous on it, but currently, per [intro.object]:
1 [...] An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2). [...]
Since you are not doing any of these things, no object is created.
Furthermore, C++ does not implicitly consider pointers to different object at the same address as equivalent. Your &data[pos]
computes a pointer to a char
object. Casting it to T*
does not make it point to any T
object residing at that address, and dereferencing that pointer has undefined behaviour. C++17 adds std::launder
, which is a way to let the compiler know that you want to access a different object at that address than what you have a pointer to.
When you modify your code to use placement new
and std::launder
, and ensure you have no misaligned accesses (I presume you left that out for brevity), your code will have defined behaviour.
* There is discussion on allowing this in a future version of C++.
i left out the memcpy solution for brevity.std::launder
is completely new to me ;)
â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about usingmemcpy
instead of the casts? In that case I am also not really creating an instance ofT
in place ofdata
, but afaik there is no problem when usingmemcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were aT
, you avoid the need for having an object of typeT
there.
â hvd
Aug 24 at 12:01
that makes sense. So in thechar
array there is just some bits that happen to be memcopyable to aT
(because I copied the bits of aT
in that place) but not aT
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of typeT
I have to first create an object of typeT
at that place.
â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
 |Â
show 1 more comment
up vote
6
down vote
accepted
I know that
char*
is allowed to point to anything, but I also have aT*
that points to achar*
.
Right, and that is a problem. While the pointer cast itself has defined behaviour, using it to access a non-existent object of type T
is not.
Unlike C, C++ does not allow impromptu creation of objects*. You cannot simply assign to some memory location as type T
and have an object of that type be created, you need an object of that type to be there already. This requires placement new
. Previous standards were ambiguous on it, but currently, per [intro.object]:
1 [...] An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2). [...]
Since you are not doing any of these things, no object is created.
Furthermore, C++ does not implicitly consider pointers to different object at the same address as equivalent. Your &data[pos]
computes a pointer to a char
object. Casting it to T*
does not make it point to any T
object residing at that address, and dereferencing that pointer has undefined behaviour. C++17 adds std::launder
, which is a way to let the compiler know that you want to access a different object at that address than what you have a pointer to.
When you modify your code to use placement new
and std::launder
, and ensure you have no misaligned accesses (I presume you left that out for brevity), your code will have defined behaviour.
* There is discussion on allowing this in a future version of C++.
i left out the memcpy solution for brevity.std::launder
is completely new to me ;)
â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about usingmemcpy
instead of the casts? In that case I am also not really creating an instance ofT
in place ofdata
, but afaik there is no problem when usingmemcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were aT
, you avoid the need for having an object of typeT
there.
â hvd
Aug 24 at 12:01
that makes sense. So in thechar
array there is just some bits that happen to be memcopyable to aT
(because I copied the bits of aT
in that place) but not aT
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of typeT
I have to first create an object of typeT
at that place.
â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
 |Â
show 1 more comment
up vote
6
down vote
accepted
up vote
6
down vote
accepted
I know that
char*
is allowed to point to anything, but I also have aT*
that points to achar*
.
Right, and that is a problem. While the pointer cast itself has defined behaviour, using it to access a non-existent object of type T
is not.
Unlike C, C++ does not allow impromptu creation of objects*. You cannot simply assign to some memory location as type T
and have an object of that type be created, you need an object of that type to be there already. This requires placement new
. Previous standards were ambiguous on it, but currently, per [intro.object]:
1 [...] An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2). [...]
Since you are not doing any of these things, no object is created.
Furthermore, C++ does not implicitly consider pointers to different object at the same address as equivalent. Your &data[pos]
computes a pointer to a char
object. Casting it to T*
does not make it point to any T
object residing at that address, and dereferencing that pointer has undefined behaviour. C++17 adds std::launder
, which is a way to let the compiler know that you want to access a different object at that address than what you have a pointer to.
When you modify your code to use placement new
and std::launder
, and ensure you have no misaligned accesses (I presume you left that out for brevity), your code will have defined behaviour.
* There is discussion on allowing this in a future version of C++.
I know that
char*
is allowed to point to anything, but I also have aT*
that points to achar*
.
Right, and that is a problem. While the pointer cast itself has defined behaviour, using it to access a non-existent object of type T
is not.
Unlike C, C++ does not allow impromptu creation of objects*. You cannot simply assign to some memory location as type T
and have an object of that type be created, you need an object of that type to be there already. This requires placement new
. Previous standards were ambiguous on it, but currently, per [intro.object]:
1 [...] An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2). [...]
Since you are not doing any of these things, no object is created.
Furthermore, C++ does not implicitly consider pointers to different object at the same address as equivalent. Your &data[pos]
computes a pointer to a char
object. Casting it to T*
does not make it point to any T
object residing at that address, and dereferencing that pointer has undefined behaviour. C++17 adds std::launder
, which is a way to let the compiler know that you want to access a different object at that address than what you have a pointer to.
When you modify your code to use placement new
and std::launder
, and ensure you have no misaligned accesses (I presume you left that out for brevity), your code will have defined behaviour.
* There is discussion on allowing this in a future version of C++.
answered Aug 23 at 10:00
hvd
113k11192271
113k11192271
i left out the memcpy solution for brevity.std::launder
is completely new to me ;)
â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about usingmemcpy
instead of the casts? In that case I am also not really creating an instance ofT
in place ofdata
, but afaik there is no problem when usingmemcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were aT
, you avoid the need for having an object of typeT
there.
â hvd
Aug 24 at 12:01
that makes sense. So in thechar
array there is just some bits that happen to be memcopyable to aT
(because I copied the bits of aT
in that place) but not aT
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of typeT
I have to first create an object of typeT
at that place.
â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
 |Â
show 1 more comment
i left out the memcpy solution for brevity.std::launder
is completely new to me ;)
â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about usingmemcpy
instead of the casts? In that case I am also not really creating an instance ofT
in place ofdata
, but afaik there is no problem when usingmemcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were aT
, you avoid the need for having an object of typeT
there.
â hvd
Aug 24 at 12:01
that makes sense. So in thechar
array there is just some bits that happen to be memcopyable to aT
(because I copied the bits of aT
in that place) but not aT
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of typeT
I have to first create an object of typeT
at that place.
â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
i left out the memcpy solution for brevity.
std::launder
is completely new to me ;)â user463035818
Aug 23 at 14:39
i left out the memcpy solution for brevity.
std::launder
is completely new to me ;)â user463035818
Aug 23 at 14:39
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about using
memcpy
instead of the casts? In that case I am also not really creating an instance of T
in place of data
, but afaik there is no problem when using memcpy
â user463035818
Aug 24 at 8:55
after thinking about it, I am not so sure anymore if your argumentation convinces me. What about using
memcpy
instead of the casts? In that case I am also not really creating an instance of T
in place of data
, but afaik there is no problem when using memcpy
â user463035818
Aug 24 at 8:55
@user463035818 Right, but in that case, by not accessing it as if it were a
T
, you avoid the need for having an object of type T
there.â hvd
Aug 24 at 12:01
@user463035818 Right, but in that case, by not accessing it as if it were a
T
, you avoid the need for having an object of type T
there.â hvd
Aug 24 at 12:01
that makes sense. So in the
char
array there is just some bits that happen to be memcopyable to a T
(because I copied the bits of a T
in that place) but not a T
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of type T
I have to first create an object of type T
at that place.â user463035818
Aug 24 at 12:05
that makes sense. So in the
char
array there is just some bits that happen to be memcopyable to a T
(because I copied the bits of a T
in that place) but not a T
that I could retrive via a cast, because the language requires that if I treat some memory as if there was some object of type T
I have to first create an object of type T
at that place.â user463035818
Aug 24 at 12:05
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
@user463035818 Exactly.
â hvd
Aug 24 at 12:53
 |Â
show 1 more comment
up vote
2
down vote
Short answer:
You may not do this:
*reinterpret_cast<T*>(&data[pos]) =
until there has been an object of typeT
constructed at the pointed-to address. Which you can accomplish by placement new.Even then, you might need to use
std::launder
as for C++17 and later, since you access the created object (of typeT
) through a pointer&data[pos]
of typechar*
.
"Direct" reinterpret_cast
is allowed only in some special cases, e.g., when T
is std::byte
, char
, or unsigned char
.
Before C++17 I would use the memcpy
-based solution. Compiler will likely optimize away any unnecessary copies.
Whoa, 1. you can. What's the diference between:int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
andchar buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?
â Kamil Cuk
Aug 23 at 9:59
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NBstd::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structs
â Swift - Friday Pie
Aug 23 at 10:42
add a comment |Â
up vote
2
down vote
Short answer:
You may not do this:
*reinterpret_cast<T*>(&data[pos]) =
until there has been an object of typeT
constructed at the pointed-to address. Which you can accomplish by placement new.Even then, you might need to use
std::launder
as for C++17 and later, since you access the created object (of typeT
) through a pointer&data[pos]
of typechar*
.
"Direct" reinterpret_cast
is allowed only in some special cases, e.g., when T
is std::byte
, char
, or unsigned char
.
Before C++17 I would use the memcpy
-based solution. Compiler will likely optimize away any unnecessary copies.
Whoa, 1. you can. What's the diference between:int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
andchar buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?
â Kamil Cuk
Aug 23 at 9:59
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NBstd::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structs
â Swift - Friday Pie
Aug 23 at 10:42
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Short answer:
You may not do this:
*reinterpret_cast<T*>(&data[pos]) =
until there has been an object of typeT
constructed at the pointed-to address. Which you can accomplish by placement new.Even then, you might need to use
std::launder
as for C++17 and later, since you access the created object (of typeT
) through a pointer&data[pos]
of typechar*
.
"Direct" reinterpret_cast
is allowed only in some special cases, e.g., when T
is std::byte
, char
, or unsigned char
.
Before C++17 I would use the memcpy
-based solution. Compiler will likely optimize away any unnecessary copies.
Short answer:
You may not do this:
*reinterpret_cast<T*>(&data[pos]) =
until there has been an object of typeT
constructed at the pointed-to address. Which you can accomplish by placement new.Even then, you might need to use
std::launder
as for C++17 and later, since you access the created object (of typeT
) through a pointer&data[pos]
of typechar*
.
"Direct" reinterpret_cast
is allowed only in some special cases, e.g., when T
is std::byte
, char
, or unsigned char
.
Before C++17 I would use the memcpy
-based solution. Compiler will likely optimize away any unnecessary copies.
edited Aug 23 at 10:18
answered Aug 23 at 9:58
Daniel Langr
5,1562039
5,1562039
Whoa, 1. you can. What's the diference between:int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
andchar buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?
â Kamil Cuk
Aug 23 at 9:59
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NBstd::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structs
â Swift - Friday Pie
Aug 23 at 10:42
add a comment |Â
Whoa, 1. you can. What's the diference between:int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
andchar buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?
â Kamil Cuk
Aug 23 at 9:59
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NBstd::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structs
â Swift - Friday Pie
Aug 23 at 10:42
Whoa, 1. you can. What's the diference between:
int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
and char buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?â Kamil Cuk
Aug 23 at 9:59
Whoa, 1. you can. What's the diference between:
int *a = reinterpret_cast<int*>(malloc(sizoeof(int))); *a = 5;
and char buf[sizeof(int)]; int *a = reinterpre_cast<int*>(buf); *a = 5;
?â Kamil Cuk
Aug 23 at 9:59
2
2
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
@KamilCuk No, C++ is very explicit about when objects are created. Assignment does not implicitly create an object.
â hvd
Aug 23 at 10:01
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
Ok, thank you! Need to delve into that topic a bit more then.
â Kamil Cuk
Aug 23 at 10:04
1
1
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk It's not a simple topic :). See, for instance, the bottom part of this answer: stackoverflow.com/a/39382728/580083. Or, even better, this one: stackoverflow.com/a/27049038/580083.
â Daniel Langr
Aug 23 at 10:06
@KamilCuk NB
std::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structsâ Swift - Friday Pie
Aug 23 at 10:42
@KamilCuk NB
std::launder
got own limitation if you try access aggregate types, e.g. arrays of arrays or structsâ Swift - Friday Pie
Aug 23 at 10:42
add a comment |Â
up vote
2
down vote
Aliasing is a situation when two refer to the same object. That might be references or pointers.
int x;
int* p = &x;
int& r = x;
// aliases: x, r ø *p refer to same object.
It's important for compiler to expect that if a value was written using one name it would be accessible through another.
int foo(int* a, int* b)
*a = 0;
*b = 1;
return *a;
// *a might be 0, might be 1, if b points at same object.
// Compiler can't short-circuit this to "return 0;"
Now if pointers are of unrelated types, there is no reason for compiler to expect that they point at same address. This is the simplest UB:
int foo( float *f, int *i )
*i = 1;
*f = 0.f;
return *i;
int main()
int a = 0;
std::cout << a << std::endl;
x = foo(reinterpret_cast<float*>(&a), &a);
std::cout << a << "n"; // Surprise?
Simply put, strict aliasing means that compiler expects names of unrelated types refer to object of different type, thus located in separate storage units. Because addresses used to access those storage units are de-facto same, result of accessing stored value is undefined and usually depends on optimization flags.
memcpy()
circumvents that by taking the address, by pointer to char, and makes copy of data stored, within code of library function.
Strict aliasing applies to union members, which described separately, but reason is same: writing to one member of union doesn't guarantee the values of other members to change. That doesn't apply to shared fields in beginning of struct stored within union. Thus, type punning by union is prohibited. (Most compilers do not honor this for historical reasons and convenience of maintaining legacy code.)
From 2017 Standard: 6.10 Lvalues and rvalues
8 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined
(8.1) â the dynamic type of the object,
(8.2) â a cv-qualified version of the dynamic type of the object,
(8.3) â a type similar (as defined in 7.5) to the dynamic type of the
object,
(8.4) â a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
(8.5) â a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
(8.6) â an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
(8.7) â a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,
(8.8) â a char, unsigned char, or std::byte type.
In 7.5
1 A cv-decomposition of a type T is a sequence of cvi and Pi such that T is âÂÂcv0 P0 cv1 P1 ÷ ÷ ÷ cvnâÂÂ1 PnâÂÂ1 cvn Uâ for n > 0, where each
cvi is a set of cv-qualifiers (6.9.3), and each Pi is âÂÂpointer toâÂÂ
(11.3.1), âÂÂpointer to member of class Ci of typeâ (11.3.3), âÂÂarray of
NiâÂÂ, or âÂÂarray of unknown bound ofâ (11.3.4). If Pi designates an
array, the cv-qualifiers cvi+1 on the element type are also taken as
the cv-qualifiers cvi of the array. [ Example: The type denoted by the
type-id const int ** has two cv-decompositions, taking U as âÂÂintâ and
as âÂÂpointer to const intâÂÂ. âÂÂend example ] The n-tuple of cv-qualifiers
after the first one in the longest cv-decomposition of T, that is,
cv1, cv2, . . . , cvn, is called the cv-qualification signature of T.
2 Two types T1 and T2 are similar if they have cv-decompositions with
the same n such that corresponding Pi components are the same and the
types denoted by U are the same.
Outcome is: while you can reinterpret_cast the pointer to a different, unrelated and not similar type, you can't use that pointer to access stored value:
char* pc = new char[100]1,2,3,4,5,6,7,8,9,10; // Note, initialized.
int* pi = reinterpret_cast<int*>(pc); // no problem.
int i = *pi; // UB
int* pc2 = reinterpret_cast<char*>(pi+2));
char c = *pc2; // no problem, unless increment didn't put us beyond array bound.
Reinterpret cast also doesn't create objects they point to and assigning value to non-existing object is UB, so you can't use dereferenced result of cast to store data either if class it points to wasn't trivial.
appreciate your exhaustive answer, but the case of aint foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my code
â user463035818
Aug 23 at 14:43
1
btwstd::memcpy
takesvoid*
as parameter, but iirc there are the same excpetions forvoid*
as forchar*
regarding strict aliasing
â user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Givenvoid foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointerp
withinfoo
would normally be regarded as "aliasing" pointerx
frombar
, because anything that knows that both exist would know thatp
is derived fromx
.
â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
add a comment |Â
up vote
2
down vote
Aliasing is a situation when two refer to the same object. That might be references or pointers.
int x;
int* p = &x;
int& r = x;
// aliases: x, r ø *p refer to same object.
It's important for compiler to expect that if a value was written using one name it would be accessible through another.
int foo(int* a, int* b)
*a = 0;
*b = 1;
return *a;
// *a might be 0, might be 1, if b points at same object.
// Compiler can't short-circuit this to "return 0;"
Now if pointers are of unrelated types, there is no reason for compiler to expect that they point at same address. This is the simplest UB:
int foo( float *f, int *i )
*i = 1;
*f = 0.f;
return *i;
int main()
int a = 0;
std::cout << a << std::endl;
x = foo(reinterpret_cast<float*>(&a), &a);
std::cout << a << "n"; // Surprise?
Simply put, strict aliasing means that compiler expects names of unrelated types refer to object of different type, thus located in separate storage units. Because addresses used to access those storage units are de-facto same, result of accessing stored value is undefined and usually depends on optimization flags.
memcpy()
circumvents that by taking the address, by pointer to char, and makes copy of data stored, within code of library function.
Strict aliasing applies to union members, which described separately, but reason is same: writing to one member of union doesn't guarantee the values of other members to change. That doesn't apply to shared fields in beginning of struct stored within union. Thus, type punning by union is prohibited. (Most compilers do not honor this for historical reasons and convenience of maintaining legacy code.)
From 2017 Standard: 6.10 Lvalues and rvalues
8 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined
(8.1) â the dynamic type of the object,
(8.2) â a cv-qualified version of the dynamic type of the object,
(8.3) â a type similar (as defined in 7.5) to the dynamic type of the
object,
(8.4) â a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
(8.5) â a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
(8.6) â an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
(8.7) â a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,
(8.8) â a char, unsigned char, or std::byte type.
In 7.5
1 A cv-decomposition of a type T is a sequence of cvi and Pi such that T is âÂÂcv0 P0 cv1 P1 ÷ ÷ ÷ cvnâÂÂ1 PnâÂÂ1 cvn Uâ for n > 0, where each
cvi is a set of cv-qualifiers (6.9.3), and each Pi is âÂÂpointer toâÂÂ
(11.3.1), âÂÂpointer to member of class Ci of typeâ (11.3.3), âÂÂarray of
NiâÂÂ, or âÂÂarray of unknown bound ofâ (11.3.4). If Pi designates an
array, the cv-qualifiers cvi+1 on the element type are also taken as
the cv-qualifiers cvi of the array. [ Example: The type denoted by the
type-id const int ** has two cv-decompositions, taking U as âÂÂintâ and
as âÂÂpointer to const intâÂÂ. âÂÂend example ] The n-tuple of cv-qualifiers
after the first one in the longest cv-decomposition of T, that is,
cv1, cv2, . . . , cvn, is called the cv-qualification signature of T.
2 Two types T1 and T2 are similar if they have cv-decompositions with
the same n such that corresponding Pi components are the same and the
types denoted by U are the same.
Outcome is: while you can reinterpret_cast the pointer to a different, unrelated and not similar type, you can't use that pointer to access stored value:
char* pc = new char[100]1,2,3,4,5,6,7,8,9,10; // Note, initialized.
int* pi = reinterpret_cast<int*>(pc); // no problem.
int i = *pi; // UB
int* pc2 = reinterpret_cast<char*>(pi+2));
char c = *pc2; // no problem, unless increment didn't put us beyond array bound.
Reinterpret cast also doesn't create objects they point to and assigning value to non-existing object is UB, so you can't use dereferenced result of cast to store data either if class it points to wasn't trivial.
appreciate your exhaustive answer, but the case of aint foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my code
â user463035818
Aug 23 at 14:43
1
btwstd::memcpy
takesvoid*
as parameter, but iirc there are the same excpetions forvoid*
as forchar*
regarding strict aliasing
â user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Givenvoid foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointerp
withinfoo
would normally be regarded as "aliasing" pointerx
frombar
, because anything that knows that both exist would know thatp
is derived fromx
.
â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Aliasing is a situation when two refer to the same object. That might be references or pointers.
int x;
int* p = &x;
int& r = x;
// aliases: x, r ø *p refer to same object.
It's important for compiler to expect that if a value was written using one name it would be accessible through another.
int foo(int* a, int* b)
*a = 0;
*b = 1;
return *a;
// *a might be 0, might be 1, if b points at same object.
// Compiler can't short-circuit this to "return 0;"
Now if pointers are of unrelated types, there is no reason for compiler to expect that they point at same address. This is the simplest UB:
int foo( float *f, int *i )
*i = 1;
*f = 0.f;
return *i;
int main()
int a = 0;
std::cout << a << std::endl;
x = foo(reinterpret_cast<float*>(&a), &a);
std::cout << a << "n"; // Surprise?
Simply put, strict aliasing means that compiler expects names of unrelated types refer to object of different type, thus located in separate storage units. Because addresses used to access those storage units are de-facto same, result of accessing stored value is undefined and usually depends on optimization flags.
memcpy()
circumvents that by taking the address, by pointer to char, and makes copy of data stored, within code of library function.
Strict aliasing applies to union members, which described separately, but reason is same: writing to one member of union doesn't guarantee the values of other members to change. That doesn't apply to shared fields in beginning of struct stored within union. Thus, type punning by union is prohibited. (Most compilers do not honor this for historical reasons and convenience of maintaining legacy code.)
From 2017 Standard: 6.10 Lvalues and rvalues
8 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined
(8.1) â the dynamic type of the object,
(8.2) â a cv-qualified version of the dynamic type of the object,
(8.3) â a type similar (as defined in 7.5) to the dynamic type of the
object,
(8.4) â a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
(8.5) â a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
(8.6) â an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
(8.7) â a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,
(8.8) â a char, unsigned char, or std::byte type.
In 7.5
1 A cv-decomposition of a type T is a sequence of cvi and Pi such that T is âÂÂcv0 P0 cv1 P1 ÷ ÷ ÷ cvnâÂÂ1 PnâÂÂ1 cvn Uâ for n > 0, where each
cvi is a set of cv-qualifiers (6.9.3), and each Pi is âÂÂpointer toâÂÂ
(11.3.1), âÂÂpointer to member of class Ci of typeâ (11.3.3), âÂÂarray of
NiâÂÂ, or âÂÂarray of unknown bound ofâ (11.3.4). If Pi designates an
array, the cv-qualifiers cvi+1 on the element type are also taken as
the cv-qualifiers cvi of the array. [ Example: The type denoted by the
type-id const int ** has two cv-decompositions, taking U as âÂÂintâ and
as âÂÂpointer to const intâÂÂ. âÂÂend example ] The n-tuple of cv-qualifiers
after the first one in the longest cv-decomposition of T, that is,
cv1, cv2, . . . , cvn, is called the cv-qualification signature of T.
2 Two types T1 and T2 are similar if they have cv-decompositions with
the same n such that corresponding Pi components are the same and the
types denoted by U are the same.
Outcome is: while you can reinterpret_cast the pointer to a different, unrelated and not similar type, you can't use that pointer to access stored value:
char* pc = new char[100]1,2,3,4,5,6,7,8,9,10; // Note, initialized.
int* pi = reinterpret_cast<int*>(pc); // no problem.
int i = *pi; // UB
int* pc2 = reinterpret_cast<char*>(pi+2));
char c = *pc2; // no problem, unless increment didn't put us beyond array bound.
Reinterpret cast also doesn't create objects they point to and assigning value to non-existing object is UB, so you can't use dereferenced result of cast to store data either if class it points to wasn't trivial.
Aliasing is a situation when two refer to the same object. That might be references or pointers.
int x;
int* p = &x;
int& r = x;
// aliases: x, r ø *p refer to same object.
It's important for compiler to expect that if a value was written using one name it would be accessible through another.
int foo(int* a, int* b)
*a = 0;
*b = 1;
return *a;
// *a might be 0, might be 1, if b points at same object.
// Compiler can't short-circuit this to "return 0;"
Now if pointers are of unrelated types, there is no reason for compiler to expect that they point at same address. This is the simplest UB:
int foo( float *f, int *i )
*i = 1;
*f = 0.f;
return *i;
int main()
int a = 0;
std::cout << a << std::endl;
x = foo(reinterpret_cast<float*>(&a), &a);
std::cout << a << "n"; // Surprise?
Simply put, strict aliasing means that compiler expects names of unrelated types refer to object of different type, thus located in separate storage units. Because addresses used to access those storage units are de-facto same, result of accessing stored value is undefined and usually depends on optimization flags.
memcpy()
circumvents that by taking the address, by pointer to char, and makes copy of data stored, within code of library function.
Strict aliasing applies to union members, which described separately, but reason is same: writing to one member of union doesn't guarantee the values of other members to change. That doesn't apply to shared fields in beginning of struct stored within union. Thus, type punning by union is prohibited. (Most compilers do not honor this for historical reasons and convenience of maintaining legacy code.)
From 2017 Standard: 6.10 Lvalues and rvalues
8 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined
(8.1) â the dynamic type of the object,
(8.2) â a cv-qualified version of the dynamic type of the object,
(8.3) â a type similar (as defined in 7.5) to the dynamic type of the
object,
(8.4) â a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
(8.5) â a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
(8.6) â an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
(8.7) â a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,
(8.8) â a char, unsigned char, or std::byte type.
In 7.5
1 A cv-decomposition of a type T is a sequence of cvi and Pi such that T is âÂÂcv0 P0 cv1 P1 ÷ ÷ ÷ cvnâÂÂ1 PnâÂÂ1 cvn Uâ for n > 0, where each
cvi is a set of cv-qualifiers (6.9.3), and each Pi is âÂÂpointer toâÂÂ
(11.3.1), âÂÂpointer to member of class Ci of typeâ (11.3.3), âÂÂarray of
NiâÂÂ, or âÂÂarray of unknown bound ofâ (11.3.4). If Pi designates an
array, the cv-qualifiers cvi+1 on the element type are also taken as
the cv-qualifiers cvi of the array. [ Example: The type denoted by the
type-id const int ** has two cv-decompositions, taking U as âÂÂintâ and
as âÂÂpointer to const intâÂÂ. âÂÂend example ] The n-tuple of cv-qualifiers
after the first one in the longest cv-decomposition of T, that is,
cv1, cv2, . . . , cvn, is called the cv-qualification signature of T.
2 Two types T1 and T2 are similar if they have cv-decompositions with
the same n such that corresponding Pi components are the same and the
types denoted by U are the same.
Outcome is: while you can reinterpret_cast the pointer to a different, unrelated and not similar type, you can't use that pointer to access stored value:
char* pc = new char[100]1,2,3,4,5,6,7,8,9,10; // Note, initialized.
int* pi = reinterpret_cast<int*>(pc); // no problem.
int i = *pi; // UB
int* pc2 = reinterpret_cast<char*>(pi+2));
char c = *pc2; // no problem, unless increment didn't put us beyond array bound.
Reinterpret cast also doesn't create objects they point to and assigning value to non-existing object is UB, so you can't use dereferenced result of cast to store data either if class it points to wasn't trivial.
edited Aug 23 at 10:55
cHao
66.6k12110150
66.6k12110150
answered Aug 23 at 10:05
Swift - Friday Pie
3,7001925
3,7001925
appreciate your exhaustive answer, but the case of aint foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my code
â user463035818
Aug 23 at 14:43
1
btwstd::memcpy
takesvoid*
as parameter, but iirc there are the same excpetions forvoid*
as forchar*
regarding strict aliasing
â user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Givenvoid foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointerp
withinfoo
would normally be regarded as "aliasing" pointerx
frombar
, because anything that knows that both exist would know thatp
is derived fromx
.
â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
add a comment |Â
appreciate your exhaustive answer, but the case of aint foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my code
â user463035818
Aug 23 at 14:43
1
btwstd::memcpy
takesvoid*
as parameter, but iirc there are the same excpetions forvoid*
as forchar*
regarding strict aliasing
â user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Givenvoid foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointerp
withinfoo
would normally be regarded as "aliasing" pointerx
frombar
, because anything that knows that both exist would know thatp
is derived fromx
.
â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
appreciate your exhaustive answer, but the case of a
int foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my codeâ user463035818
Aug 23 at 14:43
appreciate your exhaustive answer, but the case of a
int foo( float *f, int *i )
is rather obvious while I dont see (maybe I will after studying the answers more carefully) what would be the benefit of strict aliasing for the compiler in my codeâ user463035818
Aug 23 at 14:43
1
1
btw
std::memcpy
takes void*
as parameter, but iirc there are the same excpetions for void*
as for char*
regarding strict aliasingâ user463035818
Aug 23 at 14:48
btw
std::memcpy
takes void*
as parameter, but iirc there are the same excpetions for void*
as for char*
regarding strict aliasingâ user463035818
Aug 23 at 14:48
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Given
void foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointer p
within foo
would normally be regarded as "aliasing" pointer x
from bar
, because anything that knows that both exist would know that p
is derived from x
.â supercat
Aug 23 at 16:31
I think the term "aliasing" normally refers to the situation where one pointers or reference are used to reference the same storage as another pointer or reference that will also be used to reference that storage in conflicting fashion, and the former isn't known to be derived from the latter. Given
void foo(char *p) *p^=1; long long bar(long long *x) if (*x) foo((char*)x); return *x;
I don't think the pointer p
within foo
would normally be regarded as "aliasing" pointer x
from bar
, because anything that knows that both exist would know that p
is derived from x
.â supercat
Aug 23 at 16:31
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
@user463035818 True, it declared as 'void*' for sake of implicit cast of pointers, but by documentation it reinterprets pointers as 'unsigned char*'. If you cast one way, then back, it doesn't break anything. Your code, as is, got issue with inability to construct objects (all them were constructed before?) and absence of type safety. It looks like a miniature memory pool with stack interface and probably should be built as such, technically it doesn't involve breach of aliasing as long as you do not push one type and pop another.
â Swift - Friday Pie
Aug 23 at 16:32
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f51982709%2freinterpret-cast-vs-strict-aliasing%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
My answer to "What is the strict aliasing rule?" covers C++ extensively and I believe covers your question as well. It is not uncommon for old answers to get new and much better answers over time, so it is important to look at all of the answers and not just the top ones. my most upvoted answer came 4 years after the original question was asked.
â Shafik Yaghmour
Aug 23 at 16:12
@ShafikYaghmour thanks, I will take a look. The problem with the other question is that it is tagged as
C
andC++faq
but it does not have theC++
tag, so when I saw the top answers concentrating on C, I didnt consider it as a dupe. If you think it is one, feel free to flagâ user463035818
Aug 23 at 18:51