A classic C/C++ problem every developer has come across at some point is the deserialisation of raw byte buffers, such as network packets or binary files, into Plain Old Data (POD) structures. The most obvious (and conventional) solution to this problem is to simply interpret the bytes as the type of the object you are parsing. Consider the following code:
struct __attribute__((packed)) MessageHeader {
uint32_t seq_no;
uint16_t msg_type;
uint16_t msg_len;
};
void onPacket(uint8_t *buffer, size_t size) {
if (size < sizeof(MessageHeader)) return;
const auto *hdr = reinterpret_cast<const MessageHeader *>(buffer);
if (hdr->msg_type == 0)
foo();
else
bar();
}
Assuming correct alignment and field offsets, the code above looks correct. In fact, this code will work (or at least on most modern compilers). However, this code snippet has undefined behaviour. When reading the hdr->msg_type field, we are violating the strict aliasing rules defined by the language standard. The strict aliasing rules dictate that an object can only be accessed through pointers of its own type (or a compatible type like char). However, we are instructing the compiler to treat the sequence of bytes as a MessageHeader object.
UB-free solutions
Using memcpy
The safe way to solve the problem without violating strict aliasing is to use memcpy:
MessageHeader hdr;
std::memcpy(&hdr, buffer, sizeof(MessageHeader));
Modern compilers often optimise memcpy to generate efficient code, especially for small structs (however, there is no guarantee of this).
Placement new
Another alternative is to use placement new:
MessageHeader *hdr = new (buffer) MessageHeader;
While this method avoids aliasing issues, it unnecessarily calls the constructor of MessageHeader, even when the buffer already contains a valid object representation.
None of the two solutions above are ideal: we simply want to move pointers around and treat the memory as if it already was an object of type MessageHeader. To understand how to solve this problem without invoking undefined behaviour, we need to first understand object lifetimes in C++.
Back to basics: object creation
When we learn C++, we are taught that there are two steps involved when creating an object:
- Memory allocation, to reserve storage for the object
- Object construction, to initialise a valid representation of the object
Object lifetimes
In reality, there is a third step which is often overlooked (and understandably so because it is transparent to us): lifetime initialisation. Whenever we call the constructor, the compiler will also associate the object type with the underlying allocated memory, so that it can keep track of aliasing information. The following table summarises the three steps:
Method | Memory Allocated | Constructor Called | Lifetime Started |
---|---|---|---|
Stack | ✅ | ✅ | ✅ |
Heap | ✅ | ✅ | ✅ |
Placement New | ❌ | ✅ | ✅ |
Now you might be asking yourself: is there any way to only perform the third step, i.e.: initialise a lifetime for an object without calling the constructor? If there was, then it would solve the strict aliasing violation in the deserialisation program presented at the start of this article.
Method | Memory Allocated | Constructor Called | Lifetime Started |
---|---|---|---|
Stack | ✅ | ✅ | ✅ |
Heap | ✅ | ✅ | ✅ |
Placement New | ❌ | ✅ | ✅ |
std::start_lifetime_as | ❌ | ❌ | ✅ |
This is exactly what std::start_lifetime_as (introduced in C++23) aims to solve: it explicitly starts the lifetime of an object at a given memory location without invoking the constructor. We can update our original program:
void onPacket(uint8_t *buffer, size_t size) {
if (size < sizeof(MessageHeader)) return;
const auto *hdr = std::start_lifetime_as<const MessageHeader *>(buffer);
if (hdr->msg_type == 0)
foo();
else
bar();
}
The function above does not break strict aliasing rules (as it ensures the object has a valid lifetime before use) and does not perform any copies. In fact, the compiler should produce the exact same code as the original program, except that now the compiler is happy! 😄
Compiler support
As of the time of writing (March 2025), no major compiler has implemented std::start_lifetime_as. An approximate implementation using C++20's std::memmove (which implicitly starts a lifetime at the destination buffer) is a workaround (source):
template<class T>
requires (std::is_trivially_copyable_v<T> && std::is_implicit_lifetime_v<T>)
T* start_lifetime_as(void* p) noexcept {
return std::launder(static_cast<T*>(std::memmove(p, p, sizeof(T))));
}
However, the std::start_lifetime_as construct is guaranteed to not access the memory, whereas the code above does not provide the same guarantees (although in practice, the access is likely to be optimised out).
C++23’s std::start_lifetime_as provides a robust solution for safely treating raw byte buffers as structured objects while avoiding undefined behavior. Although it is not yet widely implemented, its introduction marks a significant advancement in memory management and type safety in modern C++ development.