Boxed types
That's the trait Sized, but what does ?Sized mean? The ? flags the relevant trait as optional. So, a box is a type in Rust parameterized over some other type T which may or may not have a size. A box is a kind of a pointer to heap allocated storage. Let's look into its implementation further. What of Unique<T>? This type is a signal to the Rust compiler that some *mut T is non-null and that the unique is the sole owner of T, even though T was allocated outside the Unique. Unique is defined like so:
pub struct Unique<T: ?Sized> { pointer: NonZero<*const T>, // NOTE: this marker has no consequences for variance, but is
// necessary for dropck to understand that we logically
// own a `T`. _marker: PhantomData<T>, }
NonZero<T> is a struct that the rustc source describes as a wrapper type for raw pointers and integers that will never be NULL or 0 that might allow certain optimizations. It's annotated in a special way to admit those null pointer optimizations discussed elsewhere in this chapter. Unique is also of interest for its use of PhantomData<T>. PhantomData is, in fact, a zero sized type, defined as pub struct PhantomData<T:?Sized>;. This type instructs the Rust compiler to consider PhantomData<T> as owning T even though, ultimately, there's nowhere for PhantomData to store its newfound T. This works well for Unique<T>, which must take ownership of <T> by maintaining a non-zero constant pointer to T but does not, itself, have T stored anywhere other than in the heap. A box is then, a unique, non-null pointer to a thing allocated somewhere in memory but not inside the storage space of the box.
The internals of box are compiler intrinsics: they sit at the interplay of the allocator and are a special consideration in Rust's borrow checker. With that in mind, we will avoid chasing down the internal details of Box as they will change from compiler version to compiler version and this book is explicitly not a rustc internals book. For our purposes, however, it is worth considering the API exposed by box. The key functions are:
- fn from_raw(raw: *mut T) -> Box<T>
- fn from_unique(u: Unique<T>) -> Box<T>
- fn into_raw(b: Box<T>) -> *mut T
- fn into_unique(b: Box<T>) -> Unique<T>
- fn leak<'a>(b: Box<T>) -> &'a mut T
Both from_raw and from_unique are unsafe. Conversion from a raw pointer is unsafe if a raw pointer is boxed more than once or if a box is made from a pointer that overlaps with another, as examples. There are other possibilities. Conversion from a Unique<T> is unsafe as the T may or may not be owned by Unique, resulting in a possibility of the box not being the sole owner of its memory. The into_* functions, however, are safe in the sense that the resulting pointers will be valid but the caller will not have full responsibility for managing the lifetime of the memory. The Box documentation notes that the caller can release the memory themselves or convert the pointer back into the type they came from and allow Rust to do it for them. The latter is the approach this book will take. Finally, there's leak. Leak is a fun one and is not available on stable channel but is worth discussing for applications that will ship to embedded targets. A common memory management strategy for embedded systems is to pre-allocate all necessary memory and only operate on that memory for the lifetime of the program. In Rust, this is trivially accomplished if you desire uninitialized memory of a constant size: arrays and other primitive types. In the event you desire heap allocations at the start of your program, the situation is more complicated. That's where leak comes in: it causes memory to leak from a box—a heap allocation—to wherever you please. When the leaked memory is intended to live for the lifetime of the program—into the static lifetime—there's no issue. An example is as follows, straight from the docs for leak:
#![feature(box_leak)] fn main() { let x = Box::new(41); let static_ref: &'static mut usize = Box::leak(x); *static_ref += 1; assert_eq!(*static_ref, 42); }
Here, we see a new usize allocated on the heap, leaked into static_ref—a mutable reference of static lifetime—and then fiddled with through the remaining lifetime of the program.