Pointers to memory
Rust defines several kinds of pointers, each with a specific purpose. &T is a shared reference and there may be many duplicates of that shared reference. The owner of &T does not necessarily, own T and may not modify it. Shared references are immutable. Mutable references—written &mut T—also do not imply that the other &mut T owns T necessarily but the reference may be used to mutate T. There may be only one reference for any T with a &mut T. This complicates some code but means that Rust is able to prove that two variables do not overlap in memory, unlocking a variety of optimization opportunities absent from C/C++. Rust references are designed so that the compiler is able to prove the liveness of the referred to type: references cannot dangle. This is not true of Rust's raw pointer types—*const T and *mut T—which work analogously to C pointers: they are, strictly, an address in memory and no guarantees about the data at that address are made. As such, many operations on raw pointers require the unsafe keyword and they are almost always seen solely in the context of performance-sensitive code or FFI interfaces. Put another way, a raw pointer may be null; a reference may never be null.
The rules around references often cause difficulty in situations where an immutable borrow is accidentally made of a mutable reference. The Rust documentation uses the following small program to illustrate the difficulty:
fn main() { let mut x: u8 = 5; let y: &mut u8 = &mut x; *y += 1; println!("{}", x); }
The println! macro takes its arguments by reference, implicitly here, creating a &x. The compiler rejects this program as y: &mut u8 is invalid. Were this program to compile, we would be subject to a race between the update of y and the read of x, depending on the CPU and memory ordering. The exclusive nature of references could be potentially limiting when working with structures. Rust allows programs to split borrows for a structure, providing that the disjoint fields cannot be aliased.
We demonstrate this in the following brief program:
use std::fmt; enum Project { Apollo, Gemini, Mercury, } impl fmt::Display for Project { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match *self { Project::Apollo => write!(f, "Apollo"), Project::Mercury => write!(f, "Mercury"), Project::Gemini => write!(f, "Gemini"), } } } struct Mission { project: Project, number: u8, duration_days: u8, } fn main() { let mut mission = Mission { project: Project::Gemini, number: 2, duration_days: 0, }; let proj: &Project = &mission.project; let num: &mut u8 = &mut mission.number; let dur: &mut u8 = &mut mission.duration_days; *num = 12; *dur = 3; println!("{} {} flew for {} days", proj, num, dur); }
This same trick is difficult to impossible for general container types. Consider a map where two keys map to the same referenced T. Or, for now, a slice:
use std::fmt; enum Project { Apollo, Gemini, Mercury, } impl fmt::Display for Project { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match *self { Project::Apollo => write!(f, "Apollo"), Project::Mercury => write!(f, "Mercury"), Project::Gemini => write!(f, "Gemini"), } } } struct Mission { project: Project, number: u8, duration_days: u8, } fn main() { let mut missions: [Mission; 2] = [ Mission { project: Project::Gemini, number: 2, duration_days: 0, }, Mission { project: Project::Gemini, number: 12, duration_days: 2, }, ]; let gemini_2 = &mut missions[0]; let _gemini_12 = &mut missions[1]; println!( "{} {} flew for {} days", gemini_2.project, gemini_2.number, gemini_2.duration_days ); }
This program fails to compile with the following error:
> rustc borrow_split_array.rs error[E0499]: cannot borrow `missions[..]` as mutable more than once at a time --> borrow_split_array.rs:40:27 | 39 | let gemini_2 = &mut missions[0]; | ----------- first mutable borrow occurs here 40 | let _gemini_12 = &mut missions[1]; | ^^^^^^^^^^^ second mutable borrow occurs here ... 46 | } | - first borrow ends here error: aborting due to previous error
We, the programmers, know that this was safe—gemini_2 and gemini_12 don't overlap in memory—but it's not possible for the compiler to prove this. What if we had done the following:
use std::fmt; enum Project { Apollo, Gemini, Mercury, } impl fmt::Display for Project { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match *self { Project::Apollo => write!(f, "Apollo"), Project::Mercury => write!(f, "Mercury"), Project::Gemini => write!(f, "Gemini"), } } } struct Mission { project: Project, number: u8, duration_days: u8, } fn main() { let gemini_2 = Mission { project: Project::Gemini, number: 2, duration_days: 0, }; let mut missions: [&Mission; 2] = [&gemini_2, &gemini_2]; let m0 = &mut missions[0]; let _m1 = &mut missions[1]; println!( "{} {} flew for {} days", m0.project, m0.number, m0.duration_days ); }
By definition, missions[0] and missions[1] overlap in memory. We, the programmers, know we're breaking the aliasing rules and the compiler, being conservative, assumes that the rules are being broken.