Hands-On Concurrency with Rust
上QQ阅读APP看书,第一时间看更新

Memory layout

Rust has a handful of mechanisms to lay out compound types in memory. They are as follows:

  • Arrays
  • Enums
  • Structs
  • Tuples

Exactly how these are laid out in memory depends on the representation chosen. By default, everything in Rust is repr(Rust). All repr(Rust) types are aligned on byte boundaries to the power of two. Every type is at least one byte in memory, then two, then four, and so forth. Primitives—u8, usize, bool, and &T—are aligned to their size. In Rust, representation structures have alignment according to the largest field. Consider the following struct:

struct AGC {
  elapsed_time2: u16,
  elapsed_time1: u16,
  wait_list_upper: u32,
  wait_list_lower: u16,
  digital_autopilot: u16,
  fine_scale: u16
}

AGC is aligned to u32 with padding inserted as appropriate to match that 32-bit alignment. Rust will re-order fields to achieve maximal packing. Enums are different, being subject to a host of optimizations, most notably null pointer optimization. See the following enumeration:

enum AGCInstruction {
  TC,
  TCF,
  CCS(u8),
  B(u16),
  BZF,
}

This will be laid out as follows:

struct AGCInstructionRepr {
  data: u16,
  tag: u8,
}

The data field is wide enough to accommodate the largest inner value and the tag allows discrimination between variants. Where this gets complicated is in the case of an enum that holds a non-nullable pointer, and other variants cannot refer to the same. Option<&T> means if a null pointer is discovered when dereferencing the option Rust can assume that the None variant was discovered. Rust will optimize away the tag for Option<&T>.

Rust supports other representations. repr(C)  lays out types in memory in a manner that C would do and is often used in FFI projects, as we'll see later in this book. repr(packed) lays types out in memory like repr(Rust) except that no padding is added, and alignment occurs only to the byte. This representation is likely to cause unaligned loads and a severe effect on performance of common CPUs, certainly the two CPU architectures we concern ourselves with in this book. The remaining representations have to do with forcing the size of fieldless enumerations—that is, enumerations that have no data in their variants—and these are useful for forcing the size of such an enum with an eye towards ABI compatability.

Rust allocations happen by default on the hardware stack, as is common for other low-level languages. Heap allocations must be performed explicitly by the programmer or be done implicitly when creating a new type that holds some sort of internal storage. There are complications here. By default, Rust types obey move semantics: the bits of the type are moved as appropriate between contexts. For example:

fn project_flights() -> Vec<(u16, u8)> {
    let mut t = Vec::new();
    t.push((1968, 2));
    t.push((1969, 4));
    t.push((1970, 1));
    t.push((1971, 2));
    t.push((1972, 2));
    t
}

fn main() {
    let mut total: u8 = 0;
    let flights = project_flights();
    for &(_, flights) in flights.iter() {
        total += flights;
    }
    println!("{}", total);
}

The project_flights function allocates a new Vec<(u16, u8)> on the heap, populates it, and then returns ownership of the heap-allocated vector to the caller. This does not mean that the bits of t are copied from the stack frame of project_flights but, instead, that the pointer  t is returned from the project_flights stack to main. It is possible to achieve copy semantics in Rust through the use of the Copy trait. Copy types will have their bits copied in memory from one place to the other. Rust primitive types are Copy—copying them is as fast as moving them, especially when the type is smaller than the native pointer. It's possible to implement Copy for your own type unless your type implements Drop, the trait that defines how a type deallocates itself. This restriction eliminates—in Rust code not using unsafe —the possibility of double frees. The following code block derives Copy for two user-defined types and is an example of a poor random generator:

#[derive(Clone, Copy, PartialEq, Eq)]
enum Project {
    Apollo,
    Gemini,
    Mercury,
}

#[derive(Clone, Copy)]
struct Mission {
    project: Project,
    number: u8,
    duration_days: u8,
}

fn flight() -> Mission {
    Mission {
        project: Project::Apollo,
        number: 8,
        duration_days: 6,
    }
}

fn main() {
    assert_eq!(::std::mem::size_of::<Mission>(), 3);
    let mission = flight();
    if mission.project == Project::Apollo && mission.number == 8 {
        assert_eq!(mission.duration_days, 6);
    }
}