Hands-On Data Structures and Algorithms with Rust
上QQ阅读APP看书,第一时间看更新

A transaction log

First, a list has to be defined—in Rust, lacking a null type, each item is chained to the next by an Option property. The Option instances are enumerations that wrap either the value, in this case a heap reference (such as a Box, Rc, and so on), or none—Rust's typed null equivalent. Why? Let's find out!

Creating a prototypical implementation to explore a certain aspect is always a good idea, especially since the compiler often provides excellent feedback. Accordingly, an implementation of an integer list is the first step. How about this struct for each list element?

Have a look at the following code snippet:

struct Node {
value: i32,
next: Option<Node>
}

For practical considerations, it needs a way to know where to start and the length of the list. Considering the planned append operation, a reference to the end (tail) would be useful too:

struct TransactionLog {
head: Option<Node>,
tail: Option<Node>,
pub length: u64
}

That looks great! Does it work though?

error[E0072]: recursive type `Node` has infinite size
--> ch4/src/lib.rs:5:1
|
5 | struct Node {
| ^^^^^^^^^^^^^ recursive type has infinite size
6 | value: i32,
7 | next: Option<Node>
| ------------------ recursive without indirection
|
= help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to make `Node` representable

Unfortunately, it doesn't work—and, thinking back to the previous chapters, it becomes clear why: the compiler cannot be certain of the data structure's size, since the entire list would have to be nested into the first element. However, as we know, the compiler cannot compute and therefore allocate the required amount of memory this way—which is why reference types are required.

Reference types (such as Box, Rc, and so on) are a good fit, since they allocate space on the heap and therefore allow for larger lists. Here's an updated version:

use std::cell::RefCell;
use std::rc::Rc;

struct
Node {
value: i32,
next: Option<Rc<RefCell<Node>>>
}

struct TransactionLog {
head: Option<Rc<RefCell<Node>>>,
tail:
Option<Rc<RefCell<Node>>>,
pub length: u64
}

Storing each node item in a Rc<RefCell<T>> provides the ability to retrieve and replace data as needed (the internal mutability pattern)—crucial when executing operations on the list. Another good practice is to alias types, especially if there are a lot of generics in play. This makes it easy to replace type implementations and provides a more readable definition:

type SingleLink = Option<Rc<RefCell<Node>>>;

#[derive(Clone)]
struct Node {
value: i32,
next: SingleLink,
}

Perfect! This is the base definition of the transaction log, but to use it there are many things missing. First of all, the value type has to be String:

#[derive(Clone)]
struct Node {
value: String,
next: SingleLink,
}

impl Node {
// A nice and short way of creating a new node
fn new(value: String) -> Rc<RefCell<Node>> {
Rc::new(RefCell::new(Node {
value: value,
next: None,
}))
}
}

In addition to that, it is going to be useful to create an empty list, so the impl block of the list has a single function for now—new_empty():

impl TransactionLog {
pub fn new_empty() -> TransactionLog {
TransactionLog { head: None, tail: None, length: 0 }
    }
}

Still, there is a lot missing. To recap, the transaction log has two requirements:

  • Append entries at the end
  • Remove entries from the front

Let's start with the first requirement: appending items to the back of the list!