FP Complete


I recently joined Matt Moore on LambdaShow. We spent some time discussing Rust, and one point I made was that, in my experience with Rust, ergonomics go something like this:

That may seem a bit abstract. Fortunately for me, an example of that popped up almost immediately after the post went live. This is my sheepish blog post explaining how I fairly solidly misunderstood something about the borrow checker. Hopefully it will help others.

Two weeks back, I wrote an offhand tweet with a bit of a code puzzle:

This program _looks_ like it could segfault by using a pointer to a dropped String. Who wants to guess what it actually does? pic.twitter.com/gurHjdh2A7

— Michael Snoyman (@snoyberg) September 3, 2020

I thought this was a slightly tricky case of ownership, and hoped it would help push people to a more solid understanding of the topic. Soon after, I got a reply that gave the solution I had expected:

Without the RefCell it would fail to compile, because you have a live & with hello, then you try to borrow all_tags as &mut, despite an immutable reference. Since we use RefCell, this becomes a runtime panic instead.

— Lúcás Meier (@cronokirby) September 3, 2020

But then the twist: a question that made me doubt my own sanity.

But I don’t quite understand how it does that at runtime. All we keep is a plain reference, not a `Ref` (which keeps count). The `Ref` should actually get destroyed, but somehow it won’t because of the &str we keep. I studied the code quite a bit, but it still looks like magic.

— Eskimo Coder (@tuxkimo) September 11, 2020

This led me to filing a bogus bug report with the Rust team. Fortunately for me, Jonas Schievink had mercy and quickly pointed me to the documentation on temporary lifetime extension, which explains the whole situation.

If you’ve read this much, and everything made perfect sense, congratulations! You probably don’t need to bother reading the rest of the post. But if anything is unclear, keep reading. I’ll try to make this as clear as possible.

And if the explanation below still doesn’t make sense, may I recommend FP Complete’s Rust Crash Course eBook to brush up on ownership?

Borrow rules

Arguably the key feature of Rust is its borrow checker. One of the core rules of the borrow checker is that you cannot access data that is mutably referenced elsewhere. Or said more directly: you can either immutably borrow data multiple times, or mutably borrow it once, but not both at the same time. Usually, we let the borrow checker enforce this rule. And it enforces that rule at compile time.

However, there are some situations where a statically checked rule like that is too restrictive. In such cases, the Rust standard library provides cells, which let you move this borrow checking from compile time (via static analysis) to runtime (via dynamic counters). This is known as interior mutability. And a common type for this is a RefCell.

With a RefCell, the checking occurs at runtime. Let’s demonstrate how that works. First, consider this program that fails to compile:

fn main() {
    let mut age: u32 = 30;

    let age_ref: &u32 = &age;

    let age_mut_ref: &mut u32 = &mut age;
    *age_mut_ref += 1;

    println!("Happy birthday, you're {} years old!", age_ref);
}

We try to take both an immutable reference and a mutable reference to the value age simultaneously. This doesn’t work out too well:

error[E0502]: cannot borrow `age` as mutable because it is also borrowed as immutable
 --> srcmain.rs:6:33
  |
4 |     let age_ref: &u32 = &age;
  |                         ---- immutable borrow occurs here
5 |
6 |     let age_mut_ref: &mut u32 = &mut age;
  |                                 ^^^^^^^^ mutable borrow occurs here
...
9 |     println!("Happy birthday, you're {} years old!", age_ref);
  |                                                      ------- immutable borrow later used here

The right thing to do is to fix this code. But let’s do the wrong thing! Instead of trying to fix it correctly, we’re going to use RefCell to replace our compile time checks (which prevent the code from building) with runtime checks (which allow the code to build, and then fail at runtime). Let’s check that out:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    let age_ref: Ref<u32> = age.borrow();

    let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
    *age_mut_ref += 1;

    println!("Happy birthday, you're {} years old!", age_ref);
}

It’s instructive to compare this code with the previous code. It looks remarkably similar! We’re replaced &u32 with Ref<u32>, &mut u32 with RefMut<u32>, and &age and &mut age with age.borrow() and age.borrow_mut(), respectively. You may be wondering: what are those Ref and RefMut things? Hold that thought.

This code surprisingly compiles. And here’s the runtime output (using Rust Nightly, which gives a slightly nicer error message):

thread 'main' panicked at 'already borrowed: BorrowMutError', srcmain.rs:7:44

That looks a lot like the error message we saw above from the compiler. That’s no accident: these are the same error showing up in two different ways.

Ref and RefMut

Our code panics when it calls age.borrow_mut(). Something seems to know that the age_ref variable exists. And in fact, that’s basically true. When we called age.borrow(), a counter on the RefCell was incremented. As long as age_ref stays alive, that counter will remain active. When age_ref goes out of scope, the Ref<u32> will be dropped, and the drop will cause the counter to be decremented. The same logic applies to the age_mut_ref. Let’s make two modifications to our code. First, there’s no need to call age.borrow() before age.borrow_mut(). Let’s slightly rearrange the code:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
    *age_mut_ref += 1;

    let age_ref: Ref<u32> = age.borrow();
    println!("Happy birthday, you're {} years old!", age_ref);
}

This compiles, but still gives a runtime error. However, it’s a slightly different one:

thread 'main' panicked at 'already mutably borrowed: BorrowError', srcmain.rs:8:33

Now the problem is that, when we try to call age.borrow(), the age_mut_ref is still active. Fortunately, we can fix that by manually dropping it before the age.borrow() call:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
    *age_mut_ref += 1;
    std::mem::drop(age_mut_ref);

    let age_ref: Ref<u32> = age.borrow();
    println!("Happy birthday, you're {} years old!", age_ref);
}

And finally, our program not only compiles, but runs successfully! Now I know that I’m 31 years old! (Or at least I wish I still was.)

We have another mechanism for forcing the value to drop: an inner block. If we create a block within the main function, it will have its own scope, and the age_mut_ref will automatically be dropped, no need for std::mem::drop. That looks like this:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    {
        let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
        *age_mut_ref += 1;
    }

    let age_ref: Ref<u32> = age.borrow();
    println!("Happy birthday, you're {} years old!", age_ref);
}

Once again, this compiles and runs. Looking back, we can hopefully now understand why Ref and RefMut are necessary. If .borrow() and .borrow_mut() simply returned actual references (immutable or mutable), there would be no struct with a Drop impl to ensure that the internal counters in RefCell were decremented when they go out of scope. So the world now makes sense.

No reference without a Ref

Here’s something cool: you can borrow a normal reference (e.g. &u32) from a Ref (e.g. Ref<u32>). Check this out:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    {
        let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
        *age_mut_ref += 1;
    }

    let age_ref: Ref<u32> = age.borrow();
    let age_reference: &u32 = &age_ref;
    println!("Happy birthday, you're {} years old!", age_reference);
}

age_ref is a Ref<u32>, but age_reference is a &u32. This is a compile-time-checked reference. We’re now saying that the lifetime of age_reference cannot outlive the lifetime of age_ref. As it stands, that’s true, and everything compiles and runs correctly. But we can break that really easily using either std::mem::drop:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    {
        let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
        *age_mut_ref += 1;
    }

    let age_ref: Ref<u32> = age.borrow();
    let age_reference: &u32 = &age_ref;
    std::mem::drop(age_ref);
    println!("Happy birthday, you're {} years old!", age_reference);
}

Or by using inner blocks:

use std::cell::{Ref, RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    {
        let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
        *age_mut_ref += 1;
    }

    let age_reference: &u32 = {
        let age_ref: Ref<u32> = age.borrow();
        &age_ref
    };
    println!("Happy birthday, you're {} years old!", age_reference);
}

The latter results in the error message:

error[E0597]: `age_ref` does not live long enough
  --> srcmain.rs:12:9
   |
10 |     let age_reference: &u32 = {
   |         ------------- borrow later stored here
11 |         let age_ref: Ref<u32> = age.borrow();
12 |         &age_ref
   |         ^^^^^^^^ borrowed value does not live long enough
13 |     };
   |     - `age_ref` dropped here while still borrowed

This makes sense hopefully: age_reference is borrowing from age_ref, and therefore cannot outlive it.

The false fail

Alright, our inner block currently looks like this, and refuses to compile:

let age_reference: &u32 = {
    let age_ref: Ref<u32> = age.borrow();
    &age_ref
};

age_ref is really a useless temporary variable inside that block. I assign a value to it, and then immediately borrow from that variable and never use it again. It should have no impact on our program to combine that into a single line within a block, right? Wrong. Check out this program:

use std::cell::{RefMut, RefCell};
fn main() {
    let age: RefCell<u32> = RefCell::new(30);

    {
        let mut age_mut_ref: RefMut<u32> = age.borrow_mut();
        *age_mut_ref += 1;
    }

    let age_reference: &u32 = {
        &age.borrow()
    };
    println!("Happy birthday, you're {} years old!", age_reference);
}

This looks almost identical to the code above. But this code compiles and runs successfully. What gives?!? It turns out, creating our temporary variable wasn’t quite as meaningless as we thought. That’s thanks to something called temporary lifetime extension. Let me start with a caveat from the docs themselves:

Note: The exact rules for temporary lifetime extension are subject to change. This is describing the current behavior only.

With that out of the way, let’s quote once more from the docs:

The temporary scopes for expressions in let statements are sometimes extended to the scope of the block containing the let statement. This is done when the usual temporary scope would be too small, based on certain syntactic rules.

OK, I’m all done quoting. The documentation there is pretty good at explaining things. For our case above, let’s look at the code in question:

let age_reference: &u32 = {
    &age.borrow()
};

age.borrow() create a value of type Ref<u32>. What variable holds that value? Trick question: there isn’t one. This value is temporary. We use temporary values in programming all the time. In (1 + 2) + 5, the expression 1 + 2 generates a temporary 3, which is then added to 5 and thrown away. Normally these temporaries aren’t terribly interesting.

But in the context of lifetimes and borrow checkers, they are. Taken at the most literal, { &age.borrow() } should behave as follows:

But this kind of thing would pop up all the time! Consider the incredibly simple examples from the docs that I promised not to quote from anymore (borrowing code snippets is different, OK?):

let x = &mut 0;
// Usually a temporary would be dropped by now, but the temporary for `0` lives
// to the end of the block.
println!("{}", x);

It turns out that strictly following lexical scoping rules for lifetimes wouldn’t be ergonomic. So there’s a special case to make it feel right.

Conclusion

Firstly, I hope this was a good example of my comment about ergonomics. I never would have thought about let x = &mut 0 as a beginner: yeah, sure, I can borrow a reference to a number. Cool. Then, with a bit more experience, it suddenly seems shocking: what’s the lifetime of 0? And finally, with just a bit more experience (and the kind help of Rust issue tracker maintainers), it makes sense again.

Secondly, I hope this semi-deep dive into how RefCell moves borrow rule checking to runtime helps elucidate some things. In my opinion, this was one of the harder concepts to grok in my Rust learning journey.

Thirdly, I hope seeing the temporary lifetime extension rules helps clarify why some things work that you thought wouldn’t. I know I’ve been in the middle of writing something before, been surprised the borrow checker didn’t punch me in the face, and then happily went on my way instead of questioning why everything went better than expected.

The tweets I started this off with discuss a more advanced version than I covered in the rest of the post. I’d recommend going back to the top and making sure the code and explanations all make sense.

Want to learn more about Rust? Check out FP Complete’s Rust Crash Course, or read about our training courses. Also, you may be interested in these related posts:

Subscribe to our blog via email

Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Tagged