FP Complete


Most of the web services I’ve written in Rust have used actix-web. Recently, I needed to write something that will provide some reverse proxy functionality. I’m more familiar with the hyper-powered HTTP client libraries (reqwest in particular). I decided this would be a good time to experiment again with hyper on the server side as well. The theory was that having matching Request and Response types between the client and server would work nicely. And it certainly did.

In the process, I ended up with an interesting example of battling ownership through closures and async blocks. This is a topic I typically mention in my Rust training sessions as the hardest thing I had to learn when learning Rust. So I figure a blog post demonstrating one of these crazy cases would be worthwhile.

Side note: If you’re interested in learning more about Rust, we’ll be offering a free Rust training course in December. Sign up for more information.

Cargo.toml

If you want to play along, you should start off with a cargo new. I’m using the following [dependencies] in my Cargo.toml

[dependencies]
hyper = "0.13"
tokio = { version = "0.2", features = ["full"] }
log = "0.4.11"
env_logger = "0.8.1"
hyper-tls = "0.4.3"

I’m also compiling with Rust version 1.47.0. If you’d like, you can add 1.47.0 to your rust-toolchain. And finally, my full Cargo.lock is available as a Gist.

Basic web service

To get started with a hyper-powered web service, we can use the example straight from the hyper homepage:

use std::{convert::Infallible, net::SocketAddr};
use hyper::{Body, Request, Response, Server};
use hyper::service::{make_service_fn, service_fn};

async fn handle(_: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new("Hello, World!".into()))
}

#[tokio::main]
async fn main() {
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));

    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle))
    });

    let server = Server::bind(&addr).serve(make_svc);

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);
    }
}

It’s worth explaining this a little bit, since at least in my opinion the distinction between make_service_fn and service_fn wasn’t clear. There are two different things we’re trying to create here:

This glosses over a number of details, such as:

To help us with that “glossing”, hyper provides two convenience functions for creating MakeService and Service values, make_service_fn and service_fn. Each of these will convert a closure into their respective types. Then the MakeService closure can return a Service value, and the MakeService value can be provided to hyper::server::Builder::serve. Let’s get even more concrete from the code above:

async fn handle(_: Request<Body>) -> Result<Response<Body>, Infallible> {...}
let make_svc = make_service_fn(|_conn| async {
    Ok::<_, Infallible>(service_fn(handle))
});

The handle function takes a Request<Body> and returns a Future<Output=Result<Response<Body, Infallible>>>. The Infallible is a nice way of saying “no errors can possibly occur here.” The type signatures at play require that we use a Result, but morally Result<T, Infallible> is equivalent to T.

service_fn converts this handle value into a Service value. This new value implements all of the appropriate traits to satisfy the requirements of make_service_fn and serve. We wrap up that new Service in its own Result<_, Infallible>, ignore the input &AddrStream value, and pass all of this to make_service_fn. make_svc is now a value that can be passed to serve, and we have “Hello, world!”

And if all of this seems a bit complicated for a “Hello world,” you may understand why there are lots of frameworks built on top of hyper to make it easier to work with. Anyway, onwards!

Initial reverse proxy

Next up, we want to modify our handle function to perform a reverse proxy instead of returning the “Hello, World!” text. For this example, we’re going to hard-code https://www.fpcomplete.com as the destination site for this reverse proxy. To make this happen, we’ll need to:

I’m also going to move over to the env-logger and log crates for producing output. I did this when working on the code myself, and switching to RUST_LOG=debug was a great way to debug things. (When I was working on this, I forgot I needed to create a special Client with TLS support.)

So from the top! We now have the following use statements:

use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Client, Request, Response, Server};
use hyper_tls::HttpsConnector;
use std::net::SocketAddr;

We next have three constants. The SCHEME and HOST are pretty self-explanatory: the hardcoded destination.

const SCHEME: &str = "https";
const HOST: &str = "www.fpcomplete.com";

Next we have some HTTP request headers that should not be forwarded onto the destination server. This blacklist approach to HTTP headers in reverse proxies works well enough. It’s probably a better idea in general to follow a whitelist approach. In any event, these six headers have the potential to change behavior at the transport layer, and therefore cannot be passed on from the client:

/// HTTP headers to strip, a whitelist is probably a better idea
const STRIPPED: [&str; 6] = [
    "content-length",
    "transfer-encoding",
    "accept-encoding",
    "content-encoding",
    "host",
    "connection",
];

And next we have a fairly boilerplate error type definition. We can generate a hyper::Error when performing the HTTP request to the destination server, and a hyper::http::Error when constructing the new Request. Arguably we should simply panic if the latter error occurs, since it indicates programmer error. But I’ve decided to treat it as its own error variant. So here’s some boilerplate!

#[derive(Debug)]
enum ReverseProxyError {
    Hyper(hyper::Error),
    HyperHttp(hyper::http::Error),
}

impl From<hyper::Error> for ReverseProxyError {
    fn from(e: hyper::Error) -> Self {
        ReverseProxyError::Hyper(e)
    }
}

impl From<hyper::http::Error> for ReverseProxyError {
    fn from(e: hyper::http::Error) -> Self {
        ReverseProxyError::HyperHttp(e)
    }
}

impl std::fmt::Display for ReverseProxyError {
    fn fmt(&self, fmt: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(fmt, "{:?}", self)
    }
}

impl std::error::Error for ReverseProxyError {}

With all of this in place, we can finally start writing our handle function:

async fn handle(mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
}

We’re going to mutate the incoming Request to have our new destination, and then pass it along to the destination server. This is where the beauty of using hyper for client and server comes into play: no need to futz around with changing body or header representations. The first thing we do is strip out any of the STRIPPED request headers:

let h = req.headers_mut();
for key in &STRIPPED {
    h.remove(*key);
}

Next, we’re going to construct the new request URI by combining:

let mut builder = hyper::Uri::builder()
    .scheme(SCHEME)
    .authority(HOST);
if let Some(pq) = req.uri().path_and_query() {
    builder = builder.path_and_query(pq.clone());
}
*req.uri_mut() = builder.build()?;

Panicking if req.uri().path_and_query() is None would be appropriate here, but as is my wont, I’m avoiding panics if possible. Next, for good measure, let’s add in a little bit of debug output:

log::debug!("request == {:?}", req);

Now we can construct our Client value to perform the HTTPS request:

let https = HttpsConnector::new();
let client = Client::builder().build(https);

And finally, let’s perform the request, log the response, and return the response:

let response = client.request(req).await?;
log::debug!("response == {:?}", response);
Ok(response)

Our main function looks pretty similar to what we had before. I’ve added in initialization of env-logger with a default to info level output, and modified the program to abort if the server produces any errors:

#[tokio::main]
async fn main() {
    env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));

    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, ReverseProxyError>(service_fn(handle))
    });

    let server = Server::bind(&addr).serve(make_svc);
    log::info!("Server started, bound on {}", addr);

    if let Err(e) = server.await {
        log::error!("server error: {}", e);
        std::process::abort();
    }
}

The full code is available as a Gist. This program works as expected, and if I cargo run it and connect to http://localhost:3000, I see the FP Complete homepage. Yay!

Wasteful Client

The problem with this program is that it constructs a brand new Client value on every incoming request. That’s expensive. Instead, we would like to produce the Client once, in main, and reuse it for each request. And herein lies the ownership puzzle. While we’re at this, let’s move away from using consts for the scheme and host, and instead bundle together the client, scheme, and host into a new struct:

struct ReverseProxy {
    scheme: String,
    host: String,
    client: Client<HttpsConnector<hyper::client::HttpConnector>>,
}

Next, we’ll want to change handle from a standalone function to a method on ReverseProxy. (We could equivalently pass in a reference to ReverseProxy for handle, but this feels more idiomatic):

impl ReverseProxy {
    async fn handle(&self, mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
        ...
    }
}

Then, within handle, we can replace SCHEME and HOST with &*self.scheme and &*self.host. You may be wondering “why &* and not &.” Without &*, you’ll get an error message:

error[E0277]: the trait bound `hyper::http::uri::Scheme: std::convert::From<&std::string::String>` is not satisfied
  --> srcmain.rs:59:14
   |
59 |             .scheme(&self.scheme)
   |              ^^^^^^ the trait `std::convert::From<&std::string::String>` is not implemented for `hyper::http::uri::Scheme`

This is one of those examples where the magic of deref coercion seems to fall apart. Personally, I prefer using self.scheme.as_str() instead of &*self.scheme to be more explicit, but &*self.scheme is likely more idiomatic.

Anyway, the final change within handle is to remove the let https = ...; and let client = ...; statements, and instead construct our response with:

let response = self.client.request(req).await?;

With that, our handle method is done, and we can focus our efforts on the true puzzle: the main function itself.

The easy part

The easy part of this is great: construct a ReverseProxy value, and provide the make_svc to serve:

#[tokio::main]
async fn main() {
    env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));

    let https = HttpsConnector::new();
    let client = Client::builder().build(https);

    let rp = ReverseProxy {
        client,
        scheme: "https".to_owned(),
        host: "www.fpcomplete.com".to_owned(),
    };

    // here be dragons

    let server = Server::bind(&addr).serve(make_svc);
    log::info!("Server started, bound on {}", addr);

    if let Err(e) = server.await {
        log::error!("server error: {}", e);
        std::process::abort();
    }
}

That middle part is where the difficulty lies. Previously, this code looked like:

let make_svc = make_service_fn(|_conn| async {
    Ok::<_, ReverseProxyError>(service_fn(handle))
});

We no longer have a handle function. Working around that little enigma doesn’t seem so bad initially. We’ll create a closure as the argument to service_fn:

let make_svc = make_service_fn(|_conn| async {
    Ok::<_, ReverseProxyError>(service_fn(|req| {
        rp.handle(req)
    }))
});

While that looks appealing, it fails lifetimes completely:

error[E0597]: `rp` does not live long enough
   --> srcmain.rs:90:13
    |
88  |       let make_svc = make_service_fn(|_conn| async {
    |  ____________________________________-------_-
    | |                                    |
    | |                                    value captured here
89  | |         Ok::<_, ReverseProxyError>(service_fn(|req| {
90  | |             rp.handle(req)
    | |             ^^ borrowed value does not live long enough
91  | |         }))
92  | |     });
    | |_____- returning this value requires that `rp` is borrowed for `'static`
...
101 |   }
    |   - `rp` dropped here while still borrowed

Nothing in the lifetimes of these values tells us that the ReverseProxy value will outlive the service. We cannot simply borrow a reference to ReverseProxy inside our closure. Instead, we’re going to need to move ownership of the ReverseProxy to the closure.

let make_svc = make_service_fn(|_conn| async {
    Ok::<_, ReverseProxyError>(service_fn(move |req| {
        rp.handle(req)
    }))
});

Note the addition of move in front of the closure. Unfortunately, this doesn’t work, and gives us a confusing error message:

error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
  --> srcmain.rs:90:16
   |
90 |             rp.handle(req)
   |                ^^^^^^
   |
note: first, the lifetime cannot outlive the lifetime `'_` as defined on the body at 89:47...
  --> srcmain.rs:89:47
   |
89 |         Ok::<_, ReverseProxyError>(service_fn(move |req| {
   |                                               ^^^^^^^^^^
note: ...so that closure can access `rp`
  --> srcmain.rs:90:13
   |
90 |             rp.handle(req)
   |             ^^
   = note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `hyper::proto::h2::server::H2Stream<impl std::future::Future, hyper::Body>` will meet its required lifetime bounds
  --> srcmain.rs:94:38
   |
94 |     let server = Server::bind(&addr).serve(make_svc);
   |                                      ^^^^^

error: aborting due to previous error

Instead of trying to parse that, let’s take a step back, reassess, and then try again.

So many layers!

Remember way back to the beginning of this post. I went into some details around the process of having a MakeService, which would be run for each new incoming connection, and a Service, which would be run for each new request on an existing connection. The way we’ve written things so far, the first time we handle a request, that request handler will consume the ReverseProxy. That means that we would have a use-after-move for each subsequent request on that connection. We’d also have a use-after-move for each subsequent connection we receive.

We want to share our ReverseProxy across multiple different MakeService and Service instantiations. Since this will occur across multiple system threads, the most straightforward way to handle this is to wrap our ReverseProxy in an Arc:

let rp = std::sync::Arc::new(ReverseProxy {
    client,
    scheme: "https".to_owned(),
    host: "www.fpcomplete.com".to_owned(),
});

Now we’re going to need to play around with cloneing this Arc at appropriate times. In particular, we’ll need to clone twice: once inside the make_service_fn closure, and once inside the service_fn closure. This will ensure that we never move the ReverseProxy value out of the closure’s environment, and that our closure can remain a FnMut instead of an FnOnce.

And, in order to make that happen, we’ll need to convince the compiler through appropriate usages of move to move ownership of the ReverseProxy, instead of borrowing a reference to a value with a different lifetime. This is where the fun begins! Let’s go through a series of modifications until we get to our final mind-bender.

Adding move

To recap, we’ll start with this code:

let rp = std::sync::Arc::new(ReverseProxy {
    client,
    scheme: "https".to_owned(),
    host: "www.fpcomplete.com".to_owned(),
});

let make_svc = make_service_fn(|_conn| async {
    Ok::<_, ReverseProxyError>(service_fn(|req| {
        rp.handle(req)
    }))
});

The first thing I tried was adding an rp.clone() inside the first async block:

let make_svc = make_service_fn(|_conn| async {
    let rp = rp.clone();
    Ok::<_, ReverseProxyError>(service_fn(|req| {
        rp.handle(req)
    }))
});

This doesn’t work, presumably because I need to stick some moves on the initial closure and async block like so:

let make_svc = make_service_fn(move |_conn| async move {
    let rp = rp.clone();
    Ok::<_, ReverseProxyError>(service_fn(|req| {
        rp.handle(req)
    }))
});

This unfortunately still doesn’t work, and gives me the error message:

error[E0507]: cannot move out of `rp`, a captured variable in an `FnMut` closure
  --> srcmain.rs:88:60
   |
82 |       let rp = std::sync::Arc::new(ReverseProxy {
   |           -- captured outer variable
...
88 |       let make_svc = make_service_fn(move |_conn| async move {
   |  ____________________________________________________________^
89 | |         let rp = rp.clone();
   | |                  --
   | |                  |
   | |                  move occurs because `rp` has type `std::sync::Arc<ReverseProxy>`, which does not implement the `Copy` trait
   | |                  move occurs due to use in generator
90 | |         Ok::<_, ReverseProxyError>(service_fn(|req| {
91 | |             rp.handle(req)
92 | |         }))
93 | |     });
   | |_____^ move out of `rp` occurs here

It took me a while to grok what was happening. And in fact, I’m not 100% certain I grok it yet. But I believe what is happening is:

That’s no good! It turns out the trick to fixing this isn’t so difficult. Don’t grab ownership in the async block. Instead, clone the rp in the closure, before the async block:

let make_svc = make_service_fn(move |_conn| {
    let rp = rp.clone();
    async move {
        Ok::<_, ReverseProxyError>(service_fn(|req| {
            rp.handle(req)
        }))
    }
});

Woohoo! One clone down. This code still doesn’t compile, but we’re closer. The next change to make is simple: stick a move on the inner closure:

let make_svc = make_service_fn(move |_conn| {
    let rp = rp.clone();
    async move {
        Ok::<_, ReverseProxyError>(service_fn(move |req| {
            rp.handle(req)
        }))
    }
});

This also fails, but going back to our description before, it’s easy to see why. We still need a second clone, to make sure we aren’t moving the ReverseProxy value out of the closure. Making that change is easy, but unfortunately doesn’t fully solve our problem. This code:

let make_svc = make_service_fn(move |_conn| {
    let rp = rp.clone();
    async move {
        Ok::<_, ReverseProxyError>(service_fn(move |req| {
            let rp = rp.clone();
            rp.handle(req)
        }))
    }
});

Still gives us the error message:

error[E0515]: cannot return value referencing local variable `rp`
  --> srcmain.rs:93:17
   |
93 |                 rp.handle(req)
   |                 --^^^^^^^^^^^^
   |                 |
   |                 returns a value referencing data owned by the current function
   |                 `rp` is borrowed here

What’s going on here?

Did your Future borrow my reference?

Again, referring to the introduction, I mentioned that the service_fn parameter had to return a Future<Output...>. This is an example of the impl Trait approach. I’ve previously blogged about ownership and impl trait. There are some pain points around this combination. And we’ve hit one of them.

The return type of our handle method doesn’t indicate what underlying type is implementing Future. That underlying implementation may choose to hold onto references passed into the handle method. That would include references to &self. And that means if we return that Future outside of our closure, a reference may outlive the value.

I can think of two ways to solve this problem, though there are probably more. The first one I’ll show you isn’t the one I prefer, but is the one that likely gets the idea across more clearly. Our handle method is taking a reference to ReverseProxy. But if it didn’t take a reference, and instead received the ReverseProxy by move, there would be no references to accidentally end up in the Future.

Cloning the ReverseProxy itself is expensive. Fortunately, we have another option: pass in the Arc<ReverseProxy>!

impl ReverseProxy {
    async fn handle(self: std::sync::Arc<Self>, mut req: Request<Body>) -> Result<Response<Body>, ReverseProxyError> {
        ...
    }
}

Without changing any code inside the handle method or the main function, this compiles and behaves correctly. But like I said: I don’t like it very much. This is limiting the generality of our handle method. It feels like putting the complexity in the wrong place. (Maybe you’ll disagree and say that this is the better solution. That’s fine, I’d be really interested to hear people’s thoughts.)

Instead, another possibility is to introduce an async move inside main. This will take ownership of the Arc<ReverseProxy>, and ensure that it lives as long as the Future generated by that async move block itself. This solution looks like this:

let make_svc = make_service_fn(move |_conn| {
    let rp = rp.clone();
    async move {
        Ok::<_, ReverseProxyError>(service_fn(move |req| {
            let rp = rp.clone();
            async move { rp.handle(req).await }
        }))
    }
});

We need to call .await inside the async block to ensure we don’t return a future-of-a-future. But with that change, everything works. I’m not terribly thrilled with this. It feels like an ugly hack. I don’t have any recommendations, but I hope there are improvements to the impl Trait ownership story in the future.

One final improvement

One final tweak. We put async move after the first rp.clone() originally. This helped make the error messages more tractable. But it turns out that that move isn’t doing anything useful. The move on the inner closure already forces a move of the cloned rp. So we can simplify our code by removing just one move:

let make_svc = make_service_fn(move |_conn| {
    let rp = rp.clone();
    async {
        Ok::<_, ReverseProxyError>(service_fn(move |req| {
            let rp = rp.clone();
            async move { rp.handle(req).await }
        }))
    }
});

This final version of the code is available as a Gist too.

Conclusion

I hope this was a fun trip down ownership lane. If this seemed overly complicated, keep in mind a few things:

If you enjoyed this, you may want to check out our Rust Crash Course, or sign up for our free December Rust training course.

Subscribe to our blog via email

Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Tagged