This is a debugging story told completely out of order. In order to understand the ultimate bug, why it seemed to occur arbitrarily, and the ultimate resolution, there's lots of backstory to cover. If you're already deeply familiar with the inner workings of the monad-control package, you can probably look at a demonstration of the bad instance and move on. Otherwise, prepare for a fun ride!

As usual, if you want to play along, we're going to be using Stack's script interpreter feature. Just save the snippets contents to a file and run with ```
stack
filename.hs
```

. (It works with any snippet that begins with `#!/usr/bin/env stack`

.)

Oh, and also: the confusion that this blog post demonstrates is one of the reasons why I strongly recommend sticking to a `ReaderT env IO`

monad transformer stack.

## Trying in StateT

Let's start with some broken code (my favorite kind). It uses the `StateT`

transformer and a function which may throw a runtime exception.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Monad.State.Strict import Control.Exception import Data.Typeable data OddException = OddException !Int -- great name :) deriving (Show, Typeable) instance Exception OddException mayThrow :: StateT Int IO Int mayThrow = do x <- get if odd x then lift $ throwIO $ OddException x else do put $! x + 1 return $ x `div` 2 main :: IO () main = runStateT (replicateM 2 mayThrow) 0 >>= print

Our problem is that we'd like to be able to recover from a thrown exception. Easy enough we think, we'll just use `Control.Exception.try`

to attempt to run the `mayThrow`

action. Unfortunately, if I wrap up `mayThrow`

with a `try`

, I get this highly informative error message:

```
Main.hs:21:19: error:
• Couldn't match type ‘IO’ with ‘StateT Integer IO’
Expected type: StateT Integer IO ()
Actual type: IO ()
• In the first argument of ‘runStateT’, namely
‘(replicateM 2 (try mayThrow))’
In the first argument of ‘(>>=)’, namely
‘runStateT (replicateM 2 (try mayThrow)) 0’
In the expression:
runStateT (replicateM 2 (try mayThrow)) 0 >>= print
```

Oh, that makes sense: `try`

is specialized to `IO`

, and our function is `StateT Int IO`

. Our first instinct is probably to keep throwing `lift`

calls into our program until it compiles, since `lift`

seems to always fix monad transformer compilation errors. However, try as you might, you'll never succeed. To understand why, let's look at the (slightly specialized) type signature for `try`

:

try :: IO a -> IO (Either OddException a)

If I apply `lift`

to this, I could end up with:

try :: IO a -> StateT Int IO (Either OddException a)

But there's no way to use `lift`

to modify the type of the `IO a`

input. This is generally the case with the `lift`

and `liftIO`

functions: they can deal with monad values that are the *output* of a function, but not the *input* to the function. (More precisely: the functions are covariant and work on values in positive positions. We'd need something contravariant to work on vlaues in negative positions. You can read more on this nomenclature in another blog post.)

Huh, I guess we're stuck. But then I remember that `StateT`

is just defined as ```
newtype StateT s m a = StateT { runStateT :: s -> m
(a,s)}
```

. So maybe I can write a version of `try`

that works for a `StateT`

using the internals of the type.

tryStateT :: StateT Int IO a -> StateT Int IO (Either OddException a) tryStateT (StateT f) = StateT $ \s0 -> do eres <- try (f s0) return $ case eres of Left e -> (Left e, s0) Right (a, s1) -> (Right a, s1)

Go ahead and plug that into our previous example, and you should get the desired output:

`([Right 0,Left (OddException 1)],1)`

Let's break down in nauseating detail what that `tryStateT`

function did:

- Unwrap the
`StateT`

data constructor from the provided action to get a function`f :: Int -> IO (a, Int)`

- Construct a new
`StateT`

value on the right hand side by using the`StateT`

data constructor, and capturing the initial state in the value`s0 :: Int`

. - Pass
`s0`

to`f`

to get an action`IO :: (a, Int)`

, which will give the result and the new, updated state. - Wrap
`f s0`

with`try`

to allow us to detect and recover from a runtime exception. `eres`

has type`Either OddException (a, Int)`

, and we pattern match on it.- If we receive a
`Right`

/success value, we simply wrap up the`a`

value in a`Right`

constructor together with the updated state. - If we receive a
`Left`

/exception value, we wrap it up the exception with a`Left`

. However, we need to return*some*new state. Since we have no such state available to us from the action, we return the only thing we can: the initial`s0`

state value.

**Lesson learned** We can use `try`

in a `StateT`

with some difficulty, but we need to be aware of what happens to our monadic state.

## Catching in StateT

It turns out that it's trivial to implement the `try`

function in terms of `catch`

, and the `catch`

function in terms of `try`

, at least when sticking to the `IO`

-specialized versions:

try' :: Exception e => IO a -> IO (Either e a) try' action = (Right <$> action) `catch` (return . Left) catch' :: Exception e => IO a -> (e -> IO a) -> IO a catch' action onExc = do eres <- try action case eres of Left e -> onExc e Right a -> return a

It turns out that by just changing the type signatures and replacing `try`

with `tryStateT`

, we can do the same thing for `StateT`

:

catchStateT :: Exception e => StateT Int IO a -> (e -> StateT Int IO a) -> StateT Int IO a catchStateT action onExc = do eres <- tryStateT action case eres of Left e -> onExc e Right a -> return a

**NOTE** Pay close attention to that type signature, and think about how monadic state is being shuttled through this function.

Well, if we can implement `catchStateT`

in terms of `tryStateT`

, surely we can implement it directly as well. Let's do the most straightforward thing I can think of (or at least the thing that continues my narrative here):

catchStateT :: Exception e => StateT Int IO a -> (e -> IO a) -> StateT Int IO a catchStateT (StateT action) onExc = StateT $ \s0 -> action s0 `catch` \e -> do a <- onExc e return (a, s0)

Here, we're basing our implementation on top of the `catch`

function instead of the `try`

function. We do the same unwrap-the-StateT, capture-the-s0 trick we did before. Now, in the lambda we've created for the `catch`

call, we pass the `e`

exception value to the user-supplied `onExc`

function, and then like `tryStateT`

wrap up the result in a tuple with the initial `s0`

.

Who noticed the difference in type signature? Instead of ```
e -> StateT
Int IO a
```

, our `onExc`

handler has type `e -> IO a`

. I told you to pay attention to how the monadic states were being shuttled around; let's analyze it:

- In the first function, we use
`tryStateT`

, which as we mentioned will reconstitute the original`s0`

state when it returns. If the action succeeded, nothing else happens. But in the exception case, that original`s0`

is now passed into the`onExc`

function, and the final monadic state returned will be the result of the`onExc`

function. - In the second function, we never give the
`onExc`

function a chance to play with monadic state, since it just lives in`IO`

. So we always return the original state at the end if an exception occurred.

Which behavior is best? I think most people would argue that the first function is better: it's more general in allowing `onExc`

to access and modify the monadic state, and there's not really any chance for confusion. Fair enough, I'll buy that argument (that I just made on behalf of all of my readers).

**Bonus exercise** Modify this implementation of `catchStateT`

to have the same type signature as the original one.

## Finally

This is fun, let's keep reimplementing functions from `Control.Exception`

! This time, let's do `finally`

, which will ensure that some action (usually a cleanup action) is run after an initial action, regardless of whether an exception was thrown.

finallyStateT :: StateT Int IO a -> IO b -> StateT Int IO a finallyStateT (StateT action) cleanup = StateT $ \s0 -> action s0 `finally` cleanup

That was really easy. Ehh, but one problem: look at that type signature! We just agreed (or I agreed for you) that in the case of `catch`

, it was better to have the second argument *also* live in `StateT Int IO`

. Here, our argument lives in `IO`

. Let's fix that:

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0

Huh, also pretty simple. Let's analyze the monadic state behavior here: our cleanup action is given the initial state, regardless of the result of `action s0`

. That means that, even if the action succeeded, we'll ignore the updated state. Furthermore, because `finally`

ignores the result of the second argument, we will ignore any updated monadic state. Want to see what I mean? Try this out:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Exception import Control.Monad.State.Strict finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0 action :: StateT Int IO () action = modify (+ 1) cleanup :: StateT Int IO () cleanup = do get >>= lift . print modify (+ 2) main :: IO () main = execStateT (action `finallyStateT` cleanup) 0 >>= print

You may expect the output of this to be the numbers 1 and 3, but in fact the output is 0 and 1: `cleanup`

looks at the initial state value of 0, and its `+ 2`

modification is thrown away. So can we implement a version of our function that keeps the state? Sure (slightly simplified to avoid async exception/mask noise):

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> do (a, s1) <- action s0 `onException` cleanup s0 (_b, s2) <- cleanup s1 return (a, s2)

This has the expected output of 1 and 3. Looking at how it works: we follow our same tricks, and pass in `s0`

to `action`

. If an exception is thrown there, we once again pass in `s0`

to `cleanup`

and ignore its updated state (since we have no choice). However, in the success case, we now pass in the *updated* state (`s1`

) to `cleanup`

. And finally, our resulting state is the result of `cleanup`

(`s2`

) instead of the `s1`

produced by `action`

.

We have three different implementations of `finallyStateT`

and two different type signatures. Let's compare them:

- The first one (the
`IO`

version) has the advantage that its type tells us*exactly*what's happening: the cleanup has no access to the state at all. However, you can argue like we did with`catchStateT`

that this is limiting and not what people would expect the type signature to be. - The second one (use the initial state for
`cleanup`

and then throw away its modified state) has the advantage that it's logically consistent: whether`cleanup`

is called from a success or exception code path, it does the exact same thing. On the other hand, you can argue that it is surprising behavior that state updates that*can*be preserved are being thrown away. - The third one (keep the state) has the reversed arguments of the second one.

So unlike `catchStateT`

, I would argue that there's not nearly as clear a winner with `finallyStateT`

. Each approach has its relative merits.

One final point that seems almost not worth mentioning (hint: epic foreshadowment incoming). The first version (`IO`

specialized) has an additional benefit of being ever-so-slightly more efficient than the other two, since it doesn't need to deal with the additional monadic state in `cleanup`

. With a simple monad transformer like `StateT`

this performance difference is hardly even worth thinking about. However, if we were in a tight inner loop, and our monad stack was significantly more complicated, you could imagine a case where the performance difference was significant.

## Implementing for other transformers

It's great that we understand `StateT`

so well, but can we do anything for other transformers? It turns out that, yes, we can for many transformers. (An exception is continuation-based transformers, which you can read a bit about in passing in my ResourceT blog post from last week.) Let's look at a few other examples of `finally`

:

import Control.Exception import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid finallyWriterT :: Monoid w => WriterT w IO a -> WriterT w IO b -> WriterT w IO a finallyWriterT (WriterT action) (WriterT cleanup) = WriterT $ do (a, w1) <- action `onException` cleanup (_b, w2) <- cleanup return (a, w1 <> w2) finallyReaderT :: ReaderT r IO a -> ReaderT r IO b -> ReaderT r IO a finallyReaderT (ReaderT action) (ReaderT cleanup) = ReaderT $ \r -> do a <- action r `onException` cleanup r _b <- cleanup r return a finallyExceptT :: ExceptT e IO a -> ExceptT e IO b -> ExceptT e IO a finallyExceptT (ExceptT action) (ExceptT cleanup) = ExceptT $ do ea <- action `onException` cleanup eb <- cleanup return $ case (ea, eb) of (Left e, _) -> Left e (Right _a, Left e) -> Left e (Right a, Right _b) -> Right a

The `WriterT`

case is very similar to the `StateT`

case, except (1) there's no initial state `s0`

to contend with, and (2) instead of receiving an updated `s2`

state from `cleanup`

, we need to monoidally combine the `w1`

and `w2`

values. The `ReaderT`

case is *also* very similar to `StateT`

, but in the opposite way: we receive an immutable environment `r`

which is passed into all functions, but there is no updated state. To put this in other words: `WriterT`

has no *context* but has *mutable monadic state*, whereas `ReaderT`

has a context but no mutable monadic state. `StateT`

, by contrast, has both. (This is important to understand, so reread it a few times to get comfortable with the concept.)

The `ExceptT`

case is interesting: it has no context (like `WriterT`

), but it *does* have mutable monadic state, just not like `StateT`

and `WriterT`

. Instead of returning an extra value with each result (as a product), `ExceptT`

returns either a result value or an `e`

value (as a sum). The `case`

expression at the end of `finallyExceptT`

is very informative: we need to figure out how to combine the various monadic states together. Our implementation here says that if `action`

returns `e`

, we take that result. Otherwise, if `cleanup`

fails, we take *that* value. And if they both return `Right`

values, then we use `action`

's result. But there are at least two other valid choices:

- Prefer
`cleanup`

's`e`

value to`action`

's`e`

value, if both are available. - Completely ignore the
`e`

value returned by`cleanup`

, and just use`action`

's result.

There's also a fourth, invalid option: if `action`

returns a `Left`

, return that immediately and don't call `cleanup`

. This has been a perenniel source of bugs in many libraries dealing with exceptions in monad transformers like `ErrorT`

, `ExceptT`

, and `EitherT`

. This invalidates the contract of `finally`

, namely that `cleanup`

will always be run. I've seen some arguments for why this can make sense, but I consider it nothing more than a buggy implementation.

And finally, like with `StateT`

, we could avoid all of these questions for `ExceptT`

if we just modify our type signature to use `IO b`

for `cleanup`

:

finallyExceptT :: ExceptT e IO a -> IO b -> ExceptT e IO a finallyExceptT (ExceptT action) cleanup = ExceptT $ do ea <- action `onException` cleanup _b <- cleanup return ea

So our takeaway: we can implement `finally`

for various monad transformers. In some cases this leads to questions of semantics, just like with `StateT`

. And all of these transformers fall into a pattern of optionally capturing some initial context, and optionally shuttling around some monadic state.

(And no, I haven't forgotten that the title of this blog post talks about `bracket`

. We're getting there, ever so slowly. I hope I've piqued your curiosity.)

## Generalizing the pattern

It's wonderful that we can implement all of these functions that take monad transformers as arguments. But do any of us actually want to go off and implement `catch`

, `try`

, `finally`

, `forkIO`

, `timeout`

, and a dozen other functions for every possible monad transformer stack imagineable? I doubt it. So just as we have `MonadTrans`

and `MonadIO`

for dealing with transformers in output/positive position, we can construct some kind of typeclass that handles the two concepts we mentioned above: capture the context, and deal with the monadic state.

Let's start by playing with this for just `StateT`

.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Exception import Control.Monad.State.Strict type Run s = forall b. StateT s IO b -> IO (b, s) capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0) restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult finally1 :: StateT s IO a -> IO b -> StateT s IO a finally1 action cleanup = do x <- capture $ \run -> run action `finally` cleanup restoreState x finally2 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally2 action cleanup = do x <- capture $ \run -> run action `finally` run cleanup restoreState x -- Not async exception safe! finally3 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally3 action cleanup = do x <- capture $ \run -> run action `onException` run cleanup a <- restoreState x _b <- cleanup return a main :: IO () main = do flip evalStateT () $ lift (putStrLn "here1") `finally1` putStrLn "here2" flip evalStateT () $ lift (putStrLn "here3") `finally2` lift (putStrLn "here4") flip evalStateT () $ lift (putStrLn "here5") `finally2` lift (putStrLn "here6")

That's a lot, let's step through it slowly:

type Run s = forall b. StateT s IO b -> IO (b, s)

This is a helper type to make the following bit simpler. It represents the concept of capturing the initial state in a general manner. Given an action living in our transformer, it turns an action in our base monad, returning the entire monadic state with the return value (i.e., `(b, s)`

instead of just `b`

). This allows use to define our `capture`

function:

capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0)

This function says "you give me some function that needs to be able to run monadic actions with the initial context, and I'll give it that initial context running function (`Run s`

)." The implementation isn't too bad: we just capture the `s0`

, create a `run`

function out of it, pass that into the user-provided argument, and then return the result with the original state.

Now we need some way to update the monadic state based on a result value. We call it `restoreState`

:

restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult

Pretty simple too: we ignore our original monadic state and replace it with the state contained in the argument. Next we use these two functions to implement three versions of `finally`

. The first two are able to reuse the `finally`

from `Control.Exception`

. However, both of them suffer from the inability to retain monadic state. Our third implementation fixes that, at the cost of having to reimplement the logic of `finally`

. And as my comment there mentions, our implementation is not in fact async exception safe.

So all of our original trade-offs apply from our initial `StateT`

discussion, but now there's an additional downside to option 3: it's significantly more complicated to implement correctly.

## The MonadIOControl type class

Alright, we've established that it's possible to capture this idea for `StateT`

. Let's generalize to a typeclass. We'll need three components:

- A capture function. We'll call it
`liftIOWith`

, to match nomenclature in monad-control. - A restore function, which we'll call
`restoreM`

. - An
*associated type*(type family) to represent what the monadic state for the given monad stack is.

We end up with:

type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a

Let's write an instance for `IO`

:

instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return

The `type StM IO a = a`

says that, for an `IO`

action returning `a`

, the full monadic state is just `a`

. In other words, there is no additional monadic state hanging around. That's good, as we know that there isn't. `liftIOWith`

is able to just use `id`

as the `RunInIO`

function, since you can run an `IO`

action in `IO`

directly. And finally, since there is no monadic state to update, `restoreM`

just wraps up the result value in `IO`

via `return`

. (More foreshadowment: what this instance is supposed to look like is actually at the core of the bug this blog post will eventually talk about.)

Alright, let's implement this instance for `StateT s IO`

:

instance MonadIOControl (StateT s IO) where type StM (StateT s IO) a = (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- withRun $ \(StateT f) -> f s0 return (a, s0) restoreM stateAndResult = StateT $ \_s0 -> return stateAndResult

This is basically identical to the functions we defined above, so I won't dwell on it here. But here's an interesting observation: the same way we define `MonadIO`

instance as ```
instance MonadIO m =>
MonadIO (StateT s m)
```

, it would be great to do the same thing for `MonadIOControl`

. And, in fact, we can do just that!

instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x

We use the underlying monad's `liftIOWith`

and `restoreM`

functions within our own definitions, and thereby get context and state passed up and down the stack as needed. Alright, let's go ahead and do this for all of the transformers we've been discussing:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE UndecidableInstances #-} import Control.Exception import Control.Monad.State.Strict import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid import Data.IORef type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x instance (MonadIOControl m, Monoid w) => MonadIOControl (WriterT w m) where type StM (WriterT w m) a = StM m (a, w) liftIOWith withRun = WriterT $ do a <- liftIOWith $ \run -> withRun $ \(WriterT f) -> run f return (a, mempty) restoreM x = WriterT $ restoreM x instance MonadIOControl m => MonadIOControl (ReaderT r m) where type StM (ReaderT r m) a = StM m a liftIOWith withRun = ReaderT $ \r -> liftIOWith $ \run -> withRun $ \(ReaderT f) -> run $ f r restoreM x = ReaderT $ \r -> restoreM x instance MonadIOControl m => MonadIOControl (ExceptT e m) where type StM (ExceptT e m) a = StM m (Either e a) liftIOWith withRun = ExceptT $ do a <- liftIOWith $ \run -> withRun $ \(ExceptT f) -> run f return $ Right a restoreM x = ExceptT $ restoreM x control :: MonadIOControl m => (RunInIO m -> IO (StM m a)) -> m a control f = do x <- liftIOWith f restoreM x checkControl :: MonadIOControl m => m () checkControl = control $ \run -> do ref <- newIORef (0 :: Int) let ensureIs :: MonadIO m => Int -> m () ensureIs expected = liftIO $ do putStrLn $ "ensureIs " ++ show expected curr <- atomicModifyIORef ref $ \curr -> (curr + 1, curr) unless (curr == expected) $ error $ show ("curr /= expected", curr, expected) ensureIs 0 Control.Exception.mask $ \restore -> do ensureIs 1 res <- restore (ensureIs 2 >> run (ensureIs 3) `finally` ensureIs 4) ensureIs 5 return res main :: IO () main = do checkControl runStateT checkControl () >>= print runWriterT checkControl >>= (print :: ((), ()) -> IO ()) runReaderT checkControl () runExceptT checkControl >>= (print :: Either () () -> IO ())

I encourage you to inspect each of the instances above and make sure you're comfortable with their implementation. I've added a function here, `checkControl`

, as a basic sanity check of our implementation. We start with the `control`

helper function, which runs some action with a `RunInIO`

argument, and then restores the monadic state. Then we use this function in `checkControl`

to ensure that a series of actions are all run in the correct order. As you can see, all of our test monads pass (again, *foreshadowment*).

The real monad-control package looks pretty similar to this, except:

- Instead of
`MonadIOControl`

, which is hard-coded to using`IO`

as a base monad, it provides a`MonadBaseControl`

typeclass, which allows arbitrary base monads (like`ST`

or`STM`

). - Just as
`MonadBaseControl`

is an analogue of`MonadIO`

, the package provides`MonadTransControl`

as an analogue of`MonadTrans`

, allowing you to unwrap one layer in a monad stack.

With all of this exposition out of the way—likely the longest exposition I've ever written in any blog post—we can start dealing with the actual bug. I'll show you the full context eventually, but I was asked to help debug a function that looked something like this:

fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE

This is fairly common in Conduit code. We're going to use `sourceFile`

, which needs to allocate some resources. Since we can't safely allocate resources from within a Conduit pipeline, we start off with `runResourceT`

to allow Conduit to register cleanup actions. (This combination is so common that we have a helper function `runConduitRes = runResourceT . runConduit`

.)

Unfortunately, this innocuous-looking like of code was generating an error message:

`Control.Monad.Trans.Resource.register': The mutable state is being accessed after cleanup. Please contact the maintainers.`

The "Please contact the maintainers." line should probably be removed from the resourcet package; it was from back in a time when we thought this bug was most likely to indicate an implementation bug within resourcet. That's no longer the case... which hopefully this debugging adventure will help demonstrate.

Anyway, as last week's blog post on ResourceT explained, `runResourceT`

creates a mutable variable to hold a list of cleanup actions, allows the inner action to register cleanup values into that mutable variable, and then when `runResourceT`

is exiting, it calls all those cleanup actions. And as a last sanity check, it replaces the value inside that mutable variable with a special value indicating that the state has already been closed, and it is therefore invalid to register further cleanup actions.

In well-behaved code, the structure of our `runResourceT`

function should prevent the mutable state from being accessible after it's closed, though I mention some cases last week that could cause that to happen (specifically, misuse of concurrency and the `transPipe`

function). However, after thoroughly exploring the codebase, I could find no indication that either of these common bugs had occurred.

Internally, `runResourceT`

is essentially a `bracket`

call, using the `createInternalState`

function to allocate the mutable variable, and `closeInternalState`

to clean it up. So I figured I could get a bit more information about this bug by using the `bracket`

function from `Control.Exception.Lifted`

and implementing:

fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE

Much to my chagrin, the bug disappeared! Suddenly the code worked perfectly. Beginning to question my sanity, I decided to look at the implementation of `runResourceT`

, and found this:

runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a runResourceT (ResourceT r) = control $ \run -> do istate <- createInternalState E.mask $ \restore -> do res <- restore (run (r istate)) `E.onException` stateCleanup ReleaseException istate stateCleanup ReleaseNormal istate return res

Ignoring the fact that we differentiate between exception and normal cleanup in the `stateCleanup`

function, I was struck by one question: why did I decide to implement this with `control`

in a manual, error-prone way instead of using the `bracket`

function directly? I began to worry that there was a bug in this implementation leading to all of the problems.

However, after reading through this implementation many times, I convinced myself that it was, in fact, correct. And then I realized why I had done it this way. Both `createInternalState`

and `stateCleanup`

are functions that can live in `IO`

directly, without any need of a monad transformer state. The only function that needed the monad transformer logic was that contained in the `ResourceT`

itself.

If you remember our discussion above, there were two major advantages of the implementation of `finally`

which relied upon `IO`

for the cleanup function instead of using the monad transformer state:

- It was much more explicit about how monadic state was going to be handled.
- It gave a slight performance advantage.

With the downside being that the type signature wasn't quite what people normally expected. Well, that downside didn't apply in my case: I was working on an internal function in a library, so I was free to ignore what a user-friendly API would look like. The advantage of explicitness around monadic state certainly appealed in a library that was so sensitive to getting things right. And given how widely used this function is, and the deep monadic stacks it was sometimes used it, any performance advantage was worth pursuing.

Alright, I felt good about the fact that `runResourceT`

was implemented correctly. Just to make sure I wasn't crazy, I reimplemented `fileLen`

to use an explicit `control`

instead of `Lifted.bracket`

, and the bug reappeared:

-- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

And as one final sanity check, I implemented `fileLen4`

to use the generalized style of `bracket`

, where the allocation and cleanup functions live in the monad stack instead of just `IO`

, and as expected the bug disappeared again. (Actually, I didn't really do this. I'm doing it now for the purpose of this blog post.)

fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

Whew, OK! So it turns out that my blog post title was correct: this *is* a tale of two brackets. And somehow, one of them triggers a bug, and one of them doesn't. But I still didn't know quite how that happened.

## The culprit

Another member of the team tracked down the ultimate problem to a datatype that looked like this (though not actually named `Bad`

, that would have been too obvious):

newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad

That's the kind of code that can easily pass a code review without anyone noticing a thing. With all of the context from this blog post, you may be able to understand why I've called this type `Bad`

. Go ahead and give it a few moments to try and figure it out.

OK, ready to see how this plays out? The `StM Bad a`

associated type is supposed to contain the result value of the underlying monad, together with any state introduced by this monad. Since we just have a newtype around `IO`

, there should be no monadic state, and we should just have `a`

. However, we've *actually* defined it as `IO a`

, which means "my monadic state for a value `a`

is an `IO`

action which will return an `a`

." The implementation of `liftBaseWith`

and `restoreM`

are simply in line with making the types work out.

Let's look at `fileLen3`

understanding that this is the instance in question. I'm also going to expand the `control`

function to make it easier to see what's happening.

res <- liftBaseWith $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res restoreM res

If we play it a little loose with newtype wrappers, we can substitute in the implementations of `liftBaseWith`

and `restoreM`

to get:

res <- Bad $ do let run = return . runBad istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res Bad res

Let's go ahead and substitute in our `run`

function in the one place it's used:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) `onException` closeInternalState istate closeInternalState istate return res Bad res

If you look at the code `return x `onException` foo`

, it's pretty easy to establish that `return`

itself will never throw an exception in `IO`

, and therefore the `onException`

it useless. In other words, the code is equivalent to just `return x`

. So again substituting:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) closeInternalState istate return res Bad res

And since `foo <- return x`

is just `let foo = x`

, we can turn this into:

res <- Bad $ do istate <- createInternalState closeInternalState istate return (runBad (runInternalState inner istate)) Bad res

And then:

Bad $ do istate <- createInternalState closeInternalState istate Bad (runBad (runInternalState inner istate))

And finally, just to drive the point home:

istate <- Bad createInternalState Bad $ closeInternalState istate runInternalState inner istate

So who wants to take a guess why the mutable variable was closed before we ever tried to register? Because *that's exactly what our MonadBaseControl instance said!* The problem is that instead of our monadic state just being some value, it was the

*entire action*we needed to run, which was now being deferred until after we called

`closeInternalState`

. Oops.## What about the other bracket?

Now let's try to understand why `fileLen4`

worked, despite the broken `MonadBaseControl`

instance. Again, starting with the original code after replacing `control`

with `liftBaseWith`

and `restoreM`

:

res <- liftBaseWith $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) restoreM res

This turns into:

res <- Bad $ bracket (return $ runBad createInternalState) (\st -> return $ runBad $ Bad st >>= closeInternalState) (\st -> return $ runBad $ Bad st >>= runInternalState inner) Bad res

Since this case is a bit more involved than the previous one, let's strip off the noise of `Bad`

and `runBad`

calls, since they're just wrapping/unwrapping a newtype:

res <- bracket (return createInternalState) (\st -> return $ st >>= closeInternalState) (\st -> return $ st >>= runInternalState inner) res

To decompose this mess, let's look at the actual implementation of `bracket`

from `base`

:

bracket before after thing = mask $ \restore -> do a <- before r <- restore (thing a) `onException` after a _ <- after a return r

We're going to ignore async exceptions for now, and therefore just mentally delete the `mask $ \restore`

bit. We end up with:

res <- do a <- return createInternalState r <- return (a >>= runInternalState inner) `onException` return (a >>= closeInternalState) _ <- return (a >>= closeInternalState) return r res

As above, we know that our `return x `onException` foo`

will never actually trigger the exception case. Also, `a <- return x`

is the same as `let a = x`

. So we can simplify to:

res <- do let a = createInternalState let r = a >>= runInternalState inner _ <- return (a >>= closeInternalState) return r res

Also, `_ <- return x`

has absolutely no impact at all, so we can delete that line (and any mention of `closeInternalState`

):

res <- do let a = createInternalState let r = a >>= runInternalState inner return r res

And then with a few more simply conversions, we end up with:

createInternalState >>= runInternalState inner

No wonder this code "worked": it never bothered trying to clean up! This could have easily led to complete leaking of resources in the application. Only the fact that our `runResourceT`

function thankfully stressed the code in a different way did we reveal the problem.

## What's the right instance?

It's certainly possible to define a correct newtype wrapper around `IO`

:

newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return

Unfortunately we can't simply use `GeneralizedNewtypeDeriving`

to make this instance due to the associated type family. But the explicitness here helps us understand what we did wrong before. Note that our ```
type
StM Good a
```

is just `a`

, not `IO a`

. We then implement the helper functions in terms of that. If you go through the same substitution exercise I did above, you'll see that—instead of passing around values which contain the actions to actually perform—our `fileLen3`

and `fileLen4`

functions will be performing the actions at the appropriate time.

I'm including the full test program at the end of this post for you to play with.

## Takeaways

So that blog post was certainly all over the place. I hope the primary thing you take away from it is a deeper understanding of how monad transformer stacks interact with operations in the base monad, and how monad-control works in general. In particular, next time you call `finally`

on some five-layer-deep stack, maybe you'll think twice about the implication of calling `modify`

or `tell`

in your cleanup function.

Another possible takeaway you may have is "Haskell's crazy complicated, this bug could happen to anyone, and it's almost undetectable." It turns out that there's a really simple workaround for that: stick to standard monad transformers whenever possible. monad-control is a phenomonal library, but I don't think most people should ever have to interact with it directly. Like async exceptions and `unsafePerformIO`

, there are parts of our library ecosystem that require them, but you should stick to higher-level libraries that hide that insanity from you, the same way we use higher-level languages to avoid having to write assembly.

Finally, having to think about all of the monadic state stuff in my code gives me a headache. It's possible for us to have a library like `lifted-base`

, but which constrains functions to only taking one argument in the `m`

monad and the rest in `IO`

to avoid the multiple-state stuff. However, my preferred solution is to avoid wherever possible monad transformers that introduce monadic state, and stick to `ReaderT`

like things for the majority of my application. (Yes, this is another pitch for my ReaderT design pattern.)

## Full final source code

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Monad.Trans.Control import Control.Monad.Trans.Resource import Control.Exception.Safe import qualified Control.Exception.Lifted as Lifted import Conduit newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE -- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE main :: IO () main = do putStrLn "fileLen1" tryAny (fileLen1 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen1 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen1 "/usr/share/dict/words")) >>= print putStrLn "fileLen2" tryAny (fileLen2 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen2 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen2 "/usr/share/dict/words")) >>= print putStrLn "fileLen3" tryAny (fileLen3 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen3 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen3 "/usr/share/dict/words")) >>= print putStrLn "fileLen4" tryAny (fileLen4 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen4 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen4 "/usr/share/dict/words")) >>= print

**Bonus exercise** Take the `checkControl`

function I provided above, and use it in the `Good`

and `Bad`

monads. See what the result is, and if you can understand why that's the case.