26 Jun 2017

This is a debugging story told completely out of order. In order to understand the ultimate bug, why it seemed to occur arbitrarily, and the ultimate resolution, there's lots of backstory to cover. If you're already deeply familiar with the inner workings of the monad-control package, you can probably look at a demonstration of the bad instance and move on. Otherwise, prepare for a fun ride!

As usual, if you want to play along, we're going to be using Stack's
script interpreter feature. Just
save the snippets contents to a file and run with ```
stack
filename.hs
```

. (It works with any snippet that begins with
`#!/usr/bin/env stack`

.)

Oh, and also: the confusion that this blog post demonstrates is one of the reasons why I strongly recommend sticking to a `ReaderT env IO`

monad transformer stack.

## Trying in StateT

Let's start with some broken code (my favorite kind). It uses the
`StateT`

transformer and a function which may throw a runtime
exception.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Monad.State.Strict import Control.Exception import Data.Typeable data OddException = OddException !Int -- great name :) deriving (Show, Typeable) instance Exception OddException mayThrow :: StateT Int IO Int mayThrow = do x <- get if odd x then lift $ throwIO $ OddException x else do put $! x + 1 return $ x `div` 2 main :: IO () main = runStateT (replicateM 2 mayThrow) 0 >>= print

Our problem is that we'd like to be able to recover from a thrown
exception. Easy enough we think, we'll just use
`Control.Exception.try`

to attempt to run the `mayThrow`

action. Unfortunately, if I wrap up `mayThrow`

with a `try`

, I get
this highly informative error message:

```
Main.hs:21:19: error:
• Couldn't match type ‘IO’ with ‘StateT Integer IO’
Expected type: StateT Integer IO ()
Actual type: IO ()
• In the first argument of ‘runStateT’, namely
‘(replicateM 2 (try mayThrow))’
In the first argument of ‘(>>=)’, namely
‘runStateT (replicateM 2 (try mayThrow)) 0’
In the expression:
runStateT (replicateM 2 (try mayThrow)) 0 >>= print
```

Oh, that makes sense: `try`

is specialized to `IO`

, and our function
is `StateT Int IO`

. Our first instinct is probably to keep throwing
`lift`

calls into our program until it compiles, since `lift`

seems to
always fix monad transformer compilation errors. However, try as you
might, you'll never succeed. To understand why, let's look at the
(slightly specialized) type signature for `try`

:

try :: IO a -> IO (Either OddException a)

If I apply `lift`

to this, I could end up with:

try :: IO a -> StateT Int IO (Either OddException a)

But there's no way to use `lift`

to modify the type of the `IO a`

input. This is generally the case with the `lift`

and `liftIO`

functions: they can deal with monad values that are the *output* of a
function, but not the *input* to the function. (More precisely: the
functions are covariant and work on values in positive positions. We'd
need something contravariant to work on vlaues in negative
positions. You can
read more on this nomenclature
in another blog post.)

Huh, I guess we're stuck. But then I remember that `StateT`

is just
defined as ```
newtype StateT s m a = StateT { runStateT :: s -> m
(a,s)}
```

. So maybe I can write a version of `try`

that works for a
`StateT`

using the internals of the type.

tryStateT :: StateT Int IO a -> StateT Int IO (Either OddException a) tryStateT (StateT f) = StateT $ \s0 -> do eres <- try (f s0) return $ case eres of Left e -> (Left e, s0) Right (a, s1) -> (Right a, s1)

Go ahead and plug that into our previous example, and you should get the desired output:

`([Right 0,Left (OddException 1)],1)`

Let's break down in nauseating detail what that `tryStateT`

function
did:

- Unwrap the
`StateT`

data constructor from the provided action to get a function`f :: Int -> IO (a, Int)`

- Construct a new
`StateT`

value on the right hand side by using the`StateT`

data constructor, and capturing the initial state in the value`s0 :: Int`

. - Pass
`s0`

to`f`

to get an action`IO :: (a, Int)`

, which will give the result and the new, updated state. - Wrap
`f s0`

with`try`

to allow us to detect and recover from a runtime exception. `eres`

has type`Either OddException (a, Int)`

, and we pattern match on it.- If we receive a
`Right`

/success value, we simply wrap up the`a`

value in a`Right`

constructor together with the updated state. - If we receive a
`Left`

/exception value, we wrap it up the exception with a`Left`

. However, we need to return*some*new state. Since we have no such state available to us from the action, we return the only thing we can: the initial`s0`

state value.

**Lesson learned** We can use `try`

in a `StateT`

with some
difficulty, but we need to be aware of what happens to our monadic
state.

## Catching in StateT

It turns out that it's trivial to implement the `try`

function in
terms of `catch`

, and the `catch`

function in terms of `try`

, at least
when sticking to the `IO`

-specialized versions:

try' :: Exception e => IO a -> IO (Either e a) try' action = (Right <$> action) `catch` (return . Left) catch' :: Exception e => IO a -> (e -> IO a) -> IO a catch' action onExc = do eres <- try action case eres of Left e -> onExc e Right a -> return a

It turns out that by just changing the type signatures and replacing
`try`

with `tryStateT`

, we can do the same thing for `StateT`

:

catchStateT :: Exception e => StateT Int IO a -> (e -> StateT Int IO a) -> StateT Int IO a catchStateT action onExc = do eres <- tryStateT action case eres of Left e -> onExc e Right a -> return a

**NOTE** Pay close attention to that type signature, and think about
how monadic state is being shuttled through this function.

Well, if we can implement `catchStateT`

in terms of `tryStateT`

,
surely we can implement it directly as well. Let's do the most
straightforward thing I can think of (or at least the thing that
continues my narrative here):

catchStateT :: Exception e => StateT Int IO a -> (e -> IO a) -> StateT Int IO a catchStateT (StateT action) onExc = StateT $ \s0 -> action s0 `catch` \e -> do a <- onExc e return (a, s0)

Here, we're basing our implementation on top of the `catch`

function
instead of the `try`

function. We do the same unwrap-the-StateT,
capture-the-s0 trick we did before. Now, in the lambda we've created
for the `catch`

call, we pass the `e`

exception value to the
user-supplied `onExc`

function, and then like `tryStateT`

wrap up the
result in a tuple with the initial `s0`

.

Who noticed the difference in type signature? Instead of ```
e -> StateT
Int IO a
```

, our `onExc`

handler has type `e -> IO a`

. I told you to pay
attention to how the monadic states were being shuttled around; let's
analyze it:

- In the first function, we use
`tryStateT`

, which as we mentioned will reconstitute the original`s0`

state when it returns. If the action succeeded, nothing else happens. But in the exception case, that original`s0`

is now passed into the`onExc`

function, and the final monadic state returned will be the result of the`onExc`

function. - In the second function, we never give the
`onExc`

function a chance to play with monadic state, since it just lives in`IO`

. So we always return the original state at the end if an exception occurred.

Which behavior is best? I think most people would argue that the first
function is better: it's more general in allowing `onExc`

to access
and modify the monadic state, and there's not really any chance for
confusion. Fair enough, I'll buy that argument (that I just made on
behalf of all of my readers).

**Bonus exercise** Modify this implementation of `catchStateT`

to have
the same type signature as the original one.

## Finally

This is fun, let's keep reimplementing functions from
`Control.Exception`

! This time, let's do `finally`

, which will ensure
that some action (usually a cleanup action) is run after an initial
action, regardless of whether an exception was thrown.

finallyStateT :: StateT Int IO a -> IO b -> StateT Int IO a finallyStateT (StateT action) cleanup = StateT $ \s0 -> action s0 `finally` cleanup

That was really easy. Ehh, but one problem: look at that type
signature! We just agreed (or I agreed for you) that in the case of
`catch`

, it was better to have the second argument *also* live in
`StateT Int IO`

. Here, our argument lives in `IO`

. Let's fix that:

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0

Huh, also pretty simple. Let's analyze the monadic state behavior
here: our cleanup action is given the initial state, regardless of the
result of `action s0`

. That means that, even if the action succeeded,
we'll ignore the updated state. Furthermore, because `finally`

ignores
the result of the second argument, we will ignore any updated monadic
state. Want to see what I mean? Try this out:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Exception import Control.Monad.State.Strict finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0 action :: StateT Int IO () action = modify (+ 1) cleanup :: StateT Int IO () cleanup = do get >>= lift . print modify (+ 2) main :: IO () main = execStateT (action `finallyStateT` cleanup) 0 >>= print

You may expect the output of this to be the numbers 1 and 3, but in
fact the output is 0 and 1: `cleanup`

looks at the initial state value
of 0, and its `+ 2`

modification is thrown away. So can we implement a
version of our function that keeps the state? Sure (slightly
simplified to avoid async exception/mask noise):

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> do (a, s1) <- action s0 `onException` cleanup s0 (_b, s2) <- cleanup s1 return (a, s2)

This has the expected output of 1 and 3. Looking at how it works: we
follow our same tricks, and pass in `s0`

to `action`

. If an exception
is thrown there, we once again pass in `s0`

to `cleanup`

and ignore
its updated state (since we have no choice). However, in the success
case, we now pass in the *updated* state (`s1`

) to `cleanup`

. And
finally, our resulting state is the result of `cleanup`

(`s2`

) instead
of the `s1`

produced by `action`

.

We have three different implementations of `finallyStateT`

and two
different type signatures. Let's compare them:

- The first one (the
`IO`

version) has the advantage that its type tells us*exactly*what's happening: the cleanup has no access to the state at all. However, you can argue like we did with`catchStateT`

that this is limiting and not what people would expect the type signature to be. - The second one (use the initial state for
`cleanup`

and then throw away its modified state) has the advantage that it's logically consistent: whether`cleanup`

is called from a success or exception code path, it does the exact same thing. On the other hand, you can argue that it is surprising behavior that state updates that*can*be preserved are being thrown away. - The third one (keep the state) has the reversed arguments of the second one.

So unlike `catchStateT`

, I would argue that there's not nearly as
clear a winner with `finallyStateT`

. Each approach has its relative
merits.

One final point that seems almost not worth mentioning (hint: epic
foreshadowment incoming). The first version (`IO`

specialized) has an
additional benefit of being ever-so-slightly more efficient than the
other two, since it doesn't need to deal with the additional monadic
state in `cleanup`

. With a simple monad transformer like `StateT`

this
performance difference is hardly even worth thinking about. However,
if we were in a tight inner loop, and our monad stack was
significantly more complicated, you could imagine a case where the
performance difference was significant.

## Implementing for other transformers

It's great that we understand `StateT`

so well, but can we do anything
for other transformers? It turns out that, yes, we can for many
transformers. (An exception is continuation-based transformers, which
you can read a bit about in passing in
my ResourceT blog post from last week.)
Let's look at a few other examples of `finally`

:

import Control.Exception import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid finallyWriterT :: Monoid w => WriterT w IO a -> WriterT w IO b -> WriterT w IO a finallyWriterT (WriterT action) (WriterT cleanup) = WriterT $ do (a, w1) <- action `onException` cleanup (_b, w2) <- cleanup return (a, w1 <> w2) finallyReaderT :: ReaderT r IO a -> ReaderT r IO b -> ReaderT r IO a finallyReaderT (ReaderT action) (ReaderT cleanup) = ReaderT $ \r -> do a <- action r `onException` cleanup r _b <- cleanup r return a finallyExceptT :: ExceptT e IO a -> ExceptT e IO b -> ExceptT e IO a finallyExceptT (ExceptT action) (ExceptT cleanup) = ExceptT $ do ea <- action `onException` cleanup eb <- cleanup return $ case (ea, eb) of (Left e, _) -> Left e (Right _a, Left e) -> Left e (Right a, Right _b) -> Right a

The `WriterT`

case is very similar to the `StateT`

case, except (1)
there's no initial state `s0`

to contend with, and (2) instead of
receiving an updated `s2`

state from `cleanup`

, we need to monoidally
combine the `w1`

and `w2`

values. The `ReaderT`

case is *also* very
similar to `StateT`

, but in the opposite way: we receive an immutable
environment `r`

which is passed into all functions, but there is no
updated state. To put this in other words: `WriterT`

has no *context*
but has *mutable monadic state*, whereas `ReaderT`

has a context but
no mutable monadic state. `StateT`

, by contrast, has both. (This is
important to understand, so reread it a few times to get comfortable
with the concept.)

The `ExceptT`

case is interesting: it has no context (like `WriterT`

),
but it *does* have mutable monadic state, just not like `StateT`

and
`WriterT`

. Instead of returning an extra value with each result (as a
product), `ExceptT`

returns either a result value or an `e`

value (as
a sum). The `case`

expression at the end of `finallyExceptT`

is very
informative: we need to figure out how to combine the various monadic
states together. Our implementation here says that if `action`

returns
`e`

, we take that result. Otherwise, if `cleanup`

fails, we take
*that* value. And if they both return `Right`

values, then we use
`action`

's result. But there are at least two other valid choices:

- Prefer
`cleanup`

's`e`

value to`action`

's`e`

value, if both are available. - Completely ignore the
`e`

value returned by`cleanup`

, and just use`action`

's result.

There's also a fourth, invalid option: if `action`

returns a `Left`

,
return that immediately and don't call `cleanup`

. This has been a
perenniel source of bugs in many libraries dealing with exceptions in
monad transformers like `ErrorT`

, `ExceptT`

, and `EitherT`

. This
invalidates the contract of `finally`

, namely that `cleanup`

will
always be run. I've seen some arguments for why this can make sense,
but I consider it nothing more than a buggy implementation.

And finally, like with `StateT`

, we could avoid all of these questions
for `ExceptT`

if we just modify our type signature to use `IO b`

for
`cleanup`

:

finallyExceptT :: ExceptT e IO a -> IO b -> ExceptT e IO a finallyExceptT (ExceptT action) cleanup = ExceptT $ do ea <- action `onException` cleanup _b <- cleanup return ea

So our takeaway: we can implement `finally`

for various monad
transformers. In some cases this leads to questions of semantics, just
like with `StateT`

. And all of these transformers fall into a pattern
of optionally capturing some initial context, and optionally shuttling
around some monadic state.

(And no, I haven't forgotten that the title of this blog post talks
about `bracket`

. We're getting there, ever so slowly. I hope I've
piqued your curiosity.)

## Generalizing the pattern

It's wonderful that we can implement all of these functions that take
monad transformers as arguments. But do any of us actually want to go
off and implement `catch`

, `try`

, `finally`

, `forkIO`

, `timeout`

, and
a dozen other functions for every possible monad transformer stack
imagineable? I doubt it. So just as we have `MonadTrans`

and `MonadIO`

for dealing with transformers in output/positive position, we can
construct some kind of typeclass that handles the two concepts we
mentioned above: capture the context, and deal with the monadic state.

Let's start by playing with this for just `StateT`

.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Exception import Control.Monad.State.Strict type Run s = forall b. StateT s IO b -> IO (b, s) capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0) restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult finally1 :: StateT s IO a -> IO b -> StateT s IO a finally1 action cleanup = do x <- capture $ \run -> run action `finally` cleanup restoreState x finally2 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally2 action cleanup = do x <- capture $ \run -> run action `finally` run cleanup restoreState x -- Not async exception safe! finally3 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally3 action cleanup = do x <- capture $ \run -> run action `onException` run cleanup a <- restoreState x _b <- cleanup return a main :: IO () main = do flip evalStateT () $ lift (putStrLn "here1") `finally1` putStrLn "here2" flip evalStateT () $ lift (putStrLn "here3") `finally2` lift (putStrLn "here4") flip evalStateT () $ lift (putStrLn "here5") `finally2` lift (putStrLn "here6")

That's a lot, let's step through it slowly:

type Run s = forall b. StateT s IO b -> IO (b, s)

This is a helper type to make the following bit simpler. It represents
the concept of capturing the initial state in a general manner. Given
an action living in our transformer, it turns an action in our base
monad, returning the entire monadic state with the return value (i.e.,
`(b, s)`

instead of just `b`

). This allows use to define our `capture`

function:

capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0)

This function says "you give me some function that needs to be able to
run monadic actions with the initial context, and I'll give it that
initial context running function (`Run s`

)." The implementation isn't
too bad: we just capture the `s0`

, create a `run`

function out of it,
pass that into the user-provided argument, and then return the result
with the original state.

Now we need some way to update the monadic state based on a result
value. We call it `restoreState`

:

restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult

Pretty simple too: we ignore our original monadic state and replace it
with the state contained in the argument. Next we use these two
functions to implement three versions of `finally`

. The first two are
able to reuse the `finally`

from `Control.Exception`

. However, both of
them suffer from the inability to retain monadic state. Our third
implementation fixes that, at the cost of having to reimplement the
logic of `finally`

. And as my comment there mentions, our
implementation is not in fact async exception safe.

So all of our original trade-offs apply from our initial `StateT`

discussion, but now there's an additional downside to option 3: it's
significantly more complicated to implement correctly.

## The MonadIOControl type class

Alright, we've established that it's possible to capture this idea for
`StateT`

. Let's generalize to a typeclass. We'll need three
components:

- A capture function. We'll call it
`liftIOWith`

, to match nomenclature in monad-control. - A restore function, which we'll call
`restoreM`

. - An
*associated type*(type family) to represent what the monadic state for the given monad stack is.

We end up with:

type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a

Let's write an instance for `IO`

:

instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return

The `type StM IO a = a`

says that, for an `IO`

action returning `a`

,
the full monadic state is just `a`

. In other words, there is no
additional monadic state hanging around. That's good, as we know that
there isn't. `liftIOWith`

is able to just use `id`

as the `RunInIO`

function, since you can run an `IO`

action in `IO`

directly. And
finally, since there is no monadic state to update, `restoreM`

just
wraps up the result value in `IO`

via `return`

. (More foreshadowment:
what this instance is supposed to look like is actually at the core of
the bug this blog post will eventually talk about.)

Alright, let's implement this instance for `StateT s IO`

:

instance MonadIOControl (StateT s IO) where type StM (StateT s IO) a = (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- withRun $ \(StateT f) -> f s0 return (a, s0) restoreM stateAndResult = StateT $ \_s0 -> return stateAndResult

This is basically identical to the functions we defined above, so I
won't dwell on it here. But here's an interesting observation: the
same way we define `MonadIO`

instance as ```
instance MonadIO m =>
MonadIO (StateT s m)
```

, it would be great to do the same thing for
`MonadIOControl`

. And, in fact, we can do just that!

instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x

We use the underlying monad's `liftIOWith`

and `restoreM`

functions
within our own definitions, and thereby get context and state passed
up and down the stack as needed. Alright, let's go ahead and do this
for all of the transformers we've been discussing:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE UndecidableInstances #-} import Control.Exception import Control.Monad.State.Strict import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid import Data.IORef type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x instance (MonadIOControl m, Monoid w) => MonadIOControl (WriterT w m) where type StM (WriterT w m) a = StM m (a, w) liftIOWith withRun = WriterT $ do a <- liftIOWith $ \run -> withRun $ \(WriterT f) -> run f return (a, mempty) restoreM x = WriterT $ restoreM x instance MonadIOControl m => MonadIOControl (ReaderT r m) where type StM (ReaderT r m) a = StM m a liftIOWith withRun = ReaderT $ \r -> liftIOWith $ \run -> withRun $ \(ReaderT f) -> run $ f r restoreM x = ReaderT $ \r -> restoreM x instance MonadIOControl m => MonadIOControl (ExceptT e m) where type StM (ExceptT e m) a = StM m (Either e a) liftIOWith withRun = ExceptT $ do a <- liftIOWith $ \run -> withRun $ \(ExceptT f) -> run f return $ Right a restoreM x = ExceptT $ restoreM x control :: MonadIOControl m => (RunInIO m -> IO (StM m a)) -> m a control f = do x <- liftIOWith f restoreM x checkControl :: MonadIOControl m => m () checkControl = control $ \run -> do ref <- newIORef (0 :: Int) let ensureIs :: MonadIO m => Int -> m () ensureIs expected = liftIO $ do putStrLn $ "ensureIs " ++ show expected curr <- atomicModifyIORef ref $ \curr -> (curr + 1, curr) unless (curr == expected) $ error $ show ("curr /= expected", curr, expected) ensureIs 0 Control.Exception.mask $ \restore -> do ensureIs 1 res <- restore (ensureIs 2 >> run (ensureIs 3) `finally` ensureIs 4) ensureIs 5 return res main :: IO () main = do checkControl runStateT checkControl () >>= print runWriterT checkControl >>= (print :: ((), ()) -> IO ()) runReaderT checkControl () runExceptT checkControl >>= (print :: Either () () -> IO ())

I encourage you to inspect each of the instances above and make sure
you're comfortable with their implementation. I've added a function
here, `checkControl`

, as a basic sanity check of our
implementation. We start with the `control`

helper function, which
runs some action with a `RunInIO`

argument, and then restores the
monadic state. Then we use this function in `checkControl`

to ensure
that a series of actions are all run in the correct order. As you can
see, all of our test monads pass (again, *foreshadowment*).

The real monad-control package looks pretty similar to this, except:

- Instead of
`MonadIOControl`

, which is hard-coded to using`IO`

as a base monad, it provides a`MonadBaseControl`

typeclass, which allows arbitrary base monads (like`ST`

or`STM`

). - Just as
`MonadBaseControl`

is an analogue of`MonadIO`

, the package provides`MonadTransControl`

as an analogue of`MonadTrans`

, allowing you to unwrap one layer in a monad stack.

With all of this exposition out of the way—likely the longest exposition I've ever written in any blog post—we can start dealing with the actual bug. I'll show you the full context eventually, but I was asked to help debug a function that looked something like this:

fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE

This is fairly common in Conduit code. We're going to use
`sourceFile`

, which needs to allocate some resources. Since we can't
safely allocate resources from within a Conduit pipeline, we start off
with `runResourceT`

to allow Conduit to register cleanup
actions. (This combination is so common that we have a helper function
`runConduitRes = runResourceT . runConduit`

.)

Unfortunately, this innocuous-looking like of code was generating an error message:

`Control.Monad.Trans.Resource.register': The mutable state is being accessed after cleanup. Please contact the maintainers.`

The "Please contact the maintainers." line should probably be removed from the resourcet package; it was from back in a time when we thought this bug was most likely to indicate an implementation bug within resourcet. That's no longer the case... which hopefully this debugging adventure will help demonstrate.

Anyway, as last week's blog post on ResourceT explained,
`runResourceT`

creates a mutable variable to hold a list of cleanup
actions, allows the inner action to register cleanup values into that
mutable variable, and then when `runResourceT`

is exiting, it calls
all those cleanup actions. And as a last sanity check, it replaces the
value inside that mutable variable with a special value indicating
that the state has already been closed, and it is therefore invalid to
register further cleanup actions.

In well-behaved code, the structure of our `runResourceT`

function
should prevent the mutable state from being accessible after it's
closed, though I mention some cases last week that could cause that to
happen (specifically, misuse of concurrency and the `transPipe`

function). However, after thoroughly exploring the codebase, I could
find no indication that either of these common bugs had occurred.

Internally, `runResourceT`

is essentially a `bracket`

call, using the
`createInternalState`

function to allocate the mutable variable, and
`closeInternalState`

to clean it up. So I figured I could get a bit
more information about this bug by using the `bracket`

function from
`Control.Exception.Lifted`

and implementing:

fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE

Much to my chagrin, the bug disappeared! Suddenly the code worked
perfectly. Beginning to question my sanity, I decided to look at the
implementation of `runResourceT`

, and found this:

runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a runResourceT (ResourceT r) = control $ \run -> do istate <- createInternalState E.mask $ \restore -> do res <- restore (run (r istate)) `E.onException` stateCleanup ReleaseException istate stateCleanup ReleaseNormal istate return res

Ignoring the fact that we differentiate between exception and normal
cleanup in the `stateCleanup`

function, I was struck by one question:
why did I decide to implement this with `control`

in a manual,
error-prone way instead of using the `bracket`

function directly? I
began to worry that there was a bug in this implementation leading to
all of the problems.

However, after reading through this implementation many times, I
convinced myself that it was, in fact, correct. And then I realized
why I had done it this way. Both `createInternalState`

and
`stateCleanup`

are functions that can live in `IO`

directly, without
any need of a monad transformer state. The only function that needed
the monad transformer logic was that contained in the `ResourceT`

itself.

If you remember our discussion above, there were two major advantages
of the implementation of `finally`

which relied upon `IO`

for the
cleanup function instead of using the monad transformer state:

- It was much more explicit about how monadic state was going to be handled.
- It gave a slight performance advantage.

With the downside being that the type signature wasn't quite what people normally expected. Well, that downside didn't apply in my case: I was working on an internal function in a library, so I was free to ignore what a user-friendly API would look like. The advantage of explicitness around monadic state certainly appealed in a library that was so sensitive to getting things right. And given how widely used this function is, and the deep monadic stacks it was sometimes used it, any performance advantage was worth pursuing.

Alright, I felt good about the fact that `runResourceT`

was
implemented correctly. Just to make sure I wasn't crazy, I
reimplemented `fileLen`

to use an explicit `control`

instead of
`Lifted.bracket`

, and the bug reappeared:

-- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

And as one final sanity check, I implemented `fileLen4`

to use the
generalized style of `bracket`

, where the allocation and cleanup
functions live in the monad stack instead of just `IO`

, and as
expected the bug disappeared again. (Actually, I didn't really do
this. I'm doing it now for the purpose of this blog post.)

fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

Whew, OK! So it turns out that my blog post title was correct: this
*is* a tale of two brackets. And somehow, one of them triggers a bug,
and one of them doesn't. But I still didn't know quite how that
happened.

## The culprit

Another member of the team tracked down the ultimate problem to a
datatype that looked like this (though not actually named `Bad`

, that
would have been too obvious):

newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad

That's the kind of code that can easily pass a code review without
anyone noticing a thing. With all of the context from this blog post,
you may be able to understand why I've called this type `Bad`

. Go
ahead and give it a few moments to try and figure it out.

OK, ready to see how this plays out? The `StM Bad a`

associated type
is supposed to contain the result value of the underlying monad,
together with any state introduced by this monad. Since we just have a
newtype around `IO`

, there should be no monadic state, and we should
just have `a`

. However, we've *actually* defined it as `IO a`

, which
means "my monadic state for a value `a`

is an `IO`

action which will
return an `a`

." The implementation of `liftBaseWith`

and `restoreM`

are simply in line with making the types work out.

Let's look at `fileLen3`

understanding that this is the instance in
question. I'm also going to expand the `control`

function to make it
easier to see what's happening.

res <- liftBaseWith $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res restoreM res

If we play it a little loose with newtype wrappers, we can substitute
in the implementations of `liftBaseWith`

and `restoreM`

to get:

res <- Bad $ do let run = return . runBad istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res Bad res

Let's go ahead and substitute in our `run`

function in the one place
it's used:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) `onException` closeInternalState istate closeInternalState istate return res Bad res

If you look at the code `return x `onException` foo`

, it's pretty
easy to establish that `return`

itself will never throw an exception
in `IO`

, and therefore the `onException`

it useless. In other words,
the code is equivalent to just `return x`

. So again substituting:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) closeInternalState istate return res Bad res

And since `foo <- return x`

is just `let foo = x`

, we can turn this
into:

res <- Bad $ do istate <- createInternalState closeInternalState istate return (runBad (runInternalState inner istate)) Bad res

And then:

Bad $ do istate <- createInternalState closeInternalState istate Bad (runBad (runInternalState inner istate))

And finally, just to drive the point home:

istate <- Bad createInternalState Bad $ closeInternalState istate runInternalState inner istate

So who wants to take a guess why the mutable variable was closed
before we ever tried to register? Because *that's exactly what our
MonadBaseControl instance said!* The problem is that instead of our
monadic state just being some value, it was the

*entire action*we needed to run, which was now being deferred until after we called

`closeInternalState`

. Oops.## What about the other bracket?

Now let's try to understand why `fileLen4`

worked, despite the broken
`MonadBaseControl`

instance. Again, starting with the original code
after replacing `control`

with `liftBaseWith`

and `restoreM`

:

res <- liftBaseWith $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) restoreM res

This turns into:

res <- Bad $ bracket (return $ runBad createInternalState) (\st -> return $ runBad $ Bad st >>= closeInternalState) (\st -> return $ runBad $ Bad st >>= runInternalState inner) Bad res

Since this case is a bit more involved than the previous one, let's
strip off the noise of `Bad`

and `runBad`

calls, since they're just
wrapping/unwrapping a newtype:

res <- bracket (return createInternalState) (\st -> return $ st >>= closeInternalState) (\st -> return $ st >>= runInternalState inner) res

To decompose this mess, let's look at the actual implementation of
`bracket`

from `base`

:

bracket before after thing = mask $ \restore -> do a <- before r <- restore (thing a) `onException` after a _ <- after a return r

We're going to ignore async exceptions for now, and therefore just
mentally delete the `mask $ \restore`

bit. We end up with:

res <- do a <- return createInternalState r <- return (a >>= runInternalState inner) `onException` return (a >>= closeInternalState) _ <- return (a >>= closeInternalState) return r res

As above, we know that our `return x `onException` foo`

will never
actually trigger the exception case. Also, `a <- return x`

is the same
as `let a = x`

. So we can simplify to:

res <- do let a = createInternalState let r = a >>= runInternalState inner _ <- return (a >>= closeInternalState) return r res

Also, `_ <- return x`

has absolutely no impact at all, so we can
delete that line (and any mention of `closeInternalState`

):

res <- do let a = createInternalState let r = a >>= runInternalState inner return r res

And then with a few more simply conversions, we end up with:

createInternalState >>= runInternalState inner

No wonder this code "worked": it never bothered trying to clean up!
This could have easily led to complete leaking of resources in the
application. Only the fact that our `runResourceT`

function thankfully
stressed the code in a different way did we reveal the problem.

## What's the right instance?

It's certainly possible to define a correct newtype wrapper around
`IO`

:

newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return

Unfortunately we can't simply use `GeneralizedNewtypeDeriving`

to make
this instance due to the associated type family. But the explicitness
here helps us understand what we did wrong before. Note that our ```
type
StM Good a
```

is just `a`

, not `IO a`

. We then implement the helper
functions in terms of that. If you go through the same substitution
exercise I did above, you'll see that—instead of passing around
values which contain the actions to actually perform—our
`fileLen3`

and `fileLen4`

functions will be performing the actions at
the appropriate time.

I'm including the full test program at the end of this post for you to play with.

## Takeaways

So that blog post was certainly all over the place. I hope the primary
thing you take away from it is a deeper understanding of how monad
transformer stacks interact with operations in the base monad, and how
monad-control works in general. In particular, next time you call
`finally`

on some five-layer-deep stack, maybe you'll think twice
about the implication of calling `modify`

or `tell`

in your cleanup
function.

Another possible takeaway you may have is "Haskell's crazy
complicated, this bug could happen to anyone, and it's almost
undetectable." It turns out that there's a really simple workaround
for that: stick to standard monad transformers whenever
possible. monad-control is a phenomonal library, but I don't think
most people should ever have to interact with it directly. Like async
exceptions and `unsafePerformIO`

, there are parts of our library
ecosystem that require them, but you should stick to higher-level
libraries that hide that insanity from you, the same way we use
higher-level languages to avoid having to write assembly.

Finally, having to think about all of the monadic state stuff in my
code gives me a headache. It's possible for us to have a library like
`lifted-base`

, but which constrains functions to only taking one
argument in the `m`

monad and the rest in `IO`

to avoid the
multiple-state stuff. However, my preferred solution is to avoid
wherever possible monad transformers that introduce monadic state, and
stick to `ReaderT`

like things for the majority of my
application. (Yes, this is another pitch for my
ReaderT design pattern.)

## Full final source code

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Monad.Trans.Control import Control.Monad.Trans.Resource import Control.Exception.Safe import qualified Control.Exception.Lifted as Lifted import Conduit newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE -- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE main :: IO () main = do putStrLn "fileLen1" tryAny (fileLen1 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen1 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen1 "/usr/share/dict/words")) >>= print putStrLn "fileLen2" tryAny (fileLen2 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen2 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen2 "/usr/share/dict/words")) >>= print putStrLn "fileLen3" tryAny (fileLen3 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen3 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen3 "/usr/share/dict/words")) >>= print putStrLn "fileLen4" tryAny (fileLen4 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen4 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen4 "/usr/share/dict/words")) >>= print

**Bonus exercise** Take the `checkControl`

function I provided above,
and use it in the `Good`

and `Bad`

monads. See what the result is, and
if you can understand why that's the case.