This is a debugging story told completely out of order. In order to understand the ultimate bug, why it seemed to occur arbitrarily, and the ultimate resolution, there's lots of backstory to cover. If you're already deeply familiar with the inner workings of the monad-control package, you can probably look at a demonstration of the bad instance and move on. Otherwise, prepare for a fun ride!

As usual, if you want to play along, we're going to be using
Stack's script interpreter
feature. Just save the snippets contents to a file and run with
`stack filename.hs`

. (It works with any snippet that
begins with `#!/usr/bin/env stack`

.)

Oh, and also: the confusion that this blog post demonstrates is
one of the reasons why I strongly recommend sticking to a ```
ReaderT
env IO
```

monad transformer stack.

Let's start with some broken code (my favorite kind). It uses
the `StateT`

transformer and a function which may throw
a runtime exception.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Monad.State.Strict import Control.Exception import Data.Typeable data OddException = OddException !Int -- great name :) deriving (Show, Typeable) instance Exception OddException mayThrow :: StateT Int IO Int mayThrow = do x <- get if odd x then lift $ throwIO $ OddException x else do put $! x + 1 return $ x `div` 2 main :: IO () main = runStateT (replicateM 2 mayThrow) 0 >>= print

Our problem is that we'd like to be able to recover from a
thrown exception. Easy enough we think, we'll just use
`Control.Exception.try`

to attempt to run the
`mayThrow`

action. Unfortunately, if I wrap up
`mayThrow`

with a `try`

, I get this highly
informative error message:

```
Main.hs:21:19: error:
• Couldn't match type ‘IO’ with ‘StateT Integer IO’
Expected type: StateT Integer IO ()
Actual type: IO ()
• In the first argument of ‘runStateT’, namely
‘(replicateM 2 (try mayThrow))’
In the first argument of ‘(>>=)’, namely
‘runStateT (replicateM 2 (try mayThrow)) 0’
In the expression:
runStateT (replicateM 2 (try mayThrow)) 0 >>= print
```

Oh, that makes sense: `try`

is specialized to
`IO`

, and our function is `StateT Int IO`

.
Our first instinct is probably to keep throwing `lift`

calls into our program until it compiles, since `lift`

seems to always fix monad transformer compilation errors. However,
try as you might, you'll never succeed. To understand why, let's
look at the (slightly specialized) type signature for
`try`

:

try :: IO a -> IO (Either OddException a)

If I apply `lift`

to this, I could end up with:

try :: IO a -> StateT Int IO (Either OddException a)

But there's no way to use `lift`

to modify the type
of the `IO a`

input. This is generally the case with the
`lift`

and `liftIO`

functions: they can deal
with monad values that are the *output* of a function, but not
the *input* to the function. (More precisely: the functions
are covariant and work on values in positive positions. We'd need
something contravariant to work on vlaues in negative positions.
You can read more
on this nomenclature in another blog post.)

Huh, I guess we're stuck. But then I remember that
`StateT`

is just defined as ```
newtype StateT s m a =
StateT { runStateT :: s -> m (a,s)}
```

. So maybe I can write
a version of `try`

that works for a `StateT`

using the internals of the type.

tryStateT :: StateT Int IO a -> StateT Int IO (Either OddException a) tryStateT (StateT f) = StateT $ \s0 -> do eres <- try (f s0) return $ case eres of Left e -> (Left e, s0) Right (a, s1) -> (Right a, s1)

Go ahead and plug that into our previous example, and you should get the desired output:

`([Right 0,Left (OddException 1)],1)`

Let's break down in nauseating detail what that
`tryStateT`

function did:

- Unwrap the
`StateT`

data constructor from the provided action to get a function`f :: Int -> IO (a, Int)`

- Construct a new
`StateT`

value on the right hand side by using the`StateT`

data constructor, and capturing the initial state in the value`s0 :: Int`

. - Pass
`s0`

to`f`

to get an action`IO :: (a, Int)`

, which will give the result and the new, updated state. - Wrap
`f s0`

with`try`

to allow us to detect and recover from a runtime exception. `eres`

has type`Either OddException (a, Int)`

, and we pattern match on it.- If we receive a
`Right`

/success value, we simply wrap up the`a`

value in a`Right`

constructor together with the updated state. - If we receive a
`Left`

/exception value, we wrap it up the exception with a`Left`

. However, we need to return*some*new state. Since we have no such state available to us from the action, we return the only thing we can: the initial`s0`

state value.

**Lesson learned** We can use `try`

in a
`StateT`

with some difficulty, but we need to be aware
of what happens to our monadic state.

It turns out that it's trivial to implement the `try`

function in terms of `catch`

, and the `catch`

function in terms of `try`

, at least when sticking to
the `IO`

-specialized versions:

try' :: Exception e => IO a -> IO (Either e a) try' action = (Right <$> action) `catch` (return . Left) catch' :: Exception e => IO a -> (e -> IO a) -> IO a catch' action onExc = do eres <- try action case eres of Left e -> onExc e Right a -> return a

It turns out that by just changing the type signatures and
replacing `try`

with `tryStateT`

, we can do
the same thing for `StateT`

:

catchStateT :: Exception e => StateT Int IO a -> (e -> StateT Int IO a) -> StateT Int IO a catchStateT action onExc = do eres <- tryStateT action case eres of Left e -> onExc e Right a -> return a

**NOTE** Pay close attention to that type signature, and
think about how monadic state is being shuttled through this
function.

Well, if we can implement `catchStateT`

in terms of
`tryStateT`

, surely we can implement it directly as
well. Let's do the most straightforward thing I can think of (or at
least the thing that continues my narrative here):

catchStateT :: Exception e => StateT Int IO a -> (e -> IO a) -> StateT Int IO a catchStateT (StateT action) onExc = StateT $ \s0 -> action s0 `catch` \e -> do a <- onExc e return (a, s0)

Here, we're basing our implementation on top of the
`catch`

function instead of the `try`

function. We do the same unwrap-the-StateT, capture-the-s0 trick we
did before. Now, in the lambda we've created for the
`catch`

call, we pass the `e`

exception value
to the user-supplied `onExc`

function, and then like
`tryStateT`

wrap up the result in a tuple with the
initial `s0`

.

Who noticed the difference in type signature? Instead of ```
e
-> StateT Int IO a
```

, our `onExc`

handler has
type `e -> IO a`

. I told you to pay attention to how
the monadic states were being shuttled around; let's analyze
it:

- In the first function, we use
`tryStateT`

, which as we mentioned will reconstitute the original`s0`

state when it returns. If the action succeeded, nothing else happens. But in the exception case, that original`s0`

is now passed into the`onExc`

function, and the final monadic state returned will be the result of the`onExc`

function. - In the second function, we never give the
`onExc`

function a chance to play with monadic state, since it just lives in`IO`

. So we always return the original state at the end if an exception occurred.

Which behavior is best? I think most people would argue that the
first function is better: it's more general in allowing
`onExc`

to access and modify the monadic state, and
there's not really any chance for confusion. Fair enough, I'll buy
that argument (that I just made on behalf of all of my
readers).

**Bonus exercise** Modify this implementation of
`catchStateT`

to have the same type signature as the
original one.

This is fun, let's keep reimplementing functions from
`Control.Exception`

! This time, let's do
`finally`

, which will ensure that some action (usually a
cleanup action) is run after an initial action, regardless of
whether an exception was thrown.

finallyStateT :: StateT Int IO a -> IO b -> StateT Int IO a finallyStateT (StateT action) cleanup = StateT $ \s0 -> action s0 `finally` cleanup

That was really easy. Ehh, but one problem: look at that type
signature! We just agreed (or I agreed for you) that in the case of
`catch`

, it was better to have the second argument
*also* live in `StateT Int IO`

. Here, our argument
lives in `IO`

. Let's fix that:

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0

Huh, also pretty simple. Let's analyze the monadic state
behavior here: our cleanup action is given the initial state,
regardless of the result of `action s0`

. That means
that, even if the action succeeded, we'll ignore the updated state.
Furthermore, because `finally`

ignores the result of the
second argument, we will ignore any updated monadic state. Want to
see what I mean? Try this out:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script import Control.Exception import Control.Monad.State.Strict finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> action s0 `finally` cleanup s0 action :: StateT Int IO () action = modify (+ 1) cleanup :: StateT Int IO () cleanup = do get >>= lift . print modify (+ 2) main :: IO () main = execStateT (action `finallyStateT` cleanup) 0 >>= print

You may expect the output of this to be the numbers 1 and 3, but
in fact the output is 0 and 1: `cleanup`

looks at the
initial state value of 0, and its `+ 2`

modification is
thrown away. So can we implement a version of our function that
keeps the state? Sure (slightly simplified to avoid async
exception/mask noise):

finallyStateT :: StateT Int IO a -> StateT Int IO b -> StateT Int IO a finallyStateT (StateT action) (StateT cleanup) = StateT $ \s0 -> do (a, s1) <- action s0 `onException` cleanup s0 (_b, s2) <- cleanup s1 return (a, s2)

This has the expected output of 1 and 3. Looking at how it
works: we follow our same tricks, and pass in `s0`

to
`action`

. If an exception is thrown there, we once again
pass in `s0`

to `cleanup`

and ignore its
updated state (since we have no choice). However, in the success
case, we now pass in the *updated* state (`s1`

) to
`cleanup`

. And finally, our resulting state is the
result of `cleanup`

(`s2`

) instead of the
`s1`

produced by `action`

.

We have three different implementations of
`finallyStateT`

and two different type signatures. Let's
compare them:

- The first one (the
`IO`

version) has the advantage that its type tells us*exactly*what's happening: the cleanup has no access to the state at all. However, you can argue like we did with`catchStateT`

that this is limiting and not what people would expect the type signature to be. - The second one (use the initial state for
`cleanup`

and then throw away its modified state) has the advantage that it's logically consistent: whether`cleanup`

is called from a success or exception code path, it does the exact same thing. On the other hand, you can argue that it is surprising behavior that state updates that*can*be preserved are being thrown away. - The third one (keep the state) has the reversed arguments of the second one.

So unlike `catchStateT`

, I would argue that there's
not nearly as clear a winner with `finallyStateT`

. Each
approach has its relative merits.

One final point that seems almost not worth mentioning (hint:
epic foreshadowment incoming). The first version (`IO`

specialized) has an additional benefit of being ever-so-slightly
more efficient than the other two, since it doesn't need to deal
with the additional monadic state in `cleanup`

. With a
simple monad transformer like `StateT`

this performance
difference is hardly even worth thinking about. However, if we were
in a tight inner loop, and our monad stack was significantly more
complicated, you could imagine a case where the performance
difference was significant.

It's great that we understand `StateT`

so well, but
can we do anything for other transformers? It turns out that, yes,
we can for many transformers. (An exception is continuation-based
transformers, which you can read a bit about in passing in my ResourceT blog post from
last week.) Let's look at a few other examples of
`finally`

:

import Control.Exception import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid finallyWriterT :: Monoid w => WriterT w IO a -> WriterT w IO b -> WriterT w IO a finallyWriterT (WriterT action) (WriterT cleanup) = WriterT $ do (a, w1) <- action `onException` cleanup (_b, w2) <- cleanup return (a, w1 <> w2) finallyReaderT :: ReaderT r IO a -> ReaderT r IO b -> ReaderT r IO a finallyReaderT (ReaderT action) (ReaderT cleanup) = ReaderT $ \r -> do a <- action r `onException` cleanup r _b <- cleanup r return a finallyExceptT :: ExceptT e IO a -> ExceptT e IO b -> ExceptT e IO a finallyExceptT (ExceptT action) (ExceptT cleanup) = ExceptT $ do ea <- action `onException` cleanup eb <- cleanup return $ case (ea, eb) of (Left e, _) -> Left e (Right _a, Left e) -> Left e (Right a, Right _b) -> Right a

The `WriterT`

case is very similar to the
`StateT`

case, except (1) there's no initial state
`s0`

to contend with, and (2) instead of receiving an
updated `s2`

state from `cleanup`

, we need to
monoidally combine the `w1`

and `w2`

values.
The `ReaderT`

case is *also* very similar to
`StateT`

, but in the opposite way: we receive an
immutable environment `r`

which is passed into all
functions, but there is no updated state. To put this in other
words: `WriterT`

has no *context* but has
*mutable monadic state*, whereas `ReaderT`

has a
context but no mutable monadic state. `StateT`

, by
contrast, has both. (This is important to understand, so reread it
a few times to get comfortable with the concept.)

The `ExceptT`

case is interesting: it has no context
(like `WriterT`

), but it *does* have mutable
monadic state, just not like `StateT`

and
`WriterT`

. Instead of returning an extra value with each
result (as a product), `ExceptT`

returns either a result
value or an `e`

value (as a sum). The `case`

expression at the end of `finallyExceptT`

is very
informative: we need to figure out how to combine the various
monadic states together. Our implementation here says that if
`action`

returns `e`

, we take that result.
Otherwise, if `cleanup`

fails, we take *that*
value. And if they both return `Right`

values, then we
use `action`

's result. But there are at least two other
valid choices:

- Prefer
`cleanup`

's`e`

value to`action`

's`e`

value, if both are available. - Completely ignore the
`e`

value returned by`cleanup`

, and just use`action`

's result.

There's also a fourth, invalid option: if `action`

returns a `Left`

, return that immediately and don't call
`cleanup`

. This has been a perenniel source of bugs in
many libraries dealing with exceptions in monad transformers like
`ErrorT`

, `ExceptT`

, and
`EitherT`

. This invalidates the contract of
`finally`

, namely that `cleanup`

will always
be run. I've seen some arguments for why this can make sense, but I
consider it nothing more than a buggy implementation.

And finally, like with `StateT`

, we could avoid all
of these questions for `ExceptT`

if we just modify our
type signature to use `IO b`

for
`cleanup`

:

finallyExceptT :: ExceptT e IO a -> IO b -> ExceptT e IO a finallyExceptT (ExceptT action) cleanup = ExceptT $ do ea <- action `onException` cleanup _b <- cleanup return ea

So our takeaway: we can implement `finally`

for
various monad transformers. In some cases this leads to questions
of semantics, just like with `StateT`

. And all of these
transformers fall into a pattern of optionally capturing some
initial context, and optionally shuttling around some monadic
state.

(And no, I haven't forgotten that the title of this blog post
talks about `bracket`

. We're getting there, ever so
slowly. I hope I've piqued your curiosity.)

It's wonderful that we can implement all of these functions that
take monad transformers as arguments. But do any of us actually
want to go off and implement `catch`

, `try`

,
`finally`

, `forkIO`

, `timeout`

,
and a dozen other functions for every possible monad transformer
stack imagineable? I doubt it. So just as we have
`MonadTrans`

and `MonadIO`

for dealing with
transformers in output/positive position, we can construct some
kind of typeclass that handles the two concepts we mentioned above:
capture the context, and deal with the monadic state.

Let's start by playing with this for just
`StateT`

.

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Exception import Control.Monad.State.Strict type Run s = forall b. StateT s IO b -> IO (b, s) capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0) restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult finally1 :: StateT s IO a -> IO b -> StateT s IO a finally1 action cleanup = do x <- capture $ \run -> run action `finally` cleanup restoreState x finally2 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally2 action cleanup = do x <- capture $ \run -> run action `finally` run cleanup restoreState x -- Not async exception safe! finally3 :: StateT s IO a -> StateT s IO b -> StateT s IO a finally3 action cleanup = do x <- capture $ \run -> run action `onException` run cleanup a <- restoreState x _b <- cleanup return a main :: IO () main = do flip evalStateT () $ lift (putStrLn "here1") `finally1` putStrLn "here2" flip evalStateT () $ lift (putStrLn "here3") `finally2` lift (putStrLn "here4") flip evalStateT () $ lift (putStrLn "here5") `finally2` lift (putStrLn "here6")

That's a lot, let's step through it slowly:

type Run s = forall b. StateT s IO b -> IO (b, s)

This is a helper type to make the following bit simpler. It
represents the concept of capturing the initial state in a general
manner. Given an action living in our transformer, it turns an
action in our base monad, returning the entire monadic state with
the return value (i.e., `(b, s)`

instead of just
`b`

). This allows use to define our `capture`

function:

capture :: forall s a. (Run s -> IO a) -> StateT s IO a capture withRun = StateT $ \s0 -> do let run :: Run s run (StateT f) = f s0 a <- withRun run return (a, s0)

This function says "you give me some function that needs to be
able to run monadic actions with the initial context, and I'll give
it that initial context running function (`Run s`

)." The
implementation isn't too bad: we just capture the `s0`

,
create a `run`

function out of it, pass that into the
user-provided argument, and then return the result with the
original state.

Now we need some way to update the monadic state based on a
result value. We call it `restoreState`

:

restoreState :: (a, s) -> StateT s IO a restoreState stateAndResult = StateT $ \_s0 -> return stateAndResult

Pretty simple too: we ignore our original monadic state and
replace it with the state contained in the argument. Next we use
these two functions to implement three versions of
`finally`

. The first two are able to reuse the
`finally`

from `Control.Exception`

. However,
both of them suffer from the inability to retain monadic state. Our
third implementation fixes that, at the cost of having to
reimplement the logic of `finally`

. And as my comment
there mentions, our implementation is not in fact async exception
safe.

So all of our original trade-offs apply from our initial
`StateT`

discussion, but now there's an additional
downside to option 3: it's significantly more complicated to
implement correctly.

Alright, we've established that it's possible to capture this
idea for `StateT`

. Let's generalize to a typeclass.
We'll need three components:

- A capture function. We'll call it
`liftIOWith`

, to match nomenclature in monad-control. - A restore function, which we'll call
`restoreM`

. - An
*associated type*(type family) to represent what the monadic state for the given monad stack is.

We end up with:

type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a

Let's write an instance for `IO`

:

instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return

The `type StM IO a = a`

says that, for an
`IO`

action returning `a`

, the full monadic
state is just `a`

. In other words, there is no
additional monadic state hanging around. That's good, as we know
that there isn't. `liftIOWith`

is able to just use
`id`

as the `RunInIO`

function, since you can
run an `IO`

action in `IO`

directly. And
finally, since there is no monadic state to update,
`restoreM`

just wraps up the result value in
`IO`

via `return`

. (More foreshadowment: what
this instance is supposed to look like is actually at the core of
the bug this blog post will eventually talk about.)

Alright, let's implement this instance for ```
StateT s
IO
```

:

instance MonadIOControl (StateT s IO) where type StM (StateT s IO) a = (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- withRun $ \(StateT f) -> f s0 return (a, s0) restoreM stateAndResult = StateT $ \_s0 -> return stateAndResult

This is basically identical to the functions we defined above,
so I won't dwell on it here. But here's an interesting observation:
the same way we define `MonadIO`

instance as
`instance MonadIO m => MonadIO (StateT s m)`

, it
would be great to do the same thing for
`MonadIOControl`

. And, in fact, we can do just that!

instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x

We use the underlying monad's `liftIOWith`

and
`restoreM`

functions within our own definitions, and
thereby get context and state passed up and down the stack as
needed. Alright, let's go ahead and do this for all of the
transformers we've been discussing:

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE UndecidableInstances #-} import Control.Exception import Control.Monad.State.Strict import Control.Monad.Writer import Control.Monad.Reader import Control.Monad.Except import Data.Monoid import Data.IORef type RunInIO m = forall b. m b -> IO (StM m b) class MonadIO m => MonadIOControl m where type StM m a liftIOWith :: (RunInIO m -> IO a) -> m a restoreM :: StM m a -> m a instance MonadIOControl IO where type StM IO a = a liftIOWith withRun = withRun id restoreM = return instance MonadIOControl m => MonadIOControl (StateT s m) where type StM (StateT s m) a = StM m (a, s) liftIOWith withRun = StateT $ \s0 -> do a <- liftIOWith $ \run -> withRun $ \(StateT f) -> run $ f s0 return (a, s0) restoreM x = StateT $ \_s0 -> restoreM x instance (MonadIOControl m, Monoid w) => MonadIOControl (WriterT w m) where type StM (WriterT w m) a = StM m (a, w) liftIOWith withRun = WriterT $ do a <- liftIOWith $ \run -> withRun $ \(WriterT f) -> run f return (a, mempty) restoreM x = WriterT $ restoreM x instance MonadIOControl m => MonadIOControl (ReaderT r m) where type StM (ReaderT r m) a = StM m a liftIOWith withRun = ReaderT $ \r -> liftIOWith $ \run -> withRun $ \(ReaderT f) -> run $ f r restoreM x = ReaderT $ \r -> restoreM x instance MonadIOControl m => MonadIOControl (ExceptT e m) where type StM (ExceptT e m) a = StM m (Either e a) liftIOWith withRun = ExceptT $ do a <- liftIOWith $ \run -> withRun $ \(ExceptT f) -> run f return $ Right a restoreM x = ExceptT $ restoreM x control :: MonadIOControl m => (RunInIO m -> IO (StM m a)) -> m a control f = do x <- liftIOWith f restoreM x checkControl :: MonadIOControl m => m () checkControl = control $ \run -> do ref <- newIORef (0 :: Int) let ensureIs :: MonadIO m => Int -> m () ensureIs expected = liftIO $ do putStrLn $ "ensureIs " ++ show expected curr <- atomicModifyIORef ref $ \curr -> (curr + 1, curr) unless (curr == expected) $ error $ show ("curr /= expected", curr, expected) ensureIs 0 Control.Exception.mask $ \restore -> do ensureIs 1 res <- restore (ensureIs 2 >> run (ensureIs 3) `finally` ensureIs 4) ensureIs 5 return res main :: IO () main = do checkControl runStateT checkControl () >>= print runWriterT checkControl >>= (print :: ((), ()) -> IO ()) runReaderT checkControl () runExceptT checkControl >>= (print :: Either () () -> IO ())

I encourage you to inspect each of the instances above and make
sure you're comfortable with their implementation. I've added a
function here, `checkControl`

, as a basic sanity check
of our implementation. We start with the `control`

helper function, which runs some action with a `RunInIO`

argument, and then restores the monadic state. Then we use this
function in `checkControl`

to ensure that a series of
actions are all run in the correct order. As you can see, all of
our test monads pass (again, *foreshadowment*).

The real monad-control package looks pretty similar to this, except:

- Instead of
`MonadIOControl`

, which is hard-coded to using`IO`

as a base monad, it provides a`MonadBaseControl`

typeclass, which allows arbitrary base monads (like`ST`

or`STM`

). - Just as
`MonadBaseControl`

is an analogue of`MonadIO`

, the package provides`MonadTransControl`

as an analogue of`MonadTrans`

, allowing you to unwrap one layer in a monad stack.

With all of this exposition out of the way—likely the longest exposition I've ever written in any blog post—we can start dealing with the actual bug. I'll show you the full context eventually, but I was asked to help debug a function that looked something like this:

fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE

This is fairly common in Conduit code. We're going to use
`sourceFile`

, which needs to allocate some resources.
Since we can't safely allocate resources from within a Conduit
pipeline, we start off with `runResourceT`

to allow
Conduit to register cleanup actions. (This combination is so common
that we have a helper function ```
runConduitRes = runResourceT .
runConduit
```

.)

Unfortunately, this innocuous-looking like of code was generating an error message:

`Control.Monad.Trans.Resource.register': The mutable state is being accessed after cleanup. Please contact the maintainers.`

The "Please contact the maintainers." line should probably be removed from the resourcet package; it was from back in a time when we thought this bug was most likely to indicate an implementation bug within resourcet. That's no longer the case... which hopefully this debugging adventure will help demonstrate.

Anyway, as last week's blog post on ResourceT explained,
`runResourceT`

creates a mutable variable to hold a list
of cleanup actions, allows the inner action to register cleanup
values into that mutable variable, and then when
`runResourceT`

is exiting, it calls all those cleanup
actions. And as a last sanity check, it replaces the value inside
that mutable variable with a special value indicating that the
state has already been closed, and it is therefore invalid to
register further cleanup actions.

In well-behaved code, the structure of our
`runResourceT`

function should prevent the mutable state
from being accessible after it's closed, though I mention some
cases last week that could cause that to happen (specifically,
misuse of concurrency and the `transPipe`

function).
However, after thoroughly exploring the codebase, I could find no
indication that either of these common bugs had occurred.

Internally, `runResourceT`

is essentially a
`bracket`

call, using the
`createInternalState`

function to allocate the mutable
variable, and `closeInternalState`

to clean it up. So I
figured I could get a bit more information about this bug by using
the `bracket`

function from
`Control.Exception.Lifted`

and implementing:

fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE

Much to my chagrin, the bug disappeared! Suddenly the code
worked perfectly. Beginning to question my sanity, I decided to
look at the implementation of `runResourceT`

, and found
this:

runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a runResourceT (ResourceT r) = control $ \run -> do istate <- createInternalState E.mask $ \restore -> do res <- restore (run (r istate)) `E.onException` stateCleanup ReleaseException istate stateCleanup ReleaseNormal istate return res

Ignoring the fact that we differentiate between exception and
normal cleanup in the `stateCleanup`

function, I was
struck by one question: why did I decide to implement this with
`control`

in a manual, error-prone way instead of using
the `bracket`

function directly? I began to worry that
there was a bug in this implementation leading to all of the
problems.

However, after reading through this implementation many times, I
convinced myself that it was, in fact, correct. And then I realized
why I had done it this way. Both `createInternalState`

and `stateCleanup`

are functions that can live in
`IO`

directly, without any need of a monad transformer
state. The only function that needed the monad transformer logic
was that contained in the `ResourceT`

itself.

If you remember our discussion above, there were two major
advantages of the implementation of `finally`

which
relied upon `IO`

for the cleanup function instead of
using the monad transformer state:

- It was much more explicit about how monadic state was going to be handled.
- It gave a slight performance advantage.

With the downside being that the type signature wasn't quite what people normally expected. Well, that downside didn't apply in my case: I was working on an internal function in a library, so I was free to ignore what a user-friendly API would look like. The advantage of explicitness around monadic state certainly appealed in a library that was so sensitive to getting things right. And given how widely used this function is, and the deep monadic stacks it was sometimes used it, any performance advantage was worth pursuing.

Alright, I felt good about the fact that
`runResourceT`

was implemented correctly. Just to make
sure I wasn't crazy, I reimplemented `fileLen`

to use an
explicit `control`

instead of
`Lifted.bracket`

, and the bug reappeared:

-- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

And as one final sanity check, I implemented
`fileLen4`

to use the generalized style of
`bracket`

, where the allocation and cleanup functions
live in the monad stack instead of just `IO`

, and as
expected the bug disappeared again. (Actually, I didn't really do
this. I'm doing it now for the purpose of this blog post.)

fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE

Whew, OK! So it turns out that my blog post title was correct:
this *is* a tale of two brackets. And somehow, one of them
triggers a bug, and one of them doesn't. But I still didn't know
quite how that happened.

Another member of the team tracked down the ultimate problem to
a datatype that looked like this (though not actually named
`Bad`

, that would have been too obvious):

newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad

That's the kind of code that can easily pass a code review
without anyone noticing a thing. With all of the context from this
blog post, you may be able to understand why I've called this type
`Bad`

. Go ahead and give it a few moments to try and
figure it out.

OK, ready to see how this plays out? The `StM Bad a`

associated type is supposed to contain the result value of the
underlying monad, together with any state introduced by this monad.
Since we just have a newtype around `IO`

, there should
be no monadic state, and we should just have `a`

.
However, we've *actually* defined it as `IO a`

,
which means "my monadic state for a value `a`

is an
`IO`

action which will return an `a`

." The
implementation of `liftBaseWith`

and
`restoreM`

are simply in line with making the types work
out.

Let's look at `fileLen3`

understanding that this is
the instance in question. I'm also going to expand the
`control`

function to make it easier to see what's
happening.

res <- liftBaseWith $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res restoreM res

If we play it a little loose with newtype wrappers, we can
substitute in the implementations of `liftBaseWith`

and
`restoreM`

to get:

res <- Bad $ do let run = return . runBad istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res Bad res

Let's go ahead and substitute in our `run`

function
in the one place it's used:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) `onException` closeInternalState istate closeInternalState istate return res Bad res

If you look at the code `return x `onException` foo`

,
it's pretty easy to establish that `return`

itself will
never throw an exception in `IO`

, and therefore the
`onException`

it useless. In other words, the code is
equivalent to just `return x`

. So again
substituting:

res <- Bad $ do istate <- createInternalState res <- return (runBad (runInternalState inner istate)) closeInternalState istate return res Bad res

And since `foo <- return x`

is just ```
let foo
= x
```

, we can turn this into:

res <- Bad $ do istate <- createInternalState closeInternalState istate return (runBad (runInternalState inner istate)) Bad res

And then:

Bad $ do istate <- createInternalState closeInternalState istate Bad (runBad (runInternalState inner istate))

And finally, just to drive the point home:

istate <- Bad createInternalState Bad $ closeInternalState istate runInternalState inner istate

So who wants to take a guess why the mutable variable was closed
before we ever tried to register? Because *that's exactly what
our MonadBaseControl instance said!* The problem is
that instead of our monadic state just being some value, it was the

`closeInternalState`

. Oops.Now let's try to understand why `fileLen4`

worked,
despite the broken `MonadBaseControl`

instance. Again,
starting with the original code after replacing
`control`

with `liftBaseWith`

and
`restoreM`

:

res <- liftBaseWith $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) restoreM res

This turns into:

res <- Bad $ bracket (return $ runBad createInternalState) (\st -> return $ runBad $ Bad st >>= closeInternalState) (\st -> return $ runBad $ Bad st >>= runInternalState inner) Bad res

Since this case is a bit more involved than the previous one,
let's strip off the noise of `Bad`

and
`runBad`

calls, since they're just wrapping/unwrapping a
newtype:

res <- bracket (return createInternalState) (\st -> return $ st >>= closeInternalState) (\st -> return $ st >>= runInternalState inner) res

To decompose this mess, let's look at the actual implementation
of `bracket`

from `base`

:

bracket before after thing = mask $ \restore -> do a <- before r <- restore (thing a) `onException` after a _ <- after a return r

We're going to ignore async exceptions for now, and therefore
just mentally delete the `mask $ \restore`

bit. We end
up with:

res <- do a <- return createInternalState r <- return (a >>= runInternalState inner) `onException` return (a >>= closeInternalState) _ <- return (a >>= closeInternalState) return r res

As above, we know that our ```
return x `onException`
foo
```

will never actually trigger the exception case. Also,
`a <- return x`

is the same as ```
let a =
x
```

. So we can simplify to:

res <- do let a = createInternalState let r = a >>= runInternalState inner _ <- return (a >>= closeInternalState) return r res

Also, `_ <- return x`

has absolutely no impact at
all, so we can delete that line (and any mention of
`closeInternalState`

):

res <- do let a = createInternalState let r = a >>= runInternalState inner return r res

And then with a few more simply conversions, we end up with:

createInternalState >>= runInternalState inner

No wonder this code "worked": it never bothered trying to clean
up! This could have easily led to complete leaking of resources in
the application. Only the fact that our `runResourceT`

function thankfully stressed the code in a different way did we
reveal the problem.

It's certainly possible to define a correct newtype wrapper
around `IO`

:

newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return

Unfortunately we can't simply use
`GeneralizedNewtypeDeriving`

to make this instance due
to the associated type family. But the explicitness here helps us
understand what we did wrong before. Note that our ```
type StM
Good a
```

is just `a`

, not `IO a`

. We
then implement the helper functions in terms of that. If you go
through the same substitution exercise I did above, you'll see
that—instead of passing around values which contain the actions to
actually perform—our `fileLen3`

and
`fileLen4`

functions will be performing the actions at
the appropriate time.

I'm including the full test program at the end of this post for you to play with.

So that blog post was certainly all over the place. I hope the
primary thing you take away from it is a deeper understanding of
how monad transformer stacks interact with operations in the base
monad, and how monad-control works in general. In particular, next
time you call `finally`

on some five-layer-deep stack,
maybe you'll think twice about the implication of calling
`modify`

or `tell`

in your cleanup
function.

Another possible takeaway you may have is "Haskell's crazy
complicated, this bug could happen to anyone, and it's almost
undetectable." It turns out that there's a really simple workaround
for that: stick to standard monad transformers whenever possible.
monad-control is a phenomonal library, but I don't think most
people should ever have to interact with it directly. Like async
exceptions and `unsafePerformIO`

, there are parts of our
library ecosystem that require them, but you should stick to
higher-level libraries that hide that insanity from you, the same
way we use higher-level languages to avoid having to write
assembly.

Finally, having to think about all of the monadic state stuff in
my code gives me a headache. It's possible for us to have a library
like `lifted-base`

, but which constrains functions to
only taking one argument in the `m`

monad and the rest
in `IO`

to avoid the multiple-state stuff. However, my
preferred solution is to avoid wherever possible monad transformers
that introduce monadic state, and stick to `ReaderT`

like things for the majority of my application. (Yes, this is
another pitch for my ReaderT design
pattern.)

#!/usr/bin/env stack -- stack --resolver lts-8.12 script {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE ScopedTypeVariables #-} import Control.Monad.Trans.Control import Control.Monad.Trans.Resource import Control.Exception.Safe import qualified Control.Exception.Lifted as Lifted import Conduit newtype Bad a = Bad { runBad :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Bad where type StM Bad a = IO a liftBaseWith withRun = Bad $ withRun $ return . runBad restoreM = Bad newtype Good a = Good { runGood :: IO a } deriving (Functor, Applicative, Monad, MonadIO, MonadThrow, MonadBase IO) instance MonadBaseControl IO Good where type StM Good a = a liftBaseWith withRun = Good $ withRun runGood restoreM = Good . return fileLen1 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen1 fp = runResourceT $ runConduit $ sourceFile fp .| lengthCE fileLen2 :: (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen2 fp = Lifted.bracket createInternalState closeInternalState $ runInternalState $ runConduit $ sourceFile fp .| lengthCE -- I'm ignoring async exception safety. This needs mask. fileLen3 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen3 fp = control $ \run -> do istate <- createInternalState res <- run (runInternalState inner istate) `onException` closeInternalState istate closeInternalState istate return res where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE fileLen4 :: forall m. (MonadThrow m, MonadBaseControl IO m, MonadIO m) => FilePath -> m Int fileLen4 fp = control $ \run -> bracket (run createInternalState) (\st -> run $ restoreM st >>= closeInternalState) (\st -> run $ restoreM st >>= runInternalState inner) where inner :: ResourceT m Int inner = runConduit $ sourceFile fp .| lengthCE main :: IO () main = do putStrLn "fileLen1" tryAny (fileLen1 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen1 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen1 "/usr/share/dict/words")) >>= print putStrLn "fileLen2" tryAny (fileLen2 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen2 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen2 "/usr/share/dict/words")) >>= print putStrLn "fileLen3" tryAny (fileLen3 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen3 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen3 "/usr/share/dict/words")) >>= print putStrLn "fileLen4" tryAny (fileLen4 "/usr/share/dict/words") >>= print tryAny (runBad (fileLen4 "/usr/share/dict/words")) >>= print tryAny (runGood (fileLen4 "/usr/share/dict/words")) >>= print

**Bonus exercise** Take the `checkControl`

function I provided above, and use it in the `Good`

and
`Bad`

monads. See what the result is, and if you can
understand why that's the case.

Subscribe to our blog via email

Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.