Threading and Gtk2Hs

So you're writing a Gtk2Hs application and you need to do some threading. Perhaps your program needs to access the network, call out to a database, deal with the file system, interface with an asynchronous library, or run a long background computation of any kind, and you want your interface to stay snappy while the magic is happening. Forking a few threads to do the hard stuff is a natural way to handle the problem, and you want to know how to do it right.

You're in the right place.

Best Practice: Just the Punchline

If you want to get up and running with a threaded application as quickly as possible, read this section and start coding. There's only two rules:

Make all Gtk calls from the main thread. (Don't worry: callbacks registered on the main thread will be run on the main thread.) If other threads need to affect the interface, use postGUIAsync or postGUISync to send your code to the main thread.
Link your program with the threaded runtime system by passing GHC the -threaded option at link time.
If you plan on using the Win32 API, OpenGL, or any other library that uses thread-local state, read the rest of this page, too. (Okay, there's three. But this one doesn't count.)

That's all you need to do to have a correct, concurrent GUI. Let's build a complete, runnable example (simple-example.hs) using the best practices outlined above. It will be a very simple GUI: there will be one button that forks a thread to think deep thoughts, and one checkbox that we can toggle to double-check that the GUI is still responding.

~% cat >simple-example.hs
import Control.Concurrent
import Graphics.UI.Gtk
import System.Random

verifyCommand  = "Click me to verify that the GUI is still responding."
thinkCommand   = "Think really hard."
thinkObedience = "Thinking really hard..."

main = do
    -- Now for some setup. We're about to make some Gtk calls, so we should
    -- ask ourselves: are we either on the main thread or calling postGUI?
    -- Yes, we're on the main thread. Good to go.
    initGUI
    window <- windowNew
    vbox   <- vBoxNew True 2
    check  <- checkButtonNewWithLabel verifyCommand
    button <- buttonNewWithLabel thinkCommand

    -- Next up: we'll make the button actually do something.
    button `on` buttonActivated $ do
        -- Gtk calls are still OK inside this callback! It will be called by
        -- mainGUI, which we're calling below on the main thread...
        forkIO $ do
            -- ...but Gtk calls in this callback are NOT OK. So we'll use
            -- postGUIAsync.
            n <- randomRIO (1e8, 2e8) :: IO Double
            last [0..n] `seq` postGUIAsync (buttonSetLabel button thinkCommand)
        buttonSetLabel button thinkObedience
    onDestroy window mainQuit

    -- Now we just need to plumb everything together and start the GUI.
    containerAdd vbox check
    containerAdd vbox button
    containerAdd window vbox
    widgetShowAll window
    mainGUI
~% ghc -threaded simple-example # link with the threaded runtime
[1 of 1] Compiling Main             ( simple-example.hs, simple-example.o )
Linking simple-example ...
~% ./simple-example

The remainder of this article is for people who want to know more about the interactions between the FFI, OS threads, and Haskell threads; want to know why other guides make other recommendations; want to do something unorthodox with Gtk2Hs (like make Gtk calls from other threads, run the GUI on a different thread than the main thread, make calls after mainGUI has returned, write C code that calls Haskell code, etc.); or are writing an FFI binding similar to Gtk2Hs and want to know how to do it correctly.

Why Threading is Hard

It is well-known that writing multi-threaded programs of any kind is already quite hard. There are dozens of pitfalls to avoid when ensuring that the program you write is safe, that is, that no two threads attempt to make changes to the same data structure at the same time. One common solution is to use locking. Doing so introduces more pitfalls when trying to ensure that the program is live, that is, that threads aren't waiting on each other and dead- or live-locking. In addition to all the traditional problems, multithreaded Gtk2Hs programming has some of its own.

One complication is that Gtk2Hs does its best to abstract away from Gtk's locking model to minimize the amount of Gtk locking that Gtk2Hs code has to explicitly do. The abstraction is pretty good... but abstractions leak.

Another complication arises from the awkward interaction between operating system threads, Haskell threads, and the FFI. Operating system threads have a pre-emptive model: the operating system can (and does) interrupt a thread at any time. GHC tries to present the facade that Haskell threads are pre-emptive, but in fact they are cooperative: the compiler inserts synchronization points at which the runtime runs, possibly interrupting the thread. The facade is very good, but breaks at FFI calls, because the compiler cannot insert synchronization points into foreign code.

Finally, a handful of libraries use (operating-system-)thread-local state; in particular, many Gtk calls bottom out at a call to the Win32 API. These calls all need to happen on the same operating system thread to have access to the thread-local state, so GHC's habit of migrating Haskell threads across operating system threads is dangerous.

Conventions

A runtime system is collection of code that a compiler inserts into every program it generates. (The system typically implements things like memory allocation and garbage collection, scheduling and locking of threads, and so on. The scheduler will be our main interest below.) GHC offers two runtime systems, which I will call the non-threaded and threaded runtime systems. If I don't specify, I'm talking about the threaded runtime system–my personal opinion is that the non-threaded runtime should be deprecated and eventually removed.

There are four kinds of threads: operating system threads (OS threads), Haskell threads as implemented by the non-threaded runtime (non-threaded threads), Haskell threads as implemented by the threaded runtime that are bound to a particular OS thread (bound threads), and Haskell threads as implemented by the threaded runtime that are not bound to a particular OS thread (unbound threads). Additionally, bound threads may either be currently executing Haskell code or foreign code. A Haskell thread is any one of a non-threaded thread, an unbound thread, or a bound thread that's currently executing Haskell code. If I don't specify what kind of thread I'm talking about, I want to get an email that complains loudly.

Foreign Imports

On a first reading, you may wish to skip this section and ignore any sentences below that refer to unsafe or safe imports.

The foreign function interface offers two kinds of imports: unsafe and safe imports. Elsewhere in Haskell, there is a convention that functions whose name starts with unsafe place a proof burden on the programmer calling them; in contrast, unsafe imports place a burden on the programmer writing the code that's being imported. Clients of the import have no burden! Imports marked "safe" are actually a request to the compiler to wrap the foreign code with some runtime code that makes the call safe; imports marked "unsafe" are a claim to the compiler that the imported code satisfies some conditions that make this unnecessary. As a result, unsafe imports can be slightly faster, especially in single-threaded code. For safety (as opposed to liveness), the proof burden is that the code being imported never calls back into Haskell.

Liveness is a different story. Calls to unsafe imports prevent all other Haskell threads on the same OS thread from running. The proof burden is therefore that the function does not block. Even functions which execute a long-running computation, but don't block, are probably best marked safe rather than unsafe. The code inserted by a safe import will find or create a free OS thread to execute the call on, so that the operating system's truly pre-emptive scheduling can give it a hand.

There's only a small number of Haskell threads actually executing at any given time; the OS threads that are executing Haskell threads have a data structure called a "capability" which includes a pool for OS threads and a run queue for Haskell threads. If there's multiple capabilities, they periodically synch up to do some load balancing. When a Haskell thread makes a safe FFI call, the runtime does the following:

Ask the capability's OS thread pool for another OS thread. The pool will create a new OS thread if none are available.
Hand off the capability to the newly allocated OS thread.
Become a bound thread. (Technically, the thread does not become bound until the FFI call calls back into the Haskell world. But I find it easier to think about this way.)
Make the foreign call.
When the call returns (as distinguished from calling back into Haskell via a foreign export), become unbound again (if it was unbound to begin with), then retrieve the capability from the other OS thread.
Return the other OS thread to the capability's pool.

An invariant of GHC's runtime is that bound threads always have an entire OS thread to themselves; however, this doesn't mean bound threads are always executing. Rather, they are executing just when they're in a foreign call or have a capability. When executing an unsafe call, the runtime skips all of the hullabaloo and just makes the call; this prevents any of the other Haskell threads being tracked by that capability from running.

What does this mean for you, the multithreaded Gtk2Hs programmer? Basically, it means be very wary of unsafe foreign calls, either in Gtk2Hs or in other libraries you use. They'll turn your beautiful multithreaded code distressingly single-threaded in a heartbeat.

The Guts

The Gtk2Hs library actually includes bindings to many C libraries; of particular interest are glib and Gtk. The glib library does all its locking internally; your only obligation is to call g_thread_init() exactly once before using any of its functions. This function is not exposed in Gtk2Hs; however, initGUI calls it for you. Many glib functions are imported unsafe; be aware of this if you plan to write a multithreaded C program that calls into Haskell code that uses Gtk2Hs or if you plan on using glib with more than one MainContext.

Use of the Gtk library is a bit more nuanced. There is a global Gtk lock, which, like the glib lock, is initialized in initGUI. Haskell threads that want to make Gtk calls must protect those calls by calling threadsEnter to grab the lock before the call and calling threadsLeave to release the lock after the call. The lock is mostly unused by Gtk, with the following exceptions:

The initGUI initializer calls threadsEnter.
The main Gtk loop mainGUI calls threadsLeave while waiting for events and threadsEnter again before invoking any callbacks when it emits signals.
The postGUIAsync and postGUISync functions wrap the code you hand them with calls to threadsEnter and threadsLeave before handing the code off to the main loop.
One other place (see below).

This explains why calls to these functions do not appear in the "Best Practices" section: they are inserted for you either by Gtk or by Gtk2Hs in all the places they are normally needed.

There is an additional twist, alluded to above: on Windows, Gtk uses the Win32 API, which uses OS-thread-local state. As a result, when writing multithreaded Gtk2Hs code that needs to be portable to Windows, all Gtk calls that use the API (which is most of them) must be made from a single OS thread. Luckily, the main thread is always a bound thread–meaning the runtime dedicates an entire OS thread to executing main. So the simplest thing is to make all these calls from the main thread. Otherwise, you must create your own dedicated OS thread using forkOS and make all sensitive calls from there, including the call to mainGUI (or the other main loop drivers).

Finally, there is one subtlety about the interaction between glib and Gtk that is worth mentioning here. The glib library is intended to stand on its own, so it knows nothing about the internals of Gtk. Indeed, the dependency runs the other way: glib is the library that actually handles the main loop, and Gtk just register callbacks that grab and release the Gtk lock as appropriate. The upshot of this is that both the glib and Gtk C libraries offer hooks to add things to the main loop, and both the glib and gtk Haskell packages offer timeoutAdd and idleAdd. The ones offered by glib hand your code block as-is to the main loop, which will execute your code blocks without holding the Gtk lock. If you want to execute code that makes Gtk calls periodically or when the system is idle, you should use the ones offered by the gtk package instead (or add calls to threadsEnter and threadsLeave manually).

The Non-Threaded Runtime

In the non-threaded runtime, all Haskell threads—even those executing foreign code—are multiplexed on a single operating system thread. Calls to foreign code, whether imported safe or unsafe, block all other threads from executing any code until they return or, in the case of a safe import, call back into the Haskell world via a foreign export. Because threadsEnter blocks and does not call back into Haskell code, it is not possible to make Gtk calls from multiple threads when using the non-threaded runtime (though using postGUIAsync and postGUISync to ask the main thread to make Gtk calls is okay). Although the mainGUI loop will sometimes release Gtk's lock, that doesn't really help, because no other Haskell threads can run while foreign code is executing. In principle, this problem could actually be fixed: Gtk offers a way to dynamically replace its locking code, which we could use to replace Gtk's lock with an MVar. This should give the non-threaded runtime enough of a chance to run at the critical moments that all these problems would go away. If for some reason you need to make Gtk calls from multiple non-threaded threads, contact me, and we'll work on implementing this fix together.

Common Myths

Actually, I have no idea how common these are. But these are some of the wrong things I believed while researching this topic, and what I now believe to be the truth.

All Gtk2Hs calls should happen from one thread.: The real rules are to make calls only when you hold the Gtk lock and to ensure that only one OS thread makes Win32 calls. It is true, however, that the easiest way to ensure these two things is to make all calls from the main thread.
Having all Gtk2Hs calls happen from a single thread is sufficient to make my code safe.: That's right, but only if the thread making the calls is a bound thread. The main thread is always bound, as are any threads made with forkOS. While unbound threads become bound during a call to foreign code, they return to being unbound after the call ends, and may migrate OS threads then.
When I call mainGUI, the FFI call blocks. I should use forkOS to run my code on another OS thread to work around this.: Actually, forkOS doesn't affect scheduling at all! It is true that mainGUI blocks the operating system thread it is on from executing other Haskell threads. However, because the import is marked safe, the runtime will ensure that no other unbound threads are executing (or trying to) on the same OS thread–and hence no Haskell threads will be blocked by the call.
I should add a timeout to Gtk's main loop that yields control to Haskell, or else mainGUI will starve my other Haskell threads.: This is important when using the non-threaded runtime; however, the threaded runtime needs no cooperation from FFI calls to behave correctly.
Since the non-threaded runtime only has one OS thread, I could make Gtk calls from any Haskell thread without repercussions.: Actually, callbacks are eligible for context switches. In fact, once you enter mainGUI, callbacks are the only time the Haskell runtime gets a chance to run. So that other Haskell thread will be executing while the Gtk main loop is in the middle of emitting a signal–not good!
If I wrap mainGUI with threadsEnter and threadsLeave, none of my other threads will ever be able to threadsEnter.: Don't worry: the main Gtk thread releases the Gtk lock while waiting for events.
The +RTS -N runtime option controls the number of CPUs I use. The +RTS -N runtime option controls the number of OS threads spawned.: Actually, this controls the number of capabilities. Each capability starts off with an OS thread in its pool, but those pools will grow as necessary to accomodate safe foreign calls and invocations of forkOS (which actually works by just making a safe foreign call that immediately calls back into Haskell).
Now that I know all Gtk calls must be wrapped with threadsEnter and threadsLeave, I know to call threadsEnter before postGUIAsync or postGUISync.: Actually, postGUIAsync is a glib call, not a Gtk call–it's perfectly thread safe as is–and it calls threadsEnter and threadsLeave for you in the callback that it actually passes on to the main Gtk loop.

Bibliography, Acknowledgments, and Further Reading

Thanks to Chris Casinghino for helpful suggestions on a draft and to John Lato, Duncan Coutts, and Simon Marlow for correcting various parts of the explanations of the runtime.
the thread on the Haskell subreddit
I learned everything above while trying to understand the recommendations in Writing multi-threaded GUIs by Duncan Coutts. While I ultimately disagree with some of the recommendations there, I believe the article was written before the threaded runtime got as awesome as it is today.
Multi-threaded GTK applications, Part 1 and Part 2, by Andrew Cowie, discusses how the Java bindings achieved thread safety.
the official Gtk FAQ on threaded programming
the Haddocks on bound threads and OS threads
Extending the Haskell Foreign Function Interface with Concurrency, by Simon Marlow, Simon Peyton Jones, and Wolfgang Thaller