Friday, September 2, 2011

HOWTO: deploy multithreading on Matlab with the Theron C++ Actors library

Preamble: What follows is a HOWTO guide for getting Theron-2.09 multithreaded C++ mex files built on Matlab. For my general musings on Matlab concurrency and Theron, see the previous post.

Building multithreaded Matlab mex files with Theron

This was pretty straightforward on the Mac, somewhat tricky but not terrible on Linux. I don't have any working C++ compiler with Matlab on Windows thanks to absolutely abysmal support from Microsoft and Mathworks (essentially Mathworks only supports Microsoft and Microsoft's compilers are impossible to get installed correctly). But the notes below may help you get things working on Windows anyway and I added some Windows specific notes at the very bottom. Before trying to get Theron working, I encourage you to write a simple C++ mex program (maybe a single-threaded version of your planned algorithm) and get it built on Matlab first, to confirm that your system is correctly set up for mex.

I'm fairly confident most Mac users can get this working. On Linux I have only tested on 2010 and 2011 64-bit versions of Matlab, and there are reasons to fear that earlier versions may have more trouble working. Fair warning.

Building Theron on Mac or other static mex environments

For this discussion, the fundamental difference between Mac and Linux versions of Matlab is how they handle mex compilation. When you mex a C++ source file on the Mac (confirmed up to Matlab 7.12), the system generates statically linked output (a statically linked executable I think, but it could be a static library; it doesn't really matter). On Linux, mex generates a shared library instead, which complicates things significantly when your build depends on the boost_thread library (see below).

For now, let's assume you are on a Mac or some other Matlab system that generates static mex files. One way to confirm this is to run mex on something simple with the -v option; this will show the gcc or g++ commands that mex is using. As long as the linking step doesn't include the -shared flag, you're okay.

In this case, your first step is to follow the getting started instructions on the Theron site: download Theron and build it, linking against your choice of Boost threads library. If you're not sure whether you have Boost threads installed already, you can try the locate libboost_thread command, possibly after running sudo updatedb if your locate database is out of date. If you don't have Boost threads already, the simplest way to get it is to use Macports. Once you have Macports installed, it's as simple as: sudo port install boost. Currently this gives you a very up-to-date boost version 1.47.

Theron is currently a little Windows-centric, so you'll need to tweak the makefile a little to get it to build. Theron author Ashton Mason tells me the next release should make things smoother for Gnu (non-Windows) people. First, as detailed in the Theron guide, you need to edit the makefile to point to your Boost installation. If you got Boost from Macports, this will probably look like this:

BOOST_INCLUDE_PATH = /opt/local/include/
BOOST_LIB_PATH = /opt/local/lib/
BOOST_RELEASE_LIB = boost_thread-mt
BOOST_DEBUG_LIB = boost_thread-mt

If you want to be able to run make clean to start over with the build process, I recommend also changing the following lines:

RM = rm -f

...

clean:
  ${RM} ${BUILD}$/*.o
  ${RM} ${BUILD}$/*.a
  ${RM} ${BUILD}$/*.ilk
  ${RM} ${BIN}$/*.pdb
  ${RM} ${BIN}$/*.ilk
  ${RM} ${BIN}$/*.exe
  ${RM} ${LIB}$/*.a
  ${RM} ${LIB}$/*.lib

I think that should be it, so go ahead and build. If you are only planning to use Theron from Matlab you only need the library, so I would use make library mode=release.

If all goes well, your build should finish pretty fast and you should now have the libtheron.a library in your THERONPATH/Lib folder. Now boot up Matlab and try to build a simple mex file. A Theron test program might look something like:

// theron_test.cpp
#include "mex.h"
#define THERON_USE_BOOST_THREADS 1
#include "Theron/Framework.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
  Theron::Framework theron;
}

This looks simple, but actually the call to initialize the Theron Framework involves a fair amount of background stuff getting setup, so it should give you an indication of whether things are working. To compile this from Matlab, you'll need a little more than a simple mex theron_test.cpp.

Do not put spaces in any of these mex options; no space after the "D". First you need to tell the compiler what Theron function calls look like. You do this by pointing it to the Theron header files with the -I/THERONPATH/Include/ directive.

Next you need to tell mex to pull in the theron and boost_thread libraries. The standard way to do this is with the -L and -l options to the linker, but I find it cleaner to just include the full paths to the library files after your C++ file name; this tells the linker to look through the object files in the libraries when resolving the calls to Theron in your code. I also like using the -v option so I can see exactly what mex is up to.

So your full build command should look something like:

mex -v -I/THERONPATH/Include/ theron_test.cpp /THERONPATH/Lib/libtheron.a /LIBPATH/libboost_thread.so

If you got Boost from Macports, then the libboost part above should be /opt/local/include/libboost_thread-mt.dylib. You could also use libboost_thread-my.a if you want; I don't think it matters much.

Did it build? Hope so... Now try running theron_test from the Matlab command line. You should see nothing; nothing is good. Now go ahead and code up your algorithm using the Theron tutorial as a guide and enjoy simple Matlab concurrency!

Building Theron on Linux or other shared library mex environments

The process is a bit more complicated for Linux users. The upside is, you probably don't have to worry about getting your own Boost implementation (although chances are you already have one...). The issue is that in Linux, Matlab mex files are built as shared libraries instead of statically linked. This causes two problems. The immediate problem is that Matlab needs you to build Theron with the -fPIC option turned on so that it can use the Theron library in a shared build. This is not a big problem; just add -fPIC to the CFLAGS in the Theron makefile.

The second problem is the deal breaker: Matlab uses libboost_thread internally and has its own copy hidden away in its internal distribution folders. Since your mex file got built as a shared library, it will try to pull the boost_thread library in at runtime, and it will find Matlab's version (1_40_0) instead of your external version (1_42_0 currently in apt-get). This causes a segfault.

So there are two approaches to solving this. The better one would be to find some way to tell your mex file not to use Matlab's boost_thread. I found a discussion of this on MatlabCentral, but did not get this working; it seems like it might require rewriting parts of Theron to use dlopen; not very nice. I still feel like there should be some other way to get Matlab to behave, but I didn't find it. (For completeness: it does not work to point Matlab's internal boost_thread to an updated one; duh why did I even try that?)

So there is option 2: throw in the towel and build Theron against Matlab's internal boost_thread. This works at least for 64-bit Matlab 2010 or 2011, which has boost_thread 1.40.0 internally. I don't know that it will work for earlier or 32-bit versions of Matlab, which I believe use earlier versions. (A very old version of Matlab might not have this issue at all, as I think it was around 2007 that MathWorks started multithreading Matlab internal code in the first place. If you're using a very old Matlab, check the internal distribution folders for libboost_thread to see if it's there. locate should be your friend.)

Building Theron against Matlab's internal boost_thread is not too hard. The only trick is that Matlab doesn't have the header files in the distribution (at least I didn't find them). So you'll have to download boost_1_40_0 (or whatever version Matlab has internally on your system) and point Theron to those headers. If you use the wrong headers, you'll get segfaults. Theron's makefile should look something like:

BOOST_INCLUDE_PATH = /Downloads/boost_1_40_0/
BOOST_LIB_PATH = /usr/local/MATLAB/R2011a/bin/glnxa64/
BOOST_RELEASE_LIB = boost_thread
BOOST_DEBUG_LIB = boost_thread

...

CFLAGS += -c -Wall -fPIC

(The libboost_thread in the Matlab folder is only the versioned filename, not the soname, so I would have thought you might need to symlink a soname but that didn't seem to be necessary.)

If this builds correctly, you should be good to build your mex file from Matlab; the command is the same as above for Mac, except you point it to the internal Matlab boost_thread:

mex -v -I/THERONPATH/Include/ theron_test.cpp /THERONPATH/Lib/libtheron.a /MATLABLIBPATH/libboost_thread.so.1.40

Conclusions

This was a little bit of a pain, mostly because of Matlab's questionable decision to use shared libraries internally without (apparently) providing for situations where users build their mex files against the same libraries. The process is especially annoying if you have Mac and Linux machines that both need to be able to build your mex. Some steps would be a little smoother if Theron were less Windows-centric; Theron has a new release coming that should improve on this.

I think the current solution is stable, although it's tricky to be locked into the Matlab internal boost_thread version. The benefit is clear. Using Theron to multithread a simple algorithm, I got about 4.5x speedup using 6 cores, and 6x speedup using 12 threads on 6 cores with hyperthreading. This was a simple algorithm that didn't really require an Actors model, but it was great that I could get this done quickly (once I figured out the Matlab setup tricks) and without having to deal with low-level threads and mutexes at all.

Building Theron on Windows

As noted, I have not gotten any Windows compiler to work with mex, so these are purely hypothetical guidelines. If you are building on Windows you will probably find it simpler to use the Windows thread library instead of Boost. So just leave out the THERON_USE_BOOST_THREADS define and it should revert to using Windows threads instead. This means you should be safe from the internal Matlab shared library conflict problem that Linux has. But you may still need the -fPIC option added to your CFLAGS depending on whether Windows mex is setup to create static or shared libraries (I don't know).

No comments:

Post a Comment