lxml OSX compilation madness
September 29th, 2009I don’t think that there has ever been a point in time where installing lxml on OSX was not a horrible pain in the ass. I think that at one point, it was enough just to install updated versions of libxml2 and libxslt, and lxml would compile nicely. But I must have done something bad, cause at some point recently, lxml just stopped compiling for me.
This would not do at all. I use virtualenv (and virtualenvwrapper) pretty religiously, and am loathe to install much of anything in the system-wide site-packages. I needed to get things to a point where a straight-up easy_install lxml would work. I didn’t wanna be messing around with custom install flags or whatever every time I cut a new env.
This is not a neat step-by-step guide for getting lxml working in OSX. This is just a record of some stuff that I saw, and some things that I did. Some combination thereof was sufficient to get things working.
For the record, I’m using lxml 2.2.2, libxml2 2.7.5, and libxslt 1.1.26.
One day, I fired up python, typed from lxml import etree and got an error much like the following:
>>> from lxml import etree Traceback (most recent call last): File "", line 1, in ImportError: dlopen(/Users/nwilliams/.virtualenvs/lxml1/lib/python2.6/site-packages/lxml-2.2.2-py2.6-macosx-10.5-universal.egg/lxml/etree.so, 2): Symbol not found: _xmlFree Referenced from: /Users/nwilliams/.virtualenvs/lxml1/lib/python2.6/site-packages/lxml-2.2.2-py2.6-macosx-10.5-universal.egg/lxml/etree.so Expected in: dynamic lookup >>>
Needless to say, I was less than pleased.
Looking though the source, I gathered that _xmlFree was a symbol exported by libxml2. My first thought was that lxml was somehow compiling against the old system version of libxml2, or for some other reason couldn’t find the new version.
I noticed a few funny things about the output when installing lxml. First, there were two lines right before all the heavy compilation work:
Using build configuration of libxslt 1.1.26 Building against libxml2/libxslt in the following directory: /usr/local/lib
The second line suggested that lxml was, in fact, seeing the version of libxml2 that I wanted it to. The first was a problem though, because normally, it looks like this:
Using build configuration of libxml2 2.7.5 and libxslt 1.1.26
For some reason, lxml wasn’t finding xml2-config. Doing export XML2_CONFIG=/usr/local/bin/xml2-config seemed to fix things.
The other issue was probably the important one. While compiling lxml, I would see a bunch of warnings that looked like this:
ld: warning in /usr/local/lib/libxml2.dylib, file is not of required architecture ld: warning in /usr/local/lib/libxslt.dylib, file is not of required architecture
Also for another file or two, and all repeated a few times.
Checking file /usr/local/lib/libxml2.dylib told me that the files in question were only compiled as i386 binaries. I eventually found this page, which showed me how to compile libxml2 and libxslt as universal binaries. I did modify things somewhat, though, most notably adding 64-bit architectures.
The configure command I used for libxml2 was:
env CFLAGS="-arch i386 -arch ppc -arch x86_64 -arch ppc64" ./configure --enable-static=no --without-python --disable-dependency-tracking
And for libxslt:
env CFLAGS="-arch i386 -arch ppc -arch x86_64 -arch ppc64" ./configure --disable-dependency-tracking
The CFLAGS bit is necessary, that’s what’s making things work. I don’t actually know exactly what the point of --enable-static=no is. I’m sure it’ll come back to bite me in the ass at some point. --without-python sounds scary, but since lxml is actually an alternative to the gross “real” bindings, we don’t care about them. I also have no idea what --disable-dependency-tracking does, but make will fail without it.
Once I had these universal libraries built and installed, easy_install lxml worked like a charm.