Interfacing Perl with C++, using XS, with external files, and using the STL as parameters and return values.by dextius (Monk)
|on Aug 05, 2010 at 17:37 UTC||Need Help??|
I know enough C and C++ to get things done, I'm definitely not a core Perl hacker, or any form of a low level programmer. With that in mind, there is a good chance I am incorrectly interpreting some of the terminology or the general approach. One area that I am very leary of is memory management. Just getting something working doesn't mean that you have all the leaks nailed down. That's my next task on this long, painful road. Oh, and before anyone decides to dump on my post, just keep in mind I've been trying to do this for a long time, and have bought books, scoured the internet, and asked people on forums. "This" was the BEST I could do.
So, I have always considered XS, the "final frontier" for me with Perl. I've been coding in Perl religiously for over a decade, and have done good bit of C and C++ as well. But marrying the two together has always been a formidable task that I have feared. I've avoided it through databases, sockets, and flat files. At my old job, there was an old crufty swig file in a project we used that miraculously was able to pull things together and provide an incredibly complex interface to a shared memory data store. I tried to make sense of it a few times, but in the end (due to a general lack of time and documentation) I ended up just trusting in the magic, and hoped that the swig code never broke.
I have built a few components in C++ that I am really proud of, and decided it was time to figure out how to link my beloved Perl to the world of C++. I like to think I am good at researching programming problems. I should really re-state this as, I am really good at using google, to solve compex programming problems. Sadly it's more like "I've never done anything unique, and anything I have problems with someone else has already had, and posted about it, years ago, and garnered responses from the luminaries that lead our community". I just am good enough with Google to find those threads to get me in the right direction to solve whatever problem I'm having, when my army of O'Reilley books aren't there to teach me what I need to know.
Here is my task. I have a C++ library, that I would like to be able to access from perl. The library is object oriented, takes strings as arguments, and returns strings, or vectors of strings, or hashes of strings/strings. I'll be honest, I'm not a C++ guru, I have used Boost and ACE to write safe multi-threaded code, and I have a decent understanding of memory management in C/C++ but that is pretty much the end for me.
BEGIN BOOK REVIEW
Disclaimer: This isn't a book review ;-), but I figured I'd make a few comments on this book I paid $44.99 for, since it's directly applicable to my problem.
To start off, I am staring at a copy of "Extending and Embedding Perl", by Tim Jenness and Simon Cozens. It is an odd book. Two of the eleven chapters are spent teaching you C, and two more chapters are spent discussing how to hack Perl itself. The rest of the book is a bit more applicable to the title, with two chapters devoted to embedding perl and five chapters to extending. (though, "Alternatives to XS" could have been a much larger topic, but I'll get to that later).
The reason I point out the distribution of material in the book is, in those 4 chapters about using XS, it misses the most basic use case in all of the examples. Linking Perl to an existing C or C++ library. Every single one of their examples shows the C code being embedded in the ".xs" file directly. There is no mention of how you would even accomplish this, anywhere in the book! Just to prove the point, go download the example code yourself and do an "find . |egrep -i '\.ch$'". You won't find anything outside of chapters one and three, the ones that deal with teaching you C, and a short blib on h2xs in chapter 7, but even that shows you the actual C code in the ".xs" file.
There are 7 pages devoted to extending to C++. I typed in the examples, and with a little work, got them working. The example didn't cover external files, nor did it cover how to pass complex variables (the examples dealt with "int's" for input an output). So for, to summarize (finally), this is a good reference book, and when I actually need to jerk around with SV's and HV's, and IV's, then I can learn what they are and how they work.
If I could change it, I'd just get rid of all the extra stuff, and make more room for XS and Perl's internals. If there was room to spare, maybe full chapters on Swig and Inline, as they are both solid solutions that deserve more than short mentions.
END BOOK REVIEW
Thus begun my internet search.
I started at Perlmonks. Queries concerning: "XS C++" and "XS external library" were fruitless for the most part. http://perlmonks.org/?node_id=517931 .. tye, had the most impressive response (yes, I upvoted him), but it only dealt with C++ loaded directly in the XS file, and with "char's".
Then I hit the chatterbox. tye of course was online, being the helpful guy he is held an amazing amount of patience with me as I slowly answered my questions. He did his absolute best to steer me away from XS entirely, and to use SWIG or Inline::C++ instead. But my position was simple, I didn't see any of the "awesome" libraries on CPAN resorting to this, why should I? Why is this so damn hard? Why is it so poorly documented? Why do I have a book on the subject that I paid 50$ for and I still have no clue how to accomplish such simple tasks? tye sensed my frusrtration, and kept pressing for alternatives. It was at this point I decided that I'd play the "Picard" card with XS.
<bold>"They invade our space, and we fall back. They assimilate countless worlds, and we fall back. Not again. Not this time. The line must be drawn here! This far, no farther! And I will make them pay for what they have done!" - Jean Luc Picard, Star Trek First Contact</bold>
Come hell or high water (?) I am going to figure this out.
Inline::C++ : This module, when it works, works really well. It provides a seemless integration between Perl and C++, and you barely have to lift a finger. The people who wrote this are beyond smart. I applaud their work. Here is where it breaks down for me. You need to have a compiler and your environment set up the same way for every machine that you want to use your library on (unless you do even more smart env logic in your BEGIN blocks). If you don't have a compiler, and/or don't want to rely on re-compiling when you run the program, I'm told you can distribute the compiled bits, but packaging this became almost as much of a daunting task as the original problem itself.
SWIG : I used to work with a guy who worked on the SWIG team. His initials are "M.M" (for friends of mine that read this). Nobody can understand a word he says (in English), as he's from Chile, but what is funny is nobody can understand what he says in spanish eiher (and that comes from people who are native spanish speakers). Heh, anyway, that's sort of how SWIG is. You tell it to do something, and either it works, or it spits back unintelligble mess that is totally undecipherable. Just like my old job, you have to trust the magic, and if it breaks, you have no recourse. Their website has decent documentation, but not enough detail on how to resolve complex issues, leaving you to scour the internet like I did on XS. Do you use an abstraction that will have problems, but a smaller pool of people that can help? Or do you tough it out and use the harder solution, but one that has a larger pool of support? Guess which one I'm going with.
I tried to just look at existing perl modules that I knew linked to C/C++ on CPAN. I knew libxml, Wx, and the various DBD::* libraries had to make calls to the underlying layers, but the typemaps were far too complex for my puny mind to comprehend. I hoped the Wx library would have some insight, but it too blew my mind. I needed a concrete example or documentation on what steps needed to be taken.
John Keiser seems almost as frustrated as I am. This link is referenced in MANY other articles on the internet concerning XS and C++. Of course, this doesn't have an example of using external files, and ::sigh:: it only deals with int's for inputs and outputs. Sadly, since this was written in 2001, I ran into some issues just getting the examples working. In the end, I just gave up on it and looked for other documentation.
Stack Overflow is a pretty good place to get answers to questions, most of the answers pointed to SWIG, Inline::C, and the perl-xs-tut (we'll get to the xs-tut here in a second).
Perl XS Mailing list 1 and 2 : I found these gems, while digging, and it gave me hope that building a typemap for std::string and std::vector might be possible. Unfortunately, the code was meant more as a scratchpad (and he states this explicitly). So close!
The Perl XS mailing list is a really great site. This place has a great signal to noise ratio (similar to the mod_perl mailing list). Yet again, things that I have questioned, more or less, have already been asked, answered, or told they are thinking about the problem wrong (the latter being my problem).
Boom!, here was some documentation of an external file with XS. Lots of concepts and little example though. I really wanted everything explained, no more magic.
WrappingPerl : Uses external files! Uses the STL!! ... Uses SWIG. Damn it. Ok, so I typed all this out, and got it working, and tried to figure out what SWIG was generating based on the resulting ".cxx" file. It ended up being another brick wall. I was impressed that SWIG was capable of doing almost everything I needed it to do, but it was the same magic that I had to "trust" like I did back in my previous job. I decided that I was NOT going to use swig.
Kaye SWIG : An even better SWIG example. It was recently posted to Alberto Simões's perl blog by Kaye (courtesy of Shlomo Yona). This is the best SWIG documentation that deals with complex data structures and external files available. If anyone else writes a book about XS, they should just dump this entire distribution to paper, as it is incredibly well documented. Too bad it's C, otherwise I probably would have given up XS and just used this.
Perl XS Tutorial : Yes! Why the hell didn't I look here first?! Example 4 is perfect! External files! MYEXTLIB! (wtf is an MYEXTLIB?!)
As an aside, look at the perldoc for ExtUtils::Makemaker, under MYEXTLIB, and see if you make the connection that THIS is the attribute that allows you to reference external libraries, yes I know EXTLIB should have clued me in, but it didn't register as the solution to my problem).
Holy crap the ExtUtils::MakeMaker is complicated. Where the heck did this "MY" namespace come from? What is all this weird "make" snippets littered all around? Ok, just get it working, and then you can reverse engineer it so you can teach yourself what all these keywords mean so you can actually learn something, rather than trusting the "magic". So that's what I did. (I'll cover that when I finally get to the example).
I used Module::Install for my last CPAN projet, "Ravenel". I wondered if it could build XS modules as well. But it's funny to me, since everyone seems to hate ExtUtils::MakeMaker, but all the modules that use XS all use it. It looks like this is the only syste worth using if you're dealing with this stuff.
Ok, let's not get too excited, I still need to know how to build a typemap for STL based objects, and if it's possible, complex ones (Vectors and Maps).
Typically, in any google search of a technical nature, you inevitable will start seeing posts in languages other than your own, I don't start hitting these until I exhaust all of the ones that are in english (obviously). (What do you call someone who is in Europe and can speak 5 languages? A waiter. What do you call someone who speaks one language and who is in Europe? An American.) Google translate does a decent job of getting the point across, but the concepts backing the idea always seem to be lost in translation (love that movie).
Japanese blog : I ran into this first. And I thought I finally found it. The typemaps that wrap the STL for Perl. I have a "simple" class that you instantiate with a string as an arugment, and some methods that take strings and spit them back out. This is formatted for SWIG, and combined with the earlier lesson I found on swig, I probably could have just got by with this. But I've come too far now. XS or bust!
Holly's Blog But, this, is when I hit the proverbial jackpot. "Holly's page". I don't know who Holly is, but he/she is amazing and I wish I could sing his or her praises from the tallest buildings / trees / mountains. Holly has a 9 part XS C++ programming course on his or her site "The introduction perlxs 1-9". It covers damn near everything I would ever want to know about XS, with concrete examples.
Dear Holly, you rock, you have my undying devotion until the end of time, because you have documented your brilliance with so much clarity that I was able to translate it from Japanese and still get what I needed out of it. I owe you a beer, sake, fruit juice, or whatever it is you drink. Thank you from the bottom of my beaten down soul. --dextius
I feel better now that I've gotten that diatribe out of the way. Let's show you an example of using an external library, passing strings back and forth between Perl and C++. Then, let's explain what every line does so we can demystify some of the magic and fear that comes when dealing with XS. Before I dump out all my code, there is one point I need to make. I found it's easier to abandon your build script that you normally use to build your C++ project, and just use what Makefile.PL generates. I accomplished this by simply copying my code into this new directory tree.
Here is the code
Change directories in your shell to somewhere where you can create new stuff and then run this.
This will create a skeleton for us to work with. Now create the SimpleLib directory.
Here is the code for our "simple" library.
/SimpleTest/SimpleLib/unitTests.c (A quick test "main")
At this point, you can compile "SimpleLib". You'll run these commands.
To verify, run the unitTests binary.
It should return:
Before I go any further, lets look at the Makefile.PL as it is key to all of our future plans.
1. We need to build a shared library, "make" will generate the "libSimpleLib.a" file.
2. "make bin" will generate the unitTests binary, running a "file unitTests" will tell you what architecture you're building for.
3. It is important your version of Perl was compiled with the same architecture (perl -v will tell you this).
4. A great resource on make is available here: Makefile Tutorial
Ok. Great, now we have a shared library built, and trust that it actually is capable of doing something when told to (via the unitTests binary). Now let's hook this up to Perl.
What do I do when I get: "error: macro "do_open" requires 7 arguments, but only 2 given"
While I'm here, I ran to a bunch of other fun issues. Linking to Boost or ACE causes all kinds of wacky symbol collisions to fire off, but google was to the rescue, again. Simple add these few lines to your ".xs" file (immediately after #include "ppport.h")
Exceptions? So, my code throws exceptions. I cannot catch them, they cause Perl to blow up. I looked around, and eventually found the link above. It, like most of what I have learned about XS frightened me. I can see this is another "spend a week" researching on how other people have approached the problem, and then hack something of my own together, but hopefully someone will read this and tell me what the best approach is.
Multiple C++ objects exposed by XS (in a single project)?
Wx does this, but it uses it's own custom MakeMaker to pull it off. The one thing I noticed is that it created seperate .xs files for each class. I assume this is what I need to do, but I don't know how to make Makefile.PL pick up the other .xs files I create describing the other objects defined in my header file. I also learned (the hard way) you can't define more than one object in a ".xs" file, or maybe it's just the syntax I was using. I really don't know. I wish I had a few weeks to reverse engineer how Wx is doing this, but there is so *MUCH* code, I'm not even sure where to start.
Wow, this was really hard. I still don't know everything. I feel like after pouring over Extending and Embedding Perl over the last month or so I can tackle a C based library with some level of competency. But beyond these simple STL calls and an object wrapper, I still feel totally lost with C++ and XS. So many burning questions, and so few outlets for more information. Again, if I didn't stated it before, the fact that I got any of this working at all wouldn't have been possible with out "Holly's Blog". It's amazing how much information is on that page. It's ironic, that I had to go through Japanese to to get to XS (both of which being undecipherable to me).
What have I learned? That C++ is just another opaque object to Perl, that the typemaps are what allow you to call internal perl5 api's to construct perl datastructures, using C++ code right along side it. Once you learn it, it makes sense, but holy moly, what a hill you have to climb to get there. I also learned that I have a lot more to learn.
Hopefully this mess I have written will inspire much more talented people to write even better documentation so I can update this entire page with a simple link to their page.
Thanks to all the people who wrote these posts on their blogs, and to Perlmonks, and to tye for "inspiring" this :-)