Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Dynaloader/XS: sharing C symbols across shared objects

by creamygoodness (Curate)
on Jun 10, 2008 at 00:41 UTC ( #691130=perlquestion: print w/ replies, xml ) Need Help??
creamygoodness has asked for the wisdom of the Perl Monks concerning the following question:

Greets,

I have two XS modules, and I would like C symbols loaded by one to be accessible from the other. The ultimate goal is to avoid including the Snowball stemming library in the KinoSearch distribution, with all of its compilation overhead, but instead load Lingua::Stem::Snowball and its associated shared object and have the Snowball C API be accessible from within the KinoSearch shared object.

This works fine on Mac OS X, but not on FreeBSD or Linux. The easiest way to demonstrate the problem is to use Inline C.

Here's the Hello.pm perl module:

package Hello; use Inline C => <<'END_C'; void say_hello() { printf("Greetings, earthlings!\n"); } END_C 1;

Here's the hello_goodbye.pl perl script:

use strict; use warnings; use Hello; use Inline C => <<'END_C'; extern void say_hello(); void say_goodbye() { say_hello(); printf("Prepare to die!\n"); } END_C say_goodbye();

The script works fine on OS X...

/Users/marvin/perltest/ $ perl hello_goodbye.pl Greetings, earthlings! Prepare to die! /Users/marvin/perltest/ $

... but fails on Linux...

marvin@linlin:~/perltest$ /usr/local/debugperl/bin/perl5.10.0 hello_go +odbye.pl /usr/local/debugperl/bin/perl5.10.0: symbol lookup error: /home/marvin/perltest/_Inline/lib/auto/hello_goodbye_pl_f0b6/hello_goo +dbye_pl_f0b6.so: undefined symbol: say_hello marvin@linlin:~/perltest$

... and FreeBSD:

$ perl hello_goodbye.pl /libexec/ld-elf.so.1: /usr/home/creamyg/perltest/_Inline/lib/auto/hell +o_goodbye_pl_f0b6/hello_goodbye_pl_f0b6.so: Undefined symbol "say_hel +lo" $

Can anyone explain this behavior, and perhaps suggest a solution?

Update: RESOLVED

The default behavior for dynamically loading shared objects via the C function dlopen() is not to export symbols. To override when the loading happens via Perl, it's necessary to switch out XSLoader for the more complex but powerful Dynaloader and set some flags. Here's the patch to Lingua::Stem::Snowball:
Index: lib/Lingua/Stem/Snowball.pm =================================================================== --- lib/Lingua/Stem/Snowball.pm (revision 97) +++ lib/Lingua/Stem/Snowball.pm (working copy) @@ -15,13 +15,17 @@ $VERSION = '0.95'; -@ISA = qw( Exporter ); +@ISA = qw( Exporter DynaLoader ); %EXPORT_TAGS = ( 'all' => [qw( stemmers stem )] ); @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); -require XSLoader; -XSLoader::load( 'Lingua::Stem::Snowball', $VERSION ); +require DynaLoader; +__PACKAGE__->bootstrap($VERSION); +# Ensure that C symbols are exported so that other shared libaries (e +.g. +# KinoSearch) can use them. See Dynaloader docs. +sub dl_load_flags { 0x01 } + # a shared home for the actual struct sb_stemmer C modules. $stemmifier = Lingua::Stem::Snowball::Stemmifier->new;
Thanks to Rob for pointing me in the right direction. References:
--
Marvin Humphrey
Rectangular Research ― http://www.rectangular.com

Comment on Dynaloader/XS: sharing C symbols across shared objects
Select or Download Code
Re: Dynaloader/XS: sharing C symbols across shared objects
by syphilis (Canon) on Jun 10, 2008 at 01:18 UTC
    Basically, if the Hello.pm shared object exports the say_hello() function, then all works fine. Otherwise you get the failure you reported.

    There's an (documented) EU::MM parameter called FUNCLIST that's helpful here, but it hasn't been made available to Inline::C - so I think that, wrt Inline::C, you're snookered. (Though, having said that, someone may well come up with a different solution.)

    However, if you convert Hello.pm to a normal XS module, and provide WriteMakefile() in its Makefile.PL with:
    FUNCLIST => ['boot_Hello','say_hello'],
    then all should work fine. At least it does for me, on Win32.

    However, as I understand it, you don't control the real world equivalent of "Hello.pm". Does that, in itself, pose a problem ?

    Cheers,
    Rob
    Update: Inserted boot_Hello into the FUNCLIST arg. (It's also needed, I believe.)
    Update 2:The Inline::C script also then needs to link to the Hello shared object or it's import lib. (Things aren't being quite as simple as I recall ... it's looking more and more kludgy by the minute :-)
      Basically, if the Hello.pm shared object exports the say_hello() function, then all works fine. Otherwise you get the failure you reported.

      That makes sense to me. If I understand correctly, then the failure mode is very similar to what you get when trying to load a function that a Perl module doesn't export.

      The solution would seem to be to recompile Lingua::Stem::Snowball with additional options that specify that its shared object should export the necessary symbols. Fortunately, I maintain Lingua::Stem::Snowball, so I can release a new version if need be.

      I've been trying with the gcc -export-dynamic, which I specify using Module::Build's extra_linker_flags option (and verify that it appears on the command line). No luck yet though. Hmm.

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com
Re: Dynaloader/XS: sharing C symbols across shared objects
by syphilis (Canon) on Jun 10, 2008 at 07:06 UTC
    +sub dl_load_flags { 0x01 }

    Wow !! I didn't realize that things were *that* simple on nix type operating systems. (I've just checked on my old mandrake-9.1 box, and things really *are* that simple.) There's certainly more than that required on Windows - though I realise that's probably something that's not an issue for the OP.

    For a start, on Win32 we need to have the dll export the symbols - hence the usefulness of 'FUNCLIST'. (I was forgetting that probably wouldn't be an issue on most other operating systems.)

    Secondly, on windows, there's a need to be able to resolve all symbols at compile-time. But, on my linux box, I've just realised there's no such condition to be met. As an example, the following Inline::C script runs fine on linux, but fails to build on windows:
    use warnings; use Inline C => <<'EOC'; void extern crap(); void foo() { crap(); } EOC print "Compiled fine";
    On linux (with gcc) that outputs simply Compiled fine, but on windows (using the MinGW port of gcc) the build phase fails with:
    try_pl_d349.o:try_pl_d349.c:(.text+0x5): undefined reference to `crap' collect2: ld returned 1 exit status dmake: Error code 129, while making 'blib\arch\auto\try_pl_d349\try_p +l_d349.dll
    This is, no doubt, old news to many here ... but the starkness of the differences took me somewhat by surprise.

    Cheers,
    Rob

      Windows is a target for both KinoSearch and Lingua::Stem::Snowball, but it's such a pain to deal with that compatibility lags in the dev branch of KS. For today, I just needed a band aid to get a user who's working with the KS svn trunk on Linux up and running; the DynaLoader patch does that. For a more robust solution, I'll need to take salva's suggestion.

      But here's a question for you: is it really "dynamic linking" if you need to resolve every symbol at compile time? I mean, the whole point of dynamic loading is to put off that resolution: "I'll tell you where the actual compiled code to run crap() is the first time you need it at runtime."

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com
        For a more robust solution, I'll need to take salva's suggestion

        Yes, salva's solution (a salvation ?) seems like a good one - though I haven't yet been able to work out exactly how to implement it. Pointers to functions in C ? ... then wrapped in an SV ? ... that's more than enough to frighten me.
        If someone feels inclined to present a simple demo of the procedure, I, for one, will certainly be taking a good look at it.

        At RFC: Setting up a minGW compiling envronment for Perl 5.10 there's a long and drawn out discussion that unravels this very same issue wrt Glib and Cairo on Win32. Seems that Glib and Cairo might be much more Windows-friendly if the approach presented by salva were adopted by their developers.

        is it really "dynamic linking" if you need to resolve every symbol at compile time?

        The same question is asked at http://sig9.com/node/35 - and a simple demo solution that involves the LoadLibrary() and GetProcAddress() functions (from the Windows API) is provided. I imagine it would be very tiresome to attempt to incorporate that approach into portable XS code. (I note that the "solution" presented there also involves "pointers to functions".)

        Cheers,
        Rob
Re: Dynaloader/XS: sharing C symbols across shared objects
by salva (Monsignor) on Jun 10, 2008 at 07:53 UTC
    There is another way to do it:

    On the "server" module expose a perl function (or a global variable) that returns a pointer to a table containing pointers to all the functions that you want to share. Then on the "client" module, at boot, call the function returning the table pointer and save it into a static variable for later usage. For instance, Time::HiRes uses that approach.

    It is more laborious but also guaranteed to work in any operative system while the dl_load_flags feature seems to be available only in some OSs.

      Thanks for this very useful tip, salva. It's a brute force technique, and I can see why it's reliable -- all you're doing is passing around function pointers wrapped in Perl SVs.

      There are only four or five symbols in the Snowball C API, so I can add a full public C API to Lingua::Stem::Snowball without too much effort, directly emulating the C API for Time::HiRes.

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://691130]
Approved by almut
Front-paged by salva
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (4)
As of 2014-09-20 00:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (151 votes), past polls