Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Inline CPP undefined subroutine

by Alessandro (Acolyte)
on Nov 05, 2018 at 11:39 UTC ( #1225257=perlquestion: print w/replies, xml ) Need Help??
Alessandro has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks, I wrote a subroutine with Inline CPP to count the occurrence of some characters in a string but I can't get it work, as Perl complain the subroutine is undefined. Here is my code
#!/usr/bin/perl use strict; use warnings; my $file = $ARGV[0]; open (my $fh, "<:encoding(UTF-8)", "$file") or die "Could not open < $ +file"; use Inline 'CPP' => << 'END'; using namespace std; int countGC(string gcString) { int res(0); for (int i = 0; i < gcString.length(); i++) { if (gcString[i] == 'C' | +| gcString[i] == 'G') { res++; } } return res; } END while (my $line = <$fh>){ my @string = split /\t/, $line; my $seq = $string[2]; printf ("%d\n", countGC($seq)); } close $fh;
And here is the error message  Undefined subroutine &main::countGC called at GC_gfa_filter.pl line 22, <$fh> line 1.
Any help is welcome, Thank you

Replies are listed 'Best First'.
Re: Inline CPP undefined subroutine
by bliako (Curate) on Nov 05, 2018 at 12:54 UTC

    It's because int countGC(char *) is really undefined!

    I have not used Inline::CPP before but it seems to me that a Perl string (at least in your case) will be passed as a char array rather than as a std::string.

    Although I have no idea how to tell it to pass a perl string as a std::string rather than a char *, the fix is obvious:

    int countGC(char * _gcString) { string gcString = string(_gcString); int res(0); for (int i = 0; i < gcString.length(); i++) { if (gcString[i] +== 'C' || gcString[i] == 'G') { res++; } } return res; }

    bw, bliako

      If you can do without using std::string at all then the following is simpler and will be faster. And you can use Inline C (though it works as is with Inline CPP):

      int countGC(char * gcString) { int res = 0; for (int i = strlen(gcString); i-->0; gcString++) { if (*gcStr +ing == 'C' || *gcString == 'G') { res++; } } return res; }

      bw, bliako

        Now that the question is considered solved, this piece of code might serve as an interesting side note about (premature) optimization.

        I was going to observe that ord('C') = 0x43 and ord('G') = 0x47, so you could do the comparison in one step (if (*gcString | 4 == 'G') or if (*gcString & 0xFB == 'C')), and perhaps compare 8 bytes in one go by casting the char * pointer to uint64 * and doing the necessary accounting.

        Then it occurred to me to check the code the compiler actually generates from the simple and readable function above. There is a nice online service at godbolt.org that lets you do exactly that. Paste the function text into the source window (and add the necessary #include <cstring> header to make it compile), enter -O3 for compiler options, and behold. GCC 8.2 not only notices the similar ASCII codes and uses a trick similar to mine, but it generates an efficient but nearly unreadable main loop using SIMD instructions that compares 16 bytes in one go (which is better than what you can do with simple, standard C).

        I also had the idea of replacing the loop in the function with while (*gcString++), thinking that strlen needlessly scans through the string once to find the terminator, but guess what - this kills the optimization. It needs to know the length in advance to be able to do the advanced SIMD loop.

      ...I have no idea how to tell it to pass a perl string as a std::string rather than a char *

      It requires a typemap entry, telling Inline::CPP how to deal with the "string" argument.
      This can (untested) be achieved most simply by inserting the following line into perl's ExtUtils/typemap (somewhere before the beginning of that file's "INPUT" section) :
      string T_PV
      Or, for portability, you can accompany the Inline::CPP script with a separate typemap (named, eg my.typemap) that contains that line - in which case you need to tell Inline::CPP the name of that typemap. (See rewritten script below.)
      If you want to type "string" to some type that is unknown to ExtUtils/typemap then you'd need your typemap to additionally specify how to handle that INPUT:
      string MYPV INPUT MYPV $var = ($type)SvPV_nolen($arg)
      Here's the script I ran - modified to print, line by line, the number of Cs and Gs in any plain text input file:
      use strict; use warnings; my $file = $ARGV[0]; open (my $fh, "<", "$file") or die "Could not open < $file"; use Inline 'CPP' => Config => BUILD_NOISY => 1, TYPEMAPS => './my.typemap'; use Inline 'CPP' => << 'END'; using namespace std; int countGC(string gcString) { int res(0); for (int i = 0; i < gcString.length(); i++) { if (gcString[i] == 'C' || gcString[i] == 'G') { res++; } } return res; } END while (my $line = <$fh>){ printf ("%d\n", countGC($line)); } close $fh;
      Cheers,
      Rob
      Thanks that puts me on the way to solve it! There was a C++ mistake left, I ended up with this
      use Inline 'CPP' => << 'END'; int countGC(char * _gcString) { std::string gcString(_gcString); int res(0); for (int i = 0; i < gcString.length(); i++) { if (gcString[i] +== 'C' || gcString[i] == 'G') { res++; } } return res; } END
      I would rather use:
      int countGC(char * _gcString) { string gcString(_gcString); int res(0); for (int i = 0; i < gcString.length(); i++) { if (gcString[i] +== 'C' || gcString[i] == 'G') { res++; } } return res; }
      Cheers :)
Re: Inline CPP undefined subroutine
by BillKSmith (Parson) on Nov 06, 2018 at 04:45 UTC
    There is a simple all-perl solution:
    use strict; use warnings; my $seq = 'AAGCTTGACCGGG'; my $count = $seq =~ tr/CG/CG/; print $count, "\n"; RESULT: 8
    Bill
Re: Inline CPP undefined subroutine
by harangzsolt33 (Monk) on Nov 05, 2018 at 16:22 UTC
    Why don't you write the entire program in C++ instead? That would be an idea. ;)

      My guess would be that this is a prototype for something larger/different.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1225257]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2019-03-19 08:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How do you Carpe diem?





    Results (89 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!