Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Perl and C++/STL

by rg0now (Chaplain)
on Mar 10, 2005 at 23:24 UTC ( #438468=perlmeditation: print w/replies, xml ) Need Help??

Dear Monks!

After many years of happily using Perl, lately I have been exposed to hack on a largish C++ project. So, I had to recall everything I have ever known about C++ and STL and to even learn much more new stuff (boy, g++ is really slow these days!)

I decided to read some nice tutorials on STL data structures, but soon I was shocked to realize that, during the many years of using Perl almost exclusively to everything (and only doing really low level stuff with XS and C), I have lost my ability to think in concepts other than Perl's good old arrays and hashes. So I began to search for STL counterparts.

Take for example arrays. The most straightforward STL choice would be vectors. With vectors, the familiar syntax $array[$index] = $element will work in C++ too, search and insert is fast. With C++ automatic variables, I also have the counterpart of Perl lexicals. This is really nice. However, based on my (granted, limited) understanding of STL, with vectors, you must know the size of the array well before starting to use it. Trying to write to the $indexth element of the array before actually initializing it is a pretty straight call for a segfault, as I had the chance to learn through live experience.

After falling into this trap a number of times, I realized that I more and more grow to use STL maps, which do not show this notorious problem. However, the more I used maps the more my programs began to crawl, up to the point that today, my Perl scripts tend to execute faster than the C++ counterparts.

Here is a little snippet (which is a shameless plug of merlyn's implementation to search the first 1 million primes), which remarkably demonstrates this phenomenon:

#include <map> #include <stdio.h> #define our(x) std::map<int,int> x #define my int #define eval int main() #define next continue eval{ my $UPPER = 1000000; our($sieve); for(my $i = 0; $i < $UPPER; $i++){ $sieve[$i] = 0; } for (my $guess = 2; $guess <= $UPPER; $guess++) { if($sieve[$guess]){ next; } for (my $mults = $guess * 2; $mults <= $UPPER; $mults += $guess) { $sieve[$mults] = 1; } } my $count = 0; for(my $i = 2; $i < $UPPER; $i++){ if($sieve[$i] == 0){ $count++; } } printf("The number of primes below %i is %i\n", $UPPER, $count); }
Provided that you name the file as sieve.cpp, you can compile it with g++ -Wall -fdollars-in-identifiers sieve.cpp -o sieve or you can directly feed it into Perl, too (perl sieve.cpp). For clarity, it is currently not strict-safe. If you are interested in multilingual obfu, look at here: Perl is C.

What is pretty interesting here is that the C++ code runs for 14 seconds, while 7 seconds is enough for Perl to give the very same answer. For this specifically CPU intensive problem, Perl's compiled byte-code runs twice as fast as the native machine code generated by C++! Now, this is what I call surprising. My C++-inclined colleagues simply couldn't believe it until not running the code by themselves. Moreover, the higher $UPPER is set, the larger the difference between the Perl code and the C++ code will become. This is why most of my C++ code written so far sucks so very badly...

I was quickly pointed out that this behavior is obviously a consequence of my stupid choice of STL container classes. STL maps are not on par with Perl arrays, even if the syntactical similarity made me blatantly think so. Moreover, in this very special case, the size of the array is known in advance, so I could have very much like used STL vectors as well, instead of maps. Change the

#define our(x) std::map<int,int> x
#include <vector> #define our(x) std::vector<int> x($UPPER)
and now the code is transformed to vectors. Consequently, the execution time of the C++ code drops below 1 second. The consequence seems to be: Choose your STL containers carefully!

Consequently, my question is: are there any other monks here, who caught themselves on trying to force Perlish way of thinking to other programming languages? In what way did you abuse those other programming languages?


Replies are listed 'Best First'.
Re: Perl and C++/STL
by kvale (Monsignor) on Mar 10, 2005 at 23:51 UTC
    I, too, went straight for the vectors and maps when I started to program in C++ and agree that they are not as convenient as arrays and hashes on Perl. But it is nice that they are there. It makes C++ almost bearable.

    I really miss simple, yet powerful syntax for regular expressions as well. Boost regex is a step in the right direction, but these have the impression of being bolted on rather than a part of the core language, and so are less elegant.

    Other than powerful data structures, the things I miss most when programming C/C++ are the control statments. After years in Perl it is natural to say 'do this if condition is true", but one has to rewrtie the sentence in C/C++ 'if (condition is true) do this. ". I miss 'unless' as well. When I have to stop to rearrange my logic to suite limited C/C++ control contstructs, it distracts me and the flow of ideas into code is not as smooth.

    On the other side of the fence, there is probably some Haskell or Lisp coder cursing out perl's inadequate tail recursion support :)


Re: Perl and C++/STL
by athomason (Curate) on Mar 11, 2005 at 06:19 UTC
    std::map is required by the ANSI C++ standard to be implemented as a red-black tree, which has logarithmic lookup time. Hash tables have roughly constant lookup time, so it's not surprising that for largish n the hash will outperform the tree. Unfortunately, the STL in the current standard doesn't provide for a hash-based associative container type, so you can't really compare apples to apples here. The SGI STL implementation does contain a non-standard hash_map template; that would be a more appropriate comparison.

    The moral of the story is that once your problem becomes large enough, asymptotic algorithm performance dominates the constant factors separating a "slow" language from a "fast" one.


    As for your pre-allocation concerns with std::vector, there's good news. You will indeed get segfaults using the square-bracket notation for entries that don't exist (vector<int> a(1); cout << a[3];), but the .at() accessor performs the equivalent function after doing a bounds check (it throws an exception if you violate the bounds instead of segfaulting). And if you find yourself needing to increase the size of the vector, the .resize() method does just that for whatever size you need.

      The SGI STL implementation does contain a non-standard hash_map template; that would be a more appropriate comparison.

      This made me think a bit further. Although I myself have never felt that the array implementation of Perl stands in my way (and I have never heard of anyone, who has ever done so, so it seems a one-fits-all solution), it would be pretty nice to be able to choose the type of your arrays in runtime. This way, you could choose the most appropriate data structure for your specific needs.

      But, with the advent of Perl 6, you can do it. I imagine something along the lines of:

      role List_a_la_STL_vector { ... multi method Num postcircumfix:<<[]>>( Num @self is constant : Num $index){ # return the $index element } ... } role List_a_la_STL_deque { ... multi method Num postcircumfix:<<[]>>( Num @self is constant : Num $index){ # return the $index element } ... } ... # use vector just like an STL vector my @vector does List_a_la_STL_vector; ... # use vector just like an STL deque my @deque does List_a_la_STL_deque;
      This is a really perlish solution. You are not forced to step into the realms of computer science and be intimate with storage requirements and complexity of the underlying data stuctures, but if you do, you can be much more efficient.


Re: Perl and C++/STL
by Whitehawke (Pilgrim) on Mar 11, 2005 at 08:21 UTC

    I actually learned C++ first, then Perl, so what you describe has been less of an issue for me. I avoid abusing other languages by avoiding other languages.

    Just a suggestion: if you intend to continue with C++, pick up a copy of The C++ Standard Library: A Tutorial And Reference by Nicolai M. Josuttis. Definitely a don't-miss book if you have to work with the STL.

Re: Perl and C++/STL
by chardan (Initiate) on Dec 17, 2011 at 11:28 UTC
    2011 (almost 2012) Hi, I've done some advanced work on this and would like to hear from Perl monks who are interested in hearing from interested folks. I wrote a library that didn't work like STL, but allowed Perl to gain some of the benefits-- in a way, betraying Stepanov by retreating to the reason he rejected Ada, but gaining some benefits. I'm interested in Perl 5 or 6 coders who are interested in the subject!

      You've replied to a six year old node. You are much more likely to gain useful feedback if you post a new top level node describing what you have done and including a reference to the 2005 node you just replied to.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://438468]
Approved by kvale
Front-paged by grinder
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2017-08-20 10:02 GMT
Find Nodes?
    Voting Booth?
    Who is your favorite scientist and why?

    Results (313 votes). Check out past polls.