Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Favourite modules April 2003

by mirod (Canon)
on Apr 15, 2003 at 08:28 UTC ( #250503=note: print w/ replies, xml ) Need Help??

in reply to Favourite modules April 2003

Here they are (there are more than 10, but that's why Perl is great: tons of super-useful modules):

Modules that make Perl what it is:

  • CGI (+ CGI::Carp), I am an old school guy (but I use it with mod_perl when it makes sense though),
  • DBI (+ DBD::mysql, DBD::Pg, DBD::SQLite), SQLite for quick hacks and when there is only one user, Pg most of the time,

Very useful:

Convenient modules I use a lot these days:

  • Memoize, especially to memoize the output of the following one,
  • Digest::MD5, to sort through heaps of archived data and figure out what was stored twice under different names

XML modules:

  • XML::Twig, surprise surprise ;--)
  • XML::Simple, battles with YAML for my configuration storage needs,
  • XML::PYX, for one-liners,
  • XML::LibXML, when I manage to get it installed, which is not that often

Comment on Re: Favourite modules April 2003
Re: Re: Favourite modules April 2003
by Anonymous Monk on Apr 15, 2003 at 16:13 UTC
    YAML, successor to Data::Denter, nicer output IMHO than Data::Dumper, safer to use too, modified to allow for variables (a-la-XML::Simple in its latest version)

    YAML is not particularly accurate, and the author has no intention of improving matters. Personally I wouldnt use it at all, and I certainly wouldnt recommend it to others without serious caveats.

      I have heard those claims, but the data I deal with is usually quite simple, just hashes/arrays/scalars/simple objects, no code refs or the likes. So it looks accurate enough for me.

      Could you be more precise and give us examples of the kind of data that causes problems with the module?

      As a side note I exchanged a couple of emails with ingy when I wanted to patch the module, and he was most helpfull.

Re: Re: Favourite modules April 2003
by cees (Curate) on Apr 15, 2003 at 17:54 UTC

    Just a comment/question on what you mentioned here:

    • Memoize, especially to memoize the output of the following one,
    • Digest::MD5, to sort through heaps of archived data and figure out what was stored twice under different names

    I'm just wondering what benefit Memoize provides in this context. It seems from your brief description that you are doing MD5s of the archived data so that you don't have to keep all that data in memory (just an MD5 hash of the data). This will make it easy to find duplicates and won't take up much memory. But my memoizing it, you are still keeping all the archived data in memory, and you are keeping the MD5 hash in memory as well. You might as well just store the data itself in a hash and do a straight comparison on it saving the time required to do an MD5 hash on it.

    I'm curious to know if I am blatantly missing something here, or missunderstanding the usefulness of Memoize in this context.

    By the way, I think Memoize is a great module, but I don't think there are many situations where it is actually beneficial.

      The function I memoize gets passed a file name, slurps the file, normalizes spaces and computes its MD5. So I don't think the content of the file is cached, as it is internal to the function.

      I agree that using Memoize only saves me the cost of a hash (filename => MD5). I just like how easy it is to use it, and how it removes some extra code. As programers we are used to adding extra data structures and code to cache that kind of result, but really, using Memoize gets us closer to the initial algorithm for solving the problem. At least that's how I justify using it here ;--)

        That makes more sense. Memoize will only cache the arguements to the function, and the return value so you are fine with your implementation. Sorry for jumping on this, but I thought that you might be doing something like the following:

        use Digest::MD5 qw(md5); use Memoize; memoize('md5');

        This would use gobs of memory (depending on the input) and wouldn't really accomplish anything useful.

        Do I get an award for coming up with the most unproductive use of Memoize???

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://250503]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-07-23 04:04 GMT
Find Nodes?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:

    Results (133 votes), past polls