Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

making perl more forgetting

by ddzeko (Acolyte)
on May 16, 2004 at 18:09 UTC ( [id://353785]=perlquestion: print w/replies, xml ) Need Help??

ddzeko has asked for the wisdom of the Perl Monks concerning the following question:

Cheerio,

I'm using Perl for financial data manipulations and am working with data which I do not want to keep longer than necessary in the program's memory.

To check how Perl behaves when I want it to forget the value of some variable I made a simple program that looked like this:

$a = 'charangaBoom'; $b = 'BANG!'; $c = $a . $b; $d = "D-$c"; substr($d, 2, 8) = 'x' x 8; sleep(600);

Then, I used

strings /dev/mem | egrep 'Boom[B][A][N][G]'
to check is there anywere "D-charangaBoomBANG" to be found. Unfortunatelly for me, there was.

Why is that? Is there any workaround to this except me writing another XS module providing me TIEd interface to forgettable variables?

Replies are listed 'Best First'.
Re: making perl more forgetting
by Corion (Patriarch) on May 16, 2004 at 18:15 UTC

    Of course, in your case, $c is still holding carangaBoomBANG!, so your grep will still find it.

    And there are also many other ways in which a value can be copied around, for example, when the memory for the string part of a scalar has to be reallocated in a different region, the old, now unused memory is not cleared in any way.

    In short, I don't see much hope for you to erase "sensitive" data from memory within Perl unless you can confine the handling to very short places and control every step of handling.

    Personally, I would rely on the security features of the OS, especially that only the superuser (if at all) has read rights to /dev/mem, and no other script is run as the current user which could peek the sensitive data from memory.

      Contents of $c were found as expected. But, take a look at this (grep output):

      charangaBoomBANG! D-charangaBoomBANG! D-xxxxxxxxBoomBANG!

      It looks like substr() operation duped $d string and served me another instance with copy of my old data.

      That was my second try after assigning an empty string '' gave similar result (preserved secret "D-...BANG!")

      My all hopes are now towards vec() function and I'll give it a try right away!

      Cherio!

Re: making perl more forgetting
by nothingmuch (Priest) on May 16, 2004 at 19:19 UTC
    I think this is a leftover from the temporary value, whose data was freed, and is now either in perl's memory pool, or Linux's (?) one. See later on for why tying won't work.

    You can probably wrap around perl's own malloc, so that it cleans up, and then have perl use it instead of the system one to get a desired effect. But the data may be paged, and only OpenBSD (afaik) knows to encrypt it's swap. Locking all of perl's data into real memory is not my idea of a fun time. Either way, the GnuPG project has secure memory management if you ask for it at configure time. Perhaps you should take a look at what they've done to their project, and see if you can port it to perl. It'd probably be very slow.

    As for a tied interface - the memory pools perl keeps around are (probably) used for stuff like temporary assignments in concatenations, or coercions from string to number, and vice versa. The possibilities are countless. If you don't wipe everything, you're bound to leak some data.

    Perhaps you should look into a black box solution instead, that is, write an XS module that stores a sensitive value till a point you define, and provides functionality (like comparison) on that value. Then pass it other values. The XS module will then be responsible for making sure the value is properly destroyed, and due to interface constraints the perl side won't see it.

    -nuffin
    zz zZ Z Z #!perl
Re: making perl more forgetting
by ysth (Canon) on May 16, 2004 at 19:08 UTC
    Many operators in perl have a private variable called a "target" that is used to return the result. This is basically a hidden lexical that you can't directly act on. You can try keeping all your sensitive code in coderefs:
    $sensitive_code = eval 'sub { my ($foo,$bar,$baz) = @_; $sensitive = "D-$foo-$bar-$baz"; # do stuff with $sensitive return; }'; $sensitive_code->("charanga","Boom","BANG");
    and then undef $sensitive_code; when you want to clean up (followed by a lot of miscellanous code to allocate different sizes of blocks of memory and wipe them). But it's going to be pretty difficult to guarantee success. I'd resort to XS or Inline::C for this.

    Update: actually, not sure you need the eval to get/clear a fresh pad.

      Interesting...

      Perhap's some of diotalevi's work with **cking up closures can help in cleaning up these $sensitive_code references.

      -nuffin
      zz zZ Z Z #!perl
        I don't believe in any solution to this problem and won't attempt it. Don't bother trying. In fact, see Abigail-II's comments. In theory this is something you can do in C but some discussions on sec-prog lead me to think that it may be difficult to impossible to reliably remove something from memory because the compiler is handy at optimizing away these "useless" optimizations. I recall both MSVC and gcc were under discussion.
Re: making perl more forgetting
by Abigail-II (Bishop) on May 17, 2004 at 11:20 UTC
    Perl is optimized to doing things fast, even at the expensive of using more memory. Under the hood, lots and lots of things happen, and Perl isn't going to waste time "erasing" memory it's no longer using.
    Is there any workaround to this except me writing another XS module providing
    I don't think so, and even if you're going to write C, it may be harder to than you think. Compiler might eliminate code whose effect isn't going to be seen. A year or two ago I read an article about someone who had a simalar problem as yours - but then in C. He had sensitive information in a string, and after using it, he "cleared" the content by assigning to it another string of appropriate length. However, the compiler had noticed that after the assignment, the memory wouldn't be accessed anymore - so it just optimized the assignment away.

    C is probably your best option though, just make sure the compiler doesn't outsmart you.

    Abigail

      It sounds like the only solution that is guaranteed is to use ASM and/or a C compiler with optimizing turned off. (Is there an Inline::ASM?)

      ------
      We are the carpenters and bricklayers of the Information Age.

      Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      I shouldn't have to say this, but any code, unless otherwise stated, is untested

        Well, it might be easier to write your code in such a way that the optimizer can't optimize it away the overwriting of the memory - for instance, by reading it back and doing something with those results. (Of course, you still might be outsmarted by your OS - if you're unlucky your rewrites won't go futher than the cache before the program is terminated, and the memory pages invalidated).

        Abigail

Re: making perl more forgetting
by gmpassos (Priest) on May 16, 2004 at 20:51 UTC
    Note that any OS won't overwrite the memory that is set to free! When a process free some chunk of memory it only set the are to free, but won't rewrite and clean the bytes previously writed, and the same is valid when the OS free some memory alocated to a process.

    Note that if the memory is rewrited to clean the old data, things will be much more slow, since the work to always clean the old data will be bigger than create new data.

    Graciliano M. P.
    "Creativity is the expression of the liberty".

      Thanks everybody for answers.

      In my case, I don't need each and every scalar wiped out. In fact there is just this one type of data (credit card details) that I wish to handle securely.

      Unfortunately, I don't see an option here. There are just too many places where raw data pass until they finaly reach my variables. (I'm using POE, with Wheels and Filters for I/O and credit card details are present in input and output messages in cleartext form (in one case over the IPsec VPN, and in other using UNIX socket to communicate to local process)).

      As I see it now, there's not much hope in this case. Only real good kernel level security (eventualy, with secured swap space using loop-aes on Linux or something similar on *BSD), to reduce risks and try to make sure that in case of a break-in the damage will be minimal.

        Hi,

        What you could try to do in minimizing the risk of sensitive stuff in memory is forking a separate process that handles the sensitive information and keep that running as short as possible. The memory freed after that process finished may still contain the sensitive information, as pointed out by gmpassos, but if you keep your sensitive information-process running long (as a deamon for instance) it certainly will contain the sensitive information and this will be in memory.

        What you could do is have another 'wiper'-process that uses a lot of memory, so the chance of your sensitive information being overwritten becomes very high, something simple like:

        #!/usr/bin/perl while (1) { my $aap = "a" x 8192; sleep 1; }
        will allocate at least 8192 bytes filled with 'a' every second (and probably a whole lot because we run perl), at te expense of some CPU and memory (duh). This could be tuned to take into account the current state of total memory usage (make the wiper-process use more if there's a lot of free memory left). I'm not very experienced in the details of memory management, but having a hight turnover in used memory to me seems a good way to decrease the chance of sensitive information still being in memory.

        Beware of using too much memory, because that will result in swap-usage and in that case you also have to deal with getting rid of your information if it's in swap memory. Maybe its advisable to not use swap-memory (lots of 'Live-CD' OSses don't use swap), and just add some extra memory to your system.

Re: making perl more forgetting
by ambrus (Abbot) on May 16, 2004 at 18:58 UTC

    Is the rumour true that under BSD you can use encrypted swap partitions?

Re: making perl more forgetting
by flyingmoose (Priest) on May 17, 2004 at 13:12 UTC
    IIRC, /dev/mem is only mounted as root. Are you saying you can't trust your root users? If so, you've got problems! I'd like to understand more about what you think the "risk" actually is.

      Except for the possibility of somebody breaking in and gaining root privileges I do not fear of (legitimate) root users stealing confidential data. I'm just concerned with the lack of possibility to overwrite string of characters to remove it's contents from memory and make my program more secure.

      This program is running as daemon and since it's pretty stable (POE rocks) and does not leak memory it does not get restarted too often. It's the fact that it leaves the trail of it's confidential parameters in memory and that I cannot do anything about it that worries me.

      Can I file it as a bug report? :)

      Having wipe() in Perl would fix my problem and greatly improved Perl's useability in security related applications. Or having some keyword that would mark variable wipeable by the GC when the data is being released.

      I'm looking forward to write userspace filesystem drivers in Linux using Perl, but I would not approach such thing without having means to destroy unwanted confidential data.

        Fair enough, but if you've been rooted, they can do whatever they want to your system, including installing keyloggers and shells around certain programs. It's especially dangerous when the hit is to take *future* data, so at that point, it's too late.

        I'm looking forward to write userspace filesystem drivers in Linux using Perl

        Very interesting in a sick-and-twisted sort of way... do you have any reference material on the subject? I wouldn't mind reading up on this. Naturally there is no way in heck I'd do this for anything than something that wanted to "act" like a virtual file system, wouldn't trust it to important data, etc, but it does sound cool.

Re: making perl more forgetting
by andyf (Pilgrim) on May 17, 2004 at 19:26 UTC
    Only Ambrus has come close to what I see as the obvious solution to your problem. With an encrypted filesystem , even if another process can get disk access nothing will make the slightest sense. You can page encrypt RAM so that even if they break into your page they see gobbledegook. It slows the system down a bit. Maybe the best thing to do is encrypt the CC numbers at source with a public key (so it doesn't matter that the key is in plain view). The only code that can ever see the real data is the terminal process in the chain. (this assumes you do no intermediate processing on the data). BSD and a little known Tinfoil Hat linux both sport examples of encrypted fs.

    As other posters have said, the issue of security is kind of subsumed into whether your server is secure. It's encouraging that you don't consider even local processes friendly, this is healthy paranoia. At the end of the day you have to store your private key somewhere, and if you cant extend your trust to that machine its no game.
    Andy

      With an encrypted file system, why is one process able to see it but another cannot? Did you mean to imply that this other process only ran at times that the other program wasn't in memory and the file system wasn't mounted?

      The question seemed to assume that the rogue process would run concurrently with the process with the secrets and with the ability to peek at the memory of the process with the secrets. Wouldn't a process with that sort of rights also have access to any file system the other process did?

      What of getting the secret from the program that has the unencrypted data before it goes encrypted?

Re: making perl more forgetting
by saintbrie (Scribe) on May 17, 2004 at 10:49 UTC

    Any chance you could use the Cache modules (maybe Cache::Memory)?

    "The Cache modules are designed to assist a developer in persisting data for a specified period of time."

    Sounds like the idea you had for tieing variables...

      The issue isn't keeping the data - it's the exact opposite. The problem is that data you have used, then released, isn't wiped. Here's an example:

      You have a file on your hard-drive. You "delete" the file. All that has happened is that the operating system has marked that area of the disk as writeable. The actual 1's and 0's are still in the same order they were before. (This is how disk recovery tools work.)

      The same principle works with memory. Just because you have released the memory back to the OS doesn't mean that the OS has changed what was in that area. The 1's and 0's are still in the same order. Now, let's say I have a program that runs after yours. If I'm careful (and lucky), I can grab the same memory locations that you had. If I don't overwrite them, I can read the data you had tried to keep secret.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      I shouldn't have to say this, but any code, unless otherwise stated, is untested

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://353785]
Approved by Enlil
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-23 23:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found