Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Customer data encryption

by 0xbeef (Hermit)
on Feb 25, 2007 at 11:35 UTC ( #601980=perlquestion: print w/ replies, xml ) Need Help??
0xbeef has asked for the wisdom of the Perl Monks concerning the following question:

Dear encryption monks

A colleague and myself have written a collector tool (in perl of course!) that collects o/s data on customer systems. Another tool performs an analysis of this data, potentially directly by the customer. The output could however be sent to us for in-depth analysis and audit purposes.

I have been researching a method to enable the customer to encrypt the output files before they are sent to us, but I am not experienced in encryption and scared I might miss something important.

The output files are 15-30MB uncompressed, or 1-5MB compressed. Initially I thought it would make sense to provide my public RSA key to the code (via config file), and use that to encrypt the data. From what I read only using an asymmetric cipher would be very slow, but I then started reading about hybrid encryption techniques - which for my purposes would work something like this:

1. Collector tool generates a random new encryption key.
2. Output is encrypted using this key with a fast symmetric method i.e. Crypt::CBC(?).
3. The symmetric key from step 1 is encrypted with Crypt::RSA, using my public RSA key, which is at least 2048 bits.
4. The encrypted data and now secure symmetric key is transmitted by the customer, preferably by SFTP.
5. The symmetric key is decrypted by myself, the holder of the private RSA key.
6. Customer data can now be unencrypted using the symmetric key, which is now known.
7. Customer data is analysed.

Obviously, only the private RSA keyholder can unencrypt the output file, so by implication the automated analysis tool we provide cannot be used at the customer site once the output is encrypted. I now intend to provide a configurable option to allow a customer RSA key as well, so that the encryption can be used to switch to allow for local or remote analysis.

Also, do I need to encrypt and then compress, or compress and then encrypt?

Any monastry scrutiny would be appreciated!


Comment on Customer data encryption
Re: Customer data encryption
by derby (Abbot) on Feb 25, 2007 at 11:55 UTC

    I think you need to Benchmark your approaches. If your encryption tool follows rfc2440, then the steps you outlined are exactly what happens.

Re: Customer data encryption (Crypt::CBC)
by ikegami (Pope) on Feb 25, 2007 at 11:58 UTC

    I just have time for a quick note/correction.

    Output is encrypted using this key with a fast symmetric method i.e. Crypt::CBC(?)

    Most ciphers encrypt small, fix-sized blocks. (Usually blocks of exactly 8 or 16 bytes.) Crypt::CBC provides padding (for when the length isn't a multiple of the required block size) and chaining (for when there is more than one block of data).

    Crypt::CBC adds two measures of security. Each block is encrypted with a different key, and salt is added (causing a plaintext to result in a different cyphertext every time it's encrypted).

    In short, Crypt::CBC converts a block cipher (AES, Blowfish, etc) into a stream cipher in a safe fashion. Crypt::CBC is *not* a cipher itself. It only provides the previously mentioned features. Modules such as Crypt::Rijndael (aka AES), Crypt::Blowfish and Crypt::Twofish provide the actual encryption.

    Block ciphers (such as Crypt::Rijndael, Crypt::Blowfish, Crypt::Twofish) should not be used directly. They should be used through Crypt::CBC.

      Thanks, I understand this better now. I'd probably want to use AES-256 in this case. I should have said that I'm planning to use Crypt::CBC with a Rijndael cipher


Re: Customer data encryption (asynchronous vs synchronous)
by ikegami (Pope) on Feb 25, 2007 at 12:18 UTC
    ok, I have time for another comment.

    From what I read only using an asymmetric cipher would be very slow.

    The question is whether you need to the speed enough to warrant that extra code. The extra code increases development time, the probability of a bug, the probability of a security bug, debugging time and maintenance time.

    From the customer's perspective: No. The tool will only be used on rare occasion, and the few seconds lost on those occasions won't matter.

    From your perspective: Maybe. Will you be receiving data from many customers at the same time? often? From what you said, it doesn't like it.

    Since you have to do it anyway, start with trying just the asymetric portion. If your needs aren't satisfied, then add the symetric bit.

      From what I read only using an asymmetric cipher would be very slow.

      Most public key implementations do symmetric encryption of the payload with a session key. Only the encryption of the session key is asymmetric. I'd like to see some attribution (and benchmarks) for the "slowness." From what I've seen (which of course is limited), it's the generation of the session key which can be slow - but that's normally a problem with ill configured systems. So whether you go public key to begin with or the home-grown re-implementation , you're going to have the same slowness issue (generating a session key), unless you always use the same session key and in that case, why bother at all.

      The problem really is my lack of benchmark data, so I included the size of the output in my post in case someone knows. If the slowdown is unacceptable, e.g. >10mins at the customer side, I'd then rather opt for the symmetric solution.


Re: Customer data encryption
by shmem (Canon) on Feb 25, 2007 at 12:31 UTC
    I would use GnuPG to let the customer encrypt the compressed data file with my public cipher block, have it then mailed to me, and done.


    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      I unfortunately have no guarantees that a customer has GPG. Of course, I am making the assumption that I will have a customer base of >100, and their operating system releases may/may not include GPG.

      It would not take much extra effort to include the encryption on my side, since I am already using PAR to bundle all the required modules.


        I unfortunately have no guarantees that a customer has GPG.

        I would list that as a requirement for reliable encrypted transmission of data, rather than roll my own. I mean, if they don't have mail encryption, they need it anyways. At least for any account which sends sensitive information to the extranet.


        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Customer data encryption
by traveler (Parson) on Feb 25, 2007 at 16:22 UTC
    Also, do I need to encrypt and then compress, or compress and then encrypt?

    If you want the compression to actually compress the data you must compress first. If you encrypt the data it should make the data look random, there will be no patterns (for any reasonable algorithm). Compression algorithms rely on patterns to reduce the size of the data. So, if you compress first, you get the benefit of the compression, if you encrypt first, it may actually increase the size of the final output. Corrected
      Thanks, this makes sense and I will likely end up using it like this. I was wondering if there are any drawbacks to compressing first - e.g. if an attacker could perhaps exploit any patterns that gzip/bzip2 may produce?


        Very unlikely. Encryption will add entropy to the entropy making it somewhat more difficult to crack than just encrypted text.
        That's what salting is for.
Re: Customer data encryption
by Anonymous Monk on Feb 25, 2007 at 20:04 UTC
    4. The encrypted data and now secure symmetric key is transmitted by the customer, preferably by SFTP.
    SFTP connections are encrypted. No need to mess with trying to encrypt the data if you're transmitting it encrypted.
Re: Customer data encryption
by hangon (Deacon) on Feb 25, 2007 at 23:09 UTC

    I'm not an encryption expert, but thought I'd throw this out there. If the customer does not need have the data in encrypted form on their end, why not use a secure web server and have them upload it to you via SSL? You can set up a secure upload page with a form for uploading the file and their comments. A simple cgi script could notify you via e-mail when a new file was uploaded.

      Having the file encrypted on the hard drive, not just while in transit is not a bad idea. It provides defense in depth, in case the web server (or some other service) fails to restrict access to the uploaded files.
Re: Customer data encryption
by chrism01 (Friar) on Feb 26, 2007 at 01:53 UTC
    As your item 4. (and mentioned already), if you only need encrypted transmission, but are not worried about file storage at each end, just use scp or sftp, which most people should have anyway.
    Also, I prob wouldn't worry about asymmetric performance from your descriptions.
    As mentioned above, the "slow" bit can be the key creation, but that's a one off cost anyway, unless you intend to use a new one for each file (unlikely).
      I feel that the customer is responsible for the collected data at his end, and should protect the output file with appropriate permissions. The original O/S config+log files are not encrypted (unless the administrator uses an encrypted filesystem scheme), but rather by default O/S permissions.

      But it is my responsibility to (at all cost) protect the customer's system info on my side, so ikegami's comment on securing the storage is important in my view. Being overly cautious is fine... and the suggestion about an alternative like a SSL webserver for uploads sounds good too.

      When I get a bit of extra time, I'll do some tests to compare the straightforward RSA file encryption against the hybrid method, and post the results. I have a hunch that the time differences will be significant if the keys are generated upfront.


        SFTP is actually a full featured remote file system protocol. Using it, it would be posible to process the data without downloading it first to the local harddisk. You could even write the output file directly on the remote host so sensible data never gets stored on your system.

        Net::SFTP or Net::SFTP::Foreign would allow you to do so.

        For maximun security, ssh keys should be protected by a keyphrase, but that would require to launch the process manually.

Re: Customer data encryption
by Anonymous Monk on Feb 26, 2007 at 22:14 UTC
    Compress, pad, whiten, timestamp. Then generate a symmetric session key and encrypt with it, then RSA that key and attach it. Alternately, let GnuPG do essentially that very same thing for you. It's very well tested and therefore less likely to contain a major flaw than any system you come up with. No offense intended, that's just the facts of rolling your own.
      Above post is mine, I failed to notice that I wasn't logged in. You said above that you can't guarantee that the end users have GnuPG, but you might want to see about Crypt::OpenPGP. I haven't used it myself, I don't know if it's reliable, and it hasn't been updated in years... but it claims to be a pure-perl implementation of OpenPGP. Might make a worthy alternative.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://601980]
Approved by Sidhekin
Front-paged by Old_Gray_Bear
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (13)
As of 2015-01-28 20:23 GMT
Find Nodes?
    Voting Booth?

    My top resolution in 2015 is:

    Results (223 votes), past polls