Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Holding site variables

by Bod (Parson)
on Mar 21, 2024 at 10:39 UTC ( [id://11158449]=perlquestion: print w/replies, xml ) Need Help??

Bod has asked for the wisdom of the Perl Monks concerning the following question:

We operate a number of websites, all of which operate on the same server.

Currently, I am the only developer. But that is likely to change over the next 18 months or so. I'm making some changes that present the opportunity to make some improvements to the internal design and security of the sites. I'm looking for some input on the "best" way to do this. Any input welcome but especially around global site variables.

Currently we have this directory structure (plus a few others omitted for simplicity:

site/prod/bin/ site/prod/lib/ site/prod/template/ site/prod/www/ site/test/bin/ site/test/lib/ site/test/template/ site/test/www/

Every site has identical code in prod and test (except for during development of course) except for one file site/lib/vars.pm which declares the site variables needed for that site and environment. Things like the DB credentials, the DB instance to connect to, Stripe keys, API keys, etc.

use strict; use warnings; our $env_db_user = 'dbusername'; our $env_db_pass = 'dbpassword'; our $env_paypal = 'PP username'; # etc, etc, etc
There is no logic code in this module, it just defines variables with our. This module is used by a utility module that is used by every script on the website.

When we bring another developer onboard, I want to split the site variables into two - those they have access to (test database schema name, text Stripe keys, etc) and those they don't (live Stripe keys, database credentials, etc). I could relocate this file to further up the directory structure where they don't have access, but I feel sure there is a better way to handle this as it must be a common problem in multi-developer environments.

What I have works well and it not in need of imminent change. But I have opportunity to make it more robust as I am making other changes.

What advise can you give on this matter kind and wise Monks?

Replies are listed 'Best First'.
Re: Holding site variables
by hippo (Archbishop) on Mar 21, 2024 at 12:06 UTC

    The best advice is the kind you don't want to hear. :-)

    In this case it boils down to: "Don't host test and prod on the same server". Then you give the less-trusted devs access to the test server only. No live credentials will then be available to them.

    This strategy has many other benefits. You can upgrade components (DB, webserver, O/S, perl, whatever) on the test platform in isolation, test to destruction and only then do the same on prod. Your test server does not need to have the same spec because it isn't handling the traffic and doesn't need to run 24x7 so you aren't looking at double the cost. Bite the bullet now - it will pay off in the long run.


    🦛

      The best advice is the kind you don't want to hear. :-)

      I wanted honest advice hippo and I am grateful that you provided it 😊

      Having a separate test server is not something I had even considered; now, it is firmly on the radar. But a server change isn't happening just yet, so in the short term, I am looking at a better way of dealing with global variables...unless, of course, the way I'm doing it already is as good as the alternatives.

        Well, nothing beats separate servers IMHO but that's not always an option. For example, we provide shared hosting for some customers and obviously they need to have their data isolated from each other while being on the same server. We achieve this with strictly-enforced permissions on the users' files. Each user only has read access to their own files, not those of any other user. You could set up the same, at least for the credential-filled file. Just ensure that the untrusted dev user has no permissions to access the live file (ideally the whole live tree) and that's all you need.

        For configuration variables in our own sites (not customers) we tend to use environment variables declared within the webserver conf (which is itself unreadable by the normal users). This keeps the per-site filesystem clean and means that we don't need to take care with that when deploying in-site code between dev and prod. It's a bit more of a faff to do this for the customers and they have less need - most don't bother with a dev environment hosted with us.

        HTH.


        🦛

      The best advice is the kind you don't want to hear. :-)

      Well...you might be surprised...

      Don't host test and prod on the same server

      We've been revising our infrastructure plan, and we've reached agreement on the next stage of growth. This will be a server for production and test environments plus a smaller server for dev. Test will be for making relatively small changes and releasing new features limited to our codebase whereas dev will be for testing system-wide changes such as database version, OS upgrades and Apache configuration.

      It's not happening immediately as there is not the revenue or traffic to warrant it right now, but it is agreed as a forward plan.

      Thanks hippo. Before this thread, the idea of a separate server hadn't entered our thought process.

        We've been revising our infrastructure plan, and we've reached agreement on the next stage of growth. This will be a server for production and test environments plus a smaller server for dev.

        Consider using virtual machines for the development server, so you can easily create several test servers (one test server, one staging server, one victim server for each developer) in the future. There are several good solutions for running VMs that I know:

        VMware
        Never used for a long time. Several variants, some running on bare metal, some on top of Windows or Linux. Current owner is trying to squeeze out every cent of VMware users, no matter how much the brand is damaged by that behaviour.
        VirtualBox
        Runs on top of Windows, Linux, MacOS X, Solaris. Mostly GPL software, some nice features (IIRC, USB 3.0 and Remote Desktop) are free as in beer, but not GPL. Very nice on a desktop. Management by native application.
        Proxmox
        Runs on top of Debian Linux, comes with Debian Linux if you want, provides not only VMs, but also containers. Open source, based on many existing open source packages (LXC, qemu, novnc, Perl and tons of others). Can be clustered. Management via Webbrowser. Support costs, if you can live with just the wiki and the forums, it's free as in beer. Highly recommended for servers.

        Real-world Proxmox:

        Home Setup
        2x HP N54L (dual core AMD 2,2 GHz), each with 8 GByte RAM, software RAIDs on SATA Harddisks for root and data filesystems, running seven resp. three LXC containers.
        Old Office Server
        Core i5-6400 (4x 2,7 GHz) on a Gigabyte desktop board, 32 GByte RAM, root and some data on a RAID-5 of 3x 2 TB HDD, other data on a second RAID-5 of 3x 2 TB HDD, currently running seven of the 18 configured Linux VMs.
        New Office Server
        Ryzen 7 2700X (8x 3,7 GHz) on a Gigabyte desktop board, 64 GByte RAM, root and some data on a RAID-5 of 3x 2 TB SSD, other data on a second RAID-5 of 3x 2 TB SSD, currently running 10 of the 15 configured VMs, most of them run Windows (XP, 7 or 10), the other ones Linux.

        Neither the home setup nor the office servers run in a cluster, as you need at least three servers for a cluster, and you should have a dedicated storage system. The home servers really need more RAM, but work good enough for two users. The two office machines serve about 15 users. Both setups run file and mail servers, SVN, databases. At home, Urbackup also runs in an LXC. At work, Urbackup runs on a separate, non-virtual server. At work, there are also several Jira instances, a Samba domain controller, and some test servers running in VMs.

        Some lessons learned:

        • Running one Software RAID-5/6 with a lot of disks (six) really sucks, as each write access is amplified by a factor of six, so that the machine is severely limited by disk I/O. I've changed that to two Software RAID-5 with three disks each. That significantly reduces the I/O load. The obvious disadvantage is that only one disk per RAID may fail and needs to be replaced ASAP. RAID-6 would degrade to a fully working RAID-5 if one disk fails.
        • SMR harddisks are a real pain in the back. They are just garbage. Don't buy them. If they happen to work, their performance sucks. The new server has seen five of them failing and being replaced by new ones over the last five years.
        • SATA SSDs (Samsung 870 EVO) replacing SMR disks are a huge performance gain. Highly recommended when used with a working backup.
        • (The CMR harddisks in the old office server just work. Fast enough and with no fails.)
        • Nothing beats RAM and CPU cores except for more RAM and more CPU cores. Buy as many cores and as much RAM as possible.
        • Proxmox recommends a hardware RAID. Software RAID-1 with two disks and RAID-5 with three disks are just fine. There will be some I/O load peaks, but they rarely matter in our setup.
        • You don't need server grade hardware to run servers 24/7. Desktop hardware in a 19 inch case with hot swap disk cages works just fine.
        • Proper server hardware has remote management and redundant power supplies, both can be handy at times, but you can do without.
        • RAID recovery after a power outage takes a day at full I/O load and makes the server unusable. If your machines serve more than just two home users, you want one dedicated UPS per server and one dedicated line breaker per UPS.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Holding site variables
by cavac (Parson) on Mar 21, 2024 at 19:13 UTC

    I had the same problem at $work when the team grew from "just me" to "we now have multiple people and start selling the stuff to customers". Our solution is, in simple term, to set some environment variables to tell the software what configuration files to load.

    It's easy enough to externalize some configuration files to a source code repo that only a few people have read access and even fewer have write access. If the ENV is not set, load the default config (using the development environment), otherwise load the config from the path given in the ENV variable.

    This not only makes it possible to distinguish between development and production, it's also easy for developers to switch between multiple test setups. This comes especially into play when you have multiple ways your software runs.

    In my case, it's a POS system(*), which can run on servers, workstations and embedded systems (so long as they run Linux, naturally) and in Docker (oh god, the pain!). And i alone have like a dozen different databases over multiple systems, just so i can test the various possible customer setups.($)


    (*) No, not "piece-of-sh**"(**) but "point of sale", e.g. a cash register.

    (**) That being said, the cheapest variant of the hardware this runs on, has, let's just say "the cheapest thermal printer ever made, talking the weirdest dialect of ESC/POS an unpaid chinese intern could come up with". This goes so far that the only software that can print a receipt on this contraption is my own. Neither the OEM windows driver nor their Linux driver actually work correctly. I made it work by reverse engineering both drivers, second-guessing the most likely screw-ups and misunderstandings resulting from the original Epson specifications and learning a lot of new curses(***)

    (***)Development happened over a number of weeks on Fridays. On Fridays, i have the office mostly to myself. This tactical decission hugely reduced the number of complaints about my political correctness regarding those unpaid chinese interns.

    ($) You know, those types of customers who want to have the latest and greatest software (for next to nothing, of course, money doesn't grow on trees...), and then want to run the sales process the same way that their great-great-great-grandfather used to run before Napoleon invaded Austria...

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
Re: Holding site variables
by talexb (Chancellor) on Mar 21, 2024 at 14:23 UTC

    I wonder if you could use an environment variable to select which credentials (Test or Prod) that you get, with the default being Test. That variable could be selected by the port number configured in Apache (or whatever you're using).

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Re: Holding site variables
by bliako (Abbot) on Mar 22, 2024 at 08:28 UTC

    If you turn your "configuration variables" into subs you can log whenever someone tries to read them. And act like constants. E.g.:

    # there are many other ways to print a stack trace # and some core (see caller) use Devel::StackTrace; sub env_db_user { $log->info("accessing config: ".Devel::StackTrace->new->as_string); return 'username'; } ...

    Also note that even if a user can not directly access the vars.pm, because of file permissions, your (production) code can and will read it and hold it into memory. One can tell production to dump the vars by adding some code to the test codebase which eventually may find its way to production. With the above it is a little bit more difficult to do that. I know how to print all subs from the symbol table but I don't know how to print their contents see Re^2: Holding site variables. say $_ for keys %main::

    The other issue is integrity of your configuration. It can be verified with a SHA signature. But it feels more natural to do so with having your configuration in a JSON or any other config format and your code reads the configuration and verifies its hash signature (stored in the code! oh well!) (edit: and creates subs like env_db_user dynamically). That works well with encrypted configuration too. But then your code contains the SHA signature and password to that encrypted config ...

    A side issue to what you asked, is integrity of your code and reviewing what goes from test to production ...

    bw, bliako

      > I know how to print all subs from the symbol table but I don't know how to print their contents.

      It might not work every time, but it works most of the times:

      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Data::Dumper; local $Data::Dumper::Deparse = 1; sub secret { 'Just Another Perl Hacker,' } for my $symbol (keys %main::) { my $def = do { no strict; Dumper(${'main::'}{$symbol}) }; say "$symbol => $def" if $def =~ /sub/; }

      Update: Or, instead of using a regex to identify subs:

      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Data::Dumper; local $Data::Dumper::Deparse = 1; sub secret { 'Just Another Perl Hacker,' } for my $symbol (keys %main::) { my $def = do { no strict; *{"main::$symbol"}{CODE} }; say "$symbol => ", Dumper $def if $def; }

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        thanks, noted

Re: Holding site variables
by kcott (Archbishop) on Mar 21, 2024 at 15:28 UTC

    G'day Bod,

    "... one file site/lib/vars.pm ... is used by a utility module ..."

    There's already a core pragma vars. I see a conflict here. How are you getting around that?

    — Ken

      There's already a core pragma vars. I see a conflict here. How are you getting around that?

      Names have been changed to protect the innocent security!

      Well spotted kcott! But the actual module has a different name and it has a namespace because it is declared with package.

Re: Holding site variables
by FreeBeerReekingMonk (Deacon) on Mar 22, 2024 at 20:37 UTC
    Assuming your server is Linux, and you are splitting the site/lib/myvars.pm into several files putting site/lib/test_vars.pm site/lib/prod_vars.pm (it's not site/prod/lib/vars.pm ?)

    Which are off limits to everybody, by default (chmod 600 owned by root). Now, by adding users to one or more groups, and setting ACL's on files using the groups keep you save. (and rx on the directory, to get in). And in similar fashion, the user your application runs under, is not in the groups that get access to the dev credentials.

    https://www.geeksforgeeks.org/access-control-listsacl-linux/

    And should a developer have two hats, how about creating 2 userid's for that user? so "oops" are minimized.

    Now that's all nice and all, but you need your perl myvars.pm to load in what it can and skip what it doesnt. And do some sanity checks when it's missing a variable it expects, but you can program that in.

    ./readsomevars.pl

    #!/usr/bin/perl use strict; use warnings; use feature 'say'; use File::Basename qw(dirname); use Cwd qw(abs_path); use lib dirname( abs_path $0) . '/MYLIB'; use myvars qw(get_pass %VARSETS); say "PASS=".&get_pass(); say "DEBUG: Got variables set '$_' from file: $VARSETS{$_}" for (sort keys %VARSETS);

    ./MYLIB/myvars.pm

    package myvars; # still not a cromulent package name use strict; use warnings; use Exporter qw(import); our @EXPORT_OK = qw(get_pass %VARSETS); use feature 'say'; use Cwd qw(abs_path); use File::Basename qw(dirname); our %VARSETS; my $me = 'myvars.pm'; my $medir = dirname( abs_path $0 ). '/MYLIB'; opendir( my $dh, $medir ) || die "Can't opendir $_[0] $!"; my @list = grep { !/^\\./ &&/.pm$/ && !/^$me$/ && -f "$medir/$_" } sor +t readdir($dh); closedir $dh; our $env_db_pass =""; for my $file (@list){ my $fqfile=$medir . '/'. $file; my ( $varset ) = $file =~ /^([^-]+)/; if (-r $fqfile){ say "DEBUG: reading $fqfile"; if ($VARSETS{$varset}){ warn "ERROR($varset): Already have $VARSETS{$varset} +, discarding $file\n"; # die? croak? use Carp; ? }else{ require "$fqfile"; $VARSETS{$varset} = $file; } }else{ say "DEBUG: skip $fqfile" } } say "myvars.pm: CREDENTIALS($env_db_pass)"; sub get_pass { return $env_db_pass; } 1;

    ./MYLIB/database-et.pm

    our $env_db_pass = 'devpass';

    ./MYLIB/database-pr.pm

    our $env_db_pass = 'prodpass';

    Which would:

    DEBUG: reading /home/fbrm/CODE/PERL/monks/11158449/MYLIB/database-et.p +m DEBUG: skip /home/fbrm/CODE/PERL/monks/11158449/MYLIB/database-pr.pm myvars.pm: CREDENTIALS(devpass) PASS=devpass DEBUG: Got variables set 'database' from file: database-et.pm

    or:

    reading /home/fbrm/CODE/PERL/monks/11158449/MYLIB/database-et.pm reading /home/fbrm/CODE/PERL/monks/11158449/MYLIB/database-pr.pm ERROR(database): Already have database-et.pm, discarding database-pr.p +m at /home/fbrm/CODE/PERL/monks/11158449/MYLIB/myvars.pm myvars.pm: CREDENTIALS(devpass) PASS=devpass
Re: Holding site variables
by nikosv (Deacon) on Mar 21, 2024 at 15:37 UTC
    Check Envio out :

    Envio is a command-line tool that simplifies the management of environment variables across multiple profiles. It allows users to easily switch between different configurations and apply them to their current environment

    https://github.com/envio-cli/envio

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11158449]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-06-16 00:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.