Bod's user image
User since: Nov 15, 2020 at 00:48 UTC (4 years ago)
Last here: Jul 11, 2024 at 14:42 UTC (6 days ago)
Experience: 8342
Level:Parson (16)
Writeups: 1270
Location:Coventry, UK
User's localtime: Jul 17, 2024 at 23:02 UTC
Scratchpad: View
Member of: pmdev, SiteDocClan
For this user:Search nodes

Long time amateur coder since growing up with a ZX Spectrum and BBC Micro...

Introduced to Perl in the early 1990's which quickly became the language of choice. Built many websites and backend applications using Perl including the sites for my property business:
Lets Delight - company site
Lets Stay - booking site
Also a few simple TK based desktop apps to speed things up.

Guilty of only learning what I need to get the job done - a recipe for propagating bad practice and difficult to maintain code...difficult for me so good luck to anyone else!

Now (Nov 2020) decided to improve my coding skills although I'm not really sure what "improve" means in this context. It seems Perl and best practice have come along way since I last checked in and my programming is approach is stuck in the last decade.

Onwards and upwards...

20th October 2021 - added to Saint in our Book 😀
2nd October 2022 - promoted to Priest
7th July 2023 - promoted to Vicar
15th December 2023 - promoted to Parson

Find me on LinkedIn, or on Twitter

CPAN Releases



Nodes I find helpful


Re: What do I use to release a module to CPAN for the first time?
Basic Testing Tutorial

Posts by Bod
Regexp match start or end in Seekers of Perl Wisdom
5 direct replies — Read more / Contribute
by Bod
on Jun 02, 2024 at 09:06

    As I mentioned in Yet another Encoding issue..., I am writing an AI chatbot based around AI::Chat that holds a conversation in Turkish and corrects any mistakes in the Turkish supplied by the user. Of course, there are not always mistakes so a correction is not always needed.

    I've promoted the AI that
    "if there are no mistakes that need correcting reply with the single word "Perfect" and do not add any other words to your reply."

    But being AI, it can be unpredictable! Sometimes, it will quote the Turkish and then write "Perfect" on a separate line.

    Currently I check for whether there is a correction like this:

    if ($reply !~ /^perfect/i) { $chatReply->{'correction'} = $reply; }

    I don't want to check for "Perfect" anywhere in the reply as it might form part of a valid correction. So I am thinking of checking that "Perfect" appears either at the start of the reply or at the end like this:

    if ($reply !~ /^perfect/i and $reply !~ /perfect$/i) { $chatReply->{'correction'} = $reply; }

    But is there a way of combining those two regexps into just one? It seems there should be...

Yet another Encoding issue... in Seekers of Perl Wisdom
3 direct replies — Read more / Contribute
by Bod
on Jun 01, 2024 at 15:34

    I'm using AI::Chat to create a Turkish practice, AI-Powered chat. The first part is for the AI to analyse the Turkish supplier by the user (me) and check it for errors. Because Turkish uses some non-latin characters in the alphabet, this has created another character encoding issue for me. To eliminate the OpenAI API and AI::Chat, I have created this test script that demonstrates the issue...(no apologies for inline CSS marto - this is a quick and dirty test script!)

    #!/usr/bin/perl use CGI::Carp qw(fatalsToBrowser); use lib "$ENV{'DOCUMENT_ROOT'}/cgi-bin"; use JSON; use utf8; use incl::HTMLtest; use AI::Chat; use strict; use warnings; if ($data{'userChat'}) { my $reply = {}; $reply->{'response'} = $data{'userChat'}; print "Content-type: application/json\n\n"; print encode_json $reply; exit; } print<<"END_HTML"; Content-type: text/html; charset=UTF-8 <html> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1" /> <head> <script> function sendChat() { if (document.getElementById('userChat').innerText.length > 2) { fetch('?userChat=' + encodeURIComponent(document.getElementByI +d('userChat').innerText)) .then((resp) => resp.json()) .then((json) => { document.getElementById('chatBox').innerHTML += '<div +class="textResponse">' + json.response + '</div>'; document.getElementById('userChat').innerText = ''; }); } } </script> </head> <body> <div id="chatBox" style="border:solid thin blue;min-height:100px"></di +v> <div id="userChat" contenteditable="true" style="border:solid thin gre +y"></div> <input type="button" value="send" onClick="sendChat();"> </body> </html> END_HTML

    The incl::HTML module (here renamed to incl::HTMLtest) takes the URL query string and splits it up into key value pairs that it puts into %data

    In this minimalistic script, text is entered into <div id="userChat"> and sent back to the Perl script when the button is clicked. This uses the fetch API. The content is in $data{'userChat'} which is just sent back as a very simple JSON object to be written into <div id="chatBox">.

    This works as expected until we introduce non-latin characters - for example "café" which gets displayed as "cafĂ©"

    I've captured the query string before decoding and it is "userChat=caf%C3%A9"

    It seems very strange to me that we start off with four characters in "café" and seem to get to five with "caf%C3%A9" which gets decoded as five characters...

    The code that does the decoding in incl::HTML looks like this. I cannot recall where it came from but it has been working for many, many years and has definitely handled Turkish characters in the past under Perl v5.16.3. I wonder if it is failing after the change to Perl v5.36.0

    my @pairs = split /&/, $query_string; foreach my $p(@pairs) { $p =~ tr/+/ /; $p =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; my ($key, $val) = split /=/, $p, 2; $data{$key} = $val; }

    I am beginning to think that I will never understand this mysterious world of character encodings...then I remember that for many, many years references, especially hashrefs were a total mystery to me and now I use them without having to think too hard about it. This is in no small part thanks to the Monastery and I'm hoping a similar magical revelation might be bestowed on me for character encoding! Everything was so much easier when all we had was ASCII!

CGI::Carp sometimes fails in Seekers of Perl Wisdom
4 direct replies — Read more / Contribute
by Bod
on May 18, 2024 at 12:04

    I'm using the following code...

    #!/usr/bin/perl use CGI::Carp qw(fatalsToBrowser); use strict; use warnings; use lib ( "$ENV{'DOCUMENT_ROOT'}/../lib", "$ENV{'DOCUMENT_ROOT'}/../.. +/prod/lib" ); use Bod::CRM; use Site::Utils; die "X - " . $file{'receipt', 'file'};

    The variable $file{'receipt', 'file'} is defined using our and exported with Exporter in the private module Site::Utils.

    Different die statements give very different results
    die "here!"; - CGI::Carp works as expected
    die "X - " . $xxx; - CGI::Carp works as expected - global symbol $xxx requires...
    die "X - " . $file{'receipt', 'file'}; - CGI::Carp doesn't write anything to the browser - I just get a 500 error from the browser.

    How can this be?

JSON encoding error in Seekers of Perl Wisdom
2 direct replies — Read more / Contribute
by Bod
on May 09, 2024 at 09:31

    Having changed server from CentOS to Debian 12 and Perl version from 5.16.3 to 5.36.0, I am having lots of difficulties with character encoding. A topic I don't properly understand. This problem is currently manifesting itself as a failed Stripe webhook. The webhook was working fine before and now it isn't!

    Stripe gives me this error:
    Invalid encoding: ISO-8859-1

    I have checked the Debian encoding (I think) and it says:

    ~# echo $LANG C.UTF-8

    Here is a very cut-down version of my code to demonstrate the problem:

    #!/usr/bin/perl use CGI::Carp qw(fatalsToBrowser); use strict; use warnings; use utf8; use JSON; print "Content-type: application/json; charset=UTF-8\n\n"; print encode_json { 'testing' => 'some test', };

    If I call the webhook endpoint in browser, I get the expected JSON output but Stripe gives me the encoding error.

    Given that only two things have changed, server OS and Perl version, it must be one of these. Is there anything I need to do on the server to ensure that it is serving UTF-8 output correctly?

    There are other problems that I think are unconnected, but wise monks may find helpful as they could have the same cause...

    • .htaccess files don't work - they cause a 403 error as soon as RewriteEngine is enabled
    • When I edit some (but not all) Perl scripts, the permissions change from the Plesk psacln group and the correct owner to root
    • When duplicating or copying a Perl script the permissions always change to root and from 755 to 644
    • If I set up FTP accounts, they only have read permission regardless of what group(s) the account gets put in

Install on demand ? in Seekers of Perl Wisdom
5 direct replies — Read more / Contribute
by Bod
on May 07, 2024 at 08:36

    I am using Spreadsheet::Read to provide a universal way to read spreadsheets that users upload. I'm using this module because I have no idea what flavour of spreadsheet users might want to upload.

    This module relies on others to do the work of reading the data. Is there a way to install the modules it uses "on demand"? So only when we first see a Lotus 1-2-3 spreadsheet, for example, do we install the module to read it.

    The purpose of this is to convert the spreadsheet to CSV, so I have it in a standard format for the next part of the processing, which is mapping the data fields ready to import into a CRM - perhaps I am overthinking the universal spreadsheet part...

Encoding issue after upgrade in Seekers of Perl Wisdom
1 direct reply — Read more / Contribute
by Bod
on May 06, 2024 at 17:08

    After a server change, we are getting lots of strange characters from an encoding issue.
    Double spaces and emojis are displayed as Â

    I think this issue is related to the change from Perl version from 5.16.3 to 5.36.0. From the Perl Delta, I note there have been some changes to the way Perl handles UTF encoding, but I don't understand the implications of this.

    We've also upgraded MariaDB from 10.5 to 10.11 but both the character set and the collation are the same. utf8mb4 and utf8mb4_general_ci respectively.

    This issue is not just about data that was created prior to the change. Although emojis created after the change are not mutilated, double spaces are.

    All web output is UTF8 encoded using:
    Content-Type: text/html; charset=UTF-8

    Any suggestions where I should look to solve this issue.

UK tax system uses Perl in Meditations
1 direct reply — Read more / Contribute
by Bod
on Apr 08, 2024 at 10:47

    I've just found out the HMRC (the UK's taxation department) uses Perl for at least some of its operations...

    The system is troubled today and I received this error revealing the language

    Ref: /home/ewf/MODULES/Common/ Error Code 401 at line 30

    Update: corrected typo in title

Server Time in Seekers of Perl Wisdom
2 direct replies — Read more / Contribute
by Bod
on Apr 07, 2024 at 06:56

    We've started the process of moving hosting server to a cheap VPS - just non-essential (read "hobby") sites for now whilst we get to grips with it and I learn how to manage a server...

    All is going relatively well so far, but I've found an anomaly with time settings. Can anyone explain what is going on?

    The 'old' server observes DST whereas the 'new' server doesn't. If I run timedatectl on the 'new' server it tells me it is set to UTC. I don't have access to run the same command on the 'old' server.

    Here in the UK we are now on BST (GMT+1) so things that are time dependent (like Google Calendar feeds) that have been moved over are all 1 hour out. If I get the time from the MariaDB database SELECT NOW() I get GMT, as expected.

    However, I have a bit of test code left in a page that only I use. It's a bit of JavaScript document.write(document.lastModified); which shows the time in BST. Doen't that time get passed in the HTTP headers from the Perl generated web page? Perl is, of course, also reporting GMT.

    The obvious solution seems to be to change the server time from UTC to GMT.
    Will that then observe UK DST?
    Are there any reasons not to change the server to GMT bearing in mind that the entire codebase was written on a server that observes DST?

    Update: - the "obvious" option to set would be 'Europe/London' but that is not included in timedatectl list-timezones

PERL5LIB not in @INC in Seekers of Perl Wisdom
3 direct replies — Read more / Contribute
by Bod
on Mar 23, 2024 at 10:56

    I have some common modules that are used in several places so I have created a directory for them at /usr/lib/perl_modules. I want to include this location in @INC for every user, including CRON.

    I've added export PERL5LIB=/usr/lib/perl_modules in both /etc/environment and /etc/profile.

    When I list the environment variables, PERL5LIB is there as expected. But when I try to use on of the modules I get an error:

    Can't locate in @INC (you may need to install the my_modu +le module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/ +perl/5.36.0 /usr/local/share/perl/5.36.0)
    There are other locations in @INC but /usr/lib/perl_modules is not one of them...

    I suspect the environment variable is set for root that I'm using to list the variables, but not for whatever process is running the script within Apache.

    How can I properly set PERL5LIB for all users and processes or is there a better way to get an extra entry in @INC for every script without having to use lib in every script?

Holding site variables in Seekers of Perl Wisdom
7 direct replies — Read more / Contribute
by Bod
on Mar 21, 2024 at 06:39

    We operate a number of websites, all of which operate on the same server.

    Currently, I am the only developer. But that is likely to change over the next 18 months or so. I'm making some changes that present the opportunity to make some improvements to the internal design and security of the sites. I'm looking for some input on the "best" way to do this. Any input welcome but especially around global site variables.

    Currently we have this directory structure (plus a few others omitted for simplicity:

    site/prod/bin/ site/prod/lib/ site/prod/template/ site/prod/www/ site/test/bin/ site/test/lib/ site/test/template/ site/test/www/

    Every site has identical code in prod and test (except for during development of course) except for one file site/lib/ which declares the site variables needed for that site and environment. Things like the DB credentials, the DB instance to connect to, Stripe keys, API keys, etc.

    use strict; use warnings; our $env_db_user = 'dbusername'; our $env_db_pass = 'dbpassword'; our $env_paypal = 'PP username'; # etc, etc, etc
    There is no logic code in this module, it just defines variables with our. This module is used by a utility module that is used by every script on the website.

    When we bring another developer onboard, I want to split the site variables into two - those they have access to (test database schema name, text Stripe keys, etc) and those they don't (live Stripe keys, database credentials, etc). I could relocate this file to further up the directory structure where they don't have access, but I feel sure there is a better way to handle this as it must be a common problem in multi-developer environments.

    What I have works well and it not in need of imminent change. But I have opportunity to make it more robust as I am making other changes.

    What advise can you give on this matter kind and wise Monks?