As they were created and maintained manually, over the years the descriptions of each code of the economic classification of expenditures and revenues were becoming polluted. For example, for one year the description for code 20 is "Automotive repairments", for other year the same code has "Auto repairs", for other year it has "Vehicle maintenance", and so on. Although most of the time the descriptions match, there are differences between years. Not just word differences, also abbreviations, accents, lower-uppercase, blanks, etc...
Unfortunately, all the values for one field (column) are composed of the concatenation of the code and the description, e.g. "20.- Vehicle maintenance". There aren't two separate fields for code and description. This way, it is hard to create pivot tables and such things by people who don't know how to program.
Task
Normalize the values (strings composed of code and description) of a specific field. Write a program showing to the user all the codes for which the associated description differ between years. Also, as a suggestion present the most recent (by year) code+description string (assuming it is the "best", more accurate, more descriptive, ...). Let the user interactively choose from all the options shown or introduce a new description.
Once finished, write out a CSV file with just one column containing the normalized values. This file can then be used to easily replace the whole column in the original input CSV file by using a spreadsheet app, like LibreOffice Calc or MS Excel.
Example session (target column 12):
$ raku clean_class.raku -t=12 PPP_INC_2014-2021_Liq_20230424.csv
Read rows: 4139
WARNING: unexpected separator: 1
WARNING: empty txt: 1
Processed rows: 4139
1. Impost sobre la renda
2021
2. Sobre la renda
2014 2015 2016 2017 2018 2019 2020
[code: 10 remaining: 12] Which one (1-2)[1]:
1. Sobre transmissions patrimonials i actes jurídics documentats
2014 2015 2016 2017 2018 2019 2020
2. Transmissions patrimonials i actes jurídics documentats
2021
[code: 20 remaining: 11] Which one (1-2)[2]: 2
1. De l'Administració General de l'Estat
2020 2021
2. De l'Estat
2014 2015 2016 2017 2018 2019
[code: 40 remaining: 10] Which one (1-2)[1]:
1. D'empreses públiques i d'altres ens públics de la C.
2020
2. Del Sector Públic Instrumental i altres ens del Sector Públic de la
+ Comunitat
2021
3. Del sector públic instrumental i d'altres ens públics de la C.
2014 2016 2017 2018 2019
[code: 44 remaining: 9] Which one (1-3)[2]:
...
As a bonus, as user input accept also a kind of "class" specification. For example, "1,3:4;2:6". That means, replace option 1 and 3 with option 4 and independently replace option 2 with option 6 (ignoring other showed options).
Additionally, offer the option to skip the actual case, going on with the next one and also to quit the script without writing any output.
I recently made a UTC clock script using Tk::LCD. It simulates the look of a 7-segment LCD display. The elements/digits are designed around a 22x36 pixel block(large) and an 11x18 pixel block(small). In using this package, I determined that the digits were too small and that leading zeros weren't displayed. I implemented an option for displaying leading zeros and another for scaling the elements to an arbitrary multiple(where 1 equals the original large size). I plan a separate post to discuss these changes further.
This post concerns a test script for adding support for special characters in this case the : (colon). Currently Tk::LCD only supports numbers, minus, and space. This script draws a : within a 22x36 pixel block and provides for scaling to an arbitrary multiple.
The challenge of this script was in returning lists from a callback. While I came across the solution(call by reference and the $_[0] construct) fairly quickly the implementation was not obvious to me. The result is shown below.
I plan to integrate this code into my version of Tk::LCD to allow display of an HH:MM format. Other specical characters could be implemented in a similar way.
Update1: colon2.pl listed below includes changes based on comments to colon1.pl. Thanks to all who provided comments.
#! /usr/bin/perl
# colon2.pl - Draw a scalable : (colon) on a canvas
# Test script for a planned addition to Tk::LCD.pm
# Tk::LCD defines elements within a 22 x 36 pixel re
+ctangle
# The colon is drawn as two circles within this rect
+angle
#
# @Base shapes are scaled and moved into @scaled sha
+pes for display
# Clicking the Next or Previous button rescales
# and redraws the screen
#
# James M. Lynes, Jr. - KE4MIQ
# Created: March 14, 2023
# Last Modified: 03/14/2023 - Initial Version
# 03/15/2023 - First working version
# 03/17/2023 - Updated with PerlMonks comments
#
# Environment: Ubuntu 22.04LTS
#
# Notes: Install Perl Tk and non-core modules
# sudo apt update
# sudo apt install perl-tk
use strict;
use warnings;
use Tk;
my @baseBox = (0, 0, 22, 0, 22, 36, 0, 36); # Base Rectangle b
+ounding box
my @baseTopColon = (8, 9, 14, 15); # Base Circle boun
+ding box
my @baseBotColon = (8, 21, 14, 27); # Base Circle boun
+ding box
my @scaledBox; # Scaled Rectangle
my @scaledTopColon; # Scaled Circle To
+p
my @scaledBotColon; # Scaled Circle Bo
+ttom
my $scale = 1; # Base scale facto
+r
scale(\@scaledBox, \@scaledTopColon, \@scaledBotColon); # Initial scal
+ing
# Define the Widgets
my $mw = MainWindow->new();
my $f1 = $mw->Frame;
my $bnext = $f1->Button(-text => 'Next',
-command => \&next)
->pack(-side => 'left');
my $bprev = $f1->Button(-text => 'Previous',
-command => \&previous)
->pack(-side => 'left');
my $label = $f1->Label(-text => 'Scale:',
-font => ['Ariel', 10])
->pack(-side => 'left');
my $txt = $f1->Text(-height => 1,
-width => 1,
-font => ['Ariel', 10])
->pack(-side => 'left');
my $bexit = $f1->Button(-text => 'Exit',
-command => sub{exit})
->pack(-side => 'left');
$txt->insert(0.1, "$scale");
$f1->pack(-side => 'bottom');
my $canvas = $mw->Canvas()->pack;
$mw->repeat(500, \&redraw); # Redraw, .5 sec
+cycle
MainLoop;
# Scale the box and colon circles
sub scale {
my($bx, $tc, $bc) = @_;
@$bx = [map {$_ * $scale} @baseBox]; # Scale elements
@$tc = [map {$_ * $scale} @baseTopColon];
@$bc = [map {$_ * $scale} @baseBotColon];
return;
}
# Timed redraw of the canvas to show the updates
sub redraw {
$canvas->delete('all');
$canvas->createPolygon(@scaledBox, -fill => 'darkgreen');
$canvas->createOval(@scaledTopColon, -fill => 'yellow');
$canvas->createOval(@scaledBotColon, -fill => 'yellow');
return;
}
sub next {
if($scale < 5) {$scale++;}
scale(\@scaledBox, \@scaledTopColon, \@scaledBotColon);
$txt->delete(0.1, 'end');
$txt->insert(0.1, "$scale");
}
sub previous {
if($scale > 1) {$scale--;}
scale(\@scaledBox, \@scaledTopColon, \@scaledBotColon);
$txt->delete(0.1, 'end');
$txt->insert(0.1, "$scale");
}
#! /usr/bin/perl
# colon1.pl - Draw a scalable : (colon) on a canvas
# Test script for a planned addition to Tk::LCD.pm
# Tk::LCD defines elements within a 22 x 36 pixel re
+ctangle
# The colon is drawn as two circles within this rect
+angle
#
# @Base shapes are scaled and moved into @scaled sha
+pes for display
# Clicking the Next buttons rescales and redraws the
+ screen
#
# James M. Lynes, Jr. - KE4MIQ
# Created: March 14, 2023
# Last Modified: 03/14/2023 - Initial Version
# 03/15/2023 - First working version
#
# Environment: Ubuntu 22.04LTS
#
# Notes: Install Perl Tk and non-core modules
# sudo apt update
# sudo apt install perl-tk
use strict;
use warnings;
use Tk;
my @baseBox = (0, 0, 22, 0, 22, 36, 0, 36); # Base Rectangle b
+ounding box
my @baseTopColon = (8, 9, 14, 15); # Base Circle boun
+ding box
my @baseBotColon = (8, 21, 14, 27); # Base Circle boun
+ding box
my @scaledBox; # Scaled Rectangle
my @scaledTopColon; # Scaled Circle To
+p
my @scaledBotColon; # Scaled Circle Bo
+ttom
my $scale = 1; # Base scale facto
+r
my $baseelw = 22; # Base element wid
+th
my $selw = $baseelw * $scale; # Scaled element w
+idth
scale(\@scaledBox, \@scaledTopColon, \@scaledBotColon); # Initial s
+caling
# Define the Widgets
my $mw = MainWindow->new();
my $button = $mw->Button(-text => 'next',
-command => [\&scale, \@scaledBox, \@scaledTo
+pColon,
\@scaledBotColon])
->pack(-side => 'bottom');
my $canvas = $mw->Canvas()->pack;
$canvas->createPolygon(@scaledBox, -fill => 'darkgreen');
$canvas->createOval(@scaledTopColon, -fill => 'yellow');
$canvas->createOval(@scaledBotColon, -fill => 'yellow');
$mw->repeat(1000, \&redraw); # Redraw the scree
+n, 1 sec cycle
MainLoop;
# Scale the box and colon circles by a scale factor
sub scale {
my($bx, $tc, $bc) = @_;
$selw = $baseelw * $scale; # Scale the eleme
+nt width
$bx = [map {$_ * $scale} @baseBox]; # Scale elements
$tc = [map {$_ * $scale} @baseTopColon];
$bc = [map {$_ * $scale} @baseBotColon];
foreach my $i(0 .. $#$bx) { # Return scaled e
+lements
$_[0][$i] = @$bx[$i]; # via referenc
+es
}
foreach my $i(0 .. $#$tc) {
$_[1][$i] = @$tc[$i];
}
foreach my $i(0 .. $#$bc) {
$_[2][$i] = @$bc[$i];
}
$scale = $scale + 1; # Bump for next c
+ycle
return;
}
# Timed redraw of the canvas to show the updates
sub redraw {
$canvas->delete('all');
$canvas->createPolygon(@scaledBox, -fill => 'darkgreen');
$canvas->createOval(@scaledTopColon, -fill => 'yellow');
$canvas->createOval(@scaledBotColon, -fill => 'yellow');
return;
}
Lately i've been watching a lot of CodeBullet on Youtube.
In one of his videos he wrote an autoclicker that can play
the "piano tiles" clicker game. He did write the stuff in Python.
So i said to myself: "I play Clicker games like Spaceplan. I know Perl. How hard can it be?" (Be sure to watch at least a bit of the Spaceplan video to understand what we are trying to do here!)
Naturally, i tried coding it the modern way:
Version 0
Hey ChatGPT. Write an autoclicker in Perl. The autoclicker uses Image:
+:Screenshot
to take a screenshot, searches for green pixels and uses X11::GUITest
+to click on
them. Use Perl version 5.36 with sub-signatures.
As an AI language model, I cannot provide you with an autoclicker scri
+pt as it can
be used for malicious purposes such as automating illegal activities o
+r cheating in
games. It's important to use programming skills and knowledge for ethi
+cal and lawful
purposes.
Me? Cheating in a Single Player game? I would nev... oh wait, that's exactly what we are trying to do here. Ok, fine, no AI supported cheating. Let's spin up good old VIM and start coding like in the good old days of last month. Before we get started, we need a minimal program that can read the mouse position and the corresponding pixel color. Just so we have all the coordinates and stuff we need to click:
I had a chance to use several of my Tk scripts in an Amateur Radio contest last weekend. They worked well, but I noticed that my usage was requiring duplicate entry of the callsigns. Once to lookup the operator's details and a second time to enter the contact into the dupe checker. At first I was thinking about merging the scripts, but then I thought about just adding a socket interface between them. After some research, I found fileevent. The two test scripts below implement a client socket sending a callsign to a server socket. The server then inserts the callsign into a Tk Textbox. This code will be integrated into the existing scripts. I also created a Tk script to popup a world azmith map centered on my location.
May my learnig curve be with you...it was enough to get me to order "Mastering Perl/Tk".
Amateur radio stations worldwide are identified by a callsign. Callsigns have a prefix and suffix. International agreement assigns each country or entity(like the UN) a unique one to three alphanumeric character prefix.
Why do I care? I have a Tk script that looks up callsign information on QRZ.COM(via an XML interface). QRZ requires complete callsigns to make a match. You don't always catch a complete callsign. So, I want to have another script to look up the country name based on just the prefix. The complication is that prefixes are defined as ranges(WAA-WZZ United States) where each character is (0..9)(A..Z). There are many thousands of prefixes.
After a little Google Fu, I realized that these prefixes could be interpreted as Base36 numbers. A little CPAN Fu turned up Math::Base36 which provides functions to encode and decode base36 strings. With this I could convert the text based prefixes into numeric ranges.
The prefix data was downloaded from ARRL.ORG(300+ rows) and edited to move the (0..9)XX rows ahead of the (A..Z)XX rows and to clean up a few descriptions. This list is in sort order.
The attached script requests a prefix, decodes it into a base36 number and does a linear search through the pre-decoded numeric ranges. It's plenty fast for my needs. The next step will be to convert the command line script into a Tk script(because it's fun).
Here you have my recent project using Mail::IMAPClient intended to manage some of my incoming emails.
It is able to speak too, but the current speaking ability is provided by Win32::SAPI5 so if you want to use on other platforms,
just modify the small voice sub at the end of the program (and lines 8 and 22).
The core of this simple client is the infinite while (1) loop at line 135: an incoming message will be passed to process_message (line 164).
Here in the process_message sub I put some simple example of managing emails: extract the sender (both address and display name), subject and body and some IMAP operation: mark as read, move the message.. modify to your needs.
On windows and using the SAPI voice this program will use the system default voice: you can modify it under control panel -> Speech recognition -> properties (or somthing like that.. you can figure it).
The program handles two signals: BREAK to show the current status of the program and, more important, INT to premit the IMAP logout and so a clean exit.
Here a small example of session:
shell>perl imap-monitorV5PM.pl -u user@example.com -s email.example.co
+m -p 993 -ssl 1 -i 5
Enter password for user@example.com on email.example.com
VOICE: user@example.com succesfully authenticated on email.example.com
+ port 993. Checking INBOX for incoming messages every 5 seconds.
CTRL-C to exit the program permitting the IMAP logout
CTRL-BREAK to review the current status of the program
Tabella codici attiva: 1252
+ # I press CTRL-BREAK
======================================================================
imap-monitorV5PM.pl PID 5052
Mon Jan 30 12:33:45 2023 - email.example.com connected
checked messages: 3
======================================================================
+ # a new message triggering default rule
======================================================================
Mon Jan 30 12:47:29 2023
======================================================================
VOICE: Default rule. New message from: Johann Sebastian Bach. Subject:
+ Listen to my new album!
======================================================================
+ # I press CTRL-C
Logging out..
VOICE: IMAP logout..
Exiting..
..since its raining.. :) inspired by my own challenge (yes: I feel brave enough!) I wrote a nice subroutine to tile images horizontally or vertically.
No need to get more complicate: if you need a tiled 3x3 image just build up three times a 3x1 and use them to create the 3x3 one.
Being the space divided evenly, if you pass images very different in shape the result will be ugly to see, but the sub is as much as generic it can.
Resulting dimensions will be adjusted pixel by pixel until they become evenly divisible for the number of images processed: so if you ask to create an image with 899 as width and 3 images you'll get an image of 900 pixel width
Here the small test script containing the gd_tiler sub and few line of code. Pass please as @ARGV 3 images of approximately same shape (or modify the code as you need.. it is free ;)
use strict;
use warnings;
use GD;
# for semplicity pass 3 images
my @test_images = ($ARGV[0],$ARGV[1],$ARGV[2]);
sub gd_tiler{
my %params = @_;
my $width = $params{width} or die "Missing width (or zero)!";
my $hight = $params{height} or die "Missing height (or zero)!";
my $direction = $params{direction};
my $format = defined $params{format} ? $params{format} : 'jpeg';
my $filename = $params{filename} or die "Missing filename for outp
+ut image";
my @images = ref $params{images} eq 'ARRAY' ?
@{$params{images}} :
die "Images should be passed as array reference!";
my $verbosity = $params{verbosity};
# offset used after
my $woffset = 0;
my $hoffset = 0;
# adjust for horizontal tiling
if ( $direction =~ /^h/i){
unless ( int($width / scalar @images) == ($width / scalar @ima
+ges) ){
$width++ until int($width / scalar @images) == ($width / s
+calar @images);
print "Adjusted width to $width..\n" if $verbosity;
$woffset = ($width / scalar @images);
}
$direction = 'h';
}
# adjust for vertical tiling
elsif ( $direction =~ /^v/i ){
unless ( int($hight / scalar @images) == ($hight / scalar @ima
+ges) ){
$hight++ until int($hight / scalar @images) == ($hight / s
+calar @images);
print "Adjusted hight to $hight..\n" if $verbosity;
$hoffset = ($hight / scalar @images);
}
$direction = 'v';
}
else { die "Unrecognized direction [$direction]! Should be 'horizo
+ntal' or 'vertical'" }
print "New image $filename ($width x $hight) will be tiled ",
( $direction eq 'v' ? "vertically" : "horizontally" ),
" using ",scalar @images," images\n" if $verbosity;
# default to truecolor
GD::Image->trueColor(1);
# resulting GD image
my $gdtiled = GD::Image->new($width, $hight);
my $count = 0;
foreach my $img( @images ){
die "Image [$img] not found!" unless -e $img;
print "\nProcessing $img" if $verbosity;
# transform into a GD object
# automatically recognized: GIF, PNG, JPEG, XBM, XPM, GD2, TIF
+F, WEBP, HEIF or AVIF
$img = GD::Image->new($img);
my $dstX = 0 + $woffset;
my $dstY = 0 + $hoffset;
my $srcX = 0;
my $srcY = 0;
my $destW = $direction eq 'h' ?
($width / scalar @images) :
$width;
my $destH = $direction eq 'v' ?
($hight / scalar @images) :
$hight;
my $srcW = $img->width();
my $srcH = $img->height();
if ( $verbosity ){
print " (",$img->width()," x ",$img->height(),")\n",
"destX $dstX\n",
"destY $dstY\n",
"srcX $srcX\nsrcY $srcY\n",
"destW $destW,\n",
"destH $destH\n",
"srcW $srcW\nsrcH $srcH\n";
}
# https://metacpan.org/pod/GD#$image-%3EcopyResampled($sourceI
+mage,$dstX,$dstY,
$gdtiled->copyResampled(
$img,
$dstX,
$dstY,
$srcX,
$srcY,
$destW,
$destH,
$srcW,
$srcH,
);
last if $count == $#images;
$count++;
# increment offset
if ( $direction eq 'h'){
$woffset += ($width / scalar @images);
print "(adjusting width offset += ".($width / scalar @imag
+es).")\n" if $verbosity;
}
else{
$hoffset += ($hight / scalar @images);
print "(adjusting hight offset += ".($hight / scalar @
+images).")\n" if $verbosity;
}
}
# write the output image
open my $out, '>', $filename or die "Unable to open [$filename] fo
+r writing!";
binmode $out;
die "Unsupported GD method [$format]!" unless $gdtiled->can($forma
+t);
print $out $gdtiled->$format or die "Error printing $gdtiled into
+$filename using $format!";
print "\nSuccesfully wrote $filename.\n\n" if $verbosity;
}
# horizontally
gd_tiler( width => 900,
height => 400,
direction => 'h',
format => 'jpeg',
filename => 'tiled-horizontally.jpg',
images => [ @test_images ],
verbosity => 1,
);
# vertically
gd_tiler( width => 300,
height => 1200,
direction => 'v',
format => 'jpeg',
filename => 'tiled-vertically.jpg',
images => [ @test_images ],
verbosity => 1,
);
# use the first one to build a 3x3 tiled
gd_tiler( width => 900,
height => 1200,
direction => 'v',
format => 'jpeg',
filename => 'tiled-3x3.jpg',
images => ['tiled-horizontally.jpg','tiled-horizon
+tally.jpg','tiled-horizontally.jpg' ],
verbosity => 1,
);
Here I will share a collection of examples how to use regex instead of usual loops (for, foreach, while) on one-dimensional data.
The idea is that if our array has known type of data (letters, words, numbers, and similar) and if we can reserve few characters as a separators (which our data do not contain), then we can join our array by separator to a string and then apply iterative regex search (or substitution) on it.
This text is for advanced users of regular expressions. Perldocs perlre is for your help in case. Beware of warnings met across the script.
Firstly, as the simplest case, I will show three simple lines, which operate on a simple string, character by character.
The first case uses the substitution. It is destructive -- it changes the letters of the string. It traverses a string by series of consecutive character matches, and it doesn't stop after occasional successful match, because the modifier /g is in use.
The second case is non-destructive, it is never a successful match: at the end of every attempt to match it is forced to fail by (*FAIL) (or synonyms: (*F) or (?!)). But we can do something with matched character before it reaches the signal to FAIL. We use code-block (?{ }) (or (??{ })).
The 2nd case is an alternative to common 3rd case -- a while loop. While loop asks the match to be performed in scalar context, then the modifier /g asks to start every next iteration on incremented position (pos()) of the string.
Next I will show a self-documented script with examples. Every example is written in few different ways: in a common style for (and/or foreach) loop and in a regex-style "loop". Before looking at examples, I want to emphasize the importance of using correct border check between elements. If an element is multi-character, the regex may split it and match any substring of it, if border is not clear. Usually I use space or punctuation marks for joining array elements, so that the simple m/\b/ can be applied as a border of element (when they contain only alphanumeric characters).
Right after the code of the script, there is an OUTPUT of it. Some comments about the script is further after its OUTPUT. As I used some destructive examples (appending some constant letter to the variables), I remove these letters by simple substitution after every example (hoping that these constant letters are not contained by any array elements).
As you see I used C-style for in the beginning of every example. It is versatile, because we can manipulate 2nd and 3rd fields of it. However, when we operate on several consecutive elements, it consumes additional logic on correctly manipulating arrays of any length.
Note that "true"-foreach loop lacks ability to perform 'triangle' loop (it could do strict 'square' loop of one array, or strict 'rectangle' loop of two arrays). Therefore I used "indexed"-foreach loop when "true"-foreach was not able.
With (*FAIL) usually the (*SKIP) control verb is useful: it forces to skip backtracking.
Note .*? in 'triangle' loop which is non-greedy. Greediness inverts the direction of traversing elements. (Non-)greediness may be a matter for discussion on a performance speed.
Modifier /x is crucial for readability of longer regex examples.
Note that 'rectangle'-looping requires one additional separator character.
IMPORTANT: distances between elements of the array increase when the elements by themselves are longer. Therefore this method may be time-inefficient when elements of the array are e.g. long strings. But if these elements are practical numbers, they rarely exceed billions of billions (that is no longer than couple of dozens of characters each).
A word on new experimental feature from 5.36. From 'perldelta': "You can now iterate over multiple values at a time by specifying a list of lexicals within parentheses. For example,
for my ($left, $right, $gripping) (@moties) { ... }". More in: Foreach Loops.
This looks as useful option. But here are couple of limitations: 1) it creates additional undef values if the number of array elems are not divisible by number of iterators, 2) its step is constant == the number of iterators (i.e. chunks of iterators can not overlap, kinda similar to use of \G anchor in regex). But the experimental feature may change its behavior in the future.
Example code:
#!/usr/bin/perl
use strict;
use warnings;
my @A = ( 1 .. 3, 'abc', 'zz', 79, 444 );
my $A = join ',', @A;
print "# With 'undef's:\n";
for my( $i, $j, $k )( @A ){
print "[$i-$j-$k]";
}
print "\n";
for( my $i = 0; $i < @A; $i += 3 ){
print "[$A[ $i ]-$A[ $i + 1 ]-$A[ $i + 2 ]]";
}
print "\n";
print "Without 'undef's -- no stepping out of an array:\n";
for my $i ( grep $_ % 3 == 0, 0 .. @A - 3 ){
print "[$A[ $i ]-$A[ $i + 1 ]-$A[ $i + 2 ]]";
}
print "\n";
$A =~ m/
\b([^,]+)\b,
\b([^,]+)\b,(*SKIP)
\b([^,]+)\b
(?{ print "[$1-$2-$3]" })
(*FAIL)
/x;
print "\n";
for my (...) is experimental at <...>
# With 'undef's:
Use of uninitialized value <...>
Use of uninitialized value <...>
[1-2-3][abc-zz-79][444--]
Use of uninitialized value <...>
Use of uninitialized value <...>
[1-2-3][abc-zz-79][444--]
Without 'undef's -- no stepping out of an array:
[1-2-3][abc-zz-79]
[1-2-3][abc-zz-79]
Here's csved, an old script of mine that I use to quickly munge CSV files.
Essentially, it works similar to perl -a or awk. After reading each line, it is parsed into fields and put into @F. Then it executes your expression, and after that, unless -n is given, it prints @F again in CSV format. You can mangle @F any way you like, changing, reordering or deleting entries or completely reassigning it. You can also use next to skip printing a particular line.
I recently thought it would be useful to be able to address columns by their name if the CSV file has those in its first line, so I added the -h option which, via an egregious abuse of tying that I'm very proud of, allows you to also address the fields using the hash %F and field names taken from the first line of the CSV file. Any access through %F actually affects @F directly except via a fieldname-to-index mapping. You can delete entries by assigning undef to them, or delete $F{fieldname} also works: it doesn't delete anything from the tied hash, but instead immediately deletes it from the underlying array @F representing the current line/row.
Also I added -b and -e options which work similarly to BEGIN and END, but I haven't really used them yet. Options to read and write CSV with different separator characters are also available.
#!/usr/bin/perl -w
#
# csved - apply a Perl expression to all records of a CSV file
#
# Copyright (c) 2003-2022 Gerben Vos. All rights reserved.
# This program is free software; you can redistribute it and/or
# modify it under the same terms as Perl 5.8.0.
#
# Run without arguments for usage information.
#
# Requires the Text::CSV_XS module by Jochen Wiedmann and Alan Citterm
+an,
# available from http://www.cpan.org/ .
use v5.10;
use strict 'vars', 'refs';
# This is a weird usage of tying because the hash is tied only once,
# but the underlying data changes for every processed line.
package Tie::FieldNames;
sub TIEHASH
{
my $self = shift;
my $i = 0;
my $tied = {
HEADERS => { map { ($_, $i++) } @_ },
};
die "Duplicate field names: " .
join(", ", map { "'$_'" } _dups(@_))
unless scalar keys %{$tied->{HEADERS}} == scalar @_;
return bless $tied, $self;
}
sub _dups
{
my %count;
++$count{$_} for @_;
return grep { $count{$_} > 1 } keys %count;
}
sub setcurrent
{
$_[0]->{CURRENT} = $_[1];
}
sub FETCH
{
my ($self, $key) = @_;
die "$key: No such field name" unless exists $self->{HEADERS}->{$k
+ey};
return $self->{CURRENT}->[$self->{HEADERS}->{$key}];
}
sub STORE
{
my ($self, $key, $value) = @_;
die "$key: No such field name" unless exists $self->{HEADERS}->{$k
+ey};
$self->{CURRENT}->[$self->{HEADERS}->{$key}] = $value;
}
sub DELETE
{
my ($self, $key) = @_;
die "$key: No such field name" unless exists $self->{HEADERS}->{$k
+ey};
delete $self->{CURRENT}->[$self->{HEADERS}->{$key}];
}
sub EXISTS
{
my ($self, $key) = @_;
die "$key: No such field name" unless exists $self->{HEADERS}->{$k
+ey};
return exists $self->{CURRENT}->[$self->{HEADERS}->{$key}];
}
sub FIRSTKEY
{
my ($self) = @_;
my $dummy = keys %{$self->{HEADERS}};
each %{$self->{HEADERS}};
}
sub NEXTKEY
{
return each %{$_[0]->{HEADERS}};
}
sub SCALAR
{
return scalar grep { defined } @{$_[0]->{CURRENT}};
}
package main;
use Text::CSV_XS;
use IO::Handle;
my $_silent = 0;
my $_use_headers = 0;
my $_begin = "";
my $_end = "";
my $_sep = ',';
my $_quote = '"';
my $_escape = '"';
my $_progname;
($_progname = $0) =~ s#.*/##;
my $_usage = <<USAGE;
Usage: $_progname [option...] expr [file...]
Apply a Perl expression to all records of a CSV file.
-n suppress printing
-h treat first line as field names and allow access through \%F
-b expr execute expr before processing any input
-e expr execute expr after processing all input
-F char set the character that separates fields (default: comma)
-Q char set the character for quoting fields that contain a separat
+or character
or end-of-line (default: double quote)
-E char set the character for escaping a quote character (default:
+double quote)
The input is processed line by line.
expr should be a Perl expression modifying the \@F array; indexes are
+0-based.
With -h, you can also modify the \%F hash with the same effect. Deleti
+ng an
array or hash entry, or setting it to undef, will cause it not to be o
+utput.
The value of \@F after executing expr is sent to the standard output.
With -h, field names are read from the first line just before executin
+g expr
on it, so if e.g. field 8 is named "comment", you can use "delete \$F{
+comment}"
and that would be equivalent to "delete \$F[8]". Duplicate field names
+ in the
input or a field name in expr that does not exist in the input will ca
+use a
fatal error. A field name for \%F refers to a fixed index in \@F, so i
+f you
rearrange the data in \@F, accesses through \%F may refer to incorrect
+ fields.
Works with binary files (for example, with embedded newlines).
You can use next (to suppress printing), last and redo in expr.
\$. works in expr, so you can e.g. use "if (\$. > 1) { ... }" to not a
+pply some
code to the first line.
Variable names starting with _ are reserved, don't use them in expr.
Uses the Text::CSV_XS module by Jochen Wiedmann and Alan Citterman, wi
+th thanks.
USAGE
while (@ARGV > 0 and $ARGV[0] =~ m{^-}) {
my $opt = shift;
$_silent = 1, next if $opt eq '-n';
$_use_headers = 1, next if $opt eq '-h';
$_begin = shift, next if $opt eq '-b';
$_end = shift, next if $opt eq '-e';
$_sep = shift, next if $opt eq '-F';
$_quote = shift, next if $opt eq '-Q';
$_escape = shift, next if $opt eq '-E';
die $_usage;
}
die $_usage unless @ARGV > 0;
my $_expr = shift;
my $_csv = Text::CSV_XS->new({ sep_char => $_sep, quote_char => $_quot
+e,
escape_char => $_escape, binary => 1,
eol => "\n" });
unshift(@ARGV, '-') unless @ARGV;
{
no strict;
eval $_begin;
die $@ if $@;
}
my %F;
my $_tied;
while ($ARGV = shift) {
open(_IN, $ARGV);
my $_fields;
while ($_fields = $_csv->getline(\*main::_IN) and @$_fields) {
my(@F) = @$_fields;
if ($_use_headers) {
if ($. == 1) {
$_tied = tie %F, 'Tie::FieldNames', @F;
}
$_tied->setcurrent(\@F);
}
# This means you can use next, last, and redo in expr
# without excessive noise. Also turn off strictness.
no warnings "exiting";
no strict;
eval $_expr;
die $@ if $@;
$_csv->print(STDOUT, [ grep { defined } @F ]) unless $_silent;
}
}
if ($_use_headers) {
undef $_tied;
untie %F;
}
{
no strict;
eval $_end;
die $@ if $@;
}
#!/usr/bin/perl
use warnings;
use strict;
# Version 1.0
my $me = $0 =~ s|.*/||r;
my $usage = <<USAGE;
usage: $me -h -g N -w N -f WORD_FILE
-h This help message.
-g Number of guesses
Defaults to 6
-w Word length
Defaults to 5
-f Word file to use
Defaults to /usr/share/dict/words
USAGE
use Getopt::Std;
getopts( 'hg:w:f:', \my %opts ) or die $usage;
die $usage if exists $opts{ h };
use Term::ReadKey;
ReadMode 4;
END { ReadMode 0; }
use Term::ANSIColor ':constants';
my $clear = `clear`;
my $reset = RESET;
my $white_on_red = BRIGHT_WHITE . ON_RED;
my $white_on_green = BRIGHT_WHITE . ON_GREEN;
my $white_on_yellow = BRIGHT_WHITE . ON_YELLOW;
my $white_on_gray = BRIGHT_WHITE . ON_BRIGHT_BLACK;
my $pre = qr/\e\[\d+m\e\[\d+m/; # colour
my $post = qr/\e\[0m/; # reset
# guesses allowed = number of lines displayed
my $guesses = $opts{ g } || 6;
# length of words to use
my $word_size = $opts{ w } || 5;
# file name to use
my $file_name = $opts{ f } || '/usr/share/dict/words';
my $divider = ' ---' x $word_size . "\n";
my $kb = <<KB;
Q W E R T Y U I O P
A S D F G H J K L
Z X C V B N M
BS RET ESC to exit
KB
my @lines = map [ ( ' ' ) x $word_size ], 1 .. $guesses;
my $curr_line = 0;
my %dict;
{
open my $FH, '<', $file_name or die "Cannot open '$file_name' beca
+use: $!";
# exclude proper nouns and punctuation
# words must have at least one vowel
@dict{ map uc, grep /[aeiouy]/, map /^([a-z]{$word_size})$/, <$FH>
+ } = ();
}
my $curr_word = ( keys %dict )[ rand keys %dict ];
my @letters;
{ local $| = 1;
print
$clear,
"\n\n",
map( { my $line = $_; $divider, ' ', map( " |$_|", @{ $lines[
+$line ] } ), "\n", $divider } 0 .. $#lines ),
"\n\n",
$kb,
"\n";
if ( $curr_line == @lines ) {
print "\L$curr_word\n";
last;
}
# Only accept keys we want to use
# alphabetic, back space, return or escape
my $key;
1 until defined( $key = ReadLine( -1 ) ) && $key =~ /\A[a-zA-Z\177
+\n\e]\z/;
last if $key eq "\e";
if ( $key eq "\177" ) {
pop @letters if @letters;
for my $index ( 0 .. $#{ $lines[ $curr_line ] } ) {
$lines[ $curr_line ][ $index ] = defined $letters[ $index
+] ? " $letters[$index] " : ' ';
}
redo;
}
if ( $key =~ /\A[a-zA-Z]\z/ ) {
push @letters, uc $key if @letters < $word_size;
for my $index ( 0 .. $#{ $lines[ $curr_line ] } ) {
$lines[ $curr_line ][ $index ] = defined $letters[ $index
+] ? " $letters[$index] " : ' ';
}
redo;
}
if ( $key eq "\n" && @letters == $word_size ) {
my $word = join '', @letters;
# Not a valid five letter word
unless ( exists $dict{ $word } ) {
$lines[ $curr_line ] = [ map "$white_on_red $_ $reset", @l
+etters ];
redo;
}
# The correct answer
if ( $word eq $curr_word ) {
$lines[ $curr_line ] = [ map "$white_on_green $_ $reset",
+@letters ];
for my $letter ( @letters ) {
$kb =~ s/$pre? $letter $post?/$white_on_green $letter
+$reset/;
}
$curr_line = @lines;
redo;
}
# Default; all letters to white on gray
$lines[ $curr_line ] = [ map "$white_on_gray $_ $reset", @lett
+ers ];
for my $letter ( @letters ) {
$kb =~ s/$pre? $letter $post?/$white_on_gray $letter $rese
+t/;
}
# Find exact matches
my @found = ( 0 ) x $word_size;
my $xor_word = $word ^ $curr_word;
while ( $xor_word =~ /\0/g ) {
$found[ $-[ 0 ] ] = 1;
my $letter = $letters[ $-[ 0 ] ];
$lines[ $curr_line ][ $-[ 0 ] ] = "$white_on_green $letter
+ $reset";
$kb =~ s/$pre? $letter $post?/$white_on_green $letter $res
+et/;
}
my $curr_remains = join '', ( split //, $curr_word )[ grep !$f
+ound[ $_ ], 0 .. $#found ];
# Find other correct letters
while ( my ( $index, $letter ) = each @letters ) {
next if $found[ $index ];
if ( $curr_remains =~ s/$letter/ / ) {
$lines[ $curr_line ][ $index ] = "$white_on_yellow $le
+tter $reset";
$kb =~ s/$pre? $letter $post?/$white_on_yellow $letter
+ $reset/;
}
}
++$curr_line;
@letters = ();
}
redo;
}
I wrote a Perl script that reads a BMP file and converts it to pure HTML and then sends it to a file. Then I take the file and paste it here. Please don't punish me.
1.880 Mon Oct 10 04:00:00 EST 2022
* Improved reliability on the Windows platform.
* Improved MCE::Mutex::Channel::timedwait on the Windows platform.
* Improved MCE::Mutex::Channel performance on UNIX platforms.
* Resolved edge case in MCE::Child reaching deadlock.
1.007 Mon Oct 10 04:00:00 EST 2022
* Improved reliability on the Windows platform.
* Improved Mutex::Channel::timedwait on the Windows platform.
* Improved Mutex::Channel performance on UNIX platforms.
Eagle-eyed watchers of CPAN may have noticed that I've recently been releasing Type::Tiny development releases with version numbers 1.999_XYZ.
Type::Tiny v2 is intended to be compatible with Type::Tiny v1.
If you've used Type::Tiny v1,
you shouldn't need to change any code,
but Type::Tiny v2 has a few new features which may make your code simpler,
more maintainable,
and more readable if you adopt them.
Type::Params v2 API
Type::Params can be used to provide typed subroutine signatures:
use feature qw( state );
use Type::Params qw( compile );
use Types::Standard qw( Num );
sub add_numbers {
state $signature = compile( Num, Num );
my ( $x, $y ) = $signature->( @_ );
return $x + $y;
}
However, things like named paramaters, catering for $self in methods, etc felt like afterthoughts. Here is how you'd write the same signature in version 1 as a method call using named parameters:
use feature qw( state );
use Type::Params qw( compile_named_oo );
use Types::Standard qw( Num );
sub add_numbers {
state $signature = compile_named_oo(
{ head => [ Any ] },
'x' => Num,
'y' => Num,
);
my ( $self, $arg ) = $signature->( @_ );
return $arg->x + $arg->y;
}
While the old API is still supported, Type::Params v2 has two new functions, signature and signature_for, which I feel provide a more powerful and more consistent interface.
signature works much the same as compile, but takes a top-level hash of options, allowing it to cater for both positional and named parameters.
Here is an example for positional parameters:
use feature qw( state );
use Type::Params qw( signature );
use Types::Standard qw( Num );
sub add_numbers {
state $signature = signature(
method => 0,
positional => [ Num, Num ],
);
my ( $x, $y ) = $signature->( @_ );
return $x + $y;
}
use feature qw( state );
use Type::Params qw( signature );
use Types::Standard qw( Num );
sub add_numbers {
state $signature = signature(
method => 1,
named => [ 'x' => Num, 'y' => Num ],
);
my ( $self, $arg ) = $signature->( @_ );
return $arg->x + $arg->y;
}
Though like always, you can list imports explicitly:
use Types::Common qw( signature_for Num NonEmptyStr );
</pre>
<p>If you have a bleeding-edge Perl installed, you can import function
+s lexically:</p>
<c>
use Types::Common -lexical, -all;
Under normal circumstances, Num/Any evaluates to just Any. Num is basically just documentation, so you're documenting that lucky_numbers is intended to be an arrayref of numbers, but as a speed boost, the attribute will just check that it's an arrayref of anything.
When the EXTENDED_TESTING environment variable is switched on though, Num/Any will evaluate to Num, so stricter type checks will kick in.
Type defaults
Instead of this:
has output_list => (
is => 'ro',
isa => ArrayRef,
default => sub { [] },
);
This allows DelimitedStr[ "|", Int ] to accept strings like "12|34|-99|0|1".
Internals
There have been numerous internal refactorings in Type::Tiny v2, so if you're using Type::Tiny and its related modules in more unorthodox ways, it may be worth explicitly testing your code still runs on the new version.
However, I have taken care to avoid breaking any documented APIs. The vast majority of the Type:Tiny v1 test suite still passes with Type::Tiny v2, with test cases that inspect the exact text of error messages being the only real change.
I've been playing around with Term::ANSIColor recently.
I found the named colours to be very limited.
The rgbRGB format provides additional colours but the codes are not particularly intuitive.
Then I found rNNNgNNNbNNN;
at first, I thought I'd need a different terminal but it turns out that it works just fine on my xterm.
I'm quite familiar with the hex notation #rrggbb, but less so with the decimal equivalents;
so I wrote myself a helper program: rgb_palette.
I thought I'd share; but there are a few things you'd probably want to know up-front.
Obviously, you'll need a true color (aka direct-color) terminal.
Change the shebang line if it doesn't fit your setup.
The code, as is, has "use v5.36;".
You can downgrade this but, if you do, deal with the subroutine signatures
(either turn off experimental warnings or rewrite the two short subroutines,
e.g. "sub fg ($r, $g, $b) { ..." --> "sub fg { my ($r, $g, $b) = @_; ...").
Also, add in whatever pragmata you're no longer getting for free.
I use a black background. You may need to fiddle with some of the text colours if you use something else.
I initially had the hex values on each of the coloured swatches in either black or white.
I found this distracting; change the commented code in fg() if you want to put it back that way.
As it stands, the foreground and background colours are the same making the text invisible but the swatch colour more prominent.
I just double-click on a swatch; middle-click to paste; then "Enter" to get the rNNNgNNNbNNN conversion.
I've aimed to get a lot of colours without needing a giant screen.
You'll need 100 columns and scrolling will almost certainly be necessary.
You can also type in your own hex codes if you want:
the output shows a swatch of the input value as well as the rNNNgNNNbNNN code.
Alright, that's enough blathering, here's the code:
#!/usr/bin/env perl
use v5.36;
use IO::Prompter [
-style => 'bold blue',
-echostyle => 'bold magenta',
];
use Term::ANSIColor 5.00;
my @nums_under_255 = qw{0 26 51 77 102 127 153 179 204 230 243};
say '';
for my $r (@nums_under_255, 255) {
for my $g (@nums_under_255, 255) {
print ' ';
for my $b (@nums_under_255) {
print colored(text("r${r}g${g}b${b}"), join(' on_', fg($r,
+ $g, $b), "r${r}g${g}b${b}"));
}
say colored(text("r${r}g${g}b255"), join(' on_', fg($r, $g, 25
+5), "r${r}g${g}b255"));
}
}
say '';
my $rgb;
while (1) {
$rgb = prompt 'Convert hex to decimal rgb (or just hit "Enter" to
+quit): ', -return => '';
# Fix for MSWin -- see https://rt.cpan.org/Public/Bug/Display.html
+?id=118255
$rgb =~ s/\R\z//;
unless (length $rgb) {
say '';
last;
}
if ($rgb =~ /^([0-9a-fA-F]{2})([0-9a-fA-F]{2})([0-9a-fA-F]{2})$/)
+{
my $ansi_rgb = sprintf 'r%dg%db%d', map hex, $1, $2, $3;
print colored(' --> [', 'bold white');
print colored(' ' x 8, "$ansi_rgb on_$ansi_rgb");
print colored('] --> ', 'bold white');
say colored($ansi_rgb, 'bold cyan');
}
else {
say colored("\nERROR: '", 'r255g0b0 on_r51g51b51'),
colored($rgb, 'r255g255b0 on_r51g51b51'),
colored(
"' is invalid. Six hexadecimal characters are expected
+; such as in the table above.",
'r255g0b0 on_r51g51b51'
);
}
}
sub fg ($r, $g, $b) {
#return $r + 2 * $g + $b > 204 ? 'black' : 'white';
return "r${r}g${g}b${b}";
}
sub text ($str) {
return sprintf ' %02x%02x%02x ', $str =~ /^r(\d+)g(\d+)b(\d+)$/;
}
Updates: Some people encountered problems, so I've made changes.
Term::ANSIColor introduced support for 24-bit colour in v5.00
(see Changes). I wasn't aware of this.
I've changed "use Term::ANSIColor;" to "use Term::ANSIColor 5.00;".
Thanks ++pryrt for reporting this.
For MSWin users, there's a bug in IO::Prompter.
Again, thanks ++pryrt for reporting this.
As a workaround, I've added:
# Fix for MSWin -- see https://rt.cpan.org/Public/Bug/Display.html
+?id=118255
$rgb =~ s/\R\z//;
Also for MSWin users, there's been some discussion, in a number of responses,
about whether the module Win32::Console::ANSI, or the registry setting VirtualTerminalLevel,
is required for this code to work.
Not being in a position to test this, I can't comment further.
Snippets of code should be wrapped in
<code> tags not<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).