this is a quick hack on PPI::Cache and PPI itself. The program looks for the file named last_hex_id_file.sto which contains the last hex generated for the perl document. If the file is not found (as normal the first time you run the program) it ask for the path of a perl document to parse: then generates the cache and store the hex in the above file.
When a new file or some cache content is loaded it ask for a PPI class to iterate over: each element is printed out and you are asked if you want to modify it.
At the end of the cylce the new document is put in the cache and you are asked for an eventual output file. Next time you run the program the newer version is automatically loaded from the cache: because of this run in a new folder for each perl document you want to modify.
use strict;
use warnings;
use PPI;
use PPI::Cache;
use Term::ReadLine;
use Storable qw(nstore retrieve);
my $term = Term::ReadLine->new('PPI cache hack');
my $last_hex_id_file = 'last_hex_id_file.sto';
my $perl_doc;
my $cache;
# not found the last_hex_id_file.sto file: ask for a new perl document
+ to parse
unless (-e $last_hex_id_file){
print "cache file $last_hex_id_file not found.\n".
"Insert the full path of a new file to edit and press ente
+r (or CTRL-C to terminate)\n";
my $path = $term->readline('FILE PATH:');
die "Some problem with [$path]! " unless -e -r -f -s $path;
my $doc = PPI::Document->new($path) or die "Unable to load $path v
+ia PPI! ";
$cache = PPI::Cache->new( path => './',readonly => 0);
# store the original in the cache
$cache->store_document($doc) or die "Unable to store into the cach
+e!";
# get a copy to work with from the cache
$perl_doc = $cache->get_document($doc->hex_id);
print "loading from cache ok\n" if ref $perl_doc eq 'PPI::Document
+';
#store_hex($doc->hex_id);
nstore (\$doc->hex_id, $last_hex_id_file);
}
# last_hex_id_file.sto is here: load from it
my $last_hex = retrieve($last_hex_id_file);
print "'last_hex_id_file.sto' succesfully read: using $$last_hex\n";
$cache = PPI::Cache->new( path => './',readonly => 0) unless ref $c
+ache eq 'PPI::Cache';
$perl_doc = $cache->get_document( $$last_hex );
print "Which PPI class do you want to edit?\n";
my $class = $term->readline('PPI CLASS:');
print "\n\nEach element of the type $class will be proposed for edit (
+the content).\n".
"insert your new input terminating it with CTRL-Z on a empty l
+ine.\n".
"use a bare ENTER to skip the current element\n\n";
foreach my $it ( @{$perl_doc->find($class)} ) {
print "STATEMENT: ",$it->statement,"\n",
"CONTENT: ",$it->content,"\n\n";
my @in;
while ( my $line = $term->readline('EDIT:') ){
push @in,$line;
}
if (@in){
$it->set_content(join "\n",@in);
}
}
# store in the
$cache->store_document($perl_doc);
print "storing cache hex_id: ",$perl_doc->hex_id," in $last_hex_id_fil
+e\n";
nstore (\$perl_doc->hex_id, $last_hex_id_file);
# ask for an eventual output file
print "Enter a filename if you want to save the current version (or EN
+TER to skip)\n";
my $out = $term->readline('OUTPUT FILE:');
$perl_doc->save( $out ) if $out;
use v6;
# create 3 hashes, showing mixed inits that can be used in perl6 :
my %hsh0 = ( "Name" => "George H. W. Bush", "Function" => <president o
+f USA> , 'Time' => <2001-2009>);
my %hsh1 = ( "Name" => "Bill Clinton", "Function" => <president of USA
+> , 'Time' => <1993-2001>) ;
my %hsh2 = ( "Name" , "Barack Obama", "Function" , <President of USA
+> , 'Time' , <2009-2017> ) ;
# Subscript Adverbs : :exists :k :v :p :delete
#if ( %hsh1<Name>:exists) { # Use this when keys have No Spaces !!
my $key = 'Name' ;
if ( %hsh1{$key}:k) { # Can I use %hsh1{Name}:exists : Yes works!
say "Found Name element in %hsh1";
}
# there is an other use for <> construct : , but {} also works!!
say %hsh1{}:v.perl; #Show all values
say %hsh1.pairs.perl;
# push the several hashes onto the array : @arr
my @arr ;
# Please notice the : after the push
@arr.push: { %hsh0 } ;
@arr.push: { %hsh1 } ;
@arr.push: { %hsh2 } ;
# show some entries. using serveral formats that perl6 has :
say '@arr.[0].{"Name"} = ' ~ @arr.[0].{'Name'};
say '@arr.[1].<Name> = ' ~ @arr.[1].<Name> ;
say @arr.end.fmt('%+4d') ; # How many items ( hashes ) are in the ar
+ray.
# dump @arr
dd @arr;
# try looping over the @arr , and detect the keys stored in the hash.
+It's more simple than I thought ( after some experimenting )
for 0 .. @arr.end -> $idx {
say "@arr idx : $idx";
my %x = @arr[$idx];
for %x.kv -> $key, $value {
printf "%10.10s : %-20.20s\n" , $key, $value; # please notice
+printf NOT using () ;
}
}
D:\perl6.scripts\myown>perl6 array_hash_3.p6
Found Name element in %hsh1
($("president", "of", "USA"), "1993-2001", "Bill Clinton")
(:Function($("president", "of", "USA")), :Time("1993-2001"), :Name("Bi
+ll Clinton")).Seq
@arr.[0].{"Name"} = George H. W. Bush
@arr.[1].<Name> = Bill Clinton
+2
Array @arr = [{:Function($("president", "of", "USA")), :Name("George H
+. W. Bush"), :Time("2001-2009")}, {:Function($("president", "of", "US
+A")), :Name("Bill Clinton"), :Time("1993-2001")}, {:Function($("Presi
+dent", "of", "USA")), :Name("Barack Obama"), :Time("2009-2017")}]
@arr idx : 0
Function : president of USA
Time : 2001-2009
Name : George H. W. Bush
@arr idx : 1
Function : president of USA
Time : 1993-2001
Name : Bill Clinton
@arr idx : 2
Function : President of USA
Time : 2009-2017
Name : Barack Obama
It is heavily using AI::Prolog. An example of such puzzle is below:
John,Patrick and James live here.
Each has a different favourite fruit (pinapples,apples,pears).
Each has a different profession (fisherman,blacksmith,witcher).
- My name is John. The one who likes apples is not a blacksmith. Patrick is not a witcher. James does not like pinapples. James is not a fisherman.
- My name is James. John does not like pears. Patrick does not like apples. I don't like apples. The one who likes apples is not a fisherman.
John likes apples. John is a witcher.
Patrick likes pinapples. Patrick is a fisherman.
James likes pears. James is a blacksmith.
UPDATE: There are several things that were in an earlier version of this script that didn't make the second cut, but only because I got lazy. My original got deleted somehow, and I had foolishly not kept a copy, so I wrote the above quickly.
The directory structure is documents/4digityear/abbreviatedcardinalmonth/2digitday/hour/minute. At first I restricted deletions to directories above some number of days old, but rmdir updates the directory time information, meaning a directory that was now empty because all its empty constituent directories were gone looked like it was brand new. This made it useless to run consecutively. I came up with a calculation that used the directory tree to come up with the age, and that worked. I just didn't bother with it when I rewrote the script this time. Some of the alternate solutions don't have that limitation.
But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)
I received a request from John Martel to process a large flat file and expand each record to many records based on splitting out items in field 4 delimited by semicolons. Each row in the output is given a unique ID starting with one while preserving output order.
Thank you, John. This is a great use-case for MCE::Relay (2nd example).
Input File -- Possibly larger than 500 GiB in size
This example configures a custom function for preserving output order. Unfortunately, the sprintf function alone involves extra CPU time causing the manager process to fall behind. The workers may idle while waiting for the manager process to respond to the gather request.
use strict;
use warnings;
use MCE::Loop;
my $infile = shift or die "Usage: $0 infile\n";
my $newfile = 'output.dat';
open my $fh_out, '>', $newfile or die "open error $newfile: $!\n";
sub preserve_order {
my ($fh) = @_;
my ($order_id, $start_idx, $idx, %tmp) = (1, 1);
return sub {
my ($chunk_id, $aref) = @_;
$tmp{ $chunk_id } = $aref;
while ( my $aref = delete $tmp{ $order_id } ) {
foreach my $line ( @{ $aref } ) {
$idx = sprintf "%015d", $start_idx++;
print $fh $idx, $line;
}
$order_id++;
}
}
}
MCE::Loop::init {
chunk_size => 'auto', max_workers => 3,
gather => preserve_order($fh_out)
};
mce_loop_f {
my ($mce, $chunk_ref, $chunk_id) = @_;
my @buf;
foreach my $line (@{ $chunk_ref }) {
$line =~ s/\r//g; chomp $line;
my ($f1,$f2,$f3,$items,$f5,$f6,$f7) = split /\|/, $line;
my @items_array = split /;/, $items;
foreach my $item (@items_array) {
push @buf, "|$item|$f1|$f2|$f3|$f5|$f6|$f7\n";
}
}
MCE->gather($chunk_id, \@buf);
} $infile;
MCE::Loop::finish();
close $fh_out;
To factor out sprintf from the manager process, another way is via MCE::Relay for incrementing the ID value. Workers obtain the current ID value and increment/relay for the next worker, ordered by chunk ID behind the scene. Workers call sprintf in parallel. This allows the manager process (out_iter_fh) to accommodate up to 32 workers and not fall behind. It also depends on IO performance, of course.
The MCE::Relay module is loaded automatically whenever the MCE init_relay option is specified.
use strict;
use warnings;
use MCE::Loop;
use MCE::Candy;
my $infile = shift or die "Usage: $0 infile\n";
my $newfile = 'output.dat';
open my $fh_out, '>', $newfile or die "open error $newfile: $!\n";
MCE::Loop::init {
chunk_size => 'auto', max_workers => 8,
gather => MCE::Candy::out_iter_fh($fh_out),
init_relay => 1
};
mce_loop_f {
my ($mce, $chunk_ref, $chunk_id) = @_;
my @lines;
foreach my $line (@{ $chunk_ref }) {
$line =~ s/\r//g; chomp $line;
my ($f1,$f2,$f3,$items,$f5,$f6,$f7) = split /\|/, $line;
my @items_array = split /;/, $items;
foreach my $item (@items_array) {
push @lines, "$item|$f1|$f2|$f3|$f5|$f6|$f7\n";
}
}
my $idx = MCE::relay { $_ += scalar @lines };
my $buf = '';
foreach my $line ( @lines ) {
$buf .= sprintf "%015d|%s", $idx++, $line
}
MCE->gather($chunk_id, $buf);
} $infile;
MCE::Loop::finish();
close $fh_out;
Relay accounts for the worker handling the next chunk_id value. Therefore, do not call relay more than once inside the block. Doing so will cause IPC to stall.
It's been in the works at the lower-end of my priority list, but after having a bit of a bug-closing weekend, thought I'd tackle getting out an initial release of Github::Backup.
The cloud is a great thing, until the sun evaporates it one way or another. Github, although fantastically reliable, is prone to issues just like any other site on the Internet. I'd go as far to say that even they could be prone to data loss in very rare circumstances.
This distribution, which provides a command-line binary, allows you to quickly and easily back up your repositories and issues to your local machine. The repositories are cloned so all data is retrieved as-is as legitimate Git repos, and the issues are fetched and stored as JSON data. Useful if there was ever a catastrophic issue at Github, or simply for offline perusal of your information.
At a basic level, you need to send in your Github username, API token (see this), a directory to stash the data retrieved, and a flag to signify you want to back up either your repos, issues or both.
Now, most don't like supplying keys/tokens/passwords on the command-line or within a script, so you can stash your Github API token into the GITHUB_TOKEN environment variable, and we'll fetch it from there instead:
Usage: github_backup -u username -t github_api_token -d /backup/direct
+ory -r -i
Options:
-u | --user Your Github username
-t | --token Your Github API token
-d | --dir The backup directory
-p | --proxy Optional proxy (https://proxy.example.com:PORT)
-r | --repos Back up all of your repositories
-i | --issues Back up all of your issues
-h | --help Display this help page
use warnings;
use strict;
use Github::Backup;
# token stashed in GITHUB_TOKEN env var
my $gh = Github::Backup->new(
api_user => 'stevieb9',
dir => '/home/steve/github_backup'
);
# back up all repos
$gh->repos;
# back up all issues
$gh->issues;
This is one distribution that I've released prior to being happy with my unit test regimen, so that's on the imminent to-do list. There are tests, but as always, there can never be enough. In this case, I, myself am not even happy, so if you run into any issues, please open a ticket, or reply back here.
Going forward, I plan on adding functionality to independently back up *all* Github data for a user, not just repos and issues. I also plan to test restore operations, but that's not anything I'm considering short-term.
Hi,
I created shell completion scripts for dzil. The completion that is shipped with Dist::Zilla only completes subcommands, and only is for bash, as far as I can see.
My scripts also complete options, and show the description of subcommands and options.
See https://github.com/perlpunk/shell-completions.
(If dzil commands change, I have to update this, too, of course.)
I created this with https://metacpan.org/pod/App::AppSpec
Usage:
# bash
$ git clone https://github.com/perlpunk/shell-completions.git
$ cd shell-completions
$ source bash/dzil.bash
$ dzil <TAB>
add -- add modules to an existing dist
authordeps -- list your distributions author dependencies
build -- build your dist
clean -- clean up after build, test, or install
commands -- list the applications commands
help -- Show command help
install -- install your dist
listdeps -- print your distributions prerequisites
new -- mint a new dist
nop -- do nothing: initialize dzil, then exit
release -- release your dist
run -- run stuff in a dir where your dist is built
setup -- set up a basic global config file
smoke -- smoke your dist
test -- test your dist
$ dzil test --<TAB>
--all -- enables the RELEASE_TESTING, AUTOMATED_TESTING, EX
+TENDED_TESTING and AUTHOR_TESTING env variables
--author -- enables the AUTHOR_TESTING env variable
--automated -- enables the AUTOMATED_TESTING env variable (defaul
+t behavior)
--extended -- enables the EXTENDED_TESTING env variable
--help -- Show command help
--jobs -- number of parallel test jobs to run
--keep-build-dir -- keep the build directory even after a success
--keep -- keep the build directory even after a success
--lib-inc -- additional @INC dirs
--release -- enables the RELEASE_TESTING env variable
--test-verbose -- enables verbose testing (TEST_VERBOSE env variable
+ on Makefile.PL, --verbose on Build.PL
--verbose -- log additional output
--verbose-plugin -- log additional output from some plugins only
# zsh
# put zsh/_dzil into a directory which is read by
# zsh completions, or add this directory to your .zshrc:
# fpath=("$HOME/path/to/shell-completions/zsh" $fpath)
# log in again
Now, if you're like most people, most of your day is spent wandering around aimlessly and asking yourself, "Man, I wish I had a Perl script that would take like a huuuuge X12-formatted file, and split it up into input files, each one no greater than 1500KB, or 2500 claims, whichever comes first. Wow, if I had that... Man, I'd even be willing to edit the hardcoded output path in that script to suit where I wanted those chunks to go!"
Well, look no further:
#!/usr/bin/perl
##
## X12Splitter written 043013 by Bowie J. Poag
##
## X12Splitter takes an X12-formatted .dat file, and splits it
## up into inputFiles no greater than 1500KB or 2500 claims,
## whichever comes first.
##
## Usage:
##
## x12splitter <filename>
##
## Example:
##
## x12splitter foo.dat
##
$|=1;
$numRecords=0;
$numBytes=0;
$fileName=$ARGV[0];
errorCheckAndPrep();
dumpChunks();
sub errorCheckAndPrep
{
print "\n\nX12Splitter: Checking $fileName for any structural probl
+ems..";
@inputFile=`cat $fileName`;
@temp=`ls -l $fileName`;
@fileDetails=split(" ",$temp[0]);
$fileSize=$fileDetails[4]+0;
$numElements=scalar(@inputFile);
$numTotalBytes=length($inputFile[0]);
if ($numElements > 1)
{
print "X12Splitter: Input file is malformed. Exiting..\n";
exit();
}
else
{
print "..";
}
if ($fileSize!=$numTotalBytes)
{
print "X12Splitter: Payload size and stated file size mismatch.
+Exiting.\n";
exit();
}
else
{
print "..";
}
if ($inputFile[0]=~/^ISA/)
{
print "Done.\n";
}
print "X12Splitter: Check complete. Parsing file..\n";
@payload=split("~ST",$inputFile[0]);
$envelopeOpen=$payload[0];
$envelopeClose=$payload[-1];
$envelopeClose=~/~GE/;
$envelopeClose="~GE$'";
$payload[-1]=$`;
if ($envelopeOpen=~/^ISA/ && $envelopeClose=~/~GE/)
{
print "X12Splitter: Evenvelope open and close chunks found succe
+ssfully.\n";
}
else
{
print "X12Splitter: Unexpected problem with envelope open. Openi
+ng ISA header or ~GE close not found.\n";
exit();
}
shift (@payload); ## Don't bother processing the envelope..
foreach $item (@payload)
{
$recordCount++;
$openRecordText=substr($item,0,15);
$closeRecordText=substr($item,length($item)-40,40);
printf ("\rX12Splitter: Record %6d: [%15s.....%-40s] \r", $recor
+dCount, $openRecordText, $closeRecordText);
}
print "\nX12Splitter: $recordCount total records found. Splitting..
+\n";
}
sub dumpChunks
{
$chunkPayload="";
$chunkNum=0;
$numBytesInThisChunk=0;
$numRecordsInThisChunk=0;
foreach $item (@payload)
{
$numBytesInThisChunk=length($chunkPayload);
$numRecordsInThisChunk++;
$chunkPayload.="~ST$item";
if ($numRecordsInThisChunk>2000 || $numBytesInThisChunk>1000000)
{
$chunkPayload="$envelopeOpen"."$chunkPayload"."$envelopeClose
+";
open ($fh,'>',"/demo/fin/healthport/$fileName.part.$chunkNum"
+);
print $fh "$chunkPayload";
close ($fh);
print "X12Splitter: $numRecordsInThisChunk records saved to /
+demo/fin/healthport/$fileName.part.$chunkNum\n";
$numBytesInThisChunk=0;
$numRecordsInThisChunk=0;
$chunkNum++;
$chunkPayload="";
}
}
## Clean up the last of it..
$chunkPayload="$envelopeOpen"."$chunkPayload"."$envelopeClose";
open ($fh,'>',"/demo/fin/healthport/$fileName.part.$chunkNum"
+);
print $fh "$chunkPayload";
close ($fh);
print "X12Splitter: $numRecordsInThisChunk records saved to /
+demo/fin/healthport/$fileName.part.$chunkNum\n";
}
print "\n\n\n";
Perl SDBM databases of key/value pairs (tied to program hash tables) can hold/house multiple format KEYS - which is convenient for persistent random access indexing to Flat File database records. <Note: The VALUE in the KEY/VALUE pairs is used to store the byte offsets of the Flat File records indexed>. For EXAMPLE... If you have a Flat File database (of millions of records) having the fixed-length records, random access indexed, by Social Security Number (UNIQUE PRIMARY KEY), you may also wish to have an ALTERNATE KEY WITH DUPLICATES too, in case the Social Security Number is not known for Look Up. The below code snippet (incomplete, used just to illustrate a methodology) shows how this may be accomplished to setup the indexing. Once the indexing is setup, you can use a FOR LOOP iterated from: 1 to $NUM_RECS, to random access retrieve all the Flat File database records matching any arbitrary compound KEY, composed of info contained with the fields of the records. Note: The FILE POINTER is set to any record byte offset before performing READ/WRITE operations. This is ISAM(Indexed Sequential Access Method), NoSQL, Embedded database technology. This indexing stays persistent, so that Lookup is immediately available every time you launch your database user-interface (or batch) application program. SDBM is in the public domain, so you can distribute your FlatFile/SDBM database files and Perl Application Code, FREE of CHARGE to as many companies and end-users as you like. For more discussion and Perl code examples at Perl Monks, see "JOINT DATABASE TECHNOLOGY" thread.
TO BE CLEAR... You can use ANY programming language you like (e.g. Perl, Python, C++, etc.) with SDBM support, and ANY batch or GUI DB user-interface you like (e.g. portable Perl TK, Perl Win32-GUI by Aldo Calpini), and ANY file handling syntax you like (Win32-API, portable sysopen/sysseek/sysread/syswrite/close, etc..
#-- YYYYMMDD
#-- Key example: BirthDate|LastNameFirst4Chars|FirstNameInitia
+l|StateCode
#-- "19591219|Will|K|TX"
#-- $KEY without a Seq Nbr is used to increment the number of rec
+ords saved to the database
#-- having a particular ALT KEY w/DUPS - in this example: "1959
+1219|Will|K|TX"
$KEY=$BirthDate . "|" . $LastNameFirst4Chars . "|" . $FirstNameIn
+itial . "|" . $StateCode;
$Hash{$KEY}=0;
#-- Now index the first record encountered in the Flat File datab
+ase with this particular ALT KEY w/DUPS
$num_recs = $Hash{$KEY};
$num_recs++; #-- i.e. one(1)
$Hash{$KEY}=$num_recs;
$newKEY=$KEY . "|" . $num_recs;
#-- produces: "19591219|Will|K|TX|1"
$Hash{$newKEY}= #-- The VALUE would be set to the byte offset o
+f the Flat File record just indexed
#-- Now index the second record encountered in the Flat File data
+base with this particular ALT KEY w/DUPS
$num_recs = $Hash{$KEY};
$num_recs++; #-- i.e. two(2)
$Hash{$KEY}=$num_recs;
$newKEY=$KEY . "|" . $num_recs;
#-- produces: "19591219|Will|K|TX|2"
$Hash{$newKEY}= #-- The VALUE would be set to the byte offset o
+f the Flat File record just indexed
#-- and so on...
I've recently been working with large sets of timestamped measurement data from different devices, often recorded at different times on different days and spread across multiple files. Since I'm not always involved in the recording of the data, I need to look at when the devices were turned on and off, any gaps in the data, etc., in particular for which spans of time all devices were measuring at the same time, since that's the data that then needs to be analyzed. The timestamps are jittery, and data doesn't always come in order (or, equivalently, I'd like to not have to sort everything by timestamp). Set::IntSpan's union and intersect operations make this pretty easy!
Earlier today a new monk (mz2255) attempted to post a question on SoPW about recursively searching for files in a directory tree. He was having issues with excluding . and .. and also with rel2abs and nested readdir calls and what have you. He was unable to get the SoPW to post and ended up posting on his scratch pad, so here is a reply for mz2255, and a demonstration of what I would call the modern way to do the job, using Path::Tiny.
Note that the regexp is minimally modified from the OP and likely needs improvement before it can be used reliably for the OP's desired outcome. Left here for demo purposes.
use strict; use warnings; use feature qw/ say /;
use Data::Dumper; $Data::Dumper::Sortkeys = 1;
use Path::Tiny;
my $root_dir = Path::Tiny->tempdir;
_populate_for_demo( $root_dir );
my $re = qr/ (?:\w|\d)+ _ \w+ _ .+ _R(1|2)_ .+ /x;
my %results;
$root_dir->visit(
sub { $_->is_file and push @{ $results{$1} }, "$_" if /$re/ },
{ recurse => 1 },
);
say Dumper \%results;
exit;
sub _populate_for_demo {
my $temp_dir = shift;
path("$temp_dir/$_/aa_bb_cc_R1_dd.tmp")->touchpath for 'foo','bar'
+;
path("$temp_dir/$_/aa_bb_cc_R2_dd.tmp")->touchpath for 'baz','qux'
+;
return $temp_dir;
}
__END__
This is the first implmentation of HollyGame, it is a framework underneath e.g. SDL 1.2 in my code or or
buildable with SDL 1.2 or cairo 1.2 or 2.x. If I debug it, it will try to host it on CPAN