Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: How can I use printf FORMAT strings for (Win32) shell globs?

by haukex (Archbishop)
on Jul 17, 2017 at 10:16 UTC ( [id://1195241]=note: print w/replies, xml ) Need Help??


in reply to How can I use printf FORMAT strings for (Win32) shell globs?

That's an interesting question. But I am wondering about a few things:

  • Perl's sprintf format strings are pretty complex - are you aiming to support all of it (I assume not), or just a subset, and if so, what subset?
  • Why are you getting the patterns as printf format strings in the first place, instead of globs or regular expressions? Maybe you could approach the problem from a different angle and have the user input one of those?
  • A printf format string of "Img%04d.png" can also produce an output of "Img123456.png", since %04d is just a minimum width specifier, but in your code you take it to mean exactly four digits. Plus, you exclude negative numbers, which are also possible with %04d. Why the difference?
  • Why do you convert to glob patterns first, why not just stick to regular expressions all the way through? They're more powerful and can also be used for listing files in combination with readdir, Path::Class, Path::Tiny, or File::Find::Rule. (Update: And they could even be used like so: my @files = grep {/$regex/} glob(".* *");, although I'd strongly recommend one of the aforementioned modules.)

The following is just something I played with, note it is very minimal and unfinished, e.g. it currently supports only a few specifiers and doesn't do anything with the flags or width fields. But basically, it'll turn a format string like "Img%04d.png" into a regex roughly like /^Img([-+]?[0123456789]+)\.png$/ (I used Regexp::Common::number to implement the number matching).

#!/usr/bin/env perl use warnings; use strict; use Regexp::Common qw/number/; sub fmtstr2regex { my ($fmtstr) = @_; my $parse_fmtstr = qr{ (?<esc>%%) | (?<plain>[^%]+) | % (?<flags>[-+ 0])? (?<width>\d+)? (?:\.(?<prec>\d+))? (?<type>[dfs]) }msx; my $outregex="\n"; pos($fmtstr)=undef; while ($fmtstr=~/\G$parse_fmtstr/gc) { $outregex .= "\t"; if (exists $+{esc}) { $outregex .= "\\%" } elsif (exists $+{plain}) { $outregex .= quotemeta $+{plain} } else { if ($+{type} eq 'd') { $outregex .= '('.$RE{num}{int}.')' } elsif ($+{type} eq 'f') { if (defined $+{prec}) { $outregex .= '(' .$RE{num}{real}{-places=>$+{prec}}.')' } else { $outregex .= '('.$RE{num}{real}.')' } } elsif ($+{type} eq 's') { $outregex .= '((?s).*?)' } } $outregex .= "\n"; } die "failed to parse '$fmtstr'" unless pos($fmtstr)==length($fmtstr); return qr/\A$outregex\z/x; } use Test::More; { my $fmtstr = 'Img%04d.png'; my $regex = fmtstr2regex($fmtstr); note '$regex = ', explain $regex; my @grepped = grep {/$regex/} qw/ Img0001.png Hello.txt Img6789.png ImgABCD.png Img2000.png Img1234.jpg Img0123.png /; is_deeply \@grepped, [qw/ Img0001.png Img6789.png Img2000.png Img0123.png /], 'grepped list of files'; } { my $fmtstr = 'fo.o%sbar%03d%%% 5squz *%4.2f9baz'; my $regex = fmtstr2regex($fmtstr); note '$regex = ', explain $regex; { my $string = sprintf($fmtstr,"TEST",45,"Foo",1.234); is $string, 'fo.oTESTbar045% Fooquz *1.239baz', 'sprintf'; ok my @m = $string=~$regex, 'regex matches'; is_deeply \@m, [ "TEST", "045", " Foo", "1.23" ], 'capture groups'; } { my $string = sprintf($fmtstr," 34bar",1234,"**",567.890); is $string, 'fo.o 34barbar1234% **quz *567.899baz', 'sprint +f'; ok my @m = $string=~$regex, 'regex matches'; is_deeply \@m, [ " 34bar", "1234", " **", "567.89" ], 'capture groups'; } } done_testing;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1195241]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 21:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found