when $$s =~ m/\G.../gc is too verbose

Replies are listed 'Best First'.
Re: when $$s =~ m/\G.../gc is too verbose by chromatic (Archbishop) on Feb 02, 2006 at 22:36 UTC
Clever. I usually put a local in there though, just to avoid trouble.	[reply]
Re^2: when $$s =~ m/\G.../gc is too verbose by stefp (Vicar) on Feb 03, 2006 at 00:09 UTC
Oops, I forgot it. I always localize or lexicalize variables proper to a subroutine That's why I never noticed that `$_` is not implicetely localized at the entry of a subroutine contrary to what I thought. Sadly, localizing `_` or `$_` doesn't play well with reference shuffling of strings with positions. `sub lexer { (_) = @_; print $1 if m/\G(A)/gc \|\| m/\G(B)/gc ; } my $a = "AB"; lexer \$a; ; lexer \$a;` [download] This prints "A" then "B"; If I add a `local *_` or a `local $_`, at the entry of the lexer routine, that does not work anymore. So much for a cool trick. -- stefp	[reply] [d/l]
Re: when $$s =~ m/\G.../gc is too verbose (for) by tye (Sage) on Feb 03, 2006 at 05:00 UTC
`for( $$s ) { .... }` [download] - tye	[reply] [d/l]
Re^2: when $$s =~ m/\G.../gc is too verbose (for) by stefp (Vicar) on Feb 03, 2006 at 09:49 UTC
This is the trick used by `Calc.yp` in the `Parse::Yapp` distribution. It indeed creates an alias but conveys the wrong message because the the block is not really used as a loop. -- stefp	[reply]
Re^3: when $$s =~ m/\G.../gc is too verbose (for) by bart (Canon) on Feb 03, 2006 at 11:15 UTC
That's why I wished Perl allowed another keyword as yet another synonym for `for`/`foreach` — I'd propose "`with`", for example: `with($$s) { ... }` [download] But in the meantime, I've trained myself to actually read/see `for(SCALAR) { ... }` [download] as `with(SCALAR) { ... }` [download] Chalk it up as another Perl idiom.	[reply] [d/l] [select]
Re^4: when $$s =~ m/\G.../gc is too verbose (for) by TimToady (Parson) on Feb 03, 2006 at 19:10 UTC
Re^3: when $$s =~ m/\G.../gc is too verbose (for) by tye (Sage) on Feb 03, 2006 at 19:45 UTC
Much like in English, you can use Perl's `for()` for iterating over a list, iterating via initialization + check + step, or associating a single topic with a block of syntax. So I, without apology, use `for()` for topicalizing. For you, I won't stop doing this. (: Excuse me for not demonstrating the use of English "for" analogous to init + check + step. - tye	[reply]
Re: when $$s =~ m/\G.../gc is too verbose by Anonymous Monk on Feb 02, 2006 at 23:28 UTC
Maybe I don't understand the problem, but if you really hate typing so much, why not generalize the solution instead? Not that I like to generalize things, because I usually end up un-generalizing them a few months later (stupid shifting requirements!), but it seems easier than playing with symbol table manipulations just to save a few keystrokes to me... am I missing something? I'm thinking of something roughly along these lines... completely untested and possibly wrong code is below. ;-) # make a table of regular expression patterns my %table = ( qr/(\d+)/ => 'INT', qr/([A-Z]\w*)/ => 'ID', .... # more tokens here ); my ($parser) = shift; my $s = $parser->YYData->{INPUT}; my @matches; # any matches found by our re go in here foreach my $re ( keys %table ) { # for each regexp, check to see if it matches, and # put all the captured values in @matches if it does @matches = ( $$s =~ m/\G$re/gc ); # return the appropriate token, and captures... return( $table{$re}, @matches) if (@matches); } # end search for a token match # token not found... put error handling here ... [download] -- Ytrew	[reply] [d/l]
Re^2: when $$s =~ m/\G.../gc is too verbose by stefp (Vicar) on Feb 03, 2006 at 00:00 UTC
Without going to the extremeties of toke.c (the Perl tokenizer), things are usally more complicated than mere pattern matching. One may have to test whatever flags. Otherwise, indeed one could factorize one way or another. -- stefp	[reply]
Re: when $$s =~ m/\G.../gc is too verbose by ambrus (Abbot) on Feb 03, 2006 at 15:48 UTC
You can also use a single regexp with all alternatives and \G and the g flag but without the c flag. Then you can decide which alternative mached by checking the definedness of `$1` and other match variables. I sometimes use that idiom instead of many regexps with a gc flag. A nice example is the glob_to_re function in cgrep (snapshot) (which is btw an improved version of my cgrep: Egrep clone with function name display). A simpler example is in Re: Logic trouble parsing a formatted text file into hashes of hashes (of hashes, etc.).	[reply] [d/l]
Re: when $$s =~ m/\G.../gc is too verbose by radiantmatrix (Parson) on Feb 07, 2006 at 17:53 UTC
Why not just store `$$s` in a local copy of `$_`? `#Either local $_ = $$s; #Or s//$$s/; #tricky.. ;-)` [download] Actually, in this case, I'd be tempted to alter your approach altogether and use a regex table. `sub lexer { my ($parser) = shift; my $s = $parser->YYData->{INPUT}; # I don't get your line: 'm/\G\s+/gc; skip any spaces' my %dispatch = ( INT => qr/\G(\d+)/gc, ID => qr/\G([A-Z]\w)/gc, #.. and so on .. ); while (my ($key, $regex) = each %dispatch) { return ($key, $1) if $$s =~ $regex; } }` [download] <-radiant.matrix-> A collection of thoughts and links from the minds of geeks The Code that can be seen is not the true Code* I haven't found a problem yet that can't be solved by a well-placed trebuchet	[reply] [d/l] [select]
Re^2: when $$s =~ m/\G.../gc is too verbose by Corion (Patriarch) on Feb 07, 2006 at 17:58 UTC
Your second solution is no solution: `$_ = '!'; $s = \'No'; s//$$s/; print;` [download] You'd need to empty out `$_` first, so the `local $_` is the way.	[reply] [d/l] [select]
Re^3: when $$s =~ m/\G.../gc is too verbose by radiantmatrix (Parson) on Feb 07, 2006 at 19:32 UTC
Absolutely. That's the tricky part. ;-) It will work when `$_` is undefined, but not otherwise. Of course, you could always change it to `s/./$$s/`, but still not advisable. More of an obfu trick... <-radiant.matrix-> A collection of thoughts and links from the minds of geeks The Code that can be seen is not the true Code* I haven't found a problem yet that can't be solved by a well-placed trebuchet	[reply] [d/l] [select]
Re^2: when $$s =~ m/\G.../gc is too verbose by stefp (Vicar) on Feb 07, 2006 at 22:31 UTC
About the copy: before even thinking about positions in strings, using a string copy is a no-no. Copying the string to be parsed for each token is madness. About a table for lexing : this is irrelevant to the discussion. Also, lexing can be more complex than matching. Yes, one can insert regular code in regex but that the sign that a table based lexing is not appropriate. As I said tye, using `for` is the right way to alias to `$_`. I don't like it because in the programming space, `for` is a loop... for me. :) In the natural language space, well, English is not my first language. So to paraphrase Churchill, `for` is the worst solution, but it is the only one. Hopefully, like said TimToady, Perl6 will be cleaner. -- stefp	[reply]


Keep It Simple, Stupid
	PerlMonks