Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Reading into a hash from regex

by songahji (Friar)
on May 16, 2005 at 20:29 UTC ( [id://457551]=perlquestion: print w/replies, xml ) Need Help??

songahji has asked for the wisdom of the Perl Monks concerning the following question:

Lets say I have a file containing
-------------------- ASCII=i Character=i Decimal=970 Hexidecimal=3CA Entity name= Decription=i with an umlaut -------------------- ASCII=' Character=' Decimal=39 Hexidecimal=27 Entity name= Decription=quote --------------------- ASCII=< Character=< Decimal=60 Hexidecimal=3C Entity name=lt Decription=less than -------------------- .... truncated
Next, a hash need to be populated with the value of 'ASCII' and value of 'Character'. Since the value of 'Character' are unique, it will be the hash key. So far this is what I came up
open IN, "$myfile"; { local $/; $str = <IN>; } close IN; %hash = reverse ($str =~ m/ASCII\=(.*?)\s+Character\=(.*?)\s+/g);
Are there other ways to do this better?

Greets,
Hanny J

Replies are listed 'Best First'.
Re: Reading into a hash from regex
by holli (Abbot) on May 16, 2005 at 20:38 UTC
    maybe,
    use strict; use warnings; $/="--------------------"; my %h = (); open IN, "<", "c:/pm217.txt" or die $!; while (<IN>) { if ( /ASCII=(.+?)\nCharacter=(.+?)\n/ms ) { $h{$2} = $1; } } close IN;


    holli, /regexed monk/
Re: Reading into a hash from regex
by jdporter (Paladin) on May 16, 2005 at 20:42 UTC
    Or:
    my %h = do { local $/ = "--------------------"; local @ARGV = "c:/pm217.txt"; map { ( /Character=(.*)/, /ASCII=(.*)/ ) } <> };
    Or:
    my %h; { local $/; # sluurp local @ARGV = "c:/pm217.txt"; @h{ /Character=(.*)/g } = /ASCII=(.*)/g for <>; }
Re: Reading into a hash from regex
by TedPride (Priest) on May 16, 2005 at 21:05 UTC
    use strict; use warnings; my ($handle, %hash, $a, $c); my $myfile = 'whatever.dat'; open $handle, $myfile; <$handle>; while ($_ = <$handle>) { $a = substr($_, 6, 1); $c = substr(<$handle>, 10, 1); <$handle> for (0..4); $hash{$c} = $a; }
Re: Reading into a hash from regex
by tlm (Prior) on May 17, 2005 at 00:04 UTC

    You already got some good alternative snippets that fix these errors, but I thought it would be helpful to point them out explicitly. Namely, your regexp will give the wrong results when what immediately follows the = matches \s. Therefore you should replace the .*? in the captures with .+?. Also, you should add the /s modifier to the regexp if you want the . in the captures to match \n.

    the lowliest monk

      Well spotted.

      Thx,
      Hanny J

Re: Reading into a hash from regex
by strictvars (Sexton) on May 17, 2005 at 19:48 UTC
    Maybe with a LISP style
    BEGIN{ open IN, $myfile or die "can't open $!"; local $/; $char{ ( /Character=(.*)/ )[0] } = $hex{ ( /Hexidecimal=(.*)/ )[0] + } = { map{ /(.*?)=(.*)/ } split /\n/ } for split /\n-+\n/, <IN>; }
Re: Reading into a hash from regex
by Roger (Parson) on May 17, 2005 at 13:23 UTC
    Here's my try... I make an assumption that either ASCII or Character could be the first line of the section, therefore some re-arrangement is needed:

    use Data::Dumper; my %hash; my @n = map { /(Character|ASCII)\s*?=\s*?(\S+)/; $1 ? [$1,$2] : () } < +DATA>; while (my ($a, $b) = splice @n, 0, 2) { if ($a->[0] eq 'ASCII') { $hash{$b->[1]} = $a->[1]; } else { $hash{$a->[1]} = $b->[1]; } } print Dumper(\%hash); __DATA__ -------------------- ASCII=1 Character=2 Decimal=970 Hexidecimal=3CA Entity name= Decription=i with an umlaut -------------------- Character=4 ASCII=3 Decimal=39 Hexidecimal=27 Entity name= Decription=quote --------------------- ASCII=< Character=< Decimal=60 Hexidecimal=3C Entity name=lt Decription=less than

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://457551]
Approved by ktross
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-28 18:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found