Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Regular expression

by Anonymous Monk
on Oct 30, 2012 at 05:25 UTC ( #1001453=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Can anyone tell me the meaning of the below expression.I just undertood that we are mapping from a file to a hashtable.But didnt understand the meaning of regular expression totally..Please help

chop (%key_hash = map { split /\s*[|]\s*/,$_,2 } grep (!/^$/,<KEY_FH>));

Replies are listed 'Best First'.
Re: Regular expression
by Athanasius (Chancellor) on Oct 30, 2012 at 06:13 UTC

    Parsing from right to left:

    • Assuming KEY_FH is a filehandle which has been opened for reading, grep places <KEY_FH> into list context, so it returns a list of the lines in the input file.
    • grep applies the regex !/^$/ to each element, filtering out blank lines. So, the output of the call to grep is a list of the non-blank lines in the input file.
    • map takes these lines as input, and applies the split function to each.
    • This splits on the pipe character (“|”), optionally surrounded by whitespace.
    • The LIMIT of 2 ensures that split will return at most 2 fields. So if the line contains more than one |, only the first will be used to split the line into fields. Assuming the line contains at least one |, split will return a list of two fields.
    • These two fields are assigned to the hash %key_hash, where they form a key-value pair.
    • Finally, chop is applied to the whole hash, which removes the final character from the value half of each key-value pair. This will presumably be the newline character read in at the end of each line. (But chomp would be a better choice here.)

    Hope that helps,

    Athanasius <°(((><contra mundum

Re: Regular expression
by Kenosis (Priest) on Oct 30, 2012 at 06:26 UTC

    This is more than what you asked about, but there are two regexs in the following:

    chop (%key_hash = map { split /\s*[|]\s*/,$_,2 } grep (!/^$/,<KEY_FH>) +); ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | | | | | | | | + - F +ile handle | | | | | | + - Let only + non-empty file lines pass | | | | | + - Split string into two + parts | | | | + - Default scalar containi +ng a file line | | | + - Zero or more whitespaces | | + - Vertical bar as only member in +character set | + - Zero or more whitespaces + - Should be 'chomp' to remove the trailing newlines

    It appears that this code splits a "|" delimited file line into two parts: the first field is the key and the remaining fields are the value. We can give it a try like this:

    use strict; use warnings; use Data::Dumper; chomp (my %key_hash = map { split /\s*[|]\s*/,$_,2 } grep (!/^$/,<DATA +>)); print Dumper \%key_hash; __DATA__ AAA|B|C|D|E BBB|G|H|I|J CCC|1|2|3|4 DDD|6|7|8|9

    The output shows the hash's key/value pairs:

    $VAR1 = { 'CCC' => '1|2|3|4', 'BBB' => 'G|H|I|J', 'DDD' => '6|7|8|9', 'AAA' => 'B|C|D|E' };

    Hope this helps!

Re: Regular expression
by Anonymous Monk on Oct 30, 2012 at 05:28 UTC

    But didnt understand the meaning of regular expression totally..Please help

    What did you understand?

      For split i understood that the data is splitted based on "|" .For eg : abc |efg abc and efg are obtained.but didnt understood y $-,2 and grep are used along with that.Plz explain
        um, see split, first argument is regular expression, the other arguments are string to split, and the limit

        the grep is used to skip empty lines

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1001453]
Approved by Athanasius
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2018-04-25 21:02 GMT
Find Nodes?
    Voting Booth?