Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
Just another Perl shrine
 
PerlMonks  

Re: More Regular Expressions (text data handling)

by frankus (Priest)
on Dec 04, 2001 at 16:55 UTC ( #129317=note: print w/ replies, xml ) Need Help??


in reply to More Regular Expressions (text data handling)

For brevity and clarity in the question: Using the __DATA__ label will enable folks to run this easily.
Update: Agh! the gotcha here is there are some lines with 2 key value pairs on em :(

#!/usr/bin/perl -w use strict; use Data::Dumper; for(grep {/\w/}<DATA>){ # g repeats the regex, e executes the Perl in the substitution, # returns the number of matches into the condition. # $1 and $2 are the bracketed matches in the regex in order. + unless( s/^(^[a-z ]+) *: *([\w\d]+)/$_{$1}=$2/ige ){ chomp; if (/:/){ # key part $_{$_}=join(',',@_); # make new key item @_=() } else { # list part push @_,$_ } } } print Dumper(\%_); __DATA__ Graq: Agnostic: Number: 634321 age: 27 hair colour: black height: 73 weight: 123 legs: 2 arms: 2 jameson bells guinness favourite: detests: likes:

--

Brother Frankus.

¤


Comment on Re: More Regular Expressions (text data handling)
Download Code
Re(2): More Regular Expressions (text data handling)
by graq (Curate) on Dec 04, 2001 at 18:12 UTC
    Agh! the gotcha here is there are some lines with 2 key value pairs on em :(

    Yes, sorry, I should have expanded on the data. Especially the noise either side of the data. Hence the indexing step. Below is a data example closer to a true example.

    __DATA__ NOISE noise Graq 121212 rubbish: values Graq Agnostic Number: 634321 age: 27 hair colour: black height: 73 weight: 123 legs: 2 arms: 2 balls: 1 aminals: 3 leftandright rugby cute "more noise here - and don't forget the blank lines..." __END_DATA__
    The first guaranteed unique identifer is the line begining with Number (ie /^Number:/).

    Lines before the colon (:) seperated set of values are key-less (hard coded keys must be used), values after are sub-values of the preceding : values.

    So, if you like, {arms}->{2}->{leftandright}, {balls}->{1}->{rugby} ..

    Is it becoming any clearer?? :-(

    <a href="http://www.graq.co.uk">Graq</a>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://129317]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (11)
As of 2014-04-16 20:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (434 votes), past polls