Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Problems? Is your data what you think it is?
 
PerlMonks  

How do I extract tags and retrieve values

( #184611=categorized question: print w/ replies, xml ) Need Help??
Contributed by Tita on Jul 23, 2002 at 21:36 UTC
Q&A  > strings


Description:

Hi I have this in a file:
<s>/SYM Who/WP is/VBZ the/DT author/NN of/IN the/DT book/NN... ?/. </s>/SYM.

How I extract just the tags to make a formula (ex: WP+VBZ+DT..), and how I will retrieve just the value of the tag (WP ->Who...), after I match (true) with other formulas I have in another file? Thanks. Tita

Answer: How do I extract tags and retrieve values
contributed by Tita

If the matching tags aren't nested, it can be done like so

my $string = "<S>yada yada yada</S>"; print "Yes! $1\n" if $string =~ m{<S>(.*?)</S>}; ### Here is an explanation of the regular expression use YAPE::Regex::Explain; print YAPE::Regex::Explain->new('<S>(.*?)</S>')->explain; __END__ The regular expression: (?-imsx:<S>(.*?)</S>) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- <S> '<S>' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- </S> '</S>' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

Please (register and) log in if you wish to add an answer



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others scrutinizing the Monastery: (6)
    As of 2014-04-20 07:09 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      April first is:







      Results (485 votes), past polls