Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Thanks for a great snippet tye!!
I must admit that it took me a while to get my brain around merlyn's regex. But after breaking it down it made perfect sense. unixwizard point out that the code wouldn't work on the 'other' US states, but I think that can be modified by changing the (?:\s\w+)? bit to (?:\s\w+)*
Here is merlyn's regex broken down, with comments:
\G # when progressively matching a string # with the 'g' flag # you can use the \G anchor to 'hold' the postition # just after the previous match # helps regex remember where it left off # allows you to go through a list efficiently # without using split or looping # Mastering Regular Expessions (p 236 - 240) \s* # matches zero or more spaces that # may come before before a name-value pair (\w\w) # match two word characters (alphanumeric plus '_') # parentheses assign matched letters to $1 # this is the state abbreviation \s+ # match one or more spaces between name-value pair (\w+ # match one or more word characters (?: # ?: allows for cluster-only parentheses, # no capturing and doesn't assign to $3 \s\w+ # match one space then one or many word characters )? # match zero or one of these clusters # allows match of state names with mulitple words # ie New York, West Viginia # does not match States with three words, # like 'Northern Mariana Island' # change trailing ? to * to match those '(?:\s\w+)*' ) # assigns state name to $2 /gx; # end of regex # g flag for global search # x flag to allow whitespace in regex # might also want to use c flag # c flag causes the match position to be retained # following an unsuccesful match # see: Effective Perl Programming (p.63) # # the complete regex looks like this: # # /\G\s*(\w\w)\s+(\w+(?:\s\w+)?)/g; #
Here is the Benchmark of the three routines
Benchmark: timing 1000 iterations of merlyn, tye_1, tye_2... merlyn: 0 secs ( 0.54 usr + 0.00 sys = 0.54 CPU) @ 1851.85/s (n=1000) tye_1: 1 secs ( 0.65 usr + 0.00 sys = 0.65 CPU) @ 1538.46/s (n=1000) tye_2: 2 secs ( 1.58 usr + 0.00 sys = 1.58 CPU) @ 632.91/s (n=1000)
Also here is a link to FIPS and ISO 3166 country codes in case anybody wants to apply this snippet to countries.

Get Strong Together!!

In reply to Re: U.S. State Names by aardvark
in thread U.S. State Names by tye

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    Domain Nodelet?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others musing on the Monastery: (2)
    As of 2024-07-13 15:21 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found

      erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.