Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Words are usually delimited by punctuation, not only whitespace. Therefore, the following script only counts letters, delimited by non-letters.
#!/usr/bin/perl use warnings; use strict; use open IO => ':utf8', ':std'; my ($words, $sentences); while (<>) { $words++ for m/\p{L}+/g; $sentences++ for m/\./g; } print "$words $sentences\n";

Tested on the following text:

Огонь XXII Зимних олимпийских игр в Сочи во второй раз погас в понедельник в Москве, во время этапа эстафеты олимпийского огня. После нескольких безуспешных попыток снова его зажечь, факел был заменен, передает портал
Казус произошел на Раушской набережной, недалеко от Кремля. Видно, как зрители приветствуют факелоносца, он машет в ответ, и через какое-то время факел гаснет.
59 5
لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

In reply to Re^3: Perl & Unicode: state of the art? by choroba
in thread Perl & Unicode: state of the art? by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others about the Monastery: (3)
    As of 2019-07-21 10:58 GMT
    Find Nodes?
      Voting Booth?
      If you were the first to set foot on the Moon, what would be your epigram?

      Results (7 votes). Check out past polls.