Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Regex matching end of sentence

by Dr Manhattan (Beadle)
on Jan 31, 2013 at 09:12 UTC ( #1016262=perlquestion: print w/replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I need a regex to match sentences ending with a period, but it has to miss abbreviations that might occur in the middle of the sentence.

For instance if I have a sentence 'I like Mr. Smith's dog.', the regex should not only match the 'I like Mr.' part.

if ($in =~ /(\w+)(\!|\?|\.)(\s)((([A-Z])(\w|\s|\d|\(|\)|\+|\=|\-|\@| +\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\w|\s|\d|\(|\)|\+|\ +=|\-|\@|\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\s)([A-Z])/ +) { if (!exists ($abbreviations{$9})) { $hash{$5}++; } elsif (!exists ($abbreviations{$12})) { $hash{$4}++; } }

I tried this, but it still doesn't work.

%abbreviations is a list of known abbreviations.

%hash is where correctly matched sentences are stored

Any help would be appreciated

Replies are listed 'Best First'.
Re: Regex matching end of sentence
by tmharish (Friar) on Jan 31, 2013 at 09:34 UTC
Re: Regex matching end of sentence
by Anonymous Monk on Jan 31, 2013 at 10:17 UTC
Re: Regex matching end of sentence
by ww (Archbishop) on Jan 31, 2013 at 18:20 UTC
    ...and what if the writer of the sentence(s) has (as I do) a weakness for using ellipsis?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1016262]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2017-10-22 22:03 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (275 votes). Check out past polls.