Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Regex matching end of sentence

by Dr Manhattan (Beadle)
on Jan 31, 2013 at 09:12 UTC ( #1016262=perlquestion: print w/replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I need a regex to match sentences ending with a period, but it has to miss abbreviations that might occur in the middle of the sentence.

For instance if I have a sentence 'I like Mr. Smith's dog.', the regex should not only match the 'I like Mr.' part.

if ($in =~ /(\w+)(\!|\?|\.)(\s)((([A-Z])(\w|\s|\d|\(|\)|\+|\=|\-|\@| +\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\w|\s|\d|\(|\)|\+|\ +=|\-|\@|\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\s)([A-Z])/ +) { if (!exists ($abbreviations{$9})) { $hash{$5}++; } elsif (!exists ($abbreviations{$12})) { $hash{$4}++; } }

I tried this, but it still doesn't work.

%abbreviations is a list of known abbreviations.

%hash is where correctly matched sentences are stored

Any help would be appreciated

Replies are listed 'Best First'.
Re: Regex matching end of sentence
by tmharish (Friar) on Jan 31, 2013 at 09:34 UTC
Re: Regex matching end of sentence
by Anonymous Monk on Jan 31, 2013 at 10:17 UTC
Re: Regex matching end of sentence
by ww (Bishop) on Jan 31, 2013 at 18:20 UTC
    ...and what if the writer of the sentence(s) has (as I do) a weakness for using ellipsis?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1016262]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2016-10-24 01:20 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (302 votes). Check out past polls.