Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Regex matching end of sentence

by Dr Manhattan (Beadle)
on Jan 31, 2013 at 09:12 UTC ( #1016262=perlquestion: print w/ replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I need a regex to match sentences ending with a period, but it has to miss abbreviations that might occur in the middle of the sentence.

For instance if I have a sentence 'I like Mr. Smith's dog.', the regex should not only match the 'I like Mr.' part.

if ($in =~ /(\w+)(\!|\?|\.)(\s)((([A-Z])(\w|\s|\d|\(|\)|\+|\=|\-|\@| +\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\w|\s|\d|\(|\)|\+|\ +=|\-|\@|\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\s)([A-Z])/ +) { if (!exists ($abbreviations{$9})) { $hash{$5}++; } elsif (!exists ($abbreviations{$12})) { $hash{$4}++; } }

I tried this, but it still doesn't work.

%abbreviations is a list of known abbreviations.

%hash is where correctly matched sentences are stored

Any help would be appreciated

Comment on Regex matching end of sentence
Download Code
Re: Regex matching end of sentence
by tmharish (Friar) on Jan 31, 2013 at 09:34 UTC
Re: Regex matching end of sentence
by Anonymous Monk on Jan 31, 2013 at 10:17 UTC
Re: Regex matching end of sentence
by ww (Bishop) on Jan 31, 2013 at 18:20 UTC
    ...and what if the writer of the sentence(s) has (as I do) a weakness for using ellipsis?
    :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1016262]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (14)
As of 2014-09-22 15:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (198 votes), past polls