Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Don't ask to ask, just ask
 
PerlMonks  

Regex matching end of sentence

by Dr Manhattan (Beadle)
on Jan 31, 2013 at 09:12 UTC ( #1016262=perlquestion: print w/ replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I need a regex to match sentences ending with a period, but it has to miss abbreviations that might occur in the middle of the sentence.

For instance if I have a sentence 'I like Mr. Smith's dog.', the regex should not only match the 'I like Mr.' part.

if ($in =~ /(\w+)(\!|\?|\.)(\s)((([A-Z])(\w|\s|\d|\(|\)|\+|\=|\-|\@| +\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\w|\s|\d|\(|\)|\+|\ +=|\-|\@|\#|\%|\&|\*|\<|\>|\,|\\|\/|\"|\`|'n)+(\s)(\w+\.))(\s)([A-Z])/ +) { if (!exists ($abbreviations{$9})) { $hash{$5}++; } elsif (!exists ($abbreviations{$12})) { $hash{$4}++; } }

I tried this, but it still doesn't work.

%abbreviations is a list of known abbreviations.

%hash is where correctly matched sentences are stored

Any help would be appreciated

Comment on Regex matching end of sentence
Download Code
Re: Regex matching end of sentence
by tmharish (Friar) on Jan 31, 2013 at 09:34 UTC
Re: Regex matching end of sentence
by Anonymous Monk on Jan 31, 2013 at 10:17 UTC
Re: Regex matching end of sentence
by ww (Bishop) on Jan 31, 2013 at 18:20 UTC
    ...and what if the writer of the sentence(s) has (as I do) a weakness for using ellipsis?
    :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1016262]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (13)
As of 2014-04-23 18:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (551 votes), past polls