Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

regex for text manipulation

by megaurav2002 (Monk)
on Nov 07, 2007 at 05:49 UTC ( #649399=perlquestion: print w/ replies, xml ) Need Help??
megaurav2002 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am trying to write a regex to convert the first letter and any letter following a full stop in a string to uppercase and all the other characters to lowercase. for eg:
"today is wednesday.tomorrow IS THURSDAY." => "Today is wednesday.Tomorrow is thursday."
Any help will be appreciated.

Thanks, Gaurav
"Wisdom begins in wonder" - Socrates, philosopher

Comment on regex for text manipulation
Re: regex for text manipulation
by graff (Chancellor) on Nov 07, 2007 at 06:27 UTC
    What have you tried so far? Do you know about the built-in functions ucfirst and lc ? You might also find the \L regex operator useful:
    my @words = qw/mY WoRd THIS is A mEsS/; for ( @words ) { (my $downcased = $_) =~ s/(\w+)/\L$1/; print join( " ", $_, $downcased, ucfirst(lc), uc ), "\n"; } __OUTPUT__ mY my My MY WoRd word Word WORD THIS this This THIS is is Is IS A a A A mEsS mess Mess MESS
Re: regex for text manipulation
by snoopy (Deacon) on Nov 07, 2007 at 06:39 UTC
    Here's one possible solution:
    #!/usr/bin/perl use warnings;use strict; foreach (<DATA>) { s{ (^|\.) # Start of string, or last full-stop (.*?) # Non-alpha characters (non-greedy) ([a-zA-Z]) # First alphabetic character ([^\.]*) # Through to next full stop }{ $1.$2.uc($3).lc($4) }gex; print; } __DATA__ today is wednesday.tomorrow IS THURSDAY. I am trying to write a regex to convert the first letter and any lette +r following a full stop in a string to uppercase .and . all the ot +her characters to lowercase.
Re: regex for text manipulation
by Praveen (Friar) on Nov 07, 2007 at 07:16 UTC
    Try This
    $str = "today is wednesday.tomorrow IS THURSDAY."; print "$str\n"; $str =~ s/(.*)/\L$1/gis; $str =~ s/\.(\w)/\.\u$1/gis; $str =~ s/\A(\w)/\u$1/gis; print "\n$str\n";
Re: regex for text manipulation
by gube (Parson) on Nov 07, 2007 at 07:27 UTC
    #!/usr/local/bin/perl my $str = "today is wednesday.tomorrow IS THURSDAY"; $str =~ s/^(\w)(.*?)([.])(\w)(.*)/uc($1).lc($2).lc($3).uc($4).lc($5)/g +ei; print $str;
    o/p: Today is wednesday.Tomorrow is thursday

      hi,

      The third match is the dot, then why the need of lc($3), i think $3 is enough.

      $str =~ s/^(\w)(.*?)([.])(\w)(.*)/uc($1).lc($2).($3).uc($4).lc($5)/gei;

Re: regex for text manipulation
by Anonymous Monk on Nov 07, 2007 at 09:32 UTC

    C:\@Work\Perl>perl -wMstrict -e "print qq(\no/p: \n); for (@ARGV) { print qq(\"$_\" \n); s{ ( (?: \A | [.!?]) \s*)? ([a-zA-Z]) } { ($1 || '') . (defined $1 ? uc $2 : lc $2) }egxms; print qq('$_' \n\n) }" "today is wednesday.tomorrow IS THURSDAY." " is tODAY wednesday? yes, but yesterDAY was tuesday! i See. " "i thought today was... never mind." "what a day!" o/p: "today is wednesday.tomorrow IS THURSDAY." 'Today is wednesday.Tomorrow is thursday.' " is tODAY wednesday? yes, but yesterDAY was tuesday! i See. " ' Is today wednesday? Yes, but yesterday was tuesday! I see. ' "i thought today was... never mind." 'I thought today was... Never mind.' "what a day!" 'What a day!'
Re: regex for text manipulation
by KurtSchwind (Hermit) on Nov 08, 2007 at 03:43 UTC
    Just a potential gotcha to think about. If you cap the first letter after every period, you have to watch abbreviations.
    You might get something like.
    "Which St. do you live on?" to read "Which St. Do you live on?"
    I'm sure there are other examples, but you get what I mean, right?
    --
    I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://649399]
Approved by graff
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (18)
As of 2014-07-24 14:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (160 votes), past polls