Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Parenthesis Match

by fourmi (Scribe)
on May 27, 2004 at 08:08 UTC ( #356827=perlquestion: print w/replies, xml ) Need Help??

fourmi has asked for the wisdom of the Perl Monks concerning the following question:

I want to capitalise leading letters of words, words being delimited by hyphens, spaces, and as start of string characters, the first piece of code works well, but i neglected to add in start parenthesis as part of the word delimiting set. I simply cannot get a match caught by my effort, and random guesses hasn't worked either. Any ideas, what piece i hhave wrong?
1st works well
s/(-| |^)(.)/$1\u$2/g;

2ndary effort
s/(-| |^|\()(.)/$1\u$2/g;

Replies are listed 'Best First'.
Re: Parenthesis Match
by EdwardG (Vicar) on May 27, 2004 at 08:31 UTC

    and random guesses hasn't worked either

    You obviously don't have enough monkeys.

    But seriously, here what the documentation says about capitalisation.

    d:\>perldoc -q capitalize Found in D:\Perl\lib\pod\perlfaq4.pod How do I capitalize all the words on one line? To make the first letter of each word upper case: $line =~ s/\b(\w)/\U$1/g;

    There's also good information at perldoc -f boundary, which might help clarify your thinking about what constitutes a word boundary.


      Ahh, brilliant, thanks very much, have plenty of monkeys, but seeing as they failed will now have to go and spank them. much appreciated!
Re: Parenthesis Match
by Somni (Friar) on May 27, 2004 at 08:51 UTC

    First of all, it helps to have some decent code to test with; several inputs, and your various solutions for converting them.

    foreach my $string ( 'i am the very model of a modern major-general.', 'this is one sentence. this is two.', 'foo bar', 'outside parens (inside them)', 'outside brackets [inside them], now braces {inside}', ) { (my $two_capture = $string) =~ s/ (^|[-\s(]) (\w) / $1 . uc $2 /e +gx; (my $boundary = $string) =~ s/ \b (\w) / uc $1 /e +gx; (my $alternation = $string) =~ s/ (-|\s|^|\() (. ) / $1 . uc $2 /e +gx; print( "string: $string\n", "two_capture: $two_capture\n", "boundary: $boundary\n", "alternation: $alternation\n", "\n" ); }

    While it's good you defined what you consider a word, it's probably a little too narrow. Consider the bracket and brace examples, and the ease of simply using \b, rather than defining every character you consider a word delimiter. Also, (.) is probably wrong; you want word characters you can upcase (\w), not just any character.

    Anyways, as the code shows, your attempts work just fine. What problem did you encounter when you tested them that made you think they didn't?

Re: Parenthesis Match
by BrowserUk (Pope) on May 27, 2004 at 08:49 UTC

    The problem with your regex is that if the open paren is preceded by a space, the your regex will match ' (' and attempt to upper the left paren. Having matched those two characters successfully, the next match attempt starts with the first character following the paren, and so will skip the <paren><lowercase> pairing.

    One way to fix this is to make sure that the ' (' doesn't match ' (', by only matching where the second character is a lowercase char.

    s/(-| |^|\()([a-z])/$1\u$2/g;

    Perhaps a better way would be to use \b to detect the start of words.


    This is slightly different in that anylowercase character preceded by a non-word char will be uppered. This may or may not fit your definition of a word.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
Re: Parenthesis Match
by tune (Curate) on May 27, 2004 at 08:34 UTC
      cool, thanks a lot!
Re: Parenthesis Match
by Happy-the-monk (Canon) on May 27, 2004 at 08:35 UTC

    Your second piece of code   s/(-| |^|\()(.)/$1\u$2/g;   worked perfectly when I tested it. What kind of error are you encountering?

    Cheers, Sören

      Hmm okay, could be activestate's wperl interpretation going off then, thanks for checking that for me though!
      No specific error, just doesn't catch, will try a '\b' work around methinks!
Re: Parenthesis Match
by halley (Prior) on May 27, 2004 at 14:53 UTC
    Two notes about "capitalizing all the words on one line":

    Here'S An Example Which Isn'T Proper Form, But \B Wouldn'T'Ve Prevented

    Also, if you're trying to perform capitalization for captions or titles, please remember that the English rules are more complex than just capitalizing every word. These vary between different prominent style guides, but this is the basic definition:

      First and Last words are always capitalized. Other words are NOT capitalized if they're not "important." A good guide for non-important words are articles, conjunctions and prepositions.

    This code is something I use on my own site. It's not particularly efficient or clever or even completely rigorous, but it makes most titles look right.

    sub titlecase { my @P = qw(a an the and or nor of with under over from for behind on in beside at to withi +n de del la las los); my $name = shift; my @words = split /[._ ]/, $name; my $particles = join('|', @P); foreach (@words) { $_ = ucfirst($_) if not /^($particles)$/i } $words[0] = ucfirst($words[0]); $words[-1] = ucfirst($words[-1]); return join(' ', @words); }
    • All Quiet on the Western Front
    • Of Mice and Men
    • For Whom the Bell Tolls
    • Histoire de St. Louis, Roi de France
    • Crossing Over

    Of course, it leads you back to your original question: what is a word? My code would not handle 'Tis properly, nor parentheses.

    [ e d @ h a l l e y . c c ]

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://356827]
Approved by Happy-the-monk
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2020-07-16 03:50 GMT
Find Nodes?
    Voting Booth?

    No recent polls found