Newbie Regex Problem

by Anonymous Monk
on Oct 19, 2007

Hello All

I am trying to use a regex to return the text "MyPc" in the below example. Instead of getting just MyPc, I am getting "MyPc,OU=MyOu,DC=NyDomain". A quick hand would be greatly appreciated.
$test="LDAP://CN=MyPc,OU=MyOu,DC=NyDomain,DC=com"; if ($test =~ "LDAP://CN=(.*),.*") { print "$1\n"; }
Thank you.

Re: Newbie Regex Problem
by Joost (Canon) on Oct 19, 2007 at 21:20 UTC
    As others have noted, * is greedy, matching everything it can, which in your case of . means as many characters as possible (except newlines, see perlretut for the details).

    You can either make your .* match non-greedily, using .*? which makes it match the shortest substring that makes the overall match work, or you can explicitly exclude the , character from the part you want to match:

    if ($test =~ m#LDAP://CN=([^,]*)#) { }
    Note that I used m# ... # syntax instead of "" quotes, since using a plain string as a regex literal can cause additional headaches, and using the usual // delimiters requires you to escape the / characters in your match. See the link for more information on perl's special quoting operators.

    updated: forgot to add the * to [^,]*

Re: Newbie Regex Problem
by almut (Canon) on Oct 19, 2007 at 21:11 UTC

    You need a non-greedy match:

    if ($test =~ "LDAP://CN=(.*?),") {

    (note the question mark)

Re: Newbie Regex Problem
by andyford (Curate) on Oct 19, 2007 at 21:12 UTC

    The '*' is greedy, matching as much as it can. To make it non-greedy, put a question mark after it.

    if ($test =~ "LDAP://CN=(.*?),.*") {

    non-Perl: Andy Ford

Re: Newbie Regex Problem
by dsheroh (Monsignor) on Oct 19, 2007 at 21:24 UTC
    Alternately, aside from going non-greedy, you could specify more exactly just what it is you're looking for. Namely, that you only want non-comma characters:

    if ($test =~ "LDAP://CN=([^,]*),") {

Re: Newbie Regex Problem
by FunkyMonk (Chancellor) on Oct 19, 2007 at 21:12 UTC
    * is greedy, which means it will match the longest substring it can. You can force * to be non-greedy by following it with ?
    if ($test =~ m{LDAP://CN=(.*?),}) {

    You'll also note I've taken out the last .*, as it serves no purpose and changed your string to a regexp.

    See perlre and perlretut for more information.

Re: Newbie Regex Problem
by omouse (Initiate) on Oct 20, 2007 at 15:28 UTC
    If $test isn't part of a larger string, you can just slice it instead of using a regular expression.
    $test="LDAP://CN=MyPc,OU=MyOu,DC=NyDomain,DC=com"; $CN = substr($test, 10, index($test, ",") - 10); print "$CN\n";
    You can slice off the "LDAP://CN=" since you know it'll always be there, then you slice to the first comma.

