Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Regex Optimization Question

by gawatkins (Monsignor)
on Mar 23, 2006 at 18:47 UTC ( #538822=perlquestion: print w/ replies, xml ) Need Help??
gawatkins has asked for the wisdom of the Perl Monks concerning the following question:

I have a set of strings returned from a function depicting Active Directory user accounts. The string is in the form of:

Win32_Account.Domain="ADDomain",Name="aduser1" Win32_Account.Domain="ADDomain",Name="aduser2" Win32_Account.Domain="ADDomain",Name="aduser3" Win32_Account.Domain="ADDomain",Name="aduser4"
I would like to change the format to read:
ADDomain\aduser1 ADDomain\aduser2 ADDomain\aduser3 ADDomain\aduser4
So they can easily be understood on a report. I am currently using the following code to perform this task.
#!/usr/bin/perl -w use strict; use warnings; my $string = q/ Win32_Account.Domain="ADDomain",Name="aduser1"/; my ( $domain, $uname ) = split( ",", $string ); $domain =~ s/\"//ig; $domain =~ s/Win32_Account.Domain=//ig; $uname =~ s/\"//ig; $uname =~ s/Name=//ig; #concat the parts or replace with Account Deleted if SID not found my $user = ( $uname eq "" ? "Account Deleted" : $domain . "\\" . $unam +e ); print $user;
Although the code works, is there a more efficient way to do the work. Any advice would be welcomed.

Thank you,
Greg W.

Update:

Comment on Regex Optimization Question
Select or Download Code
Re: Regex Optimization Question
by duff (Vicar) on Mar 23, 2006 at 18:54 UTC

    I think I'd probably do it like this:

    $string =~ /Win32_Account.Domain="(.*?)",Name="(.*?)"/; print $2 eq "" ? "Account Deleted" : "$1\\$2";
    But I don't know if that's more efficient as you haven't told us what axis of efficiency you're interested in. There's efficiency in execution time, efficiency in memory usage, efficiency in programmer time, etc. Generally programmer time is the most important thing to optimize for, so if you understand my version and it does what you want, then maybe it's more efficient. :-)
      Using $1, $2, etc without knowing if the regexp matched is dangerous. For example,
      foreach ('foo', 'bar') { /(oo)/; print("$1\n"); }
      outputs
      oo oo
        True. I've found that in these cases, doing the regex with pattern memory in its own code block works.
        foreach ('foo', 'bar') { { /(oo)/; print("$1\n"); } }
      There's efficiency in execution time, efficiency in memory usage, efficiency in programmer time, etc. Generally programmer time is the most important thing to optimize for,
      IMO, it's actually very, very rare if there's a situation where you can pinpoint "the most important thing to optimize for".

      In fact, I cannot think of any.

      Generally, they are all important, and it takes a good programmer/manager/project leader to strike the right balance.

        Indeed. However, not having any information about where to focus on efficiencies, you can't go wrong on "programmer time" as a sane default thing to optimize. That's why it's generally the most important thing and not specifically the most important thing. :-)

Re: Regex Optimization Question
by Roy Johnson (Monsignor) on Mar 23, 2006 at 18:57 UTC
    This seems to do basically what you want. I just extract what's between quotes and join them together with a backslash. Updated to handle "Account Deleted".
    use strict; use warnings; my $string = q/ Win32_Account.Domain="ADDomain",Name="aduser1"/; for my $string (q/ Win32_Account.Domain="ADDomain",Name="aduser1"/, q/ Win32_Account.Domain="ADDomain",Name=""/) { my @fields = $string =~ /"(.*?)"/g; printf "New string is (%s)\n", $fields[1] eq '' ? 'Account deleted' +: join '\\', @fields; }

    Caution: Contents may have been coded under pressure.
Re: Regex Optimization Question
by davidrw (Prior) on Mar 23, 2006 at 18:59 UTC
    Can you just do it in one regex (i'm going to assume the domain and name don't have dbl quotes)?
    my $string = q/ Win32_Account.Domain="ADDomain",Name="aduser1"/; print ($string =~ /Win32_Account.Domain="(.+?)",Name="(.+?)"/i) ? $1 . "\\" . $2 : "Account Deleted" ;
Re: Regex Optimization Question
by ikegami (Pope) on Mar 23, 2006 at 19:59 UTC

    #/usr/bin/perl -w
    should be
    #!/usr/bin/perl -w

    You're -w is being ignored because of the missing !.

      Thank you, I lost my line during the cut/paste excercise and had to retype it. The -w is equivalent to the use warnings; isn't it. I often just leave the #! line out from my win32 scripts.

      Thank you,
      Greg W.
        It's not quite the same. -w is actually equivalent to BEGIN { $^W = 1; }. There biggest difference is that -w will affect included modules, whereas use warnings; only affects the current scope (block or file).
Re: Regex Optimization Question
by graff (Chancellor) on Mar 24, 2006 at 02:38 UTC
    If the quotation marks are really reliable, you could do it like this:
    my @strings = ( qq/Win32_Account.Domain="ADDomain",Name="aduser1"/, qq/Win32_Account.Domain="foobar",Name=""/, qq/Win32_Account.Domain="ADDomain",Name="aduser2"/, ); for ( @strings ) { my $user = join( "\\", (split /"/)[1,3] ); $user = "Account Deleted" unless ( $user =~ /\S+\\\S+/ ); print "$user\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://538822]
Approved by sweetblood
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2014-07-12 20:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (241 votes), past polls