Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

unpacking wmic command's unicode output

by goibhniu (Hermit)
on Nov 11, 2008 at 19:38 UTC ( [id://722956]=perlquestion: print w/replies, xml ) Need Help??

goibhniu has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to use unpack (or let me know if there's a better tool) to get at just a few columns of data from the output of Windows' wmic command:

Here's the header line only:

C:\CHAS_S~1\COLUMN~1> wmic process|find /i "Caption" Caption CommandLine CreationClassName CreationDate CSCreationClassName CSName Description ExecutablePath ExecutionState Handle HandleCount InstallDate KernelModeTime MaximumWorkingSetSize MinimumWorkingSetSize Name OSCreationClassName OSName OtherOperationCount OtherTransferCount PageFaults PageFileUsage ParentProcessId PeakPageFileUsage PeakVirtualSize PeakWorkingSetSize Priority PrivatePageCount ProcessId QuotaNonPagedPoolUsage QuotaPagedPoolUsage QuotaPeakNonPagedPoolUsage QuotaPeakPagedPoolUsage ReadOperationCount ReadTransferCount SessionId Status TerminationDate ThreadCount UserModeTime VirtualSize WindowsVersion WorkingSetSize WriteOperationCount WriteTransferCount

I only want Caption, ParentProcessId, ProcessId and CommandLine from this.

It seemed to be fixed width data instead of delimited data, but I looked in a hex editor to see if those weren't tab delimeters. It turns out worse than that: everything is unicode:

C:\CHAS_S~1\COLUMN~1> perl -ne "print" header.bin  ■C a p t i o n C o m m a n d L i n +e C r e a t i o n C l a s s N a m e C r e a +t i o n D a t e C S C r e a t i o n C l a s s N a +m e C S N a m e D e s c r i p t i o n + E x e c u t a b l e P a t h E x e c u t i o n S t a t e +H a n d l e H a n d l e C o u n t I n s t a l l D a t e K e r n e l +M o d e T i m e M a x i m u m W o r k i n g S e t S i z e M i n i m u +m W o r k i n g S e t S i z e N a m e O S +C r e a t i o n C l a s s N a m e O S N a m e O t h e r O p e r a t i o n C o u n t O t h +e r T r a n s f e r C o u n t P a g e F a u l t s P a g e F i l e U s a +g e P a r e n t P r o c e s s I d P e a k P a g e F i l e U s a g e +P e a k V i r t u a l S i z e P e a k W o r k i n g S e t S i z e P r i +o r i t y P r i v a t e P a g e C o u n t P r o c e s s I d Q u o t +a N o n P a g e d P o o l U s a g e Q u o t a P a g e d P o o l U s a g e + Q u o t a P e a k N o n P a g e d P o o l U s a g e Q u o t a P e a k P a +g e d P o o l U s a g e R e a d O p e r a t i o n C o u n t R e a d T r +a n s f e r C o u n t S e s s i o n I d S t a t u s T e r m i n a t +i o n D a t e T h r e a d C o u n t U s e r M o d e T i m e V i r t +u a l S i z e W i n d o w s V e r s i o n W o r k i n g S e t S i z e + W r i t e O p e r a t i o n C o u n t W r i t e T r a n s f e r C o u n t C:\CHAS_S~1\COLUMN~1>

So I tried to teach myself pack and unpack real quick. This reminds me of the first time I ran into Regular Expressions; the learning curve seems rather steep.

I couldn't get the W pattern to work, it turns out because I'm on 5.8.8 instead of 5.10 (and so is the prod server it will run on).

Now I'm at:

C:\CHAS_S~1\COLUMN~1> perl -ne "($caption,$commandline)=unpack('@2U[42] U[270]',$_);print $c +aption;" h eader.bin 67

which is sorta correct (67 is 'C'), but what I want is the whole word. A and a aren't quite it either:
C:\CHAS_S~1\COLUMN~1> perl -ne "($caption,$commandline)=unpack('@2A[42] A[270]',$_);print $c +aption;" h eader.bin C a p t i o n C:\CHAS_S~1\COLUMN~1> perl -ne "($caption,$commandline)=unpack('@2a[42] a[270]',$_);print $c +aption;" h eader.bin C a p t i o n

Either how do I get U to give me something readable instead of a code, or how do I get print to turn 'C a p t i o n' into 'Caption'?


#my sig used to say 'I humbly seek wisdom. '. Now it says:
use strict;
use warnings;
I humbly seek wisdom.

Replies are listed 'Best First'.
Re: unpacking wmic command's unicode output
by almut (Canon) on Nov 11, 2008 at 19:50 UTC

    It's probably easier to open the file as UTF-16

    open my $fh, "<:encoding(UTF-16)", $filename or ...

    and then operate on the resulting text string as usual...

    (Use UTF-16LE, in case the file should have no BOM.)

      That's a good idea and I may resort to that. I had it in my head to get this piped from the output. Can I open a pipe as UTF-16?

      Update: I was thinking of a syntax like:
      wmic process | find /i "myprog.exe" | getmyfields.pl


      #my sig used to say 'I humbly seek wisdom. '. Now it says:
      use strict;
      use warnings;
      I humbly seek wisdom.
Re: unpacking wmic command's unicode output
by cmv (Chaplain) on Nov 13, 2008 at 18:32 UTC
    goibhniu-

    I wrestled with wmic a while back and came up with a very perl-friendly way to access the data.

    The key is using list /format:value on the command line.

    -Craig

    use strict; use Tk; use Data::Dumper; my %DATA; # Get info via wmic (not available on all versions of windows)... my @data = split(/^\s*\cM\n/m, `wmic process list /format:value`); shift(@data); pop(@data); # Remove first/last blank lines my %procs; # Iterate through each element... foreach my $p (@data) { $p =~ s/\cM//g; # Grrrr, rotten windows my %child = split(/[=\n]/, $p); # Hashify information $procs{$child{ProcessId}} = \%child; } print Dumper(\%procs), "\n";

      Very good, but it doesn't quite meet our use case. Our Admin knows the exe name he's looking for, but there could be several running at one time differentiated by the command line arguments. Task Manager shows them all, but not the command line arguments, so the admin doesn't know which to kill. wmic process list full produces outpout similar to your /format:value, so we did things like wmic process list full | find /i "ourprog.exe", but when more than one was running, couldn't tell the pid (because it's on another line). Just plain wmic process put all the data on one line so that's what I was thinking about and it colored my design space.

      I actually like your hash-ification better, and if I were writing a larger program (I see you have Tk in your use statements) that's exactly how I'd go.

      Thanks again.


      #my sig used to say 'I humbly seek wisdom. '. Now it says:
      use strict;
      use warnings;
      I humbly seek wisdom.
Re: unpacking wmic command's unicode output
by goibhniu (Hermit) on Nov 12, 2008 at 17:25 UTC

    I learned some things while working on this little problem, so I thought I'd share. update: changed $ARGV[0] for @ARGV per discussion.

    The first thing is that wmi doesn't necessarily use the same fixed headers on different machines. My first attempt was a simple batch file and worked on my machine:


    but not on either the machine of the admin I was doing this for or on the prod machine he needed it for. The static unpack template was off for the set of fields wmi was using on the other machines (to be semantically honest I didn't investigate enough to blanme this on wmi; it was just off).

    So, I ended up turning it inside-out. Instead of using perl in a batch file I used the wmic command in a perl script. I ended up opening a file (well, pipe) like almut suggested, but using BrowserUK's command line option to cover the unicode issue.

    If I were to spend more time on it I would get the common code (to get the position of a field and also to constuct the output lines) out into functions. Also, I've rethunk how to get the length of the fields by looking for something like /\G(\b)/gc, but it's a fairly strightforward little throw-away script, so I'm not going to spend that much more time on it.

    Back to lessons learned:
    - if you're searching for 'ProcessID' you'll find it in 'ParentProcessID' (duh!); this messed me up for a little bit and drove me to put all that $nextField logic in (again, I've rethunk it and think it would be more elegant with \G).
    - -CS worked on the command line, but when I added it to the she-bang (#!/usr/bin/perl -CS -W) it complained about an 'Unknown Unicode option letter'. I finally figured out I had to combine things in the command options as #!/usr/bin/perl -WCS. In retrospect, there are other problems in my past that are explained by this and I never knew it.
    - I had to pos=undef; to get the search to reset, since it didn't search backwards from the previous match.
    - $#ARGV is 0 whether there are zero or 1 arguments. This led me to the $ARGV[0] ? : solution.
    - I was worried about getting the output fields in the order I want and was happy to learn that using @location in the template let me bounce around in the string and still return the fields in the order they appreared in the template.

    I'm sure there are other lessons here and many wasy to improve my little script, but as I said, I probably won't spend any more time on it (the script that is; I'll happily spend time learning form any feedback I get here). I hope the lessons learned are helpful.


    #my sig used to say 'I humbly seek wisdom. '. Now it says:
    use strict;
    use warnings;
    I humbly seek wisdom.

      $#ARGV is 0 whether there are zero or 1 arguments.

      That's not true. $#ARGV is -1 when there are no arguments.

      And by the way, I think -CS is unnecessary and even harmful. wmic probably outputs characters based on your local, so use open ':std', ':locale'; would be more appropriate.

        Hrmmm . . . I copied only the relevant code from my script, above:

        C:\chas_sandbox\columns-by-name> copy con ARGVtest.pl #!/usr/bin/perl -WCD use strict; use warnings; use Data::Dumper; $\ = $/; my $debug = 1; #array of fields to display my @processFields = ('Caption','ParentProcessId','ProcessId','CommandL +ine'); #ARGV processing my $searchfor = $ARGV[0] ? join(' ',@ARGV) : die("I need a process to +look for." ); ^Z 1 file(s) copied. C:\chas_sandbox\columns-by-name> notepad ARGVtest.pl

        and changed it a little (in notepad, above) and tested it:

        C:\chas_sandbox\columns-by-name> type ARGVtest.pl #!/usr/bin/perl -WCD use strict; use warnings; use Data::Dumper; $\ = $/; my $debug = 1; #array of fields to display my @processFields = ('Caption','ParentProcessId','ProcessId','CommandL +ine'); #ARGV processing my $searchfor = $#ARGV ? join(' ',@ARGV) : die("I need a process to lo +ok for."); # ^^^^^^ change here C:\chas_sandbox\columns-by-name> ARGVtest.pl I need a process to look for. at C:\chas_sandbox\columns-by-name\ARGVt +est.pl lin e 14. C:\chas_sandbox\columns-by-name> ARGVtest.pl chas I need a process to look for. at C:\chas_sandbox\columns-by-name\ARGVt +est.pl lin e 14.

        and added some debug (in notepad again) and tested again:

        C:\chas_sandbox\columns-by-name> type ARGVtest.pl #!/usr/bin/perl -WCD use strict; use warnings; use Data::Dumper; $\ = $/; my $debug = 1; #array of fields to display my @processFields = ('Caption','ParentProcessId','ProcessId','CommandL +ine'); #ARGV processing print Dumper(\@ARGV); print $#ARGV; my $searchfor = $#ARGV ? join(' ',@ARGV) : die("I need a process to lo +ok for."); C:\chas_sandbox\columns-by-name> ARGVtest.pl $VAR1 = [ '' ]; 0 I need a process to look for. at C:\chas_sandbox\columns-by-name\ARGVt +est.pl lin e 16. C:\chas_sandbox\columns-by-name> ARGVtest.pl chas $VAR1 = [ ' chas' ]; 0 I need a process to look for. at C:\chas_sandbox\columns-by-name\ARGVt +est.pl lin e 16.

        . . . and it sure looks like $#ARGV is 0 either way. I totally trust your experience, so I conclude that either my test is wrong or my conclusions are. Does the she-bang line arguments mess this up or something?


        #my sig used to say 'I humbly seek wisdom. '. Now it says:
        use strict;
        use warnings;
        I humbly seek wisdom.

        Thanks and ++ for the tip about :std or :locale. It's working in place as is, so I'll file this away for the next time I'm working on it. Again, it's a one-off script, but this is a good place to document the caveats for others who might be researching similar things.


        #my sig used to say 'I humbly seek wisdom. '. Now it says:
        use strict;
        use warnings;
        I humbly seek wisdom.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://722956]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-19 13:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found