|Perl Monk, Perl Meditation|
Three reasons to deobfuscate vladb's signature: fun and profit, as pointed out by jmcnamara; vladb monkself said What I think would be useful, in addition to your idea, is to have more authors of original obfuscations to submit a link somewhere in their post to a spoiler page but the only spoiler I could find relating to the sig was in the original post; finally, vladb said that he is still ...not able to de-obfuscate much of the code... me neither, and I need to start somewhere! :)
1. The original
2. Apply a bit of formatting. (Or, in restrospect, why I'm a lamo.)
I'm unsure whether that grep;;$,=q"grep"; should really be one line or two... I'll assume one for now.
3. Add in some line numbering.
At first glance, line 1 seems to set $", the quoted-array-seperator, to the letter 'q'. The tinkering with $" makes me think that vladb is going to use an array somewhere later on; and thinking that $"=q; meant $"='q';, one might start thinking about the upcoming array...
I'm kind of slow with coding, so just ran a quick oneliner to see whether the first line really does what I thought:
This is surprising; obviously two things are happening here:
1. $"=q; is actually using the q operator to quote, umm, something...
2. For some reason, when $" is set to that, err, something, the second print statement fails.
Line 1 confused me! B::Deparse gets a lot of mention in the monastery; maybe it can help me here.
which is the equivalent of this:
Cut down a tree with a herring? Sure, I'll try, but only if it's red...
If I had been running the modified signature as I, um, modified it, I would've caught my mistake sooner. As it is, vladb's misdirection waylaid me for an hour (actually, I gave up, but while eating lunch figured out my mistake). But this time gave me a chance to read up on $" and $, in perlvar.
I'm a big fan of $", actually; I do this in oneliners a lot: $"=$/; print "@a\n"; which prints the elements of @a on their own line. ($/, incidentally, is the "input record separator"; the default character is \n.)
But I don't use $,
4. Code re-write, using discovery above
Whew! At this point, we've only looked at the first two lines of code! Fortunately, lines 4-7 are fairly straightforward.
Line 4: Setting $" and $, to 'grep' is a clue that vladb's signature is a Unix utility of some sort; the for ( `find ...`) clinches it. (Uhh, not to mention the original description!)
line 4 runs a shell command (the Unix command "find") and foreach line that is returned, processes them according to lines 5-9.
This particular find command is going to search the current directory (and, for some implementations of find, subdirectories) for files that match a particular naming convention. The regex for these filenames would be something like /^\.saves.*?~$/, if that helps you. Otherwise, here's a few examples:
foo.saves_blah~ # no match .saves_foo # no match .saves_foo~ # match!On Unix and Linux, a filename that starts with a dot (.) is a "hidden" file, which can only be seen if you use an extra flag on 'ls' (same function as DOS 'dir' command). So the find command is going to find a bunch of "hidden" files that start '.saves', continue with whatever text describes what file is saved, and end with a tilde (~). An example might be .saves_Big_Project_backup_27~
Chances are you don't have any of these in your directory on your machine, so the find command would return nothing. And with no data to apply the for block to, perl just skips the block in totum.
Well, that's pretty boring stuff. I wonder what happens when vladb uses this tool on his machine? Presumably the find command returns some data, so lines 5-9 get to kick in.
Line 5: I didn't bother reformatting this; we can do so now. s;$/;;; In a substitution (s///), one can choose an alternate delimiting character. This is useful if you have a lot of '/' that you are processing, and find yourself escaping them all the time: '\/'. Consider if you wanted to remove all '//' from a line:
Notice how much cleaner the second form is.
vladb is doing the same thing: using an alternate delimiter on his s///. He's using ';', though, because he figures that he might be able to catch overzealous deobfuscators out a second time (remember the "a lamo"!) But we're on to his semicolon madness, and know immediately that line 5 is globally removing all $/ characters from $_ - and since $/ defaults to \n, and since vladb hasn't changed it, we know we're really removing all newline characters from $_. find is only going to return one newline per line of output - this makes sense - so really line 5 is the same as chomp;
Line 6 is a simple pattern match: perl actually lets you comment your regexes if you want, so let's try that out.
Right away this tells me that I'd misguessed the naming convention that vladb is using: my previous example, .saves_Big_Project_backup_27~, wouldn't have succeeded at all: the regex says there must be a hyphen, some digits, and a hyphen; the example actually doesn't have any hyphens surrounding the digits. (Oh well, the example served its purpose: to get me thinking about the data.)
The naming convention is probably .saves-$$-~ where "$$" is the process id number of the program that created the save file. Putting the process id, or pid, into a temporary file's name is useful for two reasons: first, generally your OS doesn't cycle pids very quickly, so it's a lazy way of making sure your temp file names are unique; second, you can identify the owner of the temp file, and if the owner isn't running anymore, you can remove the old file.
Line 7 made my eyes water. It looks like a shell command is being built, but to do what? Remember that line 6 stuffed a pid into $2. Line 7 is going to use that stored data and build a ps command that checks whether that pid is still around.
$_=["ps -e -o pid | "," $2 | "," -v "," "];
First off, we've got what I call "the anonymous array square brackets". (It ain't catchy but it sure helps me remember what they do.)
If we de-obfuscate this line a bit, it becomes:
Where did those 'grep's come from? Remember back to line 2:
$, = 'grep';
So where you see a comma in line 7, you can mentally think "grep" instead.
But what does the @command do? Let's look.
ps -e -o pid # use the 'ps' command to look at the process stack; # the -e flag says to look at all running processes; # the '-o pid' flag specifies to return their process ids. | # take the output from the previous command and use it # as input for this next command grep $2 # look for the pid that we found in line 6; this pid, # remember, comes from the tempfile name, and tells us # who the owner of $_ is. | # take the output from the previous command and use it # as input for this next command grep -v grep # Right now there might be two lines in the process stack # that have $2 in them: first is our grep line from earlier # in this pipeline; second is the process whose pid really # is $2. We want to ignore the grep lines; this way we avoid # a situation where we see $2 in the process stack and think # it's the process we're looking for when really it's just us!In line 8 vladb will actually run this command; for now if you only take one thing away from this, it should be this: the output of the command will be either 0 lines of data, in which case the process isn't running, or it will be 1 line of data, in which case the process still is running.
Remember, though, that vladb didn't want to give away the whole bag at once, so instead of writing:
$_ = "ps -e -o pid | grep $2 | grep -v grep";
he instead wrote
$_=["ps -e -o pid | "," $2 | "," -v "," "];
And one of the consequences of this is that $_ isn't actually the full command that we want; it's a pointer to an anonymous array- the anonymous array is what contains the real command!p> ====
So in Line 8, when vladb actually want to check the process stack for those running processes, he must first dereference the array.
As TheDamian wrote in an oldish article archived at perl.com, ...A reference is like the traditional Zen idea of the "finger pointing at the moon". It's something that identifies a variable, and allows us to locate it. And that's the stumbling block most people need to get over: the finger (reference) isn't the moon (variable); it's merely a means of working out where the moon is.
(n/b if you haven't searched perl.com for your favorite authors and personalities that hang out on perlmonks: why haven't you? Many have written articles that will improve your understanding and use of perl almost within seconds of reading!)
The dereferencing is done in line 8 by simply tossing an at-sign, @, in front of $_.
Like line 4, line 8 uses backticks `` to run an external command and feed its output back into the program. We know from the discussion of line 7 what the command is - a search of running process IDs - and what the expected output is (either nothing or a process ID).
Line 8 also uses a ternary conditional: this is a fancy way of writing an if-else statement in just one line.
This is the same as:
We can re-write line 8 a little:
And we can re-write it a little more:
Line 9 prints $\, which, umm, defaults to nothing; here it looks like it's being treated as a newline, though, doesn't it? I've gotta admit: I'm not sure where $\ gets set to \n...
Another thing I'm not sure of is why $" was set to 'grep'; this seems like a bit of misdirection on vladb's behalf. After all, he only builds one array - in line 7 - and never double-quotes it. So as far as I can tell, $" never gets used.
And for my own fun, here's the de-obfuscated tool.
Hopefully this will be useful to some other monks as an example of how to start de-obfuscating. This is my first turn at writing a spoiler, and I gotta admit: it was pretty fun to figure this stuff out. Although (because?) I made a few wrong turns in my assumptions about the code, this exercise also helped me learn a little bit more about Perl. Thanks jmcnamara for the thread and vladb for the spoiler opportunity.
In reply to Obfu spoiler: vladb's sig (was Re: Deobfuscation for fun and profit)