Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Cleaning the Log

by Sigmund (Pilgrim)
on Sep 02, 2002 at 13:55 UTC ( [id://194577]=perlquestion: print w/replies, xml ) Need Help??

Sigmund has asked for the wisdom of the Perl Monks concerning the following question:

hello,
this maybe silly to ask, worse if asked by a level 5 monk who needs only 75 points up to level 6, but I cannot solve my problem, so...
here's the plain question:
I often use a keylogger, please don't ask why, how and where, but what really matters is that its log looks similar to this:
------cuthere------------
hee{BS}llo, I'll be bnack{BS}{BS}{BS}{BS}ack next saturday.
{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}www.yahoo.com
"keyloggers" + "linux"
{BS}{BS}{BS}
-----cuthere---------------

and so on.
I wanted to write a perlscript to get rid of all those {BS} and substitute them with real backspaces, so that I could see what was REALLY written.
so I tried this:

#!/usr/bin/perl -w use strict; my $output; $output = $ARGV[0].".clean"; open (INF, "< $ARGV[0]"); open (OUF, "> $output"); while (<INF>) { s/{BS}/\b/g; } print OUF $riga; close (INF); close (OUF);
but it fills the log with 0x08 characters instead of backspacing over the previous character. useless.
so I thought of this regexp:

s/.{BS}//g

but it didn't work and I couldn't figure out why.
then I did: after having found the first occurrence of .{BS} it deletes it and falls on the following "{" starting to search again, and finds the next one this way:

{BS}{BS} | ------- this is current position {BS}{BS} ---- this is what it finds! {BS}{BS} ----- and this is what it deletes, leaving me with a {BS{BS

and so on.

the real question is:
how can i tell to perl that it must start searching each time FROM THE BEGINNING and not from the last position, so to be sure that each backspace is converted into a REAL backspace?
thanks, SiG

Replies are listed 'Best First'.
Re: Cleaning the Log
by demerphq (Chancellor) on Sep 02, 2002 at 15:55 UTC
    Actually this is more complicated than a pure regex can handle (perl regexes _may_ be able to handle this, but the code would be scaaaary), this problem is even more difficult than handling balanced constructs and cannot be solved using a formal regular expression.

    This is because we cant simply look for .{BS}, as is clear from your sample data such as bnack{BS}{BS}{BS}{BS} where the last {BS} actually remove the 'n'.

    But only a slight amount of additional code will allow a working solution....

    my $logdata=<<EOLOG; pass{BS}{BS}{BS}{BS}{BS}{BS} hee{BS}llo, I'll be bnack{BS}{BS}{BS}{BS}ack next saturday.{BS}{BS}{BS +}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS} www.yahoo.com "keyloggers" + "linux"{BS}{BS}{BS} EOLOG my $clean; while ($logdata=~/\G(\{BS}|.)/sg) { length ($1)>1 && (length($clean)==0 || substr($clean,length($clean +)-1,1,"")) or $clean.=$1; } print $clean;
    YMMV

    Yves / DeMerphq
    ---
    Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)

      perl regexes _may_ be able to handle this, but the code would be scaaaary
      I felt like doing something scary. *grin* The trick is to use sexeger - a regex that operates on the reversed string.
      s/{BS}/\b/g; $_ = reverse $_; my $k = 0; s/(\010+)(??{ $k += length $+; "([^\010]{0,$k})" })(?{ $k -= length $+ + })//g; $_ = reverse $_;
      Don't forget to tune in for the next issue of H.R. Giger meets Perl. *grin*

      Makeshifts last the longest.

Re: Cleaning the Log
by fglock (Vicar) on Sep 02, 2002 at 14:00 UTC

    Use  while() to repeat the search until all is done:

     perl -e '$a = "abcde{BS}{BS}f"; while($a =~ s/.{BS}//) {}; print $a;'
      Try this with these
      abcde{BS}{BS}{BS}{BS}{BS}{BS}f {BS}{BS}abcde{BS}{BS}{BS}{BS}{BS}c{BS}fat A{B}{BS}S}bout
      It should output:
      f fat A{BS}bout
      :-)

      Yves / DeMerphq
      ---
      Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)

Re: Cleaning the Log
by Arien (Pilgrim) on Sep 02, 2002 at 21:53 UTC

    Here's to Golf!

    perl -pe's/\{BS\}/\b/g;1while s/.?\010//' file

    — Arien

    Edit: Good catch, BrowserUk++. I forgot about that one, but I now remember I don't need the first zero. To tie: ;-)

    perl -pe's/\{BS\}/\b/g;1while s/.?\10//' file

      Revenge! -1 :8D

      perl -pe's/\{BS\}/\b/g;1while s/.?\cH//' file

      Update:In that case, we might as well do away with that redundant '?'. If there isn't a char before it, then there is nothing to backspace over!

      Revenge again! :) -1 more.

      In private discussion, Arien pointed out that whilst the cusor may not move back past the left hand edge of the screen if you keep pounding the backspace key, it is unlikely that the keystroke logger would have the smarts to realise that, and so would likely keep recording the useless keystrokes.

      I therefore willingly accept his offer of a draw. (Drat! and double-drat! Thought I had'im that time!)

      perl -pe's/\{BS\}/\b/g;1while s/.\cH//' file


      Well It's better than the Abottoire, but Yorkshire!

        I messed up pretty bad before, but nobody took advantage... -11!

        perl -pe'1while s/.?\{BS\}//' file

        — Arien

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://194577]
Approved by Stegalex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2025-02-12 21:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found