Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Wanted: humanly readable `script` output

by mrslopenk (Acolyte)
on May 04, 2008 at 18:42 UTC ( [id://684459]=perlquestion: print w/replies, xml ) Need Help??

mrslopenk has asked for the wisdom of the Perl Monks concerning the following question:

Sometimes I want a transcript of my terminal sessions. My terminal buffer fills up too quickly, and copy/pasting from the terminal into an emacs buffer is also a hassle.

The unix command 'script' almost does what I want; it outputs a file called 'typescript' by default. But typescript has lots of terminal control characters (like backspaces) in it.

What perl module can process the typescript file so I get a humanly readable ascii file?
  • Comment on Wanted: humanly readable `script` output

Replies are listed 'Best First'.
Re: Wanted: humanly readable `script` output
by pc88mxer (Vicar) on May 04, 2008 at 19:55 UTC
    If it is only backspaces you want to remove, try this:
    my $text = "abcDE\x08\x08xyz\x08Z"; # \x08 is the backspace character while ($text =~ m/\x08/g) { substr($text, pos($text)-2, 2, ''); } print $text, "\n"; # emits: abcxyZ
    Actually, I'm a little surprised that this works. I would think that modifying $text before the last match position in a m//g loop would screw things up.

    Update: A slightly more optimized version:

    while ($text =~ m/(\x08+)/g) { substr($text, $pos-2*length($1), 2*length($1), ''); } print $text, "\n";
      Thanks for the suggestions everyone, but there are a lot of other control characters besides the backspace. If you are at a unix terminal, try this:
      script ls exit
      Then look in typescript. You'll see what I mean. Actually, there is a semi-passable solution that does not involve perl:
      TERM=dumb script ls exit cat typescript | col -b > readable.txt
      This seems to work as long as all the commands you type after script work on a dumb terminal. I was just wondering if perl had a more robust solution. I had a look at Term::Cap, but I fail to understand how to use it. I feel there must be a solution because, after all, I can see the very text I want on the terminal screen -- it's all right there! -- until my buffer size is reached. Usually copy/paste with the mouse from terminal to emacs works ok, but it didn't work the other night when I was compiling a kernel -- the output exceeded my buffer size.
        To capture the output of a kernel compile, why not just redirect the output?
        make zImage... > compile-output 2>&1
        Also, I usually put such commands in the background and tail the output file occasionally to see how things are going,
        #!/usr/bin/perl use strict; use warnings; open(FILE,"<","typescript") || die; while (my $line=<FILE>) { $line=~s/.\x08//g; $line=~s/\p{IsC}//g; print "$line\n"; } close(FILE);
        This left behind a lot of funny things, which, to my eyes look like:
        [00m, [01;34m, [01;35m, [m]0;
        characters.

        This seems to work as long as all the commands you type after script work on a dumb terminal. I was just wondering if perl had a more robust solution

        How could removing terminal-specific control sequences be any more robust than not having them in the first place?

        The next best solution would be to have some program which process's your terminal's control sequences to generated a flat file. However, emulating a devices is rather complex and error-prone (hardly robust), and you'll have to make some concession or another in order to handle text being overwritten.

      You can replace the while loop with this little regex:
      $text =~ s/.\x08//g;

        Not really. Consider:

        my $text = "No \x08\x08\x08 good"; $text =~ s/.\x08//g; print $text;

        Prints:

        No good

        Perl is environmentally friendly - it saves trees
Re: Wanted: humanly readable `script` output
by shmem (Chancellor) on May 05, 2008 at 01:22 UTC
    What perl module can process the typescript file so I get a humanly readable ascii file?

    To what end? Terminal captures include chars that are typed then erased, cursor positioning sequences, text colouring and so on. You can't convert terminal output into a "human readable ascii file" without loosing information which might be important to you, and you have a hard time converting terminal output into a human readable stream of information.

    You could view the typescript file with less -R which renders escape sequences directly, but cursor movement sequences will blow it. Terminal output has been produced by the terminal, and your terminal is the program which groks the captured output, so use that.

    If what you want is reading the terminal output again at a later time, capture that with

    script -t typescriptfile 2> timingfile

    to include timing information, then view it with 'scriptreplay'. Between <readmore> tags is the slightly hacked version I use to that end. Speeding up the output with a divisor lets you skim typescript files fast. I don't know of any better pager for typescript files. Update: adapted for use without timing file.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Wanted: humanly readable `script` output
by CountZero (Bishop) on May 04, 2008 at 19:37 UTC
    Without having seen your 'typescript'-file, I guess that filtering it through a regular expression which only allows printable characters through is the favoured solution.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Wanted: humanly readable `script` output
by swampyankee (Parson) on May 04, 2008 at 19:50 UTC

    It shouldn't be too difficult to write a Perl script to strip control characters; I doubt if a module is needed. Look in perlre, perlretut, and perlrecharclass.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

      What about cat -v ?

        The copy of the man pages I have for cat (GNU coreutils 6.9.92.4-f088d-dirt January 2008) says this: "-v, --show-nonprinting
        " use ^ and M- notation, except for LFD and TAB;"
        it doesn't strip control characters, but maps them to a different form. The FreeBSD entry for cat has a similar, but not identical description for the -v option.


        Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Wanted: humanly readable `script` output
by mrslopenk (Acolyte) on May 05, 2008 at 19:28 UTC
    Thank you ikegame and shmem for your replies. You both seemed to be asking what do I want this for. Indeed from your responses I see I forgot to state clearly what criteria I'm looking for in a solution.

    I want a file which (1) is a snapshot (ignoring color and font type) of the terminal at a given moment in time (rather than a movie, which is what script captures.) and (2) as if the terminal had no buffer size limitation, (3) is editable and searchable with a text editor, (4) is the output of a command-line program, not the result of copy/pasting with a mouse.

    Up to now, my usual practice has been to copy/paste stuff from a terminal into an emacs buffer using a mouse. I lose coloring and bold face text, but I obtain a snapshot that respects erased characters and cursor positioning up to a certain point in time. That satisfies criteria (1) and (3), but not (2) and (4).

    Thank you all very much for your thoughtful suggestions, but due to my inaccurate description of what I'm looking for, none of the suggestions so far meet my criteria.
      Both ikegame and shmem also explained that what you are looking for isn't easily done. Yes perl (or any other language) can do it, and you're barking up the right tree looking at Term::Cap if you're determined to pursue actually creating this beast.

      Let me attempt to shed some light on why this problem is "hard". If you look at your local linux /etc/termcap file, this is the configuration file for all the myriad "terminals" that have ever existed. By using libraries and this configuration file, it is possible to create your own display engine, and this is roughly how all the terminal emulation program(s) you might use all work; whether that is putty, SecureCRT (my fav.), or xterm (or not) on the unix console.

      In my local copy there are over 1500 lines with terminal names/types. So, if you want this to be truly robust you have to be able to handle all those terminal types. Not easy, but with the help of libraries and the termcap file, it isn't as large a mountain as it might seem; but it's certainly not child's play.

      Now, in your favor is the fact that in the real world, you're typically only going to care about a small subset of terminals, namely vt100/200/220 and xterm. Those are defacto standards today, but Linux and ANSI are two others that might also be widely used. That's still 6 different terminal protocols you have to deal with 100% correctly in order to be robust.

      If you're still determined to sink a lot of time into this, the Putty source code is available, I suggest using that as a guide. Good luck.

      -Scott

      Many Emacsen have shell-mode; give that a try. I use GNU Emacs, in which typing M-x shell creates a new (or switches to an existing) buffer in shell-mode within which a shell is run. To run multiple shells, rename each shell-mode buffer and reinvoke M-x shell.

      Invest time researching the capabilities of Emacs. The payoff is great.

      (1) is a snapshot (ignoring color and font type) of the terminal at a given moment in time (rather than a movie, which is what script captures.)

      A snapshot is a still image of a movie. The movie is typescript. The scriptreplay I posted fetaures pausing and continuing the script replay with a space bar press. There's your snapshot; there is no oher way, since terminal output has more about than color - size, movements and erasing.

      You cannot get at some movie snapshot without scrolling through it. For fast scrolling, hit the + key which will increase the divisor; to slow it down, hit the - key.

      (2) as if the terminal had no buffer size limitation

      The buffer grows and shrinks. Is the buffer state to be considered before or after a user hit <Ctrl>L or typed "clear"? The buffer also remains static while copious typing is sent to the terminal - e.g. editing text with vi. No scrollback buffer involved, it all happens at the same buffer location.

      (3) is editable and searchable with a text editor, (4) is the output of a command-line program, not the result of copy/pasting with a mouse.

      Your wish (3) cannot be fullfilled, since terminal output isn't text - it's not linear. A terminal is a screen area in time. Take command line editing, going back and forth inserting and deleting chars. At which moment in time is the searched text ready to be found? Wrt (4), any command line tool which converts terminal output into streaming text will produce something between "clean text" where significant information might be lost, and the "garbage" typescript recorded.

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Wanted: humanly readable `script` output
by repellent (Priest) on May 08, 2008 at 08:11 UTC
    This is a best-effort to strip out most ANSI escape sequences:
    cat typescript | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' + | col -b

      Why, oh why do I always see people explaining away a tough problem as 'impossible to answer'? We're not discussing the halting problem or anything like that. I don't think it proper for monks (Perl or otherwise) to simply shrug off a tough problem as impossible to solve. Rather argue about it in order to find a solution (like Tibetan monks do, with great passion :-) )

      First: let's get the original poster's question and later clarification right: in simple terms of analogy: if you consider (scrolling) terminal output to be a movie of a special kind of printer which prints (and occasionally erases) text on the screen, and scrolling up the screen as feeding the (continuous feed) 'paper' out the top of the screen, then what he wants (and also what he doesn't need, e.g. full state recovery from every point in time) becomes very clear. The very fact that scrollback buffers exist and can be copied (although not very conveniently) corroberates this.

      That anology should put your mind on the right track: printer, paper...hardcopy!

      One simple Google session then quickly brings up the solution: use Gnu/screen for your session. Start it with $ screen -h <scrollback buffer size>. At the shell prompt that appears, type all the commands (including whatever typos / editing), then when you're done issue <Ctrl-A> :hardcopy -h <dumpfile>.

      That's it. You can use whatever full-screen application (vi(m), mc, emacs) you like inside the session, only the last screenful from that will be saved.

      More information here: http://www.samsarin.com/blog/2007/03/11/gnu-screen-working-with-the-scrollback-buffer/

        Why, oh why do I always see people explaining away a tough problem as 'impossible to answer'?

        Because you're hallucinating

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://684459]
Approved by Corion
Front-paged by CountZero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-04-19 23:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found