Re: Wanted: humanly readable `script` output
by pc88mxer (Vicar) on May 04, 2008 at 19:55 UTC
|
If it is only backspaces you want to remove, try this:
my $text = "abcDE\x08\x08xyz\x08Z"; # \x08 is the backspace character
while ($text =~ m/\x08/g) {
substr($text, pos($text)-2, 2, '');
}
print $text, "\n"; # emits: abcxyZ
Actually, I'm a little surprised that this works. I would think that modifying $text before the last match position in a m//g loop would screw things up.
Update: A slightly more optimized version:
while ($text =~ m/(\x08+)/g) {
substr($text, $pos-2*length($1), 2*length($1), '');
}
print $text, "\n";
| [reply] [d/l] [select] |
|
Thanks for the suggestions everyone, but there are a lot of other control characters besides the backspace. If you are at a unix terminal, try this:
script
ls
exit
Then look in typescript. You'll see what I mean.
Actually, there is a semi-passable solution that does not involve perl:
TERM=dumb script
ls
exit
cat typescript | col -b > readable.txt
This seems to work as long as all the commands you type after script work on a dumb terminal. I was just wondering if perl had a more robust solution. I had a look at Term::Cap, but I fail to understand how to use it.
I feel there must be a solution because, after all, I can see the very text I want on the terminal screen -- it's all right there! -- until my buffer size is reached. Usually copy/paste with the mouse from terminal to emacs works ok, but it didn't work the other night when I was compiling a kernel -- the output exceeded my buffer size.
| [reply] [d/l] [select] |
|
To capture the output of a kernel compile, why not just redirect the output?
make zImage... > compile-output 2>&1
Also, I usually put such commands in the background and tail the output file occasionally to see how things are going,
| [reply] [d/l] [select] |
|
|
#!/usr/bin/perl
use strict;
use warnings;
open(FILE,"<","typescript") || die;
while (my $line=<FILE>) {
$line=~s/.\x08//g;
$line=~s/\p{IsC}//g;
print "$line\n";
}
close(FILE);
This left behind a lot of funny things, which, to my eyes look like:
[00m, [01;34m, [01;35m, [m]0;
characters. | [reply] [d/l] [select] |
|
|
This seems to work as long as all the commands you type after script work on a dumb terminal. I was just wondering if perl had a more robust solution
How could removing terminal-specific control sequences be any more robust than not having them in the first place?
The next best solution would be to have some program which process's your terminal's control sequences to generated a flat file. However, emulating a devices is rather complex and error-prone (hardly robust), and you'll have to make some concession or another in order to handle text being overwritten.
| [reply] |
|
You can replace the while loop with this little regex:
$text =~ s/.\x08//g;
| [reply] [d/l] |
|
my $text = "No \x08\x08\x08 good";
$text =~ s/.\x08//g;
print $text;
Prints:
No good
Perl is environmentally friendly - it saves trees
| [reply] [d/l] [select] |
|
Re: Wanted: humanly readable `script` output
by shmem (Chancellor) on May 05, 2008 at 01:22 UTC
|
What perl module can process the typescript file so I get a humanly readable ascii file?
To what end? Terminal captures include chars that are typed then erased, cursor positioning sequences, text colouring and so on. You can't convert terminal output into a "human readable ascii file" without loosing information which might be important to you, and you have a hard time converting terminal output into a human readable stream of information.
You could view the typescript file with less -R which renders escape sequences directly, but cursor movement sequences will blow it. Terminal output has been produced by the terminal, and your terminal is the program which groks the captured output, so use that.
If what you want is reading the terminal output again at a later time, capture that with
script -t typescriptfile 2> timingfile
to include timing information, then view it with 'scriptreplay'. Between <readmore> tags is the slightly hacked version I use to that end. Speeding up the output with a divisor lets you skim typescript files fast. I don't know of any better pager for typescript files. Update: adapted for use without timing file.
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
| [reply] [d/l] [select] |
Re: Wanted: humanly readable `script` output
by CountZero (Bishop) on May 04, 2008 at 19:37 UTC
|
Without having seen your 'typescript'-file, I guess that filtering it through a regular expression which only allows printable characters through is the favoured solution.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] |
|
The GNU "strings" program should do exactly that. It's installed on all Linux machines I know. Note that it ignores delete characters etc (instead of deleting the previous character), but that's /usually/ not much of a problem.
| [reply] |
Re: Wanted: humanly readable `script` output
by swampyankee (Parson) on May 04, 2008 at 19:50 UTC
|
It shouldn't be too difficult to write a Perl script to strip control characters; I doubt if a module is needed. Look in perlre, perlretut, and perlrecharclass.
Information about American English usage here and here. Floating point issues? Please read this before posting. — emc
| [reply] |
|
| [reply] |
|
The copy of the man pages I have for cat (GNU coreutils 6.9.92.4-f088d-dirt January 2008) says this:
"-v, --show-nonprinting "
use ^ and M- notation, except for LFD and TAB;"
it doesn't strip control characters, but maps them to a different form. The FreeBSD entry for cat has a similar, but not identical description for the -v option.
Information about American English usage here and here. Floating point issues? Please read this before posting. — emc
| [reply] |
Re: Wanted: humanly readable `script` output
by mrslopenk (Acolyte) on May 05, 2008 at 19:28 UTC
|
Thank you ikegame and shmem for your replies. You both seemed to be asking what do I want this for. Indeed from your responses I see I forgot to state clearly what criteria I'm looking for in a solution.
I want a file which (1) is a snapshot (ignoring color and font type) of the terminal at a given moment in time (rather than a movie, which is what script captures.) and (2) as if the terminal had no buffer size limitation, (3) is editable and searchable with a text editor, (4) is the output of a command-line program, not the result of copy/pasting with a mouse.
Up to now, my usual practice has been to copy/paste stuff from a terminal into an emacs buffer using a mouse. I lose coloring and bold face text, but I obtain a snapshot that respects erased characters and cursor positioning up to a certain point in time. That satisfies criteria (1) and (3), but not (2) and (4).
Thank you all very much for your thoughtful suggestions, but due to my inaccurate description of what I'm looking for, none of the suggestions so far meet my criteria. | [reply] |
|
Both ikegame and shmem also explained that what you are looking for isn't easily done. Yes perl (or any other language) can do it, and you're barking up the right tree looking at Term::Cap if you're determined to pursue actually creating this beast.
Let me attempt to shed some light on why this problem is "hard". If you look at your local linux /etc/termcap file, this is the configuration file for all the myriad "terminals" that have ever existed. By using libraries and this configuration file, it is possible to create your own display engine, and this is roughly how all the terminal emulation program(s) you might use all work; whether that is putty, SecureCRT (my fav.), or xterm (or not) on the unix console.
In my local copy there are over 1500 lines with terminal names/types. So, if you want this to be truly robust you have to be able to handle all those terminal types. Not easy, but with the help of libraries and the termcap file, it isn't as large a mountain as it might seem; but it's certainly not child's play.
Now, in your favor is the fact that in the real world, you're typically only going to care about a small subset of terminals, namely vt100/200/220 and xterm. Those are defacto standards today, but Linux and ANSI are two others that might also be widely used. That's still 6 different terminal protocols you have to deal with 100% correctly in order to be robust.
If you're still determined to sink a lot of time into this, the Putty source code is available, I suggest using that as a guide. Good luck. -Scott
| [reply] |
|
| [reply] |
|
(1) is a snapshot (ignoring color and font type) of the terminal at a given moment in time (rather than a movie, which is what script captures.)
A snapshot is a still image of a movie. The movie is typescript. The scriptreplay I posted fetaures pausing and continuing the script replay with a space bar press. There's your snapshot; there is no oher way, since terminal output has more about than color - size, movements and erasing.
You cannot get at some movie snapshot without scrolling through it. For fast scrolling, hit the + key which will increase the divisor; to slow it down, hit the - key.
(2) as if the terminal had no buffer size limitation
The buffer grows and shrinks. Is the buffer state to be considered before or after a user hit <Ctrl>L or typed "clear"? The buffer also remains static while copious typing is sent to the terminal - e.g. editing text with vi. No scrollback buffer involved, it all happens at the same buffer location.
(3) is editable and searchable with a text editor, (4) is the output of a command-line program, not the result of copy/pasting with a mouse.
Your wish (3) cannot be fullfilled, since terminal output isn't text - it's not linear. A terminal is a screen area in time. Take command line editing, going back and forth inserting and deleting chars. At which moment in time is the searched text ready to be found? Wrt (4), any command line tool which converts terminal output into streaming text will produce something between "clean text" where significant information might be lost, and the "garbage" typescript recorded.
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
| [reply] |
Re: Wanted: humanly readable `script` output
by repellent (Priest) on May 08, 2008 at 08:11 UTC
|
This is a best-effort to strip out most ANSI escape sequences:
cat typescript | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g'
+ | col -b
| [reply] [d/l] |
|
Why, oh why do I always see people explaining away a tough problem as 'impossible to answer'? We're not discussing the halting problem or anything like that. I don't think it proper for monks (Perl or otherwise) to simply shrug off a tough problem as impossible to solve. Rather argue about it in order to find a solution (like Tibetan monks do, with great passion :-) )
First: let's get the original poster's question and later clarification right: in simple terms of analogy: if you consider (scrolling) terminal output to be a movie of a special kind of printer which prints (and occasionally erases) text on the screen, and scrolling up the screen as feeding the (continuous feed) 'paper' out the top of the screen, then what he wants (and also what he doesn't need, e.g. full state recovery from every point in time) becomes very clear. The very fact that scrollback buffers exist and can be copied (although not very conveniently) corroberates this.
That anology should put your mind on the right track: printer, paper...hardcopy!
One simple Google session then quickly brings up the solution: use Gnu/screen for your session. Start it with
$ screen -h <scrollback buffer size>. At the shell prompt that appears, type all the commands (including whatever typos / editing), then when you're done issue <Ctrl-A> :hardcopy -h <dumpfile>.
That's it. You can use whatever full-screen application (vi(m), mc, emacs) you like inside the session, only the last screenful from that will be saved.
More information here: http://www.samsarin.com/blog/2007/03/11/gnu-screen-working-with-the-scrollback-buffer/
| [reply] [d/l] |
|
Why, oh why do I always see people explaining away a tough problem as 'impossible to answer'?
Because you're hallucinating
| [reply] |
|