stevieb has asked for the wisdom of the Perl Monks concerning the following question:
I've never had a need to use seek() in my Perl years, so after I got looking at a piece of code BrowserUK posted earlier today, I thought I'd ask about it.
In the below code in the seek $fh, $_ *80, 0; line, I would believe that it would set the cursor to column 80 (times the multiple in the while(...)), but it then proceeds to print 78 chars to the file. When I look at the file, all lines start at column 0.
#! perl -slw
use strict;
use threads;
open my $fh, '>', 'junk.dat' or die $!;
$_->join for map {
async {
seek $fh, $_ *80, 0;
print $fh $_ x 78;
};
} 1 .. 4;
close $fh;
open $fh, '<', 'junk.dat' or die $!;
print while <$fh>;
close $fh;
perldoc -f seek doesn't help it click for me. Can someone advise what I'm missing in the above code? I did modify the code to try to understand, but the reason why things start at col 0 elude me.
-stevieb
Re: Position in seek() confusion
by BrowserUk (Patriarch) on Jul 01, 2015 at 02:57 UTC
|
That was a crude demonstration knocked up in haste. If you run it, you'll find that in addition what you've noticed, it also prints the lines double spaced.
This is due to the use of -l on the shebang line. Newlines are added when the lines are printed to the file and then again when they are read back and printed to STDOUT.
I took the liberty of discarding the spaces when posting. If you want a better version, try:
#! perl -sw
use strict;
use threads;
use threads::shared;
my $sem :shared;
open my $fh, '>:raw', 'junk.dat' or die $!;
$_->join for map {
async {
for my $i ( 0 .. 999 ) {
lock $sem;
seek $fh, $i*320 + ($_-1)*80, 0;
print $fh $_ x 79, "\n";
}
};
} 1 .. 4;
close $fh;
open $fh, '<', 'junk.dat' or die $!;
print while <$fh>;
close $fh;
__END__
C:\test>t-write.pl
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
1111111111111111111111111111111111111111111111111111111111111111111111
+11111111
2222222222222222222222222222222222222222222222222222222222222222222222
+22222222
3333333333333333333333333333333333333333333333333333333333333333333333
+33333333
4444444444444444444444444444444444444444444444444444444444444444444444
+44444444
...
Note: I truncated some of the output when posting.
| [reply] [d/l] |
Re: Position in seek() confusion
by Anonymous Monk on Jul 01, 2015 at 02:02 UTC
|
BEGIN { $^W = 1; }
BEGIN { $/ = "\n"; $\ = "\n"; }
use threads;
use strict;
die $! unless open my $fh, '>', 'junk.dat';
$_->join foreach (map {async sub {
seek $fh, $_ * 80, 0;
print $fh $_ x 78;
}
;} 1..4);
close $fh;
die $! unless open $fh, '<', 'junk.dat';
print $_ while defined($_ = <$fh>);
close $fh;
So on windows "\n" becomes the bytes "\r\n" when you print, so 78 + 2 = 80, thats the lines
so seek to 1*80 from zero print "1" 78 times followed by "\r\n"
so seek to 2*80 from zero print "2" 78 times followed by "\r\n"
so seek to 3*80 from zero print "3" 78 times followed by "\r\n"
so seek to 4*80 from zero print "4" 78 times followed by "\r\n"
Then when reading the file the first 80 chars are null ("\0") then its 1111...222...333...4444
Thats Basic debugging checklist for you | [reply] [d/l] |
Re: Position in seek() confusion
by Anonymous Monk on Jul 01, 2015 at 08:23 UTC
|
It's not at all clear to me how B::Deparse makes it obvious for people not familiar with seek... OTOH, Devel::Peek does. This is on Linux:
open my $fh, '>', 'junk.dat' or die $!;
for ( 1 .. 4 ) {
seek $fh, $_ * 80, 0;
print $fh $_ x 78, "\r\n";
}
open $fh, '<', 'junk.dat' or die $!;
while ( <$fh> ) {
use Devel::Peek;
Dump $_;
}
output:
SV = PV(0x189fd70) at 0x18c0a38
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x18a4610 "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
+\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
+0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0001111111111111111111111111111
+11111111111111111111111111111111111111111111111111\r\n"\0
CUR = 160
LEN = 200
SV = PV(0x189fd70) at 0x18c0a38
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x18a4610 "2222222222222222222222222222222222222222222222222222
+22222222222222222222222222\r\n"\0
CUR = 80
LEN = 200
SV = PV(0x189fd70) at 0x18c0a38
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x18a4610 "3333333333333333333333333333333333333333333333333333
+33333333333333333333333333\r\n"\0
CUR = 80
LEN = 200
SV = PV(0x189fd70) at 0x18c0a38
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x18a4610 "4444444444444444444444444444444444444444444444444444
+44444444444444444444444444\r\n"\0
CUR = 80
LEN = 200
So, when you seek past the end of file, it's as if the filesystem fills that part of file (0 .. 79 bytes in our case) with zero bytes (actually it creates a 'hole' in the file, which, when read, returns zeros - apparently NTFS also does that?). These zero bytes are then actually printed, just not displayed on the terminal:
$ perl -E 'say "\0\0\0ABC"
ABC
$ perl -E 'say "\0\0\0ABC"' | perl -nE 'printf "%vx\n", $_'
0.0.0.41.42.43.a
| [reply] [d/l] [select] |
Re: Position in seek() confusion
by Anonymous Monk on Jul 01, 2015 at 01:34 UTC
|
| [reply] |
Re: Position in seek() confusion
by sundialsvc4 (Abbot) on Jul 01, 2015 at 12:35 UTC
|
You folks seriously told the OP to “de-parse” a Perl subroutine into its underlying bytecode-tree, assuring him or her that thereby it would be “obvious?” Really?? ... :-/ ... Gee, you people really are nerds!
The one and only thing to remember about seek(), in any programming language at all, is that it works strictly on byte-positions within the file. It does not know about newline sequences (in any of their one- or two-byte flavors ...), nor does it know about Unicode. The file consists of nothing more and nothing less than a collection of zero-or-more bytes, and seek() sets the read/write cursor to an absolute or relative byte position within that collection.
| |
|
You seriously replied to the OP indirectly? Not every context is a beginner's context. In this case, the OP was looking for deep knowledge and not just the Cliffs Notes. Can you not see this?
| [reply] |
|
sundialsvc4:
You folks seriously told the OP to “de-parse” a Perl subroutine into its underlying bytecode-tree, assuring him or her that thereby it would be “obvious?” Really?? ... :-/ ... Gee, you people really are nerds!
The one and only thing to remember about seek(), in any programming language at all, is that it works strictly on byte-positions within the file. It does not know about newline sequences (in any of their one- or two-byte flavors ...), nor does it know about Unicode. The file consists of nothing more and nothing less than a collection of zero-or-more bytes, and seek() sets the read/write cursor to an absolute or relative byte position within that collection.
Poor flushells, you really don't know anything about B::Deparse, print, $/, -l
Why starts at "column 0"? Why 78/80? Because \r\n is added
| [reply] |
|
|