Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^3: Get record separator of a file

by karlgoethebier (Curate)
on Nov 13, 2012 at 22:38 UTC ( #1003718=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Get record separator of a file
in thread Get record separator of a file

OK, I will try my best to explain.

I can see the recsep of a file with:

Karls-Mac-mini:Desktop karl$ hexdump -c -n 8 file.txt 0000000 f o o ; b a r \n + 0000008

Or i hope so.

But i really don't want to check it this way.

I have many larger files with \r\n or \n as recsep.

So i thought about efficiency and figured out that Tie::File is faster than IO::File for my needs (i benchmarked it, but that is another issue).

But when i tied my @array to the original file, all data was put into the first slot of my @array. After setting the recsep option of Tie::File to \n, everything was good.

So i thought, it would be a good idea to do something like the hexdump command in perl to get the recsep - without loosing the performance boost that Tie::File gives me.

I hope very much that this is a better explanation about what i wanted to do.

Thank you very much for your patience and help.

Regards, Karl

«The Crux of the Biscuit is the Apostrophe»


Comment on Re^3: Get record separator of a file
Select or Download Code
Re^4: Get record separator of a file
by davido (Archbishop) on Nov 14, 2012 at 00:08 UTC

    Now we're getting somewhere (I think). You should be able to take advantage of Perl's :crlf IO layer to handle the problem for you. I'll let you test this yourself I've tested this, and here is how I think it would work out.

    First, Tie::File seems to be "layers" unaware, which is fine, except that you'll have to open the file explicitly, and close it again when you're done, rather than letting Tie::File handle those operations. This gives you control over what layers are applied to the file handle.

    use strict; use warnings; use Tie::File; use Scalar::Util qw( weaken ); open my $fh, '+<:crlf', 'filename.ext' or die $!; my @array; my $t = tie @array, 'Tie::File', $fh; weaken $t; # tie holds its own ref. We don't want a mem leak. # Work, work, work... untie @array; close $fh or die $!;

    The relevant explanation of ':crlf' from the POD is: " On read converts pairs of CR,LF to a single "\n" newline character. On write converts each "\n" to a CR,LF pair." Since this happens behind the scenes, it should play nice with Tie::File, but I would test on some copies of the files first to be sure.

    Updated: Added weaken to eliminate a potential memory leak, since tie also holds a ref to its own object.


    Dave

      Update:

      Sorry, i saw it to late. Cool, i didn't know this.

      Works when called with $fh. Very nice!

      Thank you very much and best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

      Just thank you very much, Karl

      «The Crux of the Biscuit is the Apostrophe»

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1003718]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (9)
As of 2014-09-23 20:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls