Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Problem with files read to array split on empty lines

by jhoop (Acolyte)
on Aug 03, 2012 at 05:39 UTC ( #985148=perlquestion: print w/ replies, xml ) Need Help??
jhoop has asked for the wisdom of the Perl Monks concerning the following question:

I am puzzled. I have a script that should split a bunch of input files' contents by empty lines, however the same script that works in my local machine doesn't work on the same files when executed from the server install I want to run it from. The code below splits every file into 2 parts when run from the server - the split occurring only at the first break of multiple text blocks separated by blank lines.. When run from my local install it operates properly. I'm new to running scripts from a remote install, but I can't figure out what the issue is.. Any help much appreciated!

my @mailfiles = </path/to/files/*>; foreach my $file (@mailfiles){ my @text_blocks; open TEXT, '<', $file or die "could not open $file"; my $text; while (<TEXT>){ $text .= $_; } @text_blocks = split(/\n{2,}/, $text); close TEXT; #also tried the method below with similar results #{ # local $/ = ''; # @text_blocks = <TEXT>; #} #close TEXT; print scalar(@text_blocks); }

prints '2222' when run on the server, '4426' at home when run on the same four files.

edit: I should have mentioned that the break between the first and second text block (the only point where the server script splits properly) is the break between the header and body of an email message read to the file by mail::imapclient. The rest of the blank lines are inside the body of the original email message.

Comment on Problem with files read to array split on empty lines
Download Code
Re: Problem with files read to array split on empty lines
by frozenwithjoy (Curate) on Aug 03, 2012 at 05:45 UTC
    Can you please tell us which versions of perl you have locally vs remotely? ( perl -v )

      local: 5.14.2 - remote: 5.008008

Re: Problem with files read to array split on empty lines
by tobyink (Abbot) on Aug 03, 2012 at 05:59 UTC

    Are you sure the files are identical? Look at their file sizes. Do they perhaps have different line endings? \n versus \r\n.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      I believe so but I can't say for sure. The copies on server are the originals, I pulled them to the home machine to test them after the script didn't work as expected. Windows does claim two of them are slightly less (-.01k) in size.

        How did you transfer them? FTP? Many FTP clients will choose to transfer some files (often using the file name as a hint as to whether it's appropriate) in "ASCII mode" which means that it will change the line to the standard line endings for the local system during transfer.

        Try replacing your /\n{2,}/ regular expression with:

        /(?:\r?\n|\r){2,}/

        ... which should match two or more occurrences of any common line ending.

        perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Problem with files read to array split on empty lines
by davido (Archbishop) on Aug 03, 2012 at 07:39 UTC

    What operating system were the files created on? What operating system is your local server running, and what operating system is the remote system running?

    Write a small script that slurps a file, and then uses tr/\015// and tr/\012// to count independently how many carriage returns and line-feeds the files have. You might find that at least one of the files has line endings that are incompatible with one of the operating systems you're using.

    Perl's \n is a logical newline, that can consist of \012, \015\012, or \015, depending on which OS your script is running under (*nix, Win, or Mac). If the file was created or edited on an OS that uses different line endings, you could get errant behavior when reading and dealing with the file. perlport


    Dave

      Thanks very much for the input. The files were created on the same Linux server on which the script was originally attempted. They were downloaded to my Win7 machine via the cPanel web UI. I will test your suggestions on the remote system and report back. I do not imagine it would do much good to test it on the local system, because if tobyink is correct the line endings could have been converted upon downloading the files.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://985148]
Approved by tobyink
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (14)
As of 2014-09-30 14:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (372 votes), past polls