Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

splitting on win32 linefeeds aka pass the crackpipe!

by mr.dunstan (Monk)
on Feb 14, 2002 at 01:42 UTC ( #145333=perlquestion: print w/replies, xml ) Need Help??

mr.dunstan has asked for the wisdom of the Perl Monks concerning the following question:

Hi - I'm 3 hours into this problem and I'm getting bummed out. Here's what I'm trying to do:

I have a cgi script I'm working on that allows me to upload a file into the perl script using a file input element and CGI.pm and a multipart form. No problem there, been there done that.

The problem is - once I have read the file contents into memory, I want to parse it line-by-line, hopefully by reading each line into an array (this is fixed-width data mixed in with some unstructured data, thanks silly vendor, and the document itself is only vaguely structured, but if I can read it line by line I'm good to go.) The thing is - once I've read it into a scalar, the line feeds disappear BUT if I read it out to a file on the local filesystem (not an option in production BTW), the line feeds are there?!

I've tried split on the line feeds but they're not there - see example below. Way confused. Anyway here's what I'm trying to do, in short. Yes I am using strict ...

use strict, CGI, blah blah ... my $cgi = new CGI; my $filename = $cgi->param('uploaded_file'); my $file = $filename; # can't remember why I do this, alway +s have $file =~ s!^.*(\\|\/)!!; # something about cleaning stuff up my $buffer; my $content_to_parse; while (my $bytesread = read($filename,$buffer,1024)) { $content_to_parse = $content_to_parse . $buffer; } # ok now file contents are in $content_to_parse # try splitting on ^M (windoze newlines) my @contentarray = split /\015$/, $content_to_parse; # doesn't work - @contentarray is empty # scratch head, get another diet coke # now do line by line parsing ... the end
I'm sure there's a better way, like reading the file into an array somehow. Oh perlmonks deliver me from this pain!!!!

-mr.dunstan

Replies are listed 'Best First'.
Re: splitting on win32 linefeeds aka pass the crackpipe!
by little (Curate) on Feb 14, 2002 at 01:53 UTC
    Better split this way, so you can split also files saved in Unix format
    # try splitting on ^M (windoze newlines) my @contentarray = split /\r?\n/, $content_to_parse;

    Have a nice day
    All decision is left to your taste
      Wow. I feel dumb. Thanks!

      -mr.dunstan
Re: splitting on win32 linefeeds aka pass the crackpipe!
by buckaduck (Chaplain) on Feb 14, 2002 at 02:25 UTC
    Update: Mea culpa. I wasn't paying attention to the .* in the regex. Sigh. But I'll leave my post intact, to preserve my mistake for posterity...

    Off topic, but:

    my $file = $filename; # can't remember why I do this, always +have $file =~ s!^.*(\\|\/)!!; # something about cleaning stuff up
    ...is pretty ugly. If you're trying to trim a trailing slash, it's easier to read like this:
    my $file = $filename; $file =~ s![\\/]$!!;
    If you can't tell what your own regex is doing, it's time to clean it up.

    buckaduck

      The original substitution doesn't actually remove a trailing slash; rather, it removes the directory path leaving just the name of the file. If $filename is '/usr/local/bin/perl', $file will be 'perl'.

      This is better done with File::Basename.

      chipmunk++ for pointing out File::Basename.

      mr.dunstan, you'd probably be better off with this:
      my ($actualFileName, $filePath) = fileparse ($filename);

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://145333]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2019-06-19 00:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Is there a future for codeless software?



    Results (83 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!