http://www.perlmonks.org?node_id=6510

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm using a flat file database and getting problems when carriage returns (asc 0x0D) occur in the middle of lines. This happens, for example, when the text comes from a web form TEXTAREA field. How can I strip such carriage return characters from a string?

Replies are listed 'Best First'.
Re: How do I handle mid-line carriage returns in a flatfile database?
by cianoz (Friar) on Aug 26, 2000 at 16:32 UTC
    Usually i prefer to uri-encode multiline fields so i can preserve LF and CR... you can do this way:
    use URI::Escape; my $safe_field = uri_escape($field); ## now you can put $safe_field in the db ## without worry about CR/LF a ## restore it: $field = uri_unescape($safe_field);
Re: How do I handle mid-line carriage returns in a flatfile database?
by btrott (Parson) on Mar 30, 2000 at 09:47 UTC
    I believe you should be able to just do something like this:
    $entry =~ tr/\r//d;
    This will strip every carriage return from $entry. Alternatively, you may wish to replace carriage returns with spaces (so that you don't get words running together):
    $entry =~ tr/\r/ /;
    I'd recommend the latter.
      I think it's worthwhile pointing out that "\n" really means the local end of line marker, local as in the computer the script is run on.

      This is not a fixed standard as the following shows:
      Unix/Linux/OSX use \012
      Win/DOS usually uses \015\012 for text IO

      So running a script under (Uni|Linu|OS)X containing  tr/\n// on a file written on windows won't behave the way you might think.

      You spend twenty years learning the spell that makes nude virgins appear in your bedroom, and then you're so poisoned by quicksilver fumes and half-blind from reading old grimoires that you can't remember what happens next.

Re: How do I handle mid-line carriage returns in a flatfile database?
by extremely (Priest) on Dec 30, 2000 at 06:14 UTC
    The best way is to stop using return to end database records. =) Set $/="\033\n"; and write records with that on the end of the record. Using a 2 char line terminator lets you have a nice return at the end for eyeballing the files and use a nasty binary character to make the end-of-line special versus your data.
Re: How do I handle mid-line carriage returns in a flatfile database?
by gryphon (Abbot) on Dec 02, 2000 at 04:22 UTC

    While this is probably not the best way to go about this, the following has worked for me on many an occation:

    use CGI qw(header param); print header; foreach (param) { ($$_ = param($_)) =~ s/\r\n|\n/<BR>/g; }

    Note: This won't work with strict, and therefore probably shouldn't be used as is. Something a little safer might be:

    (my $some_variable = param('textbox-name')) =~ s/\r\n|\n/<BR>/g;

    The key thing is that depending on the user's OS, browser client, or other unknown something, you might end up with your params having just /\n/ or /\r\n/ in them. (Blame Micro$oft. Works for me.) ;) When I'm storing stuff in a flat-file "database," I'll usually store carriage returns as either /<BR>/ or /\\n/ and have Perl, JavaScript, or whatever deal with converting it to something else later.

    Shoot me an email if this doesn't help or answer what you're looking for.

Re: How do I handle mid-line carriage returns in a flatfile database?
by Anonymous Monk on Dec 30, 2000 at 05:33 UTC
    There's a whole bunch of stuff in the FAQ about matching over more than one line, like the /s modifier for the end of a regular expression, which means "the stuff you find doesn't have to be on the same line".

    It also say that if you set $/ = '' then Perl will read in paragraphs at a time, not lines at a time.

    Does that help?

    But obviously, killing the carriage returns on the way in to your program is going to solve it. The equivalent to the $/ = '' thing above is surely to

    * replace all double returns with a holding pattern like ||||

    * replace all single returns with spaces

    * replace all |||| with returns