converting carriage returns to <br> tags (was: Simple Question for you guys)

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Simple Question for you guys. by AidanLee (Chaplain) on May 18, 2001 at 17:19 UTC
you just need to do a global search and replace on the field with a regular expression that swaps the newline with the tag: `$fieldtext =~ s\|\n\|<br />\|g # self terminating tag for XHTML complian +ce` [download] note i've changed the regex delimiter to a pipe ( \| ) so i don't have to escape the '/' character in the `br` tag.	[reply] [d/l] [select]
Re: Simple Question for you guys. by blue_cowdawg (Monsignor) on May 18, 2001 at 17:24 UTC
`my $text=$cgi->param('mytextfield'); $text =~ s/\n/\<br\>/g; ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Peter L. Berghold --- Peter@Berghold.Net "Those who fail to learn from history are condemned to repeat it."` [download]	[reply] [d/l]
Re: Re: Simple Question for you guys. by AidanLee (Chaplain) on May 18, 2001 at 17:30 UTC
why have you escaped the < and > symbols? AFAIK they are not special inside a regex.	[reply]
Re: Re: Re: Simple Question for you guys. by blue_cowdawg (Monsignor) on May 18, 2001 at 17:41 UTC
Simple. I felt like it. When in doubt escape. It doesn't cost much and it certainly doesn't hurt. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Peter L. Berghold --- Peter@Berghold.Net "Those who fail to learn from history are condemned to repeat it."	[reply]
Re: Re: Re: Re: Simple Question for you guys. by davorg (Chancellor) on May 18, 2001 at 17:44 UTC
Re: Re: Re: Re: Re: Simple Question for you guys. by chipmunk (Parson) on May 18, 2001 at 18:20 UTC
Re: Re: Re: Re: Re: Simple Question for you guys. by blue_cowdawg (Monsignor) on May 18, 2001 at 17:52 UTC
Some notes below your chosen depth have not been shown here
Re: Simple Question for you guys. by voyager (Friar) on May 18, 2001 at 20:13 UTC
As I learned when I posted a similar question, the textarea is likely to be giving you newlines and returns if the client is PC. So you might want something like: `$textarea =~ s\|[\r\n]\|<br />\|g;` [download] Note: not sure if it's \r\n or \n\r.	[reply] [d/l]
Re: Re: Simple Question for you guys. by chipmunk (Parson) on May 18, 2001 at 23:05 UTC
If the textarea is returned with \r\n line endings, then that substitution will insert two <BR /> tags at the end of each line. I prefer something like this: `$textarea =~ s,\r\n?\|\n\r?,<br />\n,g;` I like to keep a newline after the BR tag, to make the HTML easier to read.	[reply] [d/l]
Re: Re: Simple Question for you guys. by Buckaroo Buddha (Scribe) on May 18, 2001 at 21:16 UTC
why are you using `s\|[\r\n]\|<br />\|g;` [download] instead of `s/[\r\n]/\<br \>/g;` [download] are the pipes somehow more efficient in this case or just more readable? the other thing i'm wondering about is the square brackets... does that mean `/[asdf\.]/` would search for each of those characters (a, s , d, f and Period) versus the string 'asdf.' i know they're silly questions but i'm trying to get back up to speed with reading perl ... after about 8 months in visual basic for applications there were so many REALLY nice things about working in VBA ... for example the editor will automatically fill display a list of the sub objects and methods of the object that you're working with... but then again the number of times i've struggled with the long way around a hash table or an array makes me really glad to be back in PerlScript (with a bit of Win32::OLE) anyways ... i'm rambling :)	[reply] [d/l] [select]
Re: Re: Re: Simple Question for you guys. by AidanLee (Chaplain) on May 18, 2001 at 22:01 UTC
pipes are no more efficient as regex delimiters. You have it exactly right on your latter guess though, it's a whole lot more readable than `s/[\r\n]/<br \/>/g;` You've also guessed right about the brackets. It's a way to specify a group of characters to match without specifying what order they appear in.	[reply] [d/l]
Re: Re: Re: Simple Question for you guys. by voyager (Friar) on May 18, 2001 at 21:57 UTC
You can choose your own delimiters. Anytime there's forward or back slashes in the reg exp, it' better to use a different delimitter. "Slanted toothpicks" or something is the name for the syndrome to avoid.	[reply]
Re: Re: Simple Question for you guys. by AidanLee (Chaplain) on May 18, 2001 at 20:40 UTC
A good suggestion for getting the job done right (which I'm always a fan of), but it doesn't hurt to note that the browser will display things okay without converting both. As long as a `<br />` tag is there.	[reply] [d/l]
Re: Re: Re: Simple Question for you guys. by voyager (Friar) on May 18, 2001 at 21:00 UTC
True. What caused me pain was wanting to convert two newlines to two BR tags, but leave single newlines alone. Trying to match on `\n\n` was not working because what was there was really `\n\r\n\r`.	[reply]
Re: converting carriage returns to br tags (was: Simple Question for you guys) by tachyon (Chancellor) on May 20, 2001 at 06:29 UTC
A few comments on all these comments! First this is really all you need for most circumstances. $textarea =~ s/\n/<BR>\n/g; We substitute <BR>\n so that we get the effect: Was: blah blah Now: blah<BR> blah If we sub just <BR> instead of \n<BR>\n we will get blah<BR>blah If you prefer to get blah <BR> blah then use \n<BR>\n as the sub pattern Depending on platform, the \n sequence is converted by perl to: Unix: octal \012 hex 0xA dec 10 LF may be \n Dos: octal \015\012 hex 0xD0xA dec 13 10 CRLF may be \r\n Max: octal \015 hex 0xD dec 13 CR may be \r Although perl works for you trying to allow you to just use \n as your newline delimiter and let it sort the platform dependent details, many common internet protocols specify the \015\012 sequence and unfortunately the values of Perl's \n and \r are not reliable since they can and do vary from system to system. I suspect that $textarea is named from its HTML source so you will probably want to use a truly portable solution like this: $textarea =~ s/\015\012\|\015\|\012/<br>\n/g; If you prefer hex to octal :-) $textarea =~ s/\xD\xA\|\xD\|\xA/<br>\n/g; If you are confused by the \012 or \xA notation all this is saying to perl is what I want you to match is the ASCII char decimal 10 == octal 12 == hex A == binary 1010 In expanded commented /x form: $textarea =~ s/ # substitute \015\012 # a CRLF sequence (DOS, MIME...) \| # or \015 # a lone LF (mac) \| # or \012 # a lone LF (unix) /<br>\n # with literal '<br>' plus newline /xg; # /x allow comments, /g do globally There are flaws, both major and minor, with all solutions posted: s\|\n\|<br />\|g # you don't need the unnecessary space or the / before the > # as \ is the escape char, this will sub '<br />' for \n! # rather than escape the > making it a literal which it is anyway. tr/\n/<BR>/s # you still need /g, not /s even allowing for using s instead of tr s/\n/\<br\>/g; # the escapes are correct but both unnecessary. This is the first # suggestion that will actually work (most of the time) s\|[\r\n]\|<br />\|g; # this is wrong. Leaving aside the problems with using \r and \n # and the fact it will sub '<br />' the problem is this: # if we have \r\n we will get <br><br> (assuming we fix the sub) # with \r or \n we will get <br> so we get a different and platform # dependent result. This is partially fixed by changing to: s\|[\r\n]+\|<br>\|g; # however if we have \r\n\r\n or \n\n or \r\r we get just one <br> # replacing a series of line breaks, probably not what we want s,\r\n?\|\n\r?,<br />\n,g; # this suffers from \r \n problems, matches \n\r which is not a # desired result and subs in '<br />' again -> not an HTML tag Phew, I feel better now I've got that off my chest. Finally for those that are not familiar with the concept you may use almost any non-alphanumeric char as a regex delimiter. Thus we could use paired brackets $textarea =~ s(\015\012\|\015\|\012)(<br>\n)g; Unpaired brackets: $textarea =~ s{\015\012\|\015\|\012}<<br>\n>g; Brackets then a pair of something else, even # chars $textarea =~ s[\015\012\|\015\|\012]#<br>\n#g; With brackets we can split onto two lines: $textarea =~ s (\015\012\|\015\|\012) [<br>\n]g; If using non brackets we can even use ; if you are into obfuscation $textarea =~ s;\015\012\|\015\|\012;<br>\n;g; If our delimiter is included as a literal in the pattern we need to backslash it \ (escape it) to make it take on a literal meaning and match itself within the patern rather than be taken by perl as a one of the regex delimiters In a regex only these 12 characters need escaping, although when in doubt it generally does no harm to escape a character. \ \| ( ) [ { ^ $ * + ? . All these chars have special meaning in a regex and if you do much with regexes you will soon get to know them by heart Cheers tachyon [download]	[reply] [d/l]
Re: Re: converting carriage returns to br tags (was: Simple Question for you guys) by chipmunk (Parson) on May 20, 2001 at 20:12 UTC
FYI, the use of <br /> is intentional; it is an XHTML tag. XHTML is a rewrite of HTML conforming to the rules of XML. By the way, I don't think that there would be any problems with using this substitution in practice: `s/\r\n?\|\n\r?/<br>/g;` It is true that the match of \012\015 may be avoided with the substitution you suggested: `s/\015\012\|\015\|\012/<br>/g;` But with the former, you don't have to remember whether \015 or \012 is supposed to come first. :)	[reply] [d/l] [select]
A reply falls below the community's threshold of quality. You may see it by logging in.


more useful options
	PerlMonks

converting carriage returns to &lt;br&gt; tags (was: Simple Question for you guys)

converting carriage returns to <br> tags (was: Simple Question for you guys)