http://www.perlmonks.org?node_id=788884


in reply to Re^3: remove line feed at the end
in thread remove line feed at the end

<ROOT> <FILE>sourcetag1</FILE> <NUMBER>00000 11111</NUMBER> <SOURCE>source1</SOURCE> <AUTHOR>author1 staff1</AUTHOR> <HEADLINE>DISPOSABLE DECOR: THE CUTTING EDGE DULLS FASTTYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS</HEADLINE> </ROOT> Sorry...It is not only for the line which has \ at the end. It should remove all the new line character anything which doesnot match {sometext} at the start. I mergeed the code with the sample which you provided. But it works for the line {HEADLINE} The merged code is below
#!/usr/bin/perl use strict; use warnings; my $output = ''; my $tag; my $fh; LINE: while ( my $line = <DATA>) { chomp $line; # if line ended with a '\', remove '\' and save line for later out ++put if ( $line =~ s/\\$// || $line =~ s/\s$//) { # clean indentation $line =~ s/^\s+/ /; # save line $output .= $line; next LINE; } # we have previous line(s) to consider? if ( length $output ) { # clean indentation $line =~ s/^\s+/ /; $line = $output . $line; $output = ''; } if($line =~ /^{(.*)}/) { $tag = $1; }else { if($tag eq 'FILE') { if(defined($fh)){ print $fh "</ROOT>"; close($fh); } my $filename = $line; open($fh, '>', "$filename.xml") or die "$filename: $!"; print $fh '<?xml version="1.0"?>',"\n"; print $fh "<ROOT>\n"; print $fh "<FILE>$filename</FILE>\n"; } elsif(defined($fh)) { if($line ne ''){ $line =~ s/\\//gi; print "<$tag>$line</$tag>\n"; } } } print $line, $/; } __DATA__ {FILE} sourcetag1 {NUMBER} 00000 11111 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ TYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2

Replies are listed 'Best First'.
Re^5: remove line feed at the end
by ig (Vicar) on Aug 15, 2009 at 16:27 UTC
    use strict; use warnings; my ($output, $tag, $fh); while (<DATA>) { chomp; s/^\s+//; s/\s+$//; if(/^{(.*)}$/) { # {TAG} line $fh = output($output, $tag, $fh); $output = ""; $tag = $1; } else { # not a {TAG} line next unless($tag); next if(/^\s*$/); s/\\//g; $output .= ($output) ? " $_" : "<$tag>$_"; } } $fh = output($output, $tag, $fh); if($fh) { print $fh "</ROOT>\n"; close($fh); } exit(0); sub output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<FILE>(.*)/) { if($fh) { print $fh "</ROOT>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\"?>\n<ROOT>\n"; } print $fh "$output</$tag>\n"; } return($fh); } __DATA__ ^B^B^B^B^B^B {FILE} sourcetag1 {NUMBER} 00000 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ STYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS. {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2
Re^5: remove line feed at the end
by Anonymous Monk on Aug 15, 2009 at 14:14 UTC
    Your code seems that concatenating of lines is only for the last statement before the {FILE} tag. Figure out the way to solve it.