http://www.perlmonks.org?node_id=788868


in reply to Re: remove line feed at the end
in thread remove line feed at the end

#!/usr/bin/perl use strict; use warnings; my $fh; my $tag; my %hash; my @metadata_tags; my $flag; while (<DATA>) { chomp $_; if(/^{(.*)}/) { $tag = $1; push(@metadata_tags,$tag); } else { if($tag eq 'FILE') { if(defined($fh)){ print $fh "</ROOT>"; close($fh); } my $filename = $_; open($fh, '>', "$filename.xml") or die "$filename: $!"; print $fh '<?xml version="1.0"?>',"\n"; print $fh "<ROOT>\n"; print $fh "<FILE>$filename</FILE>\n"; } elsif(defined($fh)) { if($_ ne ''){ if($_ !~ m/^{/){ $_ =~ s/\\//gi; chomp; print "<$tag>$_</$tag>\n"; # } } } } } } __DATA__ {FILE} sourcetag1 {NUMBER} 00000 11111 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ STYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS. {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2
But this produces output like this
<NUMBER>00000</NUMBER> <NUMBER>11111</NUMBER> <SOURCE>source1</SOURCE> <AUTHOR>author1 staff1</AUTHOR> <HEADLINE>DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST</HEADLINE> <HEADLINE>STYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS.</HE +ADLINE> <NUMBER>00002</NUMBER> <SOURCE>sourcenam2</SOURCE> <AUTHOR>author2 staff2</AUTHOR>
Instead of
<NUMBER>00000</NUMBER> <NUMBER>11111</NUMBER> <SOURCE>source1</SOURCE> <AUTHOR>author1 staff1</AUTHOR> <HEADLINE>DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST STYLE AT A SPE +ED USUALLY ASSOCIATED WITH WARDROBE ITEMS.</HEADLINE> <NUMBER>00002</NUMBER> <SOURCE>sourcenam2</SOURCE> <AUTHOR>author2 staff2</AUTHOR>
Please anyone help me

Replies are listed 'Best First'.
Re^3: remove line feed at the end
by linuxer (Curate) on Aug 15, 2009 at 12:51 UTC

    That's good. Looks like it works as designed.

    If you want to do special things when \ appears at the end of a line, then you should recognize it and do special things.

    You are only removing it from the line and then go on with the standard action...

    Consider something like this:

    #!/usr/bin/perl use strict; use warnings; my $output = ''; LINE: while ( my $line = <DATA>) { chomp $line; # if line ended with a '\', remove '\' and save line for later out +put if ( $line =~ s/\\$// ) { # clean indentation $line =~ s/^\s+/ /; # save line $output .= $line; next LINE; } # we have previous line(s) to consider? elsif ( length $output ) { # clean indentation $line =~ s/^\s+/ /; $line = $output . $line; $output = ''; } print $line, $/; } __DATA__ {FILE} sourcetag1 {NUMBER} 00000 11111 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ STYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS +. {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2 a\ b\ c

    It's your exercise to combine this with your code ;o)

    BTW: why do you use the /i modifier, when replacing spaces, newlines or the backslash?

      <ROOT> <FILE>sourcetag1</FILE> <NUMBER>00000 11111</NUMBER> <SOURCE>source1</SOURCE> <AUTHOR>author1 staff1</AUTHOR> <HEADLINE>DISPOSABLE DECOR: THE CUTTING EDGE DULLS FASTTYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS</HEADLINE> </ROOT> Sorry...It is not only for the line which has \ at the end. It should remove all the new line character anything which doesnot match {sometext} at the start. I mergeed the code with the sample which you provided. But it works for the line {HEADLINE} The merged code is below
      #!/usr/bin/perl use strict; use warnings; my $output = ''; my $tag; my $fh; LINE: while ( my $line = <DATA>) { chomp $line; # if line ended with a '\', remove '\' and save line for later out ++put if ( $line =~ s/\\$// || $line =~ s/\s$//) { # clean indentation $line =~ s/^\s+/ /; # save line $output .= $line; next LINE; } # we have previous line(s) to consider? if ( length $output ) { # clean indentation $line =~ s/^\s+/ /; $line = $output . $line; $output = ''; } if($line =~ /^{(.*)}/) { $tag = $1; }else { if($tag eq 'FILE') { if(defined($fh)){ print $fh "</ROOT>"; close($fh); } my $filename = $line; open($fh, '>', "$filename.xml") or die "$filename: $!"; print $fh '<?xml version="1.0"?>',"\n"; print $fh "<ROOT>\n"; print $fh "<FILE>$filename</FILE>\n"; } elsif(defined($fh)) { if($line ne ''){ $line =~ s/\\//gi; print "<$tag>$line</$tag>\n"; } } } print $line, $/; } __DATA__ {FILE} sourcetag1 {NUMBER} 00000 11111 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ TYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2
        use strict; use warnings; my ($output, $tag, $fh); while (<DATA>) { chomp; s/^\s+//; s/\s+$//; if(/^{(.*)}$/) { # {TAG} line $fh = output($output, $tag, $fh); $output = ""; $tag = $1; } else { # not a {TAG} line next unless($tag); next if(/^\s*$/); s/\\//g; $output .= ($output) ? " $_" : "<$tag>$_"; } } $fh = output($output, $tag, $fh); if($fh) { print $fh "</ROOT>\n"; close($fh); } exit(0); sub output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<FILE>(.*)/) { if($fh) { print $fh "</ROOT>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\"?>\n<ROOT>\n"; } print $fh "$output</$tag>\n"; } return($fh); } __DATA__ ^B^B^B^B^B^B {FILE} sourcetag1 {NUMBER} 00000 {SOURCE} source1 {KEYWORD} {AUTHOR} author1 staff1 {HEADLINE} DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\ STYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS. {FILE} sourcetag2 {NUMBER} 00002 {SOURCE} sourcenam2 {KEYWORD} {AUTHOR} author2 staff2
        Your code seems that concatenating of lines is only for the last statement before the {FILE} tag. Figure out the way to solve it.