http://www.perlmonks.org?node_id=1039549

freekngeek has asked for the wisdom of the Perl Monks concerning the following question:

Hi, All

I have text file which looks like this

*|DSPF 1.0 * *|DESIGN "TOP_1b_1a" *|DATE "Tue Jun 18 11:42:12 2013" *|VENDOR "Mentor Graphics Corp." *|PROGRAM "Calibre xRC v2012.2_36.25" *|DIVIDER / *|DELIMITER : * Nominal Temperature: 25C * Circuit Temperature: 25C * .subckt TOP_1b_1a PC_M1_M2_50_500_50_500_50_500_50_500_50_bot + PC_M1_M2_50_100_50_100_50_100_50_100_50_bot + PC_M1_M2_500_250_500_250_500_250_500_250_500_ctr * PC_M1_M2_500_250_500_250_500_250_500_250_500_low + PC_M1_M2_75_500_75_500_75_500_75_500_75_bot + PC_M1_M2_75_100_75_100_75_100_75_100_75_bot * PC_M1_M2_150_50_150_50_150_50_150_50_150_high + PC_M1_M2_100_500_100_500_100_500_100_500_100_bot + PC_M1_M2_100_100_100_100_100_100_100_100_100_bot rM6_M7_JA_50_100_50_100_50_100_50_100_50_ctr/0 R435:pos + M6_M7_JA_50_100_50_100_50_100_50_100_50_ctr 238.22 rM6_M7_JA_50_100_50_100_50_100_50_100_50_ctr/1 R434:neg + M6_M7_JA_50_100_50_100_50_100_50_100_50_ctr 238.22 * *|NET M6_M7_JA_50_100_50_100_50_100_50_100_50_high 0.0 *|I (R435:neg R435 neg B 0.0 10 3151) R0 R0:pos R0:neg lvsres R=0.001 $X=9.975 $Y=10.999 R1 R1:pos R1:neg lvsres R=0.001 $X=9.975 $Y=122 R2 R2:pos R2:neg lvsres R=0.001 $X=9.975 $Y=243.999 R3 R3:pos R3:neg lvsres R=0.001 $X=9.975 $Y=355 R4 R4:pos R4:neg lvsres R=0.001 $X=159.963 $Y=10.999 R5 R5:pos R5:neg lvsres R=0.001 $X=159.963 $Y=122
What I want is to edit my text file using perl script. I want all the lines to be in one line which start with '+' sign, and leave rest of lines as it is. So, whenever it encounters '+' sign, script should print all the lines in one single line and then removes the '+' sign.

I would appreciate any kind of help or ideas to build this script. Thank you

Replies are listed 'Best First'.
Re: Editing text file
by kcott (Archbishop) on Jun 18, 2013 at 13:03 UTC

    G'day freekngeek,

    I think this is probably the algorithm you're after but see notes at the end.

    #!/usr/bin/env perl use strict; use warnings; my $in_plus_lines = 0; while (<DATA>) { chomp; if (/^\+(.*)$/) { print "\n" unless $in_plus_lines; $in_plus_lines = 1; print $1; } else { $in_plus_lines = 0; print "\n" if $. > 1; print "$_"; } } print "\n"; __DATA__ * asterisk_line_1 * .dot_line + plus_line_1 + plus_line_2 * asterisk_line_3 + plus_line_3 + plus_line_4 * asterisk_line_4 + plus_line_5 + plus_line_6 plain_text_line_1 + plus_line_7 plain_text_line_2 + plus_line_8 * *asterisk_line_6 *asterisk_line_7 plain_text_line_3

    Output:

    $ pm_cat_plus_lines.pl * asterisk_line_1 * .dot_line plus_line_1 plus_line_2 * asterisk_line_3 plus_line_3 plus_line_4 * asterisk_line_4 plus_line_5 plus_line_6 plain_text_line_1 plus_line_7 plain_text_line_2 plus_line_8 * *asterisk_line_6 *asterisk_line_7 plain_text_line_3

    Notes:

    • Your sample input data could be substantially reduced: quite a few lines at the beginning and end could be removed; the length of lines could also be reduced as they can become hard to read when concatenated (due to wrapping).
    • Expected output would have been useful. I don't believe your prosaic description of the wanted output is correct; e.g. "I want all the lines to be in one line which start with '+' sign" doesn't look right — I think you want groups of lines starting with '+', not all lines.
    • The differences in data in consecutive lines is hard to discern, e.g. it takes a fair amount of effort to tell "PC_M1_M2_75_500_75_500_75_500_75_500_75_bot" and "PC_M1_M2_75_100_75_100_75_100_75_100_75_bot" apart.
    • According to "and then removes the '+' sign", you end up with a leading space when the "+" lines are concatenated. Is that what you wanted? (Expected output would have made that obvious.)

    As you can see, I've used completely different input; hopefully, the output is easier to read. Given the points above, my algorithm may be incorrect: there may be enough for you to correct it for your needs; if not, please address those points before asking follow-up questions.

    -- Ken

Re: Editing text file
by hdb (Monsignor) on Jun 18, 2013 at 13:14 UTC

    or like this:

    my $data; { $/=undef; $data = <DATA>; } $data =~ s/\n\+//g; print $data;
      One small correction to make it even better:
      local $/=undef;

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      My blog: Imperial Deltronics
Re: Editing text file
by hbm (Hermit) on Jun 18, 2013 at 12:00 UTC

    Open the file and a new file. Read the original file line by line. If the line starts with '+', remove the plus and the newline and accumulate it. If the line doesn't start with '+', print any accumulated lines and the current line to the new file.

      No need to accumulate. You can print every line, having possibly removed the newline and the + if needed.
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ