Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

join two lines in a file?

by happyrainb (Novice)
on Jan 17, 2008 at 00:54 UTC ( [id://662781]=perlquestion: print w/replies, xml ) Need Help??

happyrainb has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a text file of Book Title, Description and Book Number in each line, they are separated by ";". But some Description is droken into two lines,and several spaces at the beginning of the second line, how can I join the two lines of the Description into one same line? Example:
The Road Ahead1;Completely Revised and Up-to-Date; 002564418 road ahead2; America’a creeping revolution;00345678 The road ahead3;[Address made before the Regional Foreign Policy Conference; 004561963 my Perl codes read each line in an array, but how can I join only the +two lines of the Description? ============================================ open IN,"In.txt" or die "Can't read \n"; my @array=<IN>; chomp; for my $array (@array){ $array=~s/^\s*$//; my @field=split(/";"/, $array); for my $field (@field){ ..... } } close IN;

Replies are listed 'Best First'.
Re: join two lines in a file?
by olus (Curate) on Jan 17, 2008 at 01:32 UTC
    my @array = <DATA>; chomp; for my $array (@array){ $array =~ s/\n$// if ($array !~ /;[^;]*;/) && ($array =~ /^\w/); $array =~ s/^\s*([^\s])/$1/; } print @array; __DATA__ The Road Ahead1;Completely Revised and Up-to-Date; 002564418 road ahead2; America?~@~Ya creeping revolution;00345678 The road ahead3;[Address made before the Regional Foreign Policy Conference; 004561963
    prints
    The Road Ahead1;Completely Revised and Up-to-Date; 002564418 road ahead2; America’a creeping revolution;00345678 The road ahead3;[Address made before the Regional Foreign Policy Confe +rence; 004561963
      Thank you very much!
Re: join two lines in a file?
by GrandFather (Saint) on Jan 17, 2008 at 01:28 UTC
    use warnings; use strict; my @records; $/ = ';'; while (<DATA>) { chomp; s/\s*(?<!^)\n\s*/ /mg; push @records, $_; } print join "\n", @records; __DATA__ The Road Ahead1;Completely Revised and Up-to-Date; 002564418 road ahead2; America’a creeping revolution;00345678 The road ahead3;[Address made before the Regional Foreign Policy Conference; 004561963

    Prints:

    The Road Ahead1 Completely Revised and Up-to-Date 002564418 road ahead2 America’a creeping revolution 00345678 The road ahead3 [Address made before the Regional Foreign Policy Conference 004561963

    which may provide a useful starting point.


    Perl is environmentally friendly - it saves trees
Re: join two lines in a file?
by ambrus (Abbot) on Jan 17, 2008 at 09:23 UTC

    I recommend the idiom $_ .= <>. For example, in this case this would be something like:

    use warnings; use strict; while (<DATA>) { while (!/(?:[^;]*+;){2}/) { $_ .= <DATA>; } print "read one record: \n(($_))\n"; } __DATA__ The Road Ahead1;Completely Revised and Up-to-Date; 002564418 road ahead2; Americaa creeping revolution;00345678 The road ahead3;[Address made before the Regional Foreign Policy Conference; 004561963

    You may need to modify the regular expression to whatever it is that you recognize a full record.

    If the continuation lines are marked by the whitespace at the beginning of the next line (like in mail headers) not somehoe in the incomplete line (like in smtp commands), then you can't use this simple idiom. In this case, I recommend a one-line buffer, e.g.

    use warnings; use strict; my($buf, $bufq); sub peekline { $bufq or $buf = <DATA>; $bufq = 1; $buf; } sub getline { $bufq or $buf = <DATA>; $bufq = (); $buf; } while (defined($_ = getline)) { while (defined(peekline) and peekline =~ /^\s/) { $_ .= getline; } print "read one record: \n(($_))\n"; }
      Yes, Ambrus. I think the best way is to recognize a full record and make whatever in between as one record. The input file is in fact shown as below, The first element of one record should be a file name:
      file1.xml;Description A, B C, D, E...;SystemNumber file2.txt;Description X,Y,Z;SystemNumber file3.xml;Description M,N, O,P,Q;SystemsNumber
      I think it might be easy if I first delete all the white spaces in each line; then read all lines into an array as the elements; then split the array by the ".xml" or ".txt" files and put them as the first element of a child array. Is that the right track?
Re: join two lines in a file
by aquarium (Curate) on Jan 17, 2008 at 03:21 UTC
    do you really want to do this in perl?...because you could:
    a) get the file from origin again, making sure line break is not introduced with a different choice of tranmission/text-editor/etc used to get the file
    b) use simple regex or two in vi/vim/your_favorite_regex_editor. regex is something like use Join command on all lines that start with a space
    the hardest line to type correctly is: stty erase ^H

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://662781]
Approved by jettero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-04-26 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found