http://www.perlmonks.org?node_id=939487


in reply to string related query

The first thing I'd suggest is that you drop the do/while and do/until stuff, and learn more perlish ways to loop through files. Yes, "there is more than one way to do it," but sometimes one way really isn't as good as another.

If I understand the requirements correctly, you want to go through multiple lines (I'm assuming that's what the \n's signify), and put together the text in capital letters that follows the same "name" number before the first period. So you've put together the texts that follow name1, name1.1, and name1.1.1, and then those that follow name2, name2.2, and so on. Assuming they'll come in numbered order like you show, and all lines will match your pattern, it's not too complicated. Loop through the lines, plucking out the digits following 'name' and the text following the first space. Check the number against a counter that you're keeping, and if it has changed, start a new line for that number; if it hasn't changed, add your text to the line you're already working on. In code (which is pretty basic and would need to be expanded with error checking and probably a stricter regex for a real-world task):

abaugher@bannor:~/work/perl/monks$ cat 939472.pl #!/usr/bin/perl use Modern::Perl; open my $in, '<', '939472.txt' or die $!; my $n = 0; # current number while(my $line = <$in>){ chomp $line; # capture any digits directly following 'name' # and the text following the first space my( $nn, $text ) = $line =~ /name(\d+)\S*\s+(.+)$/; if( $nn == $n ){ # already working on this name? print " $text"; # if so, then print what we've got } else { # if not, start a new line print "\n" unless $n == 0; # unless this is the first line print "name$nn $text"; # print the new number and text $n = $nn; # and update the name counter } } print "\n"; abaugher@bannor:~/work/perl/monks$ cat 939472.txt name1 TOM RAT name1.1 AND name1.1.1 JERRY name2 BAT MAN name2.1 CAN name2.1.1 FLY name2.1.2 ANYWHERE abaugher@bannor:~/work/perl/monks$ perl 939472.pl name1 TOM RAT AND JERRY name2 BAT MAN CAN FLY ANYWHERE

Aaron B.
My Woefully Neglected Blog, where I occasionally mention Perl.