Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

string related query

by viktor (Acolyte)
on Nov 22, 2011 at 15:18 UTC ( #939472=perlquestion: print w/ replies, xml ) Need Help??
viktor has asked for the wisdom of the Perl Monks concerning the following question:

Dear monk I have a file similar to this: name1 TOM RAT\n name1.1 AND\n name1.1.1 JERRY\n name2 BAT MAN\n name2.1 CAN\n name2.1.1 FLY\n name2.1.2 ANYWHERE\n and I want the output like this name1 TOM RAT AND JERRY\n name2 BAT MAN CAN FLY ANYWHERE\n

I tried to make a script but it doesn't work . I am new to perl please let me know if possible how i can do this
#!/usr/bin/perl -w use strict; open(FILE,"test.txt"); my $line=<FILE>; my $prev_name; my $prev_id; my ($new_id,$new_name); do{ ($prev_id,$prev_name)=split (' ',$line); do{ $line=<FILE>; ($new_id,$new_name)=split (' ',$line); $new_id=~/^(n.+?)\.\d+/; my $check=$1; if ($check=~/$prev_id/){ print "$prev_id\n"; + $prev_name.=$new_name; } else{ print "$prev_id $prev_name" ; } }while ($prev_id=~/$new_id/); }until eof(FILE);

Comment on string related query
Download Code
Re: string related query
by Eliya (Vicar) on Nov 22, 2011 at 15:47 UTC

    There are many ways to do this.  Here's one suggestion:

    my %out; while (<DATA>) { chomp; my ($id, $name) = split ' ', $_, 2; # split line into two parts ($id) = $id =~ /^([^.]+)/; # extract "name1", "name2" as + $id $out{$id} .= " $name"; # assemble stuff by $id (in a + hash) } for my $id (sort keys %out) { print $id, $out{$id}, "\n"; } __DATA__ name1 TOM RAT name1.1 AND name1.1.1 JERRY name2 BAT MAN name2.1 CAN name2.1.1 FLY name2.1.2 ANYWHERE

    (feel free to ask if you need more detailed explanations)

Re: string related query
by aaron_baugher (Deacon) on Nov 22, 2011 at 16:26 UTC

    The first thing I'd suggest is that you drop the do/while and do/until stuff, and learn more perlish ways to loop through files. Yes, "there is more than one way to do it," but sometimes one way really isn't as good as another.

    If I understand the requirements correctly, you want to go through multiple lines (I'm assuming that's what the \n's signify), and put together the text in capital letters that follows the same "name" number before the first period. So you've put together the texts that follow name1, name1.1, and name1.1.1, and then those that follow name2, name2.2, and so on. Assuming they'll come in numbered order like you show, and all lines will match your pattern, it's not too complicated. Loop through the lines, plucking out the digits following 'name' and the text following the first space. Check the number against a counter that you're keeping, and if it has changed, start a new line for that number; if it hasn't changed, add your text to the line you're already working on. In code (which is pretty basic and would need to be expanded with error checking and probably a stricter regex for a real-world task):

    abaugher@bannor:~/work/perl/monks$ cat 939472.pl #!/usr/bin/perl use Modern::Perl; open my $in, '<', '939472.txt' or die $!; my $n = 0; # current number while(my $line = <$in>){ chomp $line; # capture any digits directly following 'name' # and the text following the first space my( $nn, $text ) = $line =~ /name(\d+)\S*\s+(.+)$/; if( $nn == $n ){ # already working on this name? print " $text"; # if so, then print what we've got } else { # if not, start a new line print "\n" unless $n == 0; # unless this is the first line print "name$nn $text"; # print the new number and text $n = $nn; # and update the name counter } } print "\n"; abaugher@bannor:~/work/perl/monks$ cat 939472.txt name1 TOM RAT name1.1 AND name1.1.1 JERRY name2 BAT MAN name2.1 CAN name2.1.1 FLY name2.1.2 ANYWHERE abaugher@bannor:~/work/perl/monks$ perl 939472.pl name1 TOM RAT AND JERRY name2 BAT MAN CAN FLY ANYWHERE

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.

      thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://939472]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2014-07-25 12:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (171 votes), past polls