Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

grep return the entire file not the line which matches

by vikashiiitdm (Novice)
on Aug 03, 2011 at 14:49 UTC ( [id://918288]=perlquestion: print w/replies, xml ) Need Help??

vikashiiitdm has asked for the wisdom of the Perl Monks concerning the following question:

dear monks, in the light of your suggestions i've modified the code and and it worked fine it didn't return the entire file but there is one serious problem now and even after spending 2 hours debugging it i am unable to. the code i wrote is as follows.

#!/usr/bin/perl open(INFILE, "$ARGV[0]") or die $!; open (ALL_OUT, ">Output_File") or die $!; while($line=<INFILE>) { @english=qw//; @hindi=qw//; chomp $line; print ALL_OUT "########################".$line."################## +#############\n"; ($en_line,$hnd_line)=split(/\|/,$line); if($en_line =~m/:/i) { @english=split(/:/,$en_line); for($i=0;$i<=$#english;$i++) { $englan=substr @english[$i],3; $eng_dat.=`grep -w '$englan' /home/vikash/pro_1/en_100 +0`; } print ALL_OUT $eng_dat; if($hnd_line=~m/:/i) { @hindi=split(/:/,$hnd_line); for($j=0;$j<=$#hindi;$j++) { $hindlan=@hindi[$j],3; $hindi_dat.=`grep -w '$hindlan' /home/vikash/pro_1 +/HI_1000`; } print ALL_OUT $hindi_dat; } else{ $hindlane=substr $hnd_line,3; $hindi_dat=`grep -w '$hindlane' /home/vikash/pro_1/HI_ +1000`; print ALL_OUT $hindi_dat; } } else{ $englane=substr $en_line,3; $eng_dat=`grep -w '$englane' /home/vikash/pro_1/en_1000`; print ALL_OUT $eng_dat; if($hnd_line=~m/:/i) { @hindi=split(/:/,$hnd_line); for($j=0;$j<=$#hindi;$j++) { $hindlan=substr @hindi[$j],3; $hindi_dat.=`grep -w '$hindlan' /home/vikash/pro_1 +/HI_1000`; } print ALL_OUT $hindi_dat; } else{ $hindlane=substr $hnd_line,3; $hindi_dat=`grep -w '$hindlane' /home/vikash/pro_1/HI_ +1000`; print ALL_OUT $hindi_dat; } } }

efficiency right now is nota at all my concern i just want the desired output, but the problem is when i run it with my labels file and if the entry happens to have not one key but multiple keys separated by ':' , and the split(/:/) is called it also prints the result of the last run of the $line. i am pasting a few lines of the output and the input file.

########################EN-1000-0050-2|HI-1000-0050-2################# +############## <p id=EN--1000-0050-2> A medical team opened the brain with great care +, attempting to assess the hemispheres and lobes of this strange brai +n. Was it in any way different? Was it for instance, bigger? No, it w +as decided, in fact if anything, it was a little smaller than normal. + But still intelligence is a function of the entire brain. Yes, the f +rontal area was well developed, and the thalamus and hypothalamus wer +e well developed. All the neural connections se .n.ed particularly fi +rm. In fact it was a normal brain, but\97 'Yes?' asked one of the do +ctors. </p> <p id=HI--1000-0050-2> &#2319;&#2325; &#2350;&#2375;&#2337;&#2367;&#2 +325;&#2354; &#2342;&#2354; &#2344;&#2375; &#2348;&#2337;&#2368; &#236 +0;&#2366;&#2357;&#2343;&#2366;&#2344;&#2368; &#2360;&#2375; &#2313;&# +2360;&#2325;&#2375; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;&#2381; +&#2325; &#2325;&#2379; &#2326;&#2379;&#2354;&#2366; &#2340;&#2366;&#2 +325;&#2367; &#2311;&#2360; &#2357;&#2367;&#2330;&#2367;&#2340;&#2381; +&#2352; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;&#2381;&#2325; &#23 +25;&#2375; &#2361;&#2375;&#2350;&#2367;&#2360;&#2381;&#2347;&#2368;&# +2351;&#2352;&#2381;&#2360; &#2340;&#2341;&#2366; &#2354;&#2377;&#2348 +;&#2381;&#2360; &#2325;&#2368; &#2332;&#2377;&#2330; &#2346;&#2352;&# +2326; &#2325;&#2368; &#2332;&#2366; &#2360;&#2325;&#2375;| &#2325;&#2 +381;&#2351;&#2366; &#2351;&#2361; &#2325;&#2367;&#2360;&#2368; &#2340 +;&#2352;&#2361; &#2349;&#2367;&#2344;&#2381;&#2344; &#2341;&#2366;| & +#2350;&#2360;&#2354;&#2344;, &#2325;&#2369;&#2331; &#2348;&#2337;&#23 +66; &#2341;&#2366;| &#2344;&#2361;&#2368;, &#2320;&#2360;&#2366; &#23 +49;&#2368; &#2344;&#2361;&#2368;&#2306; &#2325;&#2361;&#2366; &#2332; +&#2366; &#2360;&#2325;&#2340;&#2366;| &#2309;&#2360;&#2354; &#2350;&# +2375;&#2306; &#2340;&#2379; &#2360;&#2366;&#2350;&#2366;&#2344;&#2381 +;&#2351; &#2360;&#2375; &#2325;&#2369;&#2331; &#2331;&#2379;&#2335;&# +2366; &#2361;&#2368; &#2341;&#2366;| &#2346;&#2352; &#2348;&#2369;&#2 +342;&#2381;&#2357;&#2367;&#2350;&#2340;&#2381;&#2340;&#2366; &#2340;& +#2379; &#2346;&#2370;&#2352;&#2375; &#2350;&#2360;&#2381;&#2340;&#236 +7;&#2359;&#2381;&#2325; &#2325;&#2366; &#2325;&#2366;&#2350; &#2361;& +#2379;&#2340;&#2366; &#2361;&#2376;&#2306;| &#2361;&#2366;&#2306; &#2 +310;&#2327;&#2375; &#2325;&#2366; &#2349;&#2366;&#2327; &#2346;&#2370 +;&#2352;&#2368; &#2340;&#2352;&#2361; &#2357;&#2367;&#2325;&#2360;&#2 +367;&#2340; &#2341;&#2366; &#2324;&#2352; &#2341;&#2376;&#2354;&#2375 +;&#2350;&#2360; &#2357; &#2361;&#2366;&#2311;&#2346;&#2379;&#2341;&#2 +376;&#2354;&#2375;&#2350;&#2360; &#2349;&#2368; &#2309;&#2330;&#2381; +&#2331;&#2368; &#2340;&#2352;&#2361; &#2357;&#2367;&#2325;&#2360;&#23 +67;&#2340; &#2341;&#2375;&#2306;| &#2346;&#2370;&#2352;&#2368; &#2340 +;&#2306;&#2340;&#2381;&#2352;&#2367;&#2325; &#2346;&#2381;&#2352;&#23 +39;&#2366;&#2354;&#2368; &#2325;&#2366;&#2347;&#2368; &#2350;&#2332;& +#2348;&#2370;&#2340; &#2346;&#2381;&#2352;&#2340;&#2368;&#2340; &#236 +1;&#2379; &#2352;&#2361;&#2368; &#2341;&#2368;| &#2342;&#2352;&#2309; +&#2360;&#2354; &#2351;&#2361; &#2319;&#2325; &#2360;&#2366;&#2350;&#2 +366;&#2344;&#2381;&#2351; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;& +#2381;&#2325; &#2341;&#2366;, &#2346;&#2352;- &#2346;&#2352;, &#231 +9;&#2325; &#2337;&#2366;&#2325;&#2381;&#2335;&#2352; &#2344;&#2375; & +#2346;&#2370;&#2331;&#2366;| </p> ########################EN-1000-0050-3|HI-1000-0050-3:HI-1000-0052-1:H +I-1000-0052-2:HI-1000-0052-3############################### <p id=EN--1000-0050-3> 'Ramu's brain clearly indicates that the evolut +ion of the human brain has not stopped,' the chief of the team said. +'We can say that any brain has room for anything we chose to put into + it. But arrangement and memory and expectations are the channels th +rough which it is done. Once the process of intelligence started, eac +h of the three children bore down on the others with a weight of expe +ctation and a piling on of intellectual memory.' 'And the emotions n +ever developed, not even with Tarangini?' 'No. They had lost their i +mportance. Perhaps we make too much of them.' Captain Uttama sighed. + He had been so close to the children; he had loved them, but had the +y ever loved him? Would he ever know? </p> <p id=HI--1000-0050-2> &#2319;&#2325; &#2350;&#2375;&#2337;&#2367;&#2 +325;&#2354; &#2342;&#2354; &#2344;&#2375; &#2348;&#2337;&#2368; &#236 +0;&#2366;&#2357;&#2343;&#2366;&#2344;&#2368; &#2360;&#2375; &#2313;&# +2360;&#2325;&#2375; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;&#2381; +&#2325; &#2325;&#2379; &#2326;&#2379;&#2354;&#2366; &#2340;&#2366;&#2 +325;&#2367; &#2311;&#2360; &#2357;&#2367;&#2330;&#2367;&#2340;&#2381; +&#2352; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;&#2381;&#2325; &#23 +25;&#2375; &#2361;&#2375;&#2350;&#2367;&#2360;&#2381;&#2347;&#2368;&# +2351;&#2352;&#2381;&#2360; &#2340;&#2341;&#2366; &#2354;&#2377;&#2348 +;&#2381;&#2360; &#2325;&#2368; &#2332;&#2377;&#2330; &#2346;&#2352;&# +2326; &#2325;&#2368; &#2332;&#2366; &#2360;&#2325;&#2375;| &#2325;&#2 +381;&#2351;&#2366; &#2351;&#2361; &#2325;&#2367;&#2360;&#2368; &#2340 +;&#2352;&#2361; &#2349;&#2367;&#2344;&#2381;&#2344; &#2341;&#2366;| & +#2350;&#2360;&#2354;&#2344;, &#2325;&#2369;&#2331; &#2348;&#2337;&#23 +66; &#2341;&#2366;| &#2344;&#2361;&#2368;, &#2320;&#2360;&#2366; &#23 +49;&#2368; &#2344;&#2361;&#2368;&#2306; &#2325;&#2361;&#2366; &#2332; +&#2366; &#2360;&#2325;&#2340;&#2366;| &#2309;&#2360;&#2354; &#2350;&# +2375;&#2306; &#2340;&#2379; &#2360;&#2366;&#2350;&#2366;&#2344;&#2381 +;&#2351; &#2360;&#2375; &#2325;&#2369;&#2331; &#2331;&#2379;&#2335;&# +2366; &#2361;&#2368; &#2341;&#2366;| &#2346;&#2352; &#2348;&#2369;&#2 +342;&#2381;&#2357;&#2367;&#2350;&#2340;&#2381;&#2340;&#2366; &#2340;& +#2379; &#2346;&#2370;&#2352;&#2375; &#2350;&#2360;&#2381;&#2340;&#236 +7;&#2359;&#2381;&#2325; &#2325;&#2366; &#2325;&#2366;&#2350; &#2361;& +#2379;&#2340;&#2366; &#2361;&#2376;&#2306;| &#2361;&#2366;&#2306; &#2 +310;&#2327;&#2375; &#2325;&#2366; &#2349;&#2366;&#2327; &#2346;&#2370 +;&#2352;&#2368; &#2340;&#2352;&#2361; &#2357;&#2367;&#2325;&#2360;&#2 +367;&#2340; &#2341;&#2366; &#2324;&#2352; &#2341;&#2376;&#2354;&#2375 +;&#2350;&#2360; &#2357; &#2361;&#2366;&#2311;&#2346;&#2379;&#2341;&#2 +376;&#2354;&#2375;&#2350;&#2360; &#2349;&#2368; &#2309;&#2330;&#2381; +&#2331;&#2368; &#2340;&#2352;&#2361; &#2357;&#2367;&#2325;&#2360;&#23 +67;&#2340; &#2341;&#2375;&#2306;| &#2346;&#2370;&#2352;&#2368; &#2340 +;&#2306;&#2340;&#2381;&#2352;&#2367;&#2325; &#2346;&#2381;&#2352;&#23 +39;&#2366;&#2354;&#2368; &#2325;&#2366;&#2347;&#2368; &#2350;&#2332;& +#2348;&#2370;&#2340; &#2346;&#2381;&#2352;&#2340;&#2368;&#2340; &#236 +1;&#2379; &#2352;&#2361;&#2368; &#2341;&#2368;| &#2342;&#2352;&#2309; +&#2360;&#2354; &#2351;&#2361; &#2319;&#2325; &#2360;&#2366;&#2350;&#2 +366;&#2344;&#2381;&#2351; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;& +#2381;&#2325; &#2341;&#2366;, &#2346;&#2352;- &#2346;&#2352;, &#231 +9;&#2325; &#2337;&#2366;&#2325;&#2381;&#2335;&#2352; &#2344;&#2375; & +#2346;&#2370;&#2331;&#2366;| </p> <p id=HI--1000-0050-3> &#145;&#2352;&#2366;&#2350;&#2370; &#2325;&#23 +66; &#2350;&#2360;&#2381;&#2340;&#2367;&#2359;&#2381;&#2325; &#2311;& +#2360; &#2323;&#2352; &#2360;&#2381;&#2346;&#2359;&#2381;&#2335; &#23 +60;&#2306;&#2325;&#2375;&#2340; &#2325;&#2352;&#2340;&#2366; &#2361;& +#2376;&#2306; &#2325;&#2367; &#2350;&#2366;&#2344;&#2357;-&#2350;&#23 +60;&#2381;&#2340;&#2367;&#2359;&#2381;&#2325; &#2325;&#2366; &#2357;& +#2367;&#2325;&#2366;&#2360; &#2309;&#2349;&#2368; &#2352;&#2369;&#232 +5;&#2366; &#2344;&#2361;&#2368;&#2306; &#2361;&#2376;&#2306;,&#146; & +#2342;&#2354; &#2325;&#2375; &#2346;&#2381;&#2352;&#2350;&#2369;&#232 +6; &#2344;&#2375; &#2325;&#2361;&#2366;, &#2361;&#2350; &#2325;&#2361 +; &#2360;&#2325;&#2340;&#2375; &#2361;&#2376;&#2306; &#2325;&#2367; & +#2325;&#2367;&#2360;&#2368; &#2349;&#2368; &#2350;&#2360;&#2381;&#234 +0;&#2367;&#2359;&#2381;&#2325; &#2350;&#2375;&#2306; &#2361;&#2350; & +#2309;&#2348; &#2309;&#2346;&#2344;&#2368; &#2311;&#2330;&#2381;&#233 +1;&#2367;&#2340; &#2330;&#2368;&#2332; &#2352;&#2326; &#2360;&#2325;& +#2340;&#2375; &#2361;&#2376;| &#2346;&#2352; &#2340;&#2352;&#2381;&#2 +325;, 52 &#2309;&#2306;&#2340;&#2352;&#2367;&#2325;&#2381;&#2359; &# +2325;&#2366; &#2357;&#2352;&#2342;&#2366;&#2344; &#2360;&#2381;&#2350 +;&#2352;&#2339; &#2358;&#2325;&#2381;&#2340;&#2367; &#2324;&#2352; &# +2310;&#2325;&#2366;&#2306;&#2325;&#2381;&#2359;&#2366;&#2323;&#2306; +&#2325;&#2375; &#2350;&#2366;&#2343;&#2381;&#2351;&#2350; &#2360;&#23 +75; &#2361;&#2368; &#2351;&#2361; &#2325;&#2366;&#2350; &#2325;&#2367 +;&#2351;&#2366; &#2332;&#2366; &#2360;&#2325;&#2340;&#2366; &#2361;&# +2376;&#2306; &#2332;&#2376;&#2360;&#2375; &#2361;&#2368; &#2348;&#236 +9;&#2342;&#2381;&#2357;&#2367; &#2325;&#2368; &#2346;&#2381;&#2352;&# +2325;&#2381;&#2352;&#2367;&#2351;&#2366; &#2358;&#2369;&#2352;&#2369; + &#2361;&#2369;&#2312;, &#2351;&#2375; &#2340;&#2368;&#2344;&#2379;&# +2306; &#2348;&#2330;&#2381;&#2330;&#2375;&#2306; &#2310;&#2325;&#2366 +;&#2325;&#2381;&#2359;&#2366;&#2323;&#2306; &#2325;&#2368; &#2358;&#2 +325;&#2381;&#2340;&#2367; &#2324;&#2352; &#2348;&#2369;&#2342;&#2381; +&#2357;&#2367;&#2350;&#2366;&#2344; &#2360;&#2381;&#2350;&#2371;&#234 +0;&#2367; &#2325;&#2375; &#2360;&#2381;&#2340;&#2306;&#2349; &#2325;& +#2375; &#2360;&#2361;&#2366;&#2352;&#2375; &#2324;&#2352;&#2379;&#230 +6; &#2360;&#2375; &#2310;&#2327;&#2375; &#2344;&#2367;&#2325;&#2354; +&#2327;&#2351;&#2375;| </p> <p id=HI--1000-0052-1> &#2324;&#2352; &#2349;&#2366;&#2357;&#2344;&#2 +366;&#2317; &#2340;&#2379; &#2313;&#2344;&#2350;&#2375;&#2306; &#2325 +;&#2349;&#2368; &#2357;&#2367;&#2325;&#2360;&#2367;&#2340; &#2361;&#2 +368; &#2344;&#2361;&#2368;&#2306; &#2361;&#2369;&#2312;| &#2340;&#235 +2;&#2306;&#2327;&#2367;&#2344;&#2368; &#2350;&#2375;&#2306; &#2349;&# +2368; &#2344;&#2361;&#2368;| </p> <p id=HI--1000-0052-2> &#2344;&#2361;&#2368;&#2306; &#2313;&#2344;&#2 +325;&#2375; &#2354;&#2367;&#2319; &#2349;&#2366;&#2357;&#2344;&#2366; +&#2323;&#2306; &#2325;&#2366; &#2350;&#2361;&#2340;&#2381;&#2357; &#2 +326;&#2340;&#2381;&#2350; &#2361;&#2379; &#2327;&#2351;&#2366; &#2341 +;&#2366;| &#2360;&#2306;&#2349;&#2357;&#2340;: &#2361;&#2350;&#2366;& +#2352;&#2375; &#2349;&#2368;&#2340;&#2352; &#2351;&#2375; &#2325;&#23 +69;&#2331; &#2332;&#2381;&#2351;&#2366;&#2342;&#2366; &#2361;&#2368; +&#2361;&#2376;&#2306;| </p> <p id=HI--1000-0052-3> &#2325;&#2346;&#2381;&#2340;&#2366;&#2344; &#2 +313;&#2340;&#2381;&#2340;&#2350; &#2333;&#2375;&#2306;&#2346; &#2327; +&#2351;&#2366;| &#2357;&#2361; &#2348;&#2330;&#2381;&#2330;&#2379;&#2 +306; &#2325;&#2379; &#2311;&#2340;&#2344;&#2366; &#2344;&#2367;&#2325 +;&#2335; &#2352;&#2361;&#2366; &#2341;&#2366; &#2357;&#2361; &#2313;& +#2344;&#2360;&#2375; &#2346;&#2381;&#2351;&#2366;&#2352; &#2325;&#235 +2;&#2344;&#2375; &#2354;&#2327;&#2366; &#2341;&#2366;, &#2346;&#2352; + &#2325;&#2381;&#2351;&#2366; &#2313;&#2344;&#2381;&#2361;&#2379;&#23 +06;&#2344;&#2375; &#2349;&#2368; &#2325;&#2349;&#2368; &#2313;&#2360; +&#2360;&#2375; &#2346;&#2381;&#2351;&#2366;&#2352; &#2325;&#2367;&#23 +51;&#2366; &#2361;&#2379;&#2327;&#2366;| &#2325;&#2351;&#2366; &#2357 +;&#2361; &#2325;&#2349;&#2368; &#2351;&#2361; &#2332;&#2366;&#2344; & +#2346;&#2366;&#2351;&#2375;&#2327;&#2366;| </p>
EN-1000-0050-2|HI-1000-0050-2 EN-1000-0050-3|HI-1000-0050-3:HI-1000-0052-1:HI-1000-0052-2:HI-1000-00 +52-3

Replies are listed 'Best First'.
Re: grep return the entire file not the line which matches
by moritz (Cardinal) on Aug 03, 2011 at 14:52 UTC

      sir, thatz not substraction .. actually i forgot to add the quotes around it.please sir, see my updates post and give ur valuable comments

        Never re-type the code, always use copy&paste when presenting it to others. A single character often makes the difference between a working and a broken program.

        Where's this update you speak of?
Re: grep return the entire file not the line which matches
by FunkyMonk (Chancellor) on Aug 03, 2011 at 15:41 UTC
    After fixing your lack of strictures, as suggested by moritz and Corion, you still have a problem. The first argument of grep is not a string to search for, rather, it should be an expression (/$en/, for example).

    You'll end up with something like:

    open(ENG,"<en_1000") or die $!; my @eng_dat=<ENG>; close ENG; my $en="EN-1000-0003-3"; my @data= grep /$en/,@eng_dat; print @data;

      dear monk, thanks for the correction. my en_1000 file is made up of such lines and i need to extract the whole line based upon the id provided.

       <p id=EN--1000-0003-3> The object turned out to be a big meteorite. Uttama was delighted. He had never seen anything like it on sea or land before. Despite its journey in space and stay in water, it had retained its shape and colour.  </p>

        You still got the same problem there about which I told you a week ago. "EN--1000-0003-3" is not the same as "EN-1000-0003-3". Don't you see that in file there are two "-" after the EN, whereas in the search string there is only one "-".

        To make it even more obvious:

        EN--1000-0003-3 is not the same as EN-1000-0003-3
        The example I gave will match the all of the lines that contain the search phrase. Perhaps you should try my example with your data and see what happens.
Re: grep return the entire file not the line which matches
by Corion (Patriarch) on Aug 03, 2011 at 14:53 UTC

    What is this line supposed to do:

    $en=EN-1000-0003-3;

    Please use strict and warnings to prevent such errors. Also, please test your code to make sure it exhibits your problem before posting.

      #!/usr/bin/perl $en=EN-1000-0003-3; print "\$en is $en\n"; --------- $en is -1006

      since -1006 is true then the grep in the OP will return all from the source array as stated.

Re: grep return the entire file not the line which matches
by GotToBTru (Prior) on Aug 03, 2011 at 15:45 UTC
    Check your syntax on the grep command as well. It should probably be:
    @data=grep /$en/,@eng_dat;
      This is the correct. The syntax would be:
      @data=grep /$en/,@eng_dat;
      or it can be equivalently:
      @data=grep (/$en/,@eng_dat);
      as mentioned here: http://programmingbulls.com/perl-grep

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://918288]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-03-28 22:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found