Re: find a string and count of its occurence in a text file
by GrandFather (Saint) on Nov 14, 2007 at 05:29 UTC
|
There is a lot needless code in your sample, and a couple of foibles. Assuming you want to do something other than just printing out the contents of the matching line, something like the following sample may be what you are after:
use strict;
use warnings;
my $fileContent = <<DATA;
abc:AB CD:100
def:DE FG:101
ghi:GH IJ:102
abc:AB CD:100
ghi:GH IJ:103
DATA
open FILE, '<', \$fileContent;
while (<FILE>) {
chomp; # chomp not chop. $_ is default so omit
my @elements = split /:/, $_;
next unless $elements[0] eq 'abc';
print "Matched: ", join ('|', @elements), "\n";
}
close FILE;
Prints:
Matched: abc|AB CD|100
Matched: abc|AB CD|100
Note that for purposes of the sample the "file" is actually just a string, although Perl allows it to be opened and manipulated as a file.
Generally it is a bad idea to rely on the contents of $_ remaining unaltered over more than a couple of lines of code. You are better to use an explicit variable in such cases so that the intent of the code is clearer and so that the value doesn't get altered in unexpected ways.
Use the three parameter open to make intent clearer and use safe (what happens if the file name starts with '>' in your sample?).
grep on a single element can be replaced with an if.
Perl is environmentally friendly - it saves trees
| [reply] [d/l] [select] |
Re: find a string and count of its occurence in a text file
by ysth (Canon) on Nov 14, 2007 at 06:19 UTC
|
$/="abc";
my $count = chomp(@/=<FILE>)/$/=~y///c;
| [reply] [d/l] |
|
So I ran a perldoc chomp and saw this (in a page that is a good read in its entirety):
If you chomp a list, each element is chomped, and the total number of characters removed is returned.
and then I ran perldoc transliterate, searched inside the page for "Transliterates" and saw these:
- Options: c => Complement the SEARCHLIST.
- It returns the number of characters replaced or deleted
I figure, @/ is an ordinary array, just like @records, say. We already have a glob, */, and we are even using its special-purpose scalar portion in this example, so why not use its array slot, too? I wondered whether I could use chomp(()=<FILE>) instead, but no, it doesn't work. The assignment to the empty list probably succeeds in executing <FILE> in list context, but then throws away the results and does not provide chomp() an lvalue to work with.
The $/ =~ y///c bit, then, counts the number of characters in $/, the input-record-separator, by replacing everything, all chars in the input-record-separator (here described as the complement of nothing) with nothing and returning the number of chars thus replaced. You could just replace that whole expression with 3 in this particular case, as that's the number of characters in "abc", the value of the input-record-separator, but the counting makes the code portable to other input-record-separators.
The program is probably memory-hungry. Although it is not in slurp mode, all the lines seem to get stored in the @/ array, before chomp(LIST) has a chance to work on them.
In any case, chomp() cuts off all trailing occurrences of "abc" in @/ and returns the number of chars it thus cut off. Dividing that by three (that is, by $/=~y///c) then gives you how many times "abc" occurs in the file.
Is that right, monks?
| [reply] [d/l] [select] |
|
The $/ =~ y///c bit, then, counts the number of characters in $/, the input-record-separator, by replacing everything, all chars in the input-record-separator (here described as the complement of nothing) with nothing and returning the number of chars thus replaced.
Not quite; since the REPLACEMENTLIST defaults to the (post-complementing) SEARCHLIST (except with /d), all the chars are replaced with themselves, not nothing. So $/ is unchanged. (Actually, tr aka y recognizes when it's
only being used to count and can even be used on readonly strings then.)
| [reply] [d/l] [select] |
Re: find a string and count of its occurence in a text file
by narainhere (Monk) on Nov 14, 2007 at 05:27 UTC
|
use strict;
use warnings;
sub retriver();
my @lines;
my $lines_ref;
my $count;
$lines_ref=retriver();
@lines=@$lines_ref;
$count=@lines;
print "Count :$count\nLines\n";
print join "\n",@lines;
sub retriver()
{
my $file='source_data\data.txt';
open FILE, $file or die "FILE $file NOT FOUND - $!\n";
my @contents=<FILE>;
my @filtered=grep(/abc:/,@contents);
return \@filtered;
}
The world is so big for any individual to conquer
| [reply] [d/l] |
|
| [reply] |
|
| [reply] [d/l] [select] |
|
Re: find a string and count of its occurence in a text file
by oha (Friar) on Nov 14, 2007 at 10:26 UTC
|
As you did, you must open the file, then for every line of file you must check if starts with abc, then you can increment a variable or print out what you need.
open FILE, $file or die "can't open $file: $!\n";
while(<FILE>)
{
next unless /^abc:/;
$counter++;
chomp;
print "$line\n";
# whatever you need
}
close FILE;
Doing this way you will never load all the file lines but parse one by one.
as someone noticed grep will work on arrays, so to use it you must load all the lines in one array @array = <FILE> which lead to memory issues if the file is big.
Oha
PS: perl have the poetry of next unless, which is so beauty instead of if(! COND) { continue } I can't avoid posting it! :) | [reply] [d/l] [select] |
Re: find a string and count of its occurence in a text file
by Anonymous Monk on Nov 14, 2007 at 05:38 UTC
|
while ( <FILE> ) {
print if 'abc' eq ( split /:/ )[ 0 ];
}
Or possibly this:
while ( <FILE> ) {
print if /^abc:/;
}
| [reply] [d/l] [select] |
|
And to chuck in an (incredibly primitive) idea for a line count as well:
_data.txt_
abc:AB CD:100
def:DE FG:101
ghi:GH IJ:102
abc:AB CD:100
ghi:GH IJ:103
my $count;
while ( <FILE> ) {
print if /^abc:/;
$count++ if /^abc/;
}
print "Matched $count times\n";
_output_
abc:AB CD:100
abc:AB CD:100
Matched 2 times
Update: I'm silly; added condition for incrementing $count. | [reply] [d/l] |
|
i read that using grep is not advisable for huge files
Someone lied to you.
Maybe lied is a bit strong, they we probably thinking of this:
grep /fred/,<FILE>;
which loads the entire file into memory (or at least, tries to).
| [reply] [d/l] |
|
I thought he meant grep(1) instead of perldoc -f grep :-)
| [reply] [d/l] [select] |
Re: find a string and count of its occurence in a text file
by TheForeigner (Initiate) on Nov 14, 2007 at 15:22 UTC
|
It looks like you have plenty to work with, but here's my solution:
open my $file, 'in.txt' || die "Couldn't open file: $!\n"; #better
+way to make file handles
foreach (<$file>){ #for each line
push @matches, $_ if (/abc/); #keep them in an array if they ma
+tch
}
print @matches; #print the matches
print "Total matches: ",scalar(@matches),"\n"; #print the number of
+ matches
| [reply] [d/l] |
Re: find a string and count of its occurence in a text file
by sundialsvc4 (Abbot) on Nov 15, 2007 at 03:28 UTC
|
Do not overlook any opportunity for using all of the tools that may be available to you. For instance, this particular requirement might be easily met by awk, without the use of any Perl programming at all!
And if that be the case... "cool!"
| [reply] |
|
For instance, this particular requirement might be easily met by awk
In fact, there's a general Linux recipe for just these things, generally introduced right around the time whatever book/tute/etc. decides to introduce pipes:
$ egrep ^abc <filename> | wc -l
Disclaim: This wouldn't work if OP had, say, wanted to find the total number of occurrences of a given string that happened to occur more than once per line in data; as it is, OP doesn't (or at least, data doesn't contain the sort of case that would prevent this working) :) | [reply] [d/l] [select] |
Re: find a string and count of its occurence in a text file
by arasu (Initiate) on Apr 23, 2012 at 11:21 UTC
|
open FH, "inputDatafile.txt";
$/=""; ## input field separator
my $line = <FH>;
close (FH);
$count = $line =~ s/(abc)/$1/g;
print "count is : $count\n";
"There is a solution for all problems but we need to find a direction"
| [reply] [d/l] |