Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Text comparisons

by ScarryJerry (Initiate)
on Mar 24, 2004 at 19:15 UTC ( #339518=perlquestion: print w/replies, xml ) Need Help??

ScarryJerry has asked for the wisdom of the Perl Monks concerning the following question:

I am new to Perl.
I have looked at the FAQ's, online any other source I can find. I suspect this is a simple solution, but I am stuck.

I have a list of strings (function names) that I am tring to get a count of in source files. I have the function names in a list and I read the source file into antoher list. I have tried various ways of comparing them , for example

foreach $sourceLine (@SourceData) { if ($functionName =~ /$sourceLine/) {$counter++;} }
$counter = grep (/$functionName/ , @SourceData);

Either way I always get a value of 0 for the counter.

Any suggestions as to what I am doing wrong ?



Replies are listed 'Best First'.
Re: Text comparisons
by QM (Parson) on Mar 24, 2004 at 19:23 UTC
    I think your first example is backwards. Try
    foreach my $sourceLine ( @SourceData ) { if ( $sourceLine =~ /\b$functionName\b/ ) { $counter++; } }
    Note that \b matches a "word boundary", which prevents matching on non-function names embedded in other "words" (including other function names). But it doesn't handle the problem where function names are mentioned in comments.

    Quantum Mechanics: The dreams stuff is made of

Re: Text comparisons
by CountZero (Bishop) on Mar 24, 2004 at 19:45 UTC
    A question: do you need to have the count per function or do you want to have an aggregate value (= how many times do all of the functions together appear in the source)?

    In the first case you need to have some form of variable for each function-name to keep the tally (an array or --even better-- a hash will do the trick).

    In the other case, you can build one big regex (all functions names joined together with | (which is "or" in a regex) and feed your source to that regex.

    Warning: there can be many subtle traps and pitfalls:

    • Make sure that no function is a substring of another function or you will count twice the shorter string (e.g. "print" and "printer; and if you have a function int your tally will be way off!".
    • Variables who "contain" a function name will be counted as well if you're not careful (e.g. variable "lines_read" and function "read")
    • ...


    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Text comparisons
by Vautrin (Hermit) on Mar 24, 2004 at 19:21 UTC

    Do you have:

    use strict; use warnings;

    at the top of your script? If you don't, try putting it in. If something is going wrong (i.e. the line you're reading in is undef), it will tell you. Also, can you post more of the script? There are a lot of possibilities for what is going wrong. For instance, if you do not have an open ("FILE", "< ./file") or die("Can't open the file $!"); and are just using open ("FILE", "< ./file"); you would always get an undef (or maybe a "") if you read in the lines.

    Want to support the EFF and FSF by buying cool stuff? Click here.
Re: Text comparisons
by McMahon (Chaplain) on Mar 24, 2004 at 19:31 UTC
    Hi Jerry...
    Try two things:
    First, depending on where you get $sourceLine and $functionName from, you might have invisible newlines (especially in $sourceLine) that are making your comparisons fail. The code below is probably overkill, but try doing this and see what happens:
    foreach $sourceLine (@SourceData) { chomp $sourceLine; chomp $functionName; if ($functionName =~ /$sourceLine/) {$counter++;} }
    Also, use the Poor Man's Debugger, and stick print statements where you suspect your data:
    foreach $sourceLine (@SourceData) { print "$sourcfeLine\n"; print "$functionName\n"; print "\n"; if ($functionName =~ /$sourceLine/) {$counter++;} }
    And of course, use warnings; and use strict; hope that helps...
    -Chris -Chris
Re: Text comparisons
by ScarryJerry (Initiate) on Mar 25, 2004 at 14:10 UTC
    This is what I found out was happening. The list of function names had spaces at the end of the function name

    i.e. myCamelCaseName had spaces after the last 'e'
    Once I removed the spaces at the end, everything worked fine.
    Now I did this through a search/replace but is there something that will return the string with the trailing spaces stripped off ?

    Thanks !

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://339518]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2022-06-26 04:48 GMT
Find Nodes?
    Voting Booth?
    My most frequent journeys are powered by:

    Results (83 votes). Check out past polls.