Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Question regarding exact regexp matching

by Ekimino (Novice)
on Jun 28, 2012 at 21:35 UTC ( [id://979007]=perlquestion: print w/replies, xml ) Need Help??

Ekimino has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I Have just finished reading the Llama, and started a project to hone my Perl skills. My problem question in the following code is that my regexp is not matching exactly a line, It just matches every logfile line that starts the same way, so it gives me wrong results. I wonder if it has to do with the (.*) line starter. Also, I would also like to know what do you think of my "skeleton_player" hash copying. Thank you for your time. Logfile Example
Logfile Example "ct1<4><BOT><>" entered the game "ct1<4><BOT><>" joined team "CT" "terror1<1><BOT><TERRORIST>" attacked "ct1<4><BOT><CT>" with "p228" (d +amage "26") (damage_armor "0") (health "74") (armor "0") "terror1<1><BOT><TERRORIST>" attacked "ct1<4><BOT><CT>" with "p228" (d +amage "27") (damage_armor "0") (health "47") (armor "0") "terror1<1><BOT><TERRORIST>" attacked "ct1<4><BOT><CT>" with "p228" (d +amage "27") (damage_armor "0") (health "20") (armor "0") "terror1<1><BOT><TERRORIST>" attacked "ct1<4><BOT><CT>" with "p228" (d +amage "28") (damage_armor "0") (health "-8") (armor "0") "terror1<1><BOT><TERRORIST>" killed "ct1<4><BOT><CT>" with "p228" "ct1<4><BOT><CT>" attacked "terror2<2><BOT><TERRORIST>" with "fiveseve +n" (damage "12") (damage_armor "2") (health "88") (armor "98") "ct1<4><BOT><CT>" attacked "terror3<3><BOT><TERRORIST>" with "fiveseve +n" (damage "13") (damage_armor "0") (health "2") (armor "0")
This is the code that matches wrongly.
# kill / killed logging # $1 = killer # $2 = killed # $3 = weapon used if($_ =~ m/^"(.+)<.+><.+><.+>" killed "(.+)<.+><.+><.+>" with "(.+ +)"/){ $players{$1}{"frags"} += 1; $players{$2}{"deaths"} += 1; }
As you may see in the logfile example player named "ct1" didn't killed anyone, but It's name appears 4 times as line starter, which my regular expressions treats as 4 kills. Is the error in starting my regexp with (.*)?. If it's not too much to ask I would also like to know what do think of this part, is it correct to just copy the hash?
# The skeleton of each player hash my %skeleton_player = ( "team_side" => "0", "frags" => "0", "deaths" => "0", "suicides" => "0", "dmg_done" => "0", "dmg_received" => "0", "team_damage" => "0", "team_kills" => "0", ); my %players = (); my @file = <>; foreach(@file){ # Deletes the time information from each log line $_ =~ s/L [0-9]+\/[0-9]+\/[0-9]{4} - [0-9]+\:[0-9]+\:[0-9]+\: //g; # Creates a hash of a hash for players # # And Copies %skeleton_player to it # # %players # - %player 1 # -frags # -deaths # -etc.. # - %player 2 # -etc.. if($_ =~ m/"(.*)<.*><.*><.*>" entered the game/){ unless(exists $players{$1}){ $players{$1} = %skeleton_player; } }
Thank you for taking the time to read and think about it.

Replies are listed 'Best First'.
Re: Question regarding exact regexp matching
by jethro (Monsignor) on Jun 28, 2012 at 22:15 UTC
    $players{$1} = %skeleton_player;

    You try to assign a hash to a scalar. A hash in scalar context will return a string that tells you something about the fill ratio of your hash. Try it out:

    perl -e ' %h=(1,5,2,8); $d=%h; print $d;' #prints "2/8"

    You want this instead:

    %{$players{$1}} = %skeleton_player;

    As you can see, now you are assigning a hash to a hash.

    PS: It helps with debugging to print out values of variables even when you think you know what is in there. Very helpful for this is Data::Dumper. Try the following before and after your logging loop and see what your data looks like (you also have to add "use Data::Dumper" at the start of your script):

    print Dumper(\%players),"\n";

    Also very helpful to print out whatever gets matched in your logging loop, i.e. put this inside the match-success if-case:

    print "I matched $1 and $2\n";
      Assigning the hash correctly has solved the problem, Thank you.
Re: Question regarding exact regexp matching
by aaron_baugher (Curate) on Jun 28, 2012 at 22:23 UTC
    Is the error in starting my regexp with (.*)?.

    Your regexp doesn't start with that. Did you mean to type (.+) perhaps? Either way, yes, that's probably the issue. All of your .+ elements will greedily match as much text as possible while still allowing the entire thing to match. When you have several of them in the same regexp, there's a good chance they will match more of your text than you expect. It may help to make them non-greedy by changing them to .+? , or make sure they can't match certain characters by using an exclusionary character class like ([^"]+) to match as many non-double-quote characters as possible. Or find a different way to parse out the data you need.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: Question regarding exact regexp matching
by CountZero (Bishop) on Jun 28, 2012 at 22:22 UTC
    There must be something else wrong somewhere in your code, because your regex does not match those "ct1" lines at all. It does match the "killed" lines.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://979007]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2024-03-28 08:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found