Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Quick Fix for PerlMonks modules

by pope (Friar)
on Sep 24, 2002 at 11:17 UTC ( #200334=monkdiscuss: print w/ replies, xml ) Need Help??

Greeting fellow monks,

I was about to ask something in chatterbox using Shendal's monkchat when I found it just didn't work. It didn't show the currently online users and showed no messages at all. "This must be because of recent changes of this site layout", I thought, then I grabbed the latest PerlMonks modules (version 2.2), install it, but the same problem was still there.

After "super searching", I only found the same problem (Does anybody else have this problem?) was reported but no response.

After looking into PerlMonks modules, I found a fundamental problem that the XML parsing is done using regex in a fragile way, that's why it is broken. Here is a quick fix to PerlMonks/Chat.pm and PerlMonks/Users.pm, which simply replaces that parsing part with XML::Simple's XMLin().

--- Chat.pm-old Tue Sep 24 17:45:12 2002 +++ Chat.pm Tue Sep 24 17:32:07 2002 @@ -11,6 +11,7 @@ use strict; use vars qw(@ISA); use HTML::Entities; +use XML::Simple; use PerlMonks; use PerlMonks::NewestNodes; @@ -58,7 +59,15 @@ # Get general chat messages if ($c=$self->getpage(CHAT_URL)) { $c=~s/[\r\n\t]//g; - my @msgs=($c=~/message\s+author="([^\"]+)"[^>]+>\s*(.*?)\s*<\/mes +sage>/g); + + # problematic + # my @msgs=($c=~/message\s+author="([^\"]+)"[^>]+>\s*(.*?)\s*<\/m +essage>/g); + + my $msgs = XMLin($c, forcearray => 1)->{message}; + + my @msgs = map { $_->{author} => $_->{content} } + sort { $a->{time} <=> $b->{time} } $msgs ? @$msgs : (); + if (@msgs) { while (@msgs) { my ($author, $msg)=(shift(@msgs),shift(@msgs));
--- Users.pm-old Tue Sep 24 17:45:18 2002 +++ Users.pm Tue Sep 24 17:37:28 2002 @@ -14,6 +14,7 @@ use strict; use vars qw(@ISA); +use XML::Simple; use PerlMonks; @ISA=qw(PerlMonks); @@ -63,7 +64,13 @@ my $self=shift; if ( (time() - $self->{cache_users_ts}) > USERS_REFRESH) { if (my $c=$self->getpage(USERS_URL)) { - my %users=($c=~/user\s+username="([^\"]+)"\s+user_id="(\d+)"/g) +; + + # problematic + # my %users=($c=~/user\s+username="([^\"]+)"\s+user_id="(\d+)" +/g); + + my $users = XMLin($c, forcearray => 1)->{user}; + my %users = map { $_->{username} => $_->{user_id} } $users ? @ +$users : (); + $self->{cache_users}=\%users; $self->{cache_users_ts}=time(); }

Comment on Quick Fix for PerlMonks modules
Select or Download Code
Re: Quick Fix for PerlMonks modules
by Joost (Canon) on Sep 24, 2002 at 14:50 UTC
    There are already a bunch of fixes available at PerlMonks modules 2.0. Zzamboni recently moved the modules to sourceforge, and asked wether anyone wanted them continued. Some people did, some didn't but as far as I could see he did not fix them.

    Also, you might want to get in contact with Jouke, who expressed interest in taking over the modules from Zzamboni, but I haven't heard from either of them since.

    -- Joost downtime n. The period during which a system is error-free and immune from user input.
      Thanks for the info, Joost. I know that it must have been already discussed here, somewhere in the monastery, I just couldn't find it.

      However, after looking at the fixes posted at PerlMonks modules 2.0, I notice none of the them touch the real problem: not doing XML parsing in a sane way. For example, they all will simply fail if there are spaces between an attribute and its value. Probably after some refinement, one can eventually ends up with a regex that will really work. But I think it's much simpler to use a good XML module to do that job.

      -- pope who is not a pope, or the pope

        I know, I started to replace some of the mathing with XML::Simple but I simply got bored with it, especially because I think it is time for a complete rewrite for these modules.
        -- Joost downtime n. The period during which a system is error-free and immune from user input.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://200334]
Approved by valdez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (11)
As of 2014-07-31 17:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (249 votes), past polls