Wierd behaviour with HTML::Entities::decode

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Here's my code -

BEGIN { $ENV{LC_ALL} = $ENV{LANG} = 'sv_SE.UTF-8' }

use strict;
use warnings;
use utf8;

use open ':std', ':locale';

use LWP::UserAgent qw( get );
use HTML::Strip    qw( );
use HTML::Entities qw( decode_entities );
use Text::Sentence qw( split_sentences );

my $userAgent = LWP::UserAgent->new();
#$userAgent->agent('Mozilla/5.0');

my $url = "http://www.expressen.se";

my $response = $userAgent->get($url);
  die "Can't get $url: ", $response->status_line
   unless $response->is_success;

my $hs = HTML::Strip->new( decode_entities => 0 );
my $parsedContent = $hs->parse( $response->content );
utf8::decode( my $decodedParsedContent = $parsedContent );
$decodedParsedContent =~ s/(\s)+/ /g; # remove double whitespace
decode_entities(my $decodedParsedContentWithDecodeEntities = $decodedP
+arsedContent);

my @sentences = split_sentences( $decodedParsedContentWithDecodeEntiti
+es );
foreach my $sentence (@sentences) 
{
      #$sentence =~ s/^\s+//; # remove leading whitespace 
    #$sentence =~ s/\s+$//; # remove trailing
    decode_entities(my $sentenceDecodeEntities = $sentence);
    
    while ($sentenceDecodeEntities =~ /(\w+)/g) 
    {
        print "$1 : ".$sentenceDecodeEntities."\n";
    }
}
[download]

One of my output lines is -

gor : BLOGG Europe Turnéblogg Mic i vår replokal "The Dungeon" BLOGG L
+otta Gröning Krönikör Demokratifiasko...
[download]

Which looks good, however if I comment out either of the two calls to decode_entities(), I end up getting -

gor : BLOGG Europe Turnéblogg Mic i vår replokal &quot;The Dungeon&quo
+t; BLOGG Lotta Gröning  Krönikör Demokratifiasko...
[download]

Why do I need the two calls to decode_entities()???
Thanks very much for your help!

Back to Seekers of Perl Wisdom