Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Mojo::URL returns incorrect absolute path

by mr_p (Scribe)
on May 16, 2013 at 16:52 UTC ( #1033868=perlquestion: print w/replies, xml ) Need Help??
mr_p has asked for the wisdom of the Perl Monks concerning the following question:

Hello Everyone,

I am running into this issue where Mojo::URL to_abs returns me with incorrect absolute path? Can someone help me with this.

Results are: They should be:
#!/usr/bin/perl use 5.010; use open qw( :std :utf8 ); use strict; use utf8; use warnings qw(all); use Data::Dumper; use Mojo::UserAgent; # FIFO queue my $linkUrl = ""; my $ua = Mojo::UserAgent->new(max_redirects => 2)->detect_proxy; my $tx = $ua->get($linkUrl); for my $e ($tx->res->dom('a[href]')->each) { my $link = Mojo::URL->new($e->{href}); next if 'Mojo::URL' ne ref $link; $link = $link->to_abs($tx->req->url)->fragment(undef); next unless grep { $link->protocol eq $_ } qw(http https); if ($link->to_string =~ /rss/ ) { print $link->to_abs; print "\n"; } }

Replies are listed 'Best First'.
Re: Mojo::URL returns incorrect absolute path
by McA (Priest) on May 17, 2013 at 02:15 UTC


    I looked at your problem and found the following:

    You grab the URL In the result of that html document there is a href with href="page/rss". If you would build the resulting URL manually you would take the and add the relative url page/rss to it. This would result in This is what you get from Mojolicious.

    Now the big "BUT":

    In the resulting html document of there is a html tag <base href="" /> stating that every relative URL should be based on that base URL. This means that your href="page/rss" is added to <base href="" /> resulting in <base href="" />, which is what you want.

    The question remains. Should Mojolicious respect any base-tag on its own or are you responsible to extract a base tag and add it to your absolute-url-generating-code?

    As I took the time to look at your problem I would like to ask you to file a question to the mojolicious maintainers if this behaviour is intentional.

    Best regards

      Thanks for the the explanation to the problem.

      I understand what your are saying. I was just expecting the behavior for Mojolicious to be the same as browser behavior.


      FYI: I have posted question to mojolicious maintainers.

      Do you know work around this issue?

Re: Mojo::URL returns incorrect absolute path
by Anonymous Monk on May 16, 2013 at 17:18 UTC
    um, how about code without all that useragent/DOM stuff?
      I need to use the dom for the links, don't I?

        I need to use the dom for the links, don't I?

        You say there is a problem with Mojo::URL , that it returns incorrect absolute path , meaning  my $link = Mojo::URL->new($e->{href}); gives you the wrong thing

        So as employ Basic debugging checklist , How do I post a question effectively?, to test your hypothesis :)

        use Data::Dumper, dumper the href, and lets see what it does

        You think Mojo::URL is a proble, fantastic, lets check

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1033868]
Approved by herveus
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2018-12-11 08:39 GMT
Find Nodes?
    Voting Booth?
    How many stories does it take before you've heard them all?

    Results (53 votes). Check out past polls.

    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!