Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Get non transformed XML

by erroneousBollock (Curate)
on Nov 22, 2007 at 08:49 UTC ( #652327=note: print w/ replies, xml ) Need Help??


in reply to Get non transformed XML

Is there a way to use LWP:Simple to get the source of an XML document without the XSL transformation.
I doubt LWP::Simple has anything to do with XSL translation of some XML document loaded by a webserver.

if I got to the site and hit view source I can see the XML with no problem
My intuition is that the webserver is detecting the browser "agent" string and has determined that your browser (LWP::Simple) can't apply the stylesheet itself, so the webserver is doing the translation server-side for you.

Try using LWP::UserAgent and:

$ua->agent('Mozilla/5.0');

$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/2006120418 Firefox/2.0.0.1');

Update: fixed agent string, thanks Gangabass.

-David


Comment on Re: Get non transformed XML
Select or Download Code
Re^2: Get non transformed XML
by Danikar (Novice) on Nov 22, 2007 at 08:57 UTC
    I just tried the code below and recieved the same thing =(
    require LWP::UserAgent;
    
    my $ua = LWP::UserAgent->new;
    $ua->timeout(10);
    $ua->env_proxy;
    $ua->agent('Mozilla/5.0');
    
    my $response = $ua->get('http://www.wowarmory.com/');
    
    if ($response->is_success) 
    {
    	print $response->content;  # or whatever
    }
    else 
    {
    	die $response->status_line;
    }

      I think this not enough.

      Try this UserAgent:

      $ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1 +) Gecko/2006120418 Firefox/2.0.0.1');

      If this not help when try UserAgent which your browser send to target site (you can see it with HTTP::Proxy).

        That worked!

        Thanks a lot.

      Firefox DownThemAll addon retrieves 183 bytes.
      wget retrievies 23k.

      I think it's safe to say it's some sort of header :-)

      Update: fixed in first reply.

      -David

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://652327]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (10)
As of 2014-10-20 13:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (76 votes), past polls