Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Parsing Framed Web Pages

by CaMelRyder (Pilgrim)
on Aug 05, 2008 at 20:43 UTC ( [id://702488]=perlquestion: print w/replies, xml ) Need Help??

CaMelRyder has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to pull down/parse a web page that has frames in it. LWP::UserAgent returns some page about how I should be using a browser that supports frames. What modules/solutions do you fine people use to handle framed web pages?

I tried LWP::UserAgent::FramesReady to no avail. Couldn't get it to compile on two different machines.

¥peace from CaMelRyder¥

Replies are listed 'Best First'.
Re: Parsing Framed Web Pages
by hangon (Deacon) on Aug 05, 2008 at 21:11 UTC

    What you have is the underlying frameset page. You should be able to parse the html for the urls to the actual content pages that are loaded into the frames, then fetch them directly. It will look something like the following. You want the src= attribute inside the <frame> tags.

    <frameset frameborder="0" framespacing="0" border="0" rows="120,*"> <frame src="menu.html" name="topmenu" target="body" scrolling="no" + noresize> <frame src="main.html" name="body" scrolling="auto"> </frameset>
      Nice. When I read your post, I was hit with the "why didn't i think of that" feeling. While I am at work right now and can't try it, I'm sure that is exactly the insight that I needed.
      ¥peace from CaMelRyder¥
Re: Parsing Framed Web Pages
by Perlbotics (Archbishop) on Aug 05, 2008 at 21:04 UTC
    Hav'nt done that yet, but I guess the server cannot identify your LWP::UserAgent as a browser that supports frames? Thus, it sends a notification about that. The manual says, that agent (will be the User-Agent: HTTP header attribute) is set to "libwww-perl/#.##" by default. Try to change this to e.g. Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0). HTH
Re: Parsing Framed Web Pages
by cormanaz (Deacon) on Aug 05, 2008 at 20:57 UTC
    If you're on Windoze you can try Win32::IEAutomation.

    Also when you say LWP::UserAgent::FramesReady won't compile, I assume you're trying to do it manually. You might try to install with ppm (on Windoze) or CPAN (in linux) to be sure you have all the dependencies in order.

    Good luck...Steve

      I'm using linux and the perl -MCPAN route didn't work for installing FramesReady
      ¥peace from CaMelRyder¥
Re: Parsing Framed Web Pages
by spivey49 (Monk) on Aug 05, 2008 at 21:06 UTC

    I've used LWP::UserAgent::FramesReady in the past.

    What issues are you having installing the module? Have you tried ppm or CPAN to make sure you have the required dependencies?

      I would compile, but it would fail so many tests that it's unuseable. FrameReady seems to be a lost cause for me.
      ¥peace from CaMelRyder¥

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://702488]
Approved by cormanaz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (6)
As of 2024-04-26 09:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found