Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Using the contents of

by Popcorn Dave (Abbot)
on Sep 22, 2008 at 06:24 UTC ( [id://712947]=perlquestion: print w/replies, xml ) Need Help??

Popcorn Dave has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I've googled around for this but perhaps I'm not looking in the right place.

Consider the following snippet:

my $source = get('http://www.drudgereport.com'); open FH, '>', 'temp.html' || die "Can't open source file\n"; # write t +emp source file to parse print FH $source; close FH; my $stream = HTML::TokeParser->new('temp.html') || die "Couldn't read HTML file $source";
Is there a way that I can eliminate writing the captured data from $source and just pass it directly to the HTML::TokeParser object?

To me it seems like an unnecessary step to be performing, but like I said I tried to google and apparently wasn't hitting the right keywords.

Thanks in advance!


Revolution. Today, 3 O'Clock. Meet behind the monkey bars.

I would love to change the world, but they won't give me the source code

Replies are listed 'Best First'.
Re: Using the contents of
by Corion (Patriarch) on Sep 22, 2008 at 06:28 UTC

    The documentation says that alternatively to

    $p = HTML::TokeParser->new( $filename, %opt );
    you can use
    $p = HTML::TokeParser->new( \$document, %opt );

    Which is what I'd use then.

Re: Using the contents of
by ikegami (Patriarch) on Sep 22, 2008 at 06:29 UTC

    Yes. Quote the docs,

    If the argument is a reference to a plain scalar, then this scalar is taken to be the literal document to parse.

    my $stream = HTML::TokeParser->new(\$source) || die "Couldn't read HTML file $source";
Re: Using the contents of
by Popcorn Dave (Abbot) on Sep 22, 2008 at 18:37 UTC
    Thanks for your replies!

    I'm just curious though if there's a way to do that generically so that when a file was called for that you could reference some other data stream that had been generated rather than having to write data to a temporary file.


    Revolution. Today, 3 O'Clock. Meet behind the monkey bars.

    I would love to change the world, but they won't give me the source code

      Don't use tie. Since 5.8.0, you can create a file handle that reads from/writes to a scalar.
      my $data = '...'; open(my $fh, '<', \$data) or die; sub_that_needs_file_handle($fh);

      Before 5.8, IO::Scalar and IO::String provide the functionality (although not as well) by using tie

      Neither tie nor this method produce a real (system) file handle, so they won't work where a real file handle is needed. For example, you can't use them to capture the output of a child process. (See IPC::Run for a solution in that particular case.)

      You could tie the filehandle to a scalar, implementing the READLINE, GETC, and READ methods appropriately. See Tying Filehandles. (I thought there must be a CPAN module to do this—sort of the converse of Tie::File—but a cursory search didn't find it.)

      UPDATE: IO::String and IO::Stringy both seem to be the sort of thing that you'd want here. The documentation suggests that they're only useful on older versions that don't support lexical filehandles, but it seems to me that lexical filehandles don't suffice for your situation.

      UPDATE 2: ikegami pointed out what I was missing (i.e., why you shouldn't tie after all). Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://712947]
Approved by lamp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-03-28 08:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found