Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Mojo Dom extract

by ribo75017 (Initiate)
on May 27, 2015 at 08:55 UTC ( [id://1127948]=perlquestion: print w/replies, xml ) Need Help??

ribo75017 has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I wanna extract : - text from date, title, cat and lieu classes - anchors with Mojo::Dom but I am not able to do it :

Is it possible to get data like this : for each anchor into "ret" div, get (anchors, date elements, title, cat and lieu)

Thanks a lot

use Mojo::DOM; my $dom = Mojo::DOM->new(<<'HTML'); <div class="ret"> <a href="blabla.com/1234" title="Text 1"> <div class="rtm"> <div class="date"> <div>22</div> <div>mai</div> <div>19:52</div> </div> <div class="image"> <div class="imageclass-and-nb"><im +g src="blabla.com/img1234" alt="Text 1"></div> </div> <div class="all"> <h2 class="title">Text 1 Title</h2> <div class="cat">Blue</div> <div class="lieu">Dourdan</div> </div> </div> </a> <a href="blabla.com/1212" title="Text 2"> <div class="rtm"> <div class="date"> <div>22</div> <div>mai</div> <div>11:55</div> </div> <div class="image"> <div class="imageclass"><img src=" +blabla.com/img1212" alt="Text 2"></div> </div> <div class="detail"> <h2 class="title">Text 2 title</h2> <div class="cat">Blue</div> <div class="lieu">Champigny-sur-Marne< +/div> </div> </div> </a> </div> HTML print $dom->find('div.date')->map(sub{$_->children->each})->map(sub{$_ +->text})->each;

Replies are listed 'Best First'.
Re: Mojo Dom extract
by Anonymous Monk on May 27, 2015 at 09:05 UTC

    I wanna extract.... Is it possible...thanks

    Yes, its possible, you already have a start, why don't you simply do it?

      Because I don't know how to do it efficiently if possible, ie without 10 lines like :

      $dom->find('div.date')->map(sub{$_->children->each})->map(sub{$_->text})->each

      $dom->find('div.lieu')->map(sub{$_->children->each})->map(sub{$_->text})->each .. etc

      Would like:

      Dom->ret->a => anchor

      Dom->ret->rtm->date => date fields .. etc

        Because I don't know how to do it efficiently if possible, ie without 10 lines like :

        First do it any way possible, later reduce it :)

        Anyway ,

        for my $ret ( $dom->find('div.ret')->each ){ for my $aaaa ( $ret->find('a')->each ){ my $date = $aaaa->find('div.date')->first->all_text; my $lieu = $aaaa->find('div.lieu')->first->all_text; my $cat = $aaaa->find('div.cat')->first->all_text; my $titl = $aaaa->find('h2.title')->first->all_text; print join( "\t#\t", $date, $lieu, $cat, $titl), "\n"; } } __END__ 22 mai 19:52 # Dourdan # Blue # Text 1 Title 22 mai 11:55 # Champigny-sur-Marne # Blue # Text 2 t +itle

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1127948]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-03-28 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found