in reply to Re^2: SEO Fixer Part II - Updated
in thread SEO Fixer Part II - Updated
I'll try to explain about Ebola, I'll try be clear.
In Re^3: Trying to Insert Alt Tags Programmatically you posted this (I've run it through perltidy )
my ( $tree, $title, $titleastext, $newtitle, $newtitleh1, $newtitleastexth1, $newtitleastexth1clipped, $newtitleh2, $newtitleastexth2, $newtitleastexth2clipped, $newtitleh3, $newtitleastexth3, $newtitleastexth3clipped, $newtitleh4, $newtitleastexth4, $newtitleastexth4clipped, $newtitlep, $newtitleastextp, $newtitleastextpclipped, $summary, $var, $newmetadescription, $newmetakeywords ); $tree = HTML::Tree->new(); $tree->parse($html); $title = $tree->look_down( '_tag', 'title' ); $titleastext = $title->as_text; use HTML::Element; if ($titleastext) { print "\nTitle: $titleastext\n\n"; } else { $newtitle = HTML::Element->new('title'); $newtitle = $newtitleh1; $newtitleh1 = $tree->look_down( '_tag', 'h1' ); if ($newtitleh1) { $newtitleastexth1 = $newtitleh1->as_text; } } if ($newtitleastexth1) { $newtitleastexth1clipped = substr( $newtitleastexth1, 0, 6 +5 ); $html->push_content($newtitleastexth1clipped); print "\n$url does not have a title. We created one from\n + the first 66 characters your first headline tag \<h1\>:\n $newtitleastexth1clipped.\n Please change if desired.\n\n"; }
you should turn it into a subroutine, which I did, and I called it Ebola (surely you noticed substr/push_content).$newtitleh1 ... $newtitleastexth1 ... $newtitleastexth1clipped ... $html->push_content ... ... $newtitleh2 ... $newtitleastexth2 ... $newtitleastexth2clipped ... $html->push_content ...
To learn to Ebola you would write a program like this (Ebola.pl)
And the output of this program is#!/usr/bin/perl -- # Ebola.pl use strict; use warnings; use HTML::Tree; Main( @ARGV ); exit( 0 ); sub Main { my $t = HTML::TreeBuilder->new_from_content(join'','<h4>',0..9,'<h +4>'); my $f = HTML::TreeBuilder->new_from_content('<title>f</title>'); my $B = $f->look_down( qw' _tag body ' ); print $t->as_HTML, "\n\n"; print '-'x33, "\n\n"; if ( Ebola( $B, 5, eval{ $t->look_down(qw'_tag h1')->as_text } ) ) + { print "Using h1\n"; } elsif ( Ebola( $B, 5, eval{ $t->look_down(qw'_tag h2')->as_text +} ) ) { print "Using h2\n"; } elsif ( Ebola( $B, 5, eval{ $t->look_down(qw'_tag h3')->as_text +} ) ) { print "Using h3\n"; } elsif ( Ebola( $B, 5, eval{ $t->look_down(qw'_tag h4')->as_text +} ) ) { print "Using h4\n\n"; } print '-' x 33, "\n\n"; print $f->as_HTML, "\n\n"; } ## end sub Main sub Ebola { my ( $html, $clip, $text ) = @_; if ( defined $text and length $text ) { $text = substr $text, 0, $clip; $html->push_content($text); } } ## end sub Ebola __END__
This is how you learn to Ebola, by writing a small program (Ebola.pl) to explore how HTML::TreeBuilder/HTML::Element objects behave.<html><head></head><body><h4>0123456789</h4><h4></h4></body></html> --------------------------------- Using h4 --------------------------------- <html><head><title>f</title></head><body>01234</body></html>
For the next step in the process, you write a program called Ebola.t
Where AsdfQwerty(), like Ebola.pl, puts Ebola() through its paces.sub Main { AsdfQwerty( '<h1>0123456789</h1>', '01234' ); AsdfQwerty( '<h1>0123456789</h2>', '01234' ); AsdfQwerty( '<h3>0123456789</h3>', '01234' ); AsdfQwerty( '<h4>0123456789</h4>', '01234' ); AsdfQwerty( '<h5>0123456789</h5>', '01234' ); AsdfQwerty( '<p>0123456789</p>', '01234' ); }
That is, AsdfQwerty(), tests Ebola(), by feeding Ebola() various inputs, and checking that the output Ebola() produces meets your expectations.
Problems are easier to spot and fix in very small programs.
Forget about your big task (SEO Fixer) for the moment, write 5 or 10 of these little programs, each dealing with a single small task, a single function (Ebola).
But don't call it Ebola, Ebola isn't descriptive, it doesn't explain or even hint at what the subroutine is supposed to accomplish or demonstrate/teach you.
Do you follow what I'm saying?
The reason I chose names like Ebola is because you're supposed to change them.
Learning to program is hard, there is a lot of information you have to juggle inside your head, and only sticks if you write/rewrite code yourself.
You still have part of my demo prog in this post i'm replying to
This code was supposed to demonstrate/teach how look_down works, and how to modify any tags you might find, it wasn't meant for direct inclusion in your program, after all, I'm not writing your program :)local $\ = $/; print $_->as_HTML for $tree->look_down( '_tag', 'img ', sub { not defined $_[0]->attr('alt') } ); print '---'; print $_->as_HTML for $tree->look_down( qw' _tag img ', sub { not length $_[0]->attr('alt') } ); print '---'; $_->attr( alt => MAlt($_) ) for $tree->look_down( qw' _tag img ', sub { not length $_[0]->attr('alt') } ); print $_->as_HTML for $tree->look_down(qw' _tag img '); print $tree->as_HTML; $tree = $tree->delete;
Its like learning to juggle
- First you start with one ball
- and then practice with two balls
- and two in one hand
- then three balls.
- Next you start with one bowling pin
- and then two bowling pins
- and then two bowling pins in one hand
- and then three bowling pins
- Next you start with one ball and one bowling pin
- and then one ball and one bowling pin in one hand
- and then two balls and one bowling pin
- Next you start with one two bowling pin and one ball
- And then you start with one knife
- and then two knives
- then two knives in one hand
- and then three knives
- Again you start with one ball and one knife
- then one ball and one knife in one hand
- then one knife and two balls
- Eventually you work your way up to three chainsaws, two bowling pins, two knives, and four tennis balls
I know this must seem overwhelming (hey, I'm typing my fingers bloody here :D) so here is the last surprise, AsdfQwerty is better written using Test::More, like http://cpansearch.perl.org/src/PETEK/HTML-Tree-3.23/t/attributes.t
#!/usr/bin/perl # HTML::TreeBuilder invokes HTML::Entities::decode on the contents of # HREF attributes. Some CGI-based sites use lang=en or such for # internationalization. When this parameter is after an ampersand, # the resulting &lang is decoded, breaking the link. "sub" is another # popular one. # Test provided by Rocco Caputo use warnings; use strict; use Test::More tests => 1; use HTML::TreeBuilder; my $tb = HTML::TreeBuilder->new(); $tb->parse( "<a href='http://wherever/moo.cgi?xyz=123&lang=en'>Test</a>" ); my @links = $tb->look_down( sub { $_[0]->tag eq "a" } ); my $href = $links[0]->attr("href"); ok($href =~ /lang/, "href should contain 'lang' (is: $href)"); exit;
Good luck.
|
---|