I am using following code to get positions of start and end of html tags. Problem is HTML::Tagreader requires file as argument. But I have HTML as a sting in some variable. I dont want to create a file and delete it. Just for using this module. Can any one suggest better solution where I can use string instead of file
Note : Problem is HTML::TagReader does not allow string argument. I am only trying to get position of html tags using this module. Is there any better option?
use HTML::TagReader;
my $filename = 'test2.html'; # Here instead of using this file I wan
+t to do same thing using HTML as a string in some variable say $html_
+string = 'content of test2.html'
my $p=new HTML::TagReader "$filename";
open(my $fh, '<', $filename) or die "Could not open file '$filenam
+e' $!";
my %line_chars;
my $line_number = 1;
while (my $row = <$fh>) {
if ($line_number > 1) {
$line_chars{$line_number} = $line_chars{$line_number -
+1} + length($row);
}
else {
$line_chars{$line_number} = length($row);
}
$line_number++;
}
my @atags;
my %atagrange;
while(my ($tagOrText,$tagtype,$linenumber,$column)=$p->getbytoken($s
+howerr)) {
my $position;
my $a_start_tag_pos;
if ($linenumber > 1) {
$position = $line_chars{$linenumber - 1} + $column;
}#print "\ntagOrText:" . $tagOrText . "\ntagtype : " . $tagtype
+. "\nline number :" . $linenumber . "\ncolumn : " . $column . "\npos
+ition : " . $position . "\n";
if ($tagtype eq "a" or $tagtype eq '/a') {
if ($tagtype eq "a") {
push(@atags, $position);
}
else {
$a_start_tag_pos = pop(@atags);
$atagrange{$a_start_tag_pos} = $position;
}
}
}
thanks in advance...
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|