Hi,
You can of course take an HTML::Parser to do the work for you. This is the best way.
But if you really want your own regexp, I suggest to use 2 parsings, just to maintain readability. I am sure a real regexp expert can do it in one line, but here is my try.
$x="<html>/something/more words<br/></html>";
$x =~ m|<(\w+)>(.*)</\1>|g;
print "first is $1\n"; # You can put the <> around it here
$2 =~ m|(/\w+/)(.*)<br/>| ;
print "second is $1\n";
print "third is $2\n";
It is up to you to store then in arrays, but at least it gives you a hint.
This is really quick and dirty...
updateIt all depends on how much flexibity you want anyway, you can easily play around with the seperators, etc ...
---------------------------
Dr. Mark Ceulemans
Senior Consultant
IT Masters, Belgium
| [reply] [d/l] |
my $str ='<html>/something/more words<br/></html>';
my @bits = $str =~ m!(<html>)/([^/]+)/([^/]+)<br/>(</html>)!;
print $_,$/ for @bits;
Gives
<html>
something
more words
</html>
Of course, if what you've asked for is a simplification of your real requirements, then there are probably much better ways of doing what you really want, but you'd need to tell us what that is.
Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy | [reply] [d/l] [select] |
could yo be more especific please?
You can do this:
$_=<DATA>;
chomp;
($htmlini, $something, $more, $htmlend)=
m|(<html>)(/.*?/)([^<]*).*(</html>)|i;
print join "\n", ($htmlini, $something, $more, $htmlend);
__DATA__
<html>/something/more words<br/></html>
but I'm afraid that we need some more information about the text yo need to match.
Ah!, you can also read perl documentation about regular expresions:
perldoc perlre
perldoc perlrequick
Hopes
perl -le '$_=$,=q,\,@4O,,s,^$,$\,,s,s,^,b9,s,$_^=q,$\^-]!,,print'
| [reply] [d/l] [select] |