go ahead... be a heretic | |
PerlMonks |
Boolean operators in PERL regexp?by lenrobert (Initiate) |
on Feb 24, 2005 at 18:34 UTC ( [id://434178]=perlquestion: print w/replies, xml ) | Need Help?? |
lenrobert has asked for the wisdom of the Perl Monks concerning the following question: I am aware of OR ( | ), but is there logical NOT in the PERL regex syntax? The task would be the following: to extract the relative links (i.e. the href property of the "a" element) from an HTML file, even if it is not enclosed in quotation marks. This means I don't want to retrieve hyperlinks beginning with /, or # or javascript: I would express the following string, and capture (extract) the content of the second parenthesis. ( <a href=" OR <a href=) THEN NOT(/ OR # OR javascript: OR \s OR " ) THEN ( \s OR " ) The best regexp I could do is this, but it does not handle the case of / # javascript: etc. /(<a href="|<a href=)([^"]*?)(\s|")/gi) Does anyone know the answer, and help me? Thanks in advance, Robert
Back to
Seekers of Perl Wisdom
|
|