http://www.perlmonks.org?node_id=928571

Benedict White has asked for the wisdom of the Perl Monks concerning the following question:

For reasons too long to explain, our web servers have changed version of PHP which now means we need to use regular expressions in the perl format rather than the posix. We have had major problems that have been solved but now need help with some regex. The original Posix regex was (using php function erigi which is case insensitve :
\\[table\\]([^\\[]+)\\[/table\\]
This is the input text:
[b]South Coast[/b] <br /> [table] <br /> Boat Link Asking Price Lying<br /> Heritage [url=http://www.someurl.co.uk/home/FS_heritage] +Detail on Heritage [/url] Warsash<br /> Firestorm Too [url=http://www.someurl.co.uk/home/FirestormToo] Deta +il on Firestorm Too [/url] Plymouth<br /> [/table]<br />
I need it to match on the [table] and the [/table] and extract the data inbetween, this passed to another function to construct the table.

Replies are listed 'Best First'.
Re: Regular expression question
by Neighbour (Friar) on Sep 29, 2011 at 13:25 UTC
    But your current regex only matches everything between [table] and [/table] if that 'everything' does not contain a '['....which, in your example, it does.
Re: Regular expression question
by pvaldes (Chaplain) on Sep 29, 2011 at 13:57 UTC
    if(/\[table\](.*?)\[\/table\]/) {print $1}
      my ($table_data) = $input =~ /\[table\](.*?)\[\/table\]/s;
      You need the "s"-modifier as the the table spans several lines.
        Many thanks for that, the /s turned out to be as important as the (.*?). It works well now.
Re: Regular expression question
by JavaFan (Canon) on Sep 30, 2011 at 09:14 UTC
    I don't like .*?.
    if (m{\[table\] # Match [table] ([^[]* (?:\[(?!/table\]) [^[]*)*) # Match a string not containi +ng [/table] \[/table\] # Match [/table] }x) {print $1}