http://www.perlmonks.org?node_id=200595

jmcnamara has asked for the wisdom of the Perl Monks concerning the following question:


In response to a CB request for a regex to match floating point numbers, I suggested the regex from perlfaq4*:     /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/

At that point jarich asked if the look-ahead ?= should be a clustering ?:.

In fact the regex seems to contain other unnecessary capturing as well. The following seems to be functionally equivalent but without the capturing:      /^[+-]?(?:\d|\.\d)\d*(?:\.\d*)?(?:[Ee][+-]?\d+)?$/

Is there any reason for this capturing in a FAQ about matching? Should it be changed?

Also, the decimal part of the floating point regex is different from the decimal regex on the previous line. Is there any reason for this inconsistency?

Here is the test frame I used:

#!/usr/bin/perl -wl use strict; my @nums = qw( 0e0 0 +0 -0 1. 0.14 .14 1.24e5 24e5 -24e-5 2.3. 2.3.4 1..2 ); for (@nums) { # Print only if the match fails # perlfaq4 regex print "1: ", $_ if ! /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+ +))?$/; # perlfaq4 regex modified print "2: ", $_ if ! /^[+-]?(?:\d|\.\d)\d*(?:\.\d*)?(?:[Ee][+-]?\d ++)?$/; # perlfaq4 decimal regex extended to match floating point print "3: ", $_ if ! /^[+-]?(?:\d+(?:\.\d*)?|\.\d+)([Ee][+-]?\d+)? +$/; }

--
John.

* Regexp::Common was also suggested.