Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Regex hackery

by markkawika (Monk)
on Jun 12, 2009 at 18:33 UTC ( [id://771051]=note: print w/replies, xml ) Need Help??


in reply to Regex hackery

That regex isn't too bad. Your assumption about : not being allowed in a filename is completely wrong. About the only character not allowed in a filename is a directory separator, such as / on Unix.

But apart from that, I have a few minor suggestions on your regex.

  1. Inside a character class [] a period does not need to be escaped.
  2. You should use /x. It makes your regex easier to read.
  3. \d is usually preferred to [0-9]. It makes your regex more portable.
  4. You have an unnecessary set of parens in your regex: (?: ).

Rewritten, it would read:

/ ^ [^.]+ \. \s # Ignore the line numbers (.+?) # Capture the file name (?: (\d) \s # Capture the optional leading size digit ) ? ( \d+ \. \d {2} # Capture the rest of the size ) \s MB .* $ /x
After this, your file name is $1, and your size is, as you stated, $2$3.

And yes, there is an ambiguity, where if the line was:

1. go 2 123.45 MB
The regex would parse "go" as the file name and "2123.45" as the file size. There's no way around this given the format of the input.

Replies are listed 'Best First'.
Re^2: Regex hackery
by ikegami (Patriarch) on Jun 12, 2009 at 18:46 UTC

    Your assumption about : not being allowed in a filename is completely wrong. About the only character not allowed in a filename is a directory separator

    You're wrong about the OP being completely wrong. In Windows, the colon is the device indicator. On old Macs, the colon is the directory separator.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://771051]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-09-10 00:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.