Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Regex To Remove File Extension

by jethro (Monsignor)
on Dec 10, 2008 at 19:07 UTC ( #729482=note: print w/ replies, xml ) Need Help??


in reply to Regex To Remove File Extension

There are a lot of ways to skin this cat:

s/\..*+$//; s/\-[^\.]*$//;

Both of these use $ at the end to anchor the regex to the end of the string. The first uses .*+, the non-greedy version of .*, the second uses the character class of all chars except '.' to only get the last suffix

Another possibility is to use the perl module File::Basename and this is probably the best way, because you don't need to worry about getting it right, someone else did that already

UPDATE: kennethk is right, the first version doesn't work. Obviously the regex engine never matches from right to left even when anchored to the right

UPDATE2: Seems to be not my day. 3 errors in two lines is quite depressing


Comment on Re: Regex To Remove File Extension
Download Code
Re^2: Regex To Remove File Extension
by kennethk (Monsignor) on Dec 10, 2008 at 19:17 UTC
    Both of these fail. .*? is the non-greedy version, not .*+, so s/\..*+$// fails on compile, and still doesn't work right if debugged because it's matching off of the first period. Your second expression has a typo (- in place of .) so it should read s/\.[^\.]*$//, as per dreadpiratepeter's post.
Re^2: Regex To Remove File Extension
by Narveson (Chaplain) on Dec 10, 2008 at 20:14 UTC

    File::Basename says

    $basename = basename($fullname,@suffixlist);

    If @suffixes are given each element is a pattern (either a string or a qr//) matched against the end of the $filename. The matching portion is removed and becomes the $suffix.

    So File::Basename doesn't solve the original problem, it requires the solution before it can be used.

      Of course if you have a manageable list of extensions, you could populate the suffix list and use File::Basename very easily. That is, the original post says "extension might not always be txt," but that doesn't indicate the scope of potential extensions. Maybe it's just txt, html, htm, pl and cgi (just a random group of extensions chosen). In which case I'd lean towards the File::Basename solution rather than creating a regex unique to this script.

      Or maybe the AnonyMonk means to be able to remove any extension, in which case File::Basename isn't the best solution. Obviously the AnonyMonk will need to choose the best approach, but I wouldn't discount File::Basename for a limited number of extensions.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://729482]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2014-07-13 05:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (247 votes), past polls