http://www.perlmonks.org?node_id=158793

emilford has asked for the wisdom of the Perl Monks concerning the following question:

Okay, so I'm anal about the way my music files are named on my hard drive. I like to have my music files named in a specific format, so after downloading a ton of music, I have a lot of renaming to do.

I thought to myself, why not once again use perl to your advantage. Here's what I'm looking for: I download files with all sorts of naming conventions and I'm looking to rename them to this format:
Music Author - Song of Title
I started thinking of a regular expression that could do this and then realized all the possible cases to take of. Capitalization, removing certain elements, adding spaces where needbe, etc. I'm not the most experienced programmer when it comes to regular expressions, but from my minimal use, I think this could be done. Any help monks?

What regular expression would I need (or series of regular expressions) to rename a file like below? Or is this too broad of a task, with too many possibilities to do with reg. exps.?
xecutioners-you_cant_scratch_(skit) to Xecutioners - You Cant Scratch Skit

Replies are listed 'Best First'.
Re: a task for a regular expression expert
by stephen (Priest) on Apr 13, 2002 at 17:23 UTC

    Regular expressions are the vice-grips in the Perl programmer's toolbox. They're important tools. They're useful for just about everything. And overusing them can get you into sticky situations.

    Here's code to transform 'artist_name-title_with_other(chars)' into 'Artist Name - Title With Other Chars'.

    my $song = q{xecutioners-you_cant_scratch_(skit)}; # Split up artist and title of song my ($artist, $title) = split(/-/, $song, 2); # Turn the artist and title into human text, then stick # them back together with ' - '. my $new_song = humanize_text($artist) . ' - ' . humanize_text($title); print $new_song, "\n"; # Runs a piping operation. From back to front, here's what # happens: # 1. We split 'something_like(this)' into 'something', # 'like', 'this' # 2. We capitalize the first letter of every word # 3. We stick the list back together delimited with spaces sub humanize_text { return join(' ', map(ucfirst($_), split(/[^a-zA-Z]+/, $_[0])) ); }

    stephen

      Thanks Stephen, this is a really good start! I ran the code on a few sample file names and it works fairly well. There are, however, still a few kinks that I need to work through.

      1) some song titles have numbers in them - these are removed with curr +ent code 2) what about &'s and apostrophes - these are also removed 3) i also need to find a way to remove certain key words that don't re +ally belong in the title
      Either way, the code is a great start. Thanks. - Eric
Re: a task for a regular expression expert
by perlplexer (Hermit) on Apr 13, 2002 at 18:38 UTC
    I doubt you can do that with just a single regex. Even if you come up with something it'll be ugly.
    Here's how I would approach your problem:
    use strict; use warnings; my $folder = $ARGV[0] ? $ARGV[0] : '.'; my $file; opendir DIR, $folder or die "Can't open $folder : $!\n"; while (defined($file = readdir DIR)){ next if $file eq '.' or $file eq '..'; $file =~ tr/_\t\r\n;:/ /; $file =~ s/\s+/ /g; $file =~ s/- *(?:track *)?\d+ *-/-/gi; $file =~ s/^\d+ *- *|^\d+ *\. *|^[(\[{] *\d+ *[)\]}] *//; $file =~ s/(\S)-(\S)/$1 - $2/g; $file =~ s/(\w+)/\u$1/g; print "$file\n"; } closedir DIR;
    As you can see, the code just print()s the "beautified" version of each file name. You can replace print() with rename() or whatever you like.

    Input
    01-Paul_Oakenfold_-_01_-_HHC_-_We're_Not_Alone.mp3 04.Paul_Oakenfold_-_04_-_Red_Sun_-_This_Love.mp3 05. Paul_Oakenfold_-_05_-_Ryuchi_Sakamoto_-_Little_Budha.mp3 Paul_Oakenfold_-_Track 06_-_Man_With_No_Name_-_Teleport.mp3 Paul_Oakenfold_-_07_-_Terrorvision_-_Conspiracy.mp3 Paul_Oakenfold_-_08_-_man_with_no_name_-_Sugar_Rush-retry.mp3 Paul_Oakenfold_-_09_-_Eric_Serra_-_Cute_Name.mp3 Paul_Oakenfold_-_10_-_State_of_Emergency_-_Banks_of_Babylon.mp3 Paul_Oakenfold_-_11_-_Juno_Reactor_-_Jungle_High.mp3 Paul_Oakenfold_-_12_-_Ennio_Morricone_-_Miscrere.mp3 Paul_Oakenfold_-_13_-_Virus_-_Moon.mp3 Paul_Oakenfold_-_14_-_Grace_-_If_I_Could_Fly.mp3 Paul_Oakenfold_-_15_-_Our_House_-_Floor_Space-retry.mp3 [02]Paul_Oakenfold_-_02_-_Ryuchi_Sakamoto_-_Merry_Christmas_Mr_L.mp3 [03] Paul_Oakenfold_-_03_-_Y-Traxx_-_Mystery_Land.mp3
    Output
    Paul Oakenfold - HHC - We'Re Not Alone.Mp3 Paul Oakenfold - Red Sun - This Love.Mp3 Paul Oakenfold - Ryuchi Sakamoto - Little Budha.Mp3 Paul Oakenfold - Man With No Name - Teleport.Mp3 Paul Oakenfold - Terrorvision - Conspiracy.Mp3 Paul Oakenfold - Man With No Name - Sugar Rush - Retry.Mp3 Paul Oakenfold - Eric Serra - Cute Name.Mp3 Paul Oakenfold - State Of Emergency - Banks Of Babylon.Mp3 Paul Oakenfold - Juno Reactor - Jungle High.Mp3 Paul Oakenfold - Ennio Morricone - Miscrere.Mp3 Paul Oakenfold - Virus - Moon.Mp3 Paul Oakenfold - Grace - If I Could Fly.Mp3 Paul Oakenfold - Our House - Floor Space - Retry.Mp3 Paul Oakenfold - Ryuchi Sakamoto - Merry Christmas Mr L.Mp3 Paul Oakenfold - Y - Traxx - Mystery Land.Mp3

    --perlplexer
      Paul Oakenfold....nice.

      Well I have to say this code works quite well. I'm in the process of testing it out (working through what all the reg. exprs. do) and adding a few additions here and there.

      Thanks for the help! - Eric
      $_++ for 'Astral Projection', 'Hallucinogen', 'Shakta';

      --Guess who :)
Making consistent file names for MP3s (boo)
by boo_radley (Parson) on Apr 13, 2002 at 19:02 UTC
    I, too, tend to be particular about how my mp3s are titled. If your "music files" are "MP3s", you could try using the assortment of ID3 modules to assemble the information from the id3 tags rather than the file name. This information tends to be a little more reliable than extracting the same from file names, which rely entirely on the whims of the person... (Some of the titles I've ripped have CDDB information that includes the ISBN code and format (CD5) in the title, which makes no sense to me).
Re: a task for a regular expression expert
by Juerd (Abbot) on Apr 13, 2002 at 17:10 UTC
Re: a task for a regular expression expert
by emilford (Friar) on Apr 13, 2002 at 17:24 UTC
    What about removing things other than parans from the string? Is there a more general way to remove anything that isn't either a space or character? What if the filename is Author - Song_Title_Foo?