Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Optimize file renaming.

by omega_monk (Scribe)
on May 26, 2005 at 11:50 UTC ( [id://460625]=perlquestion: print w/replies, xml ) Need Help??

omega_monk has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I had some files(~350) that I wanted to rename, and after a bit of work got them renamed, as I wanted. My question though is about how I could have done this more efficiently, so a bit of background, and then the code I used....

Like I said, I have 350 or so files, named from "1.png" to "350.png", and I simply wanted leading 0's on any base filename that had less than 3 numeric characters, if that makes sense, so "1.png" becomes "001.png".

#!/usr/bin/perl -w use strict; opendir(THISDIR,"e:\\perl\\comics\\"); my @comics=readdir(THISDIR); close(THISDIR); foreach (reverse(@comics)) { my ($name,$ext)=split(/\./,$_); if ($name=~/\d{3}/) { print '.'; } elsif ($name=~/\d{2}/) { my $old=$name; $name=~s/(\d{2})/0$1/; rename "$old\.$ext", "$name\.$ext"; } elsif ($name=~/\d{1}/) { my $old=$name; $name=~s/(\d{1})/00$1/; rename "$old\.$ext", "$name\.$ext"; } }


The wisdom of the monks is appreciated.

update: I made the changes that I picked up, and for the moment, I have this. Thanks for the lessons, monks.

#!/usr/bin/perl use warnings; use strict; chdir('e:/perl/comics/'); my @comics = glob('*.png'); foreach(@comics) { my $oldname = $_; $_ =~ s/(\d+)/sprintf("%03d",$1)/e; rename($oldname,$_) unless $oldname eq $_; }

Replies are listed 'Best First'.
Re: Optimize file renaming.
by blazar (Canon) on May 26, 2005 at 11:59 UTC
    Hint: sprintf

    Update: Analyzing more precisely your script...

    #!/usr/bin/perl -w use strict;
    It's better, for various reasons to long to be discussed here, to
    use warnings;
    instead of -w. With recent enough perls, that is...
    opendir(THISDIR,"e:\\perl\\comics\\"); my @comics=readdir(THISDIR); close(THISDIR);
    Well, nothing wrong with this, per se. But I have the impression that people tend to abuse opendir & C. where a simple glob would do. In particular since you say that your script did work, you must have run it from e:\perl\comics. Otherwise you should have chdir'd there or prepended the dirname to the filenames subsequently.

    Also, and this is IMHO important, perl supports "/" as the directory separator also under Windows, which is useful. Quoting backaslashes can be confusing and may lead to errors. If you really want to use them, and of course if you don't need interpolation, you can use single quotes, which hopefully will make your life easier.

    All in all you may have had e.g.

    my @comics=glob 'e:/perl/comics/*'; # or 'e:/perl/comics/*.png', according to your description
    foreach (reverse(@comics)) {
    Huh?!? Why reverse?!?
    my ($name,$ext)=split(/\./,$_);
    Nothing wrong with this either, but a very useful module for "this kinda things" is File::Basename. Now I use it even when a simple regex would do. In any case it makes me sure it's doing the right thing.

    <snip rest of code>

    Don't! Use sprintf as hinted above, instead.

    All in all, if this was just a quick hack, I would have done that with a combination of shell cmds and perl, e.g.:

    ls -1 *.png | perl -lne '$o=$_; s/(\d+)/sprintf "%03d", $1/e; rename $ +o, $_'
      Honestly the reverse is a remnant from trying to figure out how to deal with the list of files. I should have removed it, but didn't.

      This was more learning on my part, than necessity. Appreciate the insight.
        In any case this is a good occasion to remind that readdir returns the filenames in whatever order they're stored in the filesystem.
Re: Optimize file renaming.
by polettix (Vicar) on May 26, 2005 at 11:59 UTC
    • I don't understand why you reverse the file list.
    • Is the print necessary?
    You could pad with zeroes using sprintf:
    $name = sprintf "%03d", $name;
    or, in a more general way:
    my $padstring = '0'; my $minlength = 3; $name = ($padstring x ($minlength - length($name))) . $name;

    Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

    Don't fool yourself.
Re: Optimize file renaming.
by prasadbabu (Prior) on May 26, 2005 at 12:04 UTC

    TIMTOWTDI

    use File::Basename; for $a (@a) { ($name, $path, $type) = fileparse ($a, qr/\..*/); $name = sprintf("%03d", $name); $a = "$name"."$type"; }

    Prasad

Re: Optimize file renaming.
by ivancho (Hermit) on May 26, 2005 at 12:02 UTC
    chdir("e:\\perl\\comics\\"); foreach my $fname (glob("*.png")) { my $append = "0" x (7 - length($fname)); rename($fname, $append.$fname) or die "Cannot rename $fname"; }
    Update:

    in all honesty, for my usage I'd have written

    perl -e 'rename($_,("0"x(7-length($_))).$_) for glob("*.png")'
    from the appropriate directory

    But it's probably best to maintain some readability when messing around with the filesystem

    Update2:

    rename($_,substr("000$_",-7)) is even groovier ( borrowed from EdwardG ), although it'll mess you up, if there are longer names, while the previous version won't

Re: Optimize file renaming.
by mirod (Canon) on May 26, 2005 at 12:01 UTC

    Try using sprintf:

    ... my $old_name= $_; my $new_name= $sprintf( "%03d.%s", split(/\./); rename $old_name, $new_name unless( $old_name eq $new_name); ...
Re: Optimize file renaming.
by Tomtom (Scribe) on May 26, 2005 at 12:00 UTC
    I think you could use sprintf like :
    $result = sprintf("%07s", $name);
    "%07d", since your new file names would always 7 characters long.

    Update : too late :(

    Update : As blazar said, "%07d" is wrong, I updated it to "%07s", which seems to work better : sorry for the mistake.
      $ echo '1.png' | perl -lpe '$_=sprintf "%07d", $_' 0000001
Re: Optimize file renaming.
by EdwardG (Vicar) on May 26, 2005 at 12:21 UTC
    E:\test>dir Volume in drive E has no label. Volume Serial Number is 4266-6D09 Directory of E:\test 26/05/2005 13:20 <DIR> . 26/05/2005 13:20 <DIR> .. 26/05/2005 13:05 0 1.png 26/05/2005 13:05 0 2.png 26/05/2005 13:05 0 3.png 26/05/2005 13:05 0 4.png 26/05/2005 13:05 0 5.png 5 File(s) 0 bytes 2 Dir(s) 60,811,968,512 bytes free E:\test>for %f in (*.png) do @perl -e "$new = substr('0'x3 . $ARGV[0], +-7); `ren $ARGV[0] $new`;" %f E:\test>dir Volume in drive E has no label. Volume Serial Number is 4266-6D09 Directory of E:\test 26/05/2005 13:20 <DIR> . 26/05/2005 13:20 <DIR> .. 26/05/2005 13:05 0 001.png 26/05/2005 13:05 0 002.png 26/05/2005 13:05 0 003.png 26/05/2005 13:05 0 004.png 26/05/2005 13:05 0 005.png 5 File(s) 0 bytes 2 Dir(s) 60,811,968,512 bytes free E:\test>

     

Re: Optimize file renaming.
by trammell (Priest) on May 26, 2005 at 13:49 UTC
    Nobody has mentioned the rename(1) script (installed into /usr/bin/rename on my Debian machine, part of the perl package) written in Perl. IIRC it's been around since Perl 4 days....
    % touch 1.png 11.png 1111.png % rename 's/(\d+)/sprintf("%03d",$1)/e' *.png % ls *.png 001.png 011.png 1111.png %

      And if you don't have it and don't feel like grabbing the entire source tarball:

      #!/usr/bin/perl -w # rename - Larry's filename fixer $op = shift or die "Usage: rename expr [files]\n"; chomp(@ARGV = <STDIN>) unless @ARGV; for (@ARGV) { $was = $_; eval $op; die $@ if $@; rename($was,$_) unless $was eq $_; }

      Always handy. Don't leave $HOME without it.

      On the system I'm currently on:
      $ file $(which rename) /usr/bin/rename: ELF 32-bit LSB executable, Intel 80386, version 1 (SY +SV), for GNU/Linux 2.0.0, dynamically linked (uses shared libs), stri +pped
      However from the description you give of perl's rename it seems it lets you run arbitrary code. I hope it uses Safe; although I heard some experienced and knowledgeable perl hacker say that even that isn't really bulletproof...
Re: Optimize file renaming.
by halley (Prior) on May 26, 2005 at 13:30 UTC
    So far, nobody's offered glob(), so I'll throw that in there. Unless you're expecting to load several thousand filenames every few seconds, avoid the extra hassle of a readdir() loop. Since you're not even trying to save memory in your code, you should see no obvious performance difference.

    Replace:

    opendir(THISDIR,"e:\\perl\\comics\\"); my @comics=readdir(THISDIR); close(THISDIR);

    With:

    my @comics = glob('e:/perl/comics/*');

    Many people work in the current directory, and specify specific wildcard masks. The "angle brackets" operator calls glob() for you (if it's not a bareword), so this might be even more readable.

    my @comics = <*.png>;
    Oh, and there are a few cases where Windows still really requires backslashes, but most of the time, it accepts forward slashes, Unix-style. (This has been true since MS-DOS 2.0 when it introduced subdirectories.) I find them easier to type and read.

    --
    [ e d @ h a l l e y . c c ]

Re: Optimize file renaming.
by Grygonos (Chaplain) on May 26, 2005 at 14:59 UTC
    TIMTOWDI
    #!/Perl/bin/perl use strict; use warnings; opendir(image_data,"C:/testdir") or die "$!"; map{print scalar(reverse(substr(scalar(reverse "000".$_),0,7)))."\n"} +readdir image_data; closedir(image_data)
Re: Optimize file renaming.
by blazar (Canon) on May 26, 2005 at 17:13 UTC
    update: I made the changes that I picked up, and for the moment, I have this. Thanks for the lessons, monks.
    All in all looks fine. Only
    $_ =~ s/(\d+)/sprintf("%03d",$1)/e;
    One good point of using $_ is that so many built in functions and operators default to it, so that you can write
    s/(\d+)/sprintf("%03d",$1)/e;
    instead. Now, once you get used to Perl slang, this will become more intuitive and readable than the other way round. OTOH you may have chosen to do
    (my $newname=$_) =~ s/(\d+)/sprintf("%03d",$1)/e;
    instead.
    rename($oldname,$_) unless $oldname eq $_;
    Now, this is not strictly necessary:
    touch zizze; perl -le 'rename "zizze", "zizze" or die $!'
    Of course it helps to avoid an unnecessary system call. But then again if it is a matter of a quick hack, I wouldn't do it. All in all I don't know if it is convenient efficiency-wise and wether it is or not depends strongly on the actual filenames. One could try to do a benchmark, but I'm not doing it now as my lazyness is currently overwhelming my hubris (BTW: see also Modules that significantly contribute to Laziness and Modules that significantly contribute to {Impatience,Hubris} - free ad! ;-)
Re: Optimize file renaming.
by kscaldef (Pilgrim) on May 26, 2005 at 18:34 UTC
    I'm surprised that no one has asked _why_ you are interested in optimizing this task. Do you expect to do it a lot? Do you expect to do it on many more than 350 files in the future? Is it actually a good use of your time to make this faster?

    That said, the revised version has the virtue of being easier to understand than the original, so that's good. Personally, though, I would do away with the s///e business and just assign the result of sprintf to a temporary variable.

      I did mention that I wanted to optimize it for the sake of learning. ;)
Re: Optimize file renaming.
by Tanktalus (Canon) on May 26, 2005 at 17:21 UTC

    What I use on Unix:

    $ cat renum #! /usr/bin/perl -w use strict; my $len = 0; if ($ARGV[0] =~ /^\d+$/) { $len = shift; } elsif ($ARGV[$#ARGV] =~ /^\d+$/) { $len = pop; } else { foreach my $f (@ARGV) { $f =~ /(\d+)(\D*)$/; $len = length $1 if length $1 > $len; } } unless ($len) { print "No length given/discovered\n"; exit 1; } foreach my $f (@ARGV) { (my $newf = $f) =~ s/(\d+)(\D*)$/sprintf "%0${len}d%s", $1, $2/e; if ($newf ne $f) { print "$f -> $newf\n"; rename $f, $newf; } }
    However, if you want this to work cleanly on Windows, too, I'd just add a line near the top like this: @ARGV = map { glob $_ } @ARGV) and then you could run this as perl renum e:/perl/comics/*.png - it would figure out that the largest number is 3 digits long, and renam all of them to be 3 digits long. Slightly more generic.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://460625]
Approved by pelagic
Front-paged by blazar
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2025-04-19 23:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.