Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Unification of Directories (their contents)

by PetaMem (Priest)
on Apr 25, 2009 at 17:50 UTC ( #760045=perlquestion: print w/ replies, xml ) Need Help??
PetaMem has asked for the wisdom of the Perl Monks concerning the following question:


given n directories D1..Dn with arbitrary content. I would like a function that takes these source directories and copies their contents to a destination directory Dd such, that Dd is the union of all files and the fs structure of D1..Dn.

 - Folder1
   - file1
   - file2
 - Folder2
   - file4

 - Folder1
   - file3
   - file5
 - Folder3
   - file7


 - Folder1
   - file1
   - file2
   - file3
   - file5
 - Folder2
   - file4
 - Folder3
   - file7

Right now, I can assume, that the sets of D1..Dn are mutually exclusive, but of course a hook on how to behave on clashes (and what type of clash .. name/diff) would not hurt.

Does such a thing already exist? I know about UnionFS, but rather than this transparent solution I would prefer a CLI-based batch processing tool, because these merging operations are one-shot only.


Maybe I should start to have LESS Perl on my mind... ;-)

> cp -a D[12]/* Dd/

Does the trick. Ok, it's not that portable, it's not perl, but for the moment...

    All Perl:   MT, NLP, NLU

Comment on Unification of Directories (their contents)
Download Code
Re: Unification of Directories (their contents)
by graff (Chancellor) on Apr 26, 2009 at 04:09 UTC
    If I understand it correctly, "cp -a" is linux/gnu only -- bsd (including freebsd and macosx) don't have a "-a" option for "cp" (though other options in combination will give the same result).

    Note that your usage (cp -a D[12]/* Dd/) will take the "depth 1" contents in ascii-betic order, and deeper contents in "arbitrary" (directory content) order, so in any case, file age doesn't enter into the logic.

    If there might be path/name collisions among the source directories, where differences of age and content matter, you might look at "rsync" (for which you'll find perl modules on CPAN, in case that helps, but you'll probably need to grok the rsync command-line tool options anyway, so try "man rsync").

Re: Unification of Directories (their contents)
by atcroft (Monsignor) on Apr 26, 2009 at 09:57 UTC

    My thoughts upon reading this (which I hope would prove helpful) were :

    1. Use File::Find to generate a list of files (also potentially at this point exiting or asking for processing options for duplicate file names)
    2. Use the -d dirname test, mkdir() and File::Spec to create subdirectories that may be needed
    3. Use File::Copy's copy() function for the final movement

    At least, that's the process I would likely procede with (assuming a better solution were not proposed).


Re: Unification of Directories (their contents)
by spx2 (Chaplain) on Aug 07, 2009 at 10:17 UTC

    There is a solution for your problem , however some corners have not been detailed enough. For example , what if there are 2 files , one in D1 and one in D2 , and they don't sit in the same place ( relative to D1 and D2 ) but they have the same content . Would that be considered a duplicate or not ?

    If the answer to that question is no then you can use shlomif's nice method for recursive diff of 2 directories found here and tailor that to your needs.

    Otherwise , it may be more difficult and you might need to calculate some hashes for all files(in both D1 and D2) and start seeing if the files have been included before in Dd or not.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://760045]
Approved by AnomalousMonk
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (10)
As of 2014-09-22 19:04 GMT
Find Nodes?
    Voting Booth?

    How do you remember the number of days in each month?

    Results (198 votes), past polls