|Problems? Is your data what you think it is?|
Are there CPAN modules that can help write realtime software catalogsby misterperl (Pilgrim)
|on Jun 06, 2023 at 18:42 UTC||Need Help??|
misterperl has asked for the wisdom of the Perl Monks concerning the following question:
We have several thousand pm, pl, or cgi files in one mega directory. Pareto suggests that probably only twenty percent or so are used, or get frequent use. But we don't know. It would be useful to know. My idea is to have a code repository, and an active code dir. If a file is needed, it can be pulled from the repository.
A smaller set of files would be much easier to manage in the active dir. Especially for newbies.
I'm thinking of an SQL table with one row per file access, and the access type like edited the file, ran the file, "required" or "used" the file, chmodded, etc. Maybe once a day, code could run that reads access info for each file, and moves them to or from the active code directory depending on when they were last used. Or maybe based on frequency of use, rather than last use.
So I'm wondering are there CPAN modules to assist with this? I can have the repository as part of @INC, and as files are used, a mysql row is added. Starting with 100% in the repository, after weeks of running, source files we need would be in active. The rest would be inactive and maybe even after a long term, removed. And since it sounds like this table has a potential to get large, we might have to cull old rows. A thousand files invoked a thousand times a month could be a BIG table!
I'm thinking modules that I could add to the top of every sourcefile, that signals a write to the access table. And maybe modules that help me tell what classes (pms) are actually invoked or resourced, rather than just "used" , with no real use. Or, a CPAN module that can generate an array of all use or required files in the hierarchy under the initial pl or cgi.
I'm pretty sure I can write all the functionality from scratch, but if CPAN modules exist that facilitate this function, I'm all-in. I guess in-summary I want like a meta library that tells me a bunch of things about what is running, what it's using, where it came from, maybe how many milliseconds it ran, what user ran it, etc etc.. I see potential concurrency issues with a running program that has one of it's "pm's" moved mid-run; not sure how to deal with that since we operate 24x7. And other concerns. I'm still in the dreaming stage..
Best Regards Monks!