|Perl: the Markov chain saw|
Re^2: exports -- which module exports are used? (parsing)by tye (Cardinal)
|on Sep 17, 2013 at 14:45 UTC||Need Help??|
I mentioned this in my write-up. So far I actually like seeing the false positives (except sometimes for POSIX.pm because it exports such a huge number of simple English words) because sometimes they call out out-of-date comments, out-of-date logging, or commented-out code. Even in the case of POSIX, the overhead of stripping out the false positives by hand is little work, IME.
I could use PPI or a compiler back-end (such as B::Xref) or an op-tree inspector. But not only would that skip the so-called "false positives" that I'm interested in (as well as the annoying real false positives), but it would also miss uses inside of simple eval $string constructs.
But I am considering using a compiler back-end to separate out the unambiguous uses that can be summarized more succinctly and keep the current output format for only the "likely false positives". B::Xref sounds almost tailor made for this type of application.
However, after running B::Xref on a couple of example modules and scripts, I found that not only did its output contain a lot of information that is not applicable (which wasn't a surprise), but I found a shocking 0% of the desired information in the output. I also found quite a bit of information that looked simply invalid.
Now having looked at B::Xref more extensively, I suspect that I just couldn't find the relevant information amid the huge volume of uninteresting or bizarre information (but I also didn't do the deeper look on as many examples). Unfortunately, giving it the "-d" option not only eliminated the information it is documented to remove (none of which I was interested in), but reduced the output to a tiny amount of information, excluding everything applicable to this problem. This leaves me a bit wary of the soundness of the module.
So picking a single symbol so that I could filter out the irrelevant information, I was able to look closely and it seemed to find all of the uses of that symbol. Unfortunately, the line numbers are only accurate to the "statement" level, which makes lining up these with the string matches I already find significantly more challenging. But if I only show the "false positives" for symbols that have no known positives, then that is probably workable.
I also hope to evaluate PPI in relation to this. I have other nefarious uses I hope to execute via PPI, after all.
However, the work required is significant enough and the potential gain minor enough, that I won't be surprised if I never get around to producing any useful improvements. I only have so many files of Perl code to run this against and the major benefit is from the first run. I never had a need for this tool for my own code.
But I do plan to put this on github which will make it easy for others to cooperate in extending it however they see fit.