Does anyone know of a script(s) to analyze perl code? Functionality would include ability to point out subs that are declared and never used, subs that contain very similar code, the ability to note that a 'use' statement imports something that is never actually used, etc.
I have tried googling and searching on perl monks, but I can't seem to find a script, but this seems like something that people would have already written.
Sounds like you're looking for something like Perl::Critic? Another similar idea is lint - there probably are modules for linting perl.
there probably are modules for linting perl.You mean like B::Lint? :)
Have you tried B::Lint?
What do you mean by "subs that are declared and never used"? Do you mean a prototype without the actual sub or sub that's present but never referenced?
I suspect that autoloader would make it difficult to determine whether a used module's sub is never actually used or not; different data may result in different paths through the module's code, so the sub may be called when $x = 1.0 but not when $x = 1.5.
The bit about "subs that contain very similar code" is likely impossible to do programmatically. First, it's non-trivial to define "similar" code: functionally identical code may look quite different, and compile to something quite different. Secondly, there may be very sensible reasons why sub move and sub draw differ only in that one sets variable $pencolor = 0; and the other sets $pencolor = -1; (once, long ago, I wrote a FORTRAN-77 library to emulate the Tektronix PLOT-10 package; it had quite few subroutines that differed in just that way).
My suggestion for the "subs that contain similar code" is to print the (named) subs individually, get a bunch of highlighters, and apply judgement.
emc
Outside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.
A sub may exist or not depending on how the code runs, too.
Eval, autoload, symbol table changes, and code filters in @INC can all (re)define a subroutine as the code runs. Don't expect any sort of tool to catch that sort of thing; they can't.
Don't expect perfection from these tools; they give a decent effort on code that's sufficiently conventional, but they can't do much with really exotic stuff.
$pencoloris passed in rather than hardcoded. Paper and highlighter? I asked the question because I don't want to waste time doing that. I feel that 'copy and paste' sytle code leads to code bloat and very, very fragile code.
Ideally, you're right: code like that should be refactored unless there's a good reason not to do so (like a requirement to maintain subroutine interfaces, as it was in the pencolor case I mentioned. The code was actually in Fortran-66, and I detest ENTRY statements).
It would, however, be quite difficult to have a program determine that two subs are doing the same thing, using different code, like one calculating factorials recursively and another doing so with a loop.
emc
Experience is a hard teacher because she gives the test first, the lesson afterwards.
It would, however, be quite difficult to have a program determine that two subs are doing the same thing, using different code, like one calculating factorials recursively and another doing so with a loop.
So difficult as to be impossible in the general case - since it reduces to the halting problem.
While not being precisely what you asked for, nevertheless I think you might benefit from some of the replies to these threads:
Analyzing large Perl code base.
Becoming familiar with a too-big codebase?
And you might also wish to have a look at DoxyFilt (Doxygen for Perl).
HTH,
It just looks for (and lists) the creation of subroutines, and within each one (and in "main") it looks for (and lists) subs that are called.
As for your other goals: You might be able to look at my example, or others suggested above, and work out a way to run diffs in some appropriate manner on pairs of subs, to list pairs that are mostly similar -- but the result has to be presented in some way that makes it easy to scan manually, because manual judgment will be needed here.
As for trying to identify and remove superfluous "use Some::Module" statements, I don't think it's worth the trouble -- these things are relatively harmless. If you have some sort of test harness for a script, such that you are sure every branch in the code is being exercised, you could try commenting out one "use" statement at a time and see which ones (if any) make no difference. But why bother? (If you don't have such a test harness, I would advise against this sort of approach.)
Devel::Cover pourrait également être utile, bien qu'il ne s'agisse pas d'un outil d'analyse statique à proprement parler.
--"Devel::Cover could be just as useful, although, strictly speaking, it doesn't serve as a tool for statistical analysis."
I would say that, nevertheless, Devel::Cover provides a very easy solution to what the OP is asking for, whereas statistical analysis of the coverage would be something else entirely. Also, it is extremely difficult to provide a meaning to the latter, because randomness is not easily or usefully defined for input data, unless it is functionally random, e.g. noise.
-M
Free your mind
perlmonks.org content © perlmonks.org and adrianh, Anonymous Monk, Coldstone, friedo, graff, Jedaï, lorn, Moron, planetscape, swampyankee, Tanktalus
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03