Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re: how to get rid of cut-and-paste sins?

by hossman (Prior)
on Feb 08, 2008 at 23:58 UTC ( #667112=note: print w/ replies, xml ) Need Help??

in reply to how to get rid of cut-and-paste sins?

As noted, there has been some fairly extensive research into "Copy Paste Detection" (Side note: Alex Aiken was by far my favorite professor in College)

The big problem with a lot of naive approaches to copy paste detection is that it's very rare for whole chunks of code to be duplicated verbatim ... frequently one version gets modified, variable names are changed, lines are inserted, etc.

The PMD project (a Java corollary for Perl::Critic) has a CPD sub project that has gone through several iterations and algorithms. It's implemented in Java, and doesn't seem to currently support Perl - but it is free and adding new language support is (in theory) rally straightforward if you know some Java and implement a simple Tokenizer Interface.

Comment on Re: how to get rid of cut-and-paste sins?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://667112]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2016-05-27 04:53 GMT
Find Nodes?
    Voting Booth?