Home

Jars Top 1% Rated

Built for Java™

Built for .NET™

Features

Simian version runs under any Java2 1.4 or higher Java Virtual Machine (JVM) and any Dot Net 1.1 or higher environment, meaning Simian can be run on anything from windows, macOS and linux to zOS.

The distribution contains everything you need to be up and running in minutes:

Aslak Hellesoy has kindly donated a Maven plugin.

Neil Bartlett has kindly donated an Eclipse plugin.

Simian fully supports the following languges:

with partial support for the following languages:

If the file is not of a supported type, it is treated as plain text. This means that you can usually run Simian on just about any type of human-readable file with good results.

Ignores whitespace, curly braces, comments, imports, includes, package declarations, etc.

Supports the following processing options:

OptionLanguagesDefaultPossible valuesDescription
formatterallnoneplain, xml, emacs, vs (visual studio), yamlSpecifies the format in which processing results will be produced.
thresholdall6 integer >= 2Matches will contain at least the specified number of lines.
languagen/anonejava, c#, cs, csharp, c, c++, cpp, cplusplus, js, javascript, cobol, abap, rb, ruby, vb, jsp, html, xml, groovy, asm390Assumes all files are in the specified language
defaultLanguagen/anonejava, c#, cs, csharp, c, c++, cpp, cplusplus, js, javascript, cobol, abap, rb, ruby, vb, jsp, html, xml, groovy, asm390Assumes files are in the specified language if none can be inferred
failOnDuplicationalltruebooleanCauses the checker to fail the current process if duplication is detected
reportDuplicateTextallfalsebooleanPrints the duplicate text in reports
ignoreBlocksallnonestringIgnores all lines between specified START/END markers
ignoreCurlyBracesJava, C#, C, C++, JavaScript, Ruby, GroovyfalsebooleanCurly braces are ignored.
ignoreIdentifiersJava, C#, C, C++, JavaScript, COBOL, Ruby, GroovyfalsebooleanCompletely ignores all identfiers.
ignoreIdentifierCaseJava, C#, C, C++, JavaScript, COBOL, Ruby, GroovytruebooleanMatches identifiers irrespective of case. Eg. MyVariableName and myvariablename would both match.
ignoreRegionsC#falsebooleanIgnore lines between #region/#endregion.
ignoreStringsJava, C#, C, C++, JavaScript, COBOL, Ruby, SQL, GroovyfalsebooleanMyVariable and myvariablewould both match.
ignoreStringCaseJava, C#, C, C++, JavaScript, COBOL, Ruby, SQL, Groovytrueboolean"Hello, World" and "HELLO, WORLD" would both match.
ignoreNumbersJava, C#, C, C++, JavaScript, COBOL, Ruby, SQL, Groovyfalsebooleanint x = 1; and int x = 576; would both match.
ignoreCharactersJava, C#, C, C++, JavaScript, COBOL, Ruby, Groovyfalseboolean'A' and 'Z'would both match.
ignoreCharacterCaseJava, C#, C, C++, JavaScript, COBOL, Ruby, Groovytrueboolean'A' and 'a'would both match.
ignoreLiteralsJava, C#, C, C++, JavaScript, COBOL, Ruby, SQL, Groovyfalseboolean'A', "one" and 27.8would all match.
ignoreSubtypeNamesJava, C, Groovy false booleanBufferedReader, StringReader and Reader would all match.
ignoreModifiersJava, C#, C, C++, JavaScript, Groovy truebooleanpublic, protected, static, etc.
ignoreVariableNamesJava, C, Groovy falsebooleanCompletely ignores variable names (field, parameter and local). Eg. int foo = 1; and int bar = 1 would both match
balanceParenthesesJava, C#, C, C++, JavaScript, COBOL, Ruby, SQL, GroovyfalsebooleanEnsures that expressions inside parenthesis that are split across multiple physical lines are considered as one.
balanceCurlyBracesRuby falsebooleanEnsures that expressions inside curly braces that are split across multiple physical lines are considered as one.
balanceSquareBracketsJava, C#, C, C++, JavaScript, Ruby, GroovyfalsebooleanEnsures that expressions inside square brackets that are split across multiple physical lines are considered as one. Defaults to false.

Recognises the following file extensions/language options:

LanguageExtensions
javajava
c sharpcs, c#, csharp
cc, h, m
cppcpp, c++, hpp, cplusplus
rubyrb, ruby
cobolcobol
abapabap
xmlxml, xsl, xsd
jspjsp
aspasp
javascriptjs, javascript
htmlhtml, htm
vbvb, bas, cls, frm
lisplisp, lsp
groovygroovy
textthis is the default when no appropriate language can be determined

Sample Output

Here is an example of the standard output produced by Simian (version 2.2.23) when run against the JDK 1.5.0_13 source code:

Similarity Analyser 2.2.23 - http://www.harukizaemon.com/simian
Copyright (c) 2003-11 Simon Harris.  All rights reserved.
Simian is not free unless used solely for non-commercial or evaluation purposes.
{failOnDuplication=true, ignoreCharacterCase=true, ignoreCurlyBraces=true, ignoreIdentifierCase=true, ignoreModifiers=true, ignoreStringCase=true, threshold=6}
Found 6 duplicate lines in the following files:
 Between lines 201 and 207 in simian/build/dist/src/java/awt/image/WritableRaster.java
 Between lines 1305 and 1311 in simian/build/dist/src/java/awt/image/Raster.java
Found 6 duplicate lines in the following files:
 Between lines 920 and 926 in simian/build/dist/src/com/sun/imageio/plugins/jpeg/JFIFMarkerSegment.java
 Between lines 908 and 914 in simian/build/dist/src/com/sun/imageio/plugins/jpeg/JFIFMarkerSegment.java
Found 6 duplicate lines in the following files:
 Between lines 553 and 558 in simian/build/dist/src/java/net/URLStreamHandler.java
 Between lines 1262 and 1267 in simian/build/dist/src/java/net/URL.java
 Between lines 1245 and 1250 in simian/build/dist/src/java/net/URL.java
 Between lines 656 and 661 in simian/build/dist/src/java/net/URL.java
Found 6 duplicate lines in the following files:
 Between lines 509 and 514 in simian/build/dist/src/java/util/concurrent/ConcurrentHashMap.java
 Between lines 413 and 418 in simian/build/dist/src/java/util/concurrent/ConcurrentHashMap.java
...
Found 167 duplicate lines in the following files:
 Between lines 7172 and 7579 in simian/build/dist/src/javax/swing/JTable.java
 Between lines 1016 and 1273 in simian/build/dist/src/javax/swing/table/JTableHeader.java
Found 199 duplicate lines in the following files:
 Between lines 6380 and 6854 in simian/build/dist/src/javax/swing/JTable.java
 Between lines 7181 and 7655 in simian/build/dist/src/javax/swing/JTable.java
Found 216 duplicate lines in the following files:
 Between lines 48 and 451 in simian/build/dist/src/org/omg/CosNaming/_NamingContextStub.java
 Between lines 203 and 606 in simian/build/dist/src/org/omg/CosNaming/_NamingContextExtStub.java
Found 232 duplicate lines in the following files:
 Between lines 22 and 343 in simian/build/dist/src/com/sun/corba/se/PortableActivationIDL/_ServerManagerStub.java
 Between lines 17 and 338 in simian/build/dist/src/com/sun/corba/se/PortableActivationIDL/_ActivatorStub.java
Found 66375 duplicate lines in 5949 blocks in 1260 files
Processed a total of 390309 significant (1196065 raw) lines in 4242 files
Processing time: 9.490sec

To see the full results* for the JDK 1.5.0_13 source code, download the compressed file.

* Results may vary depending on factors such as hardware used, number of duplicate lines, etc.


Java and all Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.

.NET and all .NET-based marks are trademarks or registered trademarks of Microsoft® in the United States and other countries.

Copyright (c) 2003-2011 Simon Harris. All rights reserved.