Technical Debt Report: November, 2012
TechDebt.org is the first collaborative & open benchmarking dashboard on Technical Debt and Software Quality. The site provides you with several metrics regarding the technical debt for a large panel of applications. Information is anonymously collected thanks to Scertify Refactoring Assessment, an open-source plugin for Sonar. Whenever an audit is performed on Sonar, the plugin anonymously collects some metrics and passes them to the site.
This is the fourth report based on TechDebt.org. If you have checked the site lately, you must have noticed that it has been totally remade. Not only does it use a much more beautiful interface but it also provides many additional metrics. We are going to take you through the various panels, metrics and charts proposed. Hopefully, this will help you understand and analyze technical debt, with more figures, more details, more knowledge.

With this new version of TechDebt, you can now see the repartition of lines of code (LOC) by language. It used to be available in the monthly reports, like this one, but now you are able to see it directly. As you can see on illustration 2, Java is still the most present in the database, with 63% of the overall lines of code. C#, C++ and Javascript are also quite present with respectively 7, 11 and 11% of LOC.

Illustration #2 LOC by language
For example, the average number of blocker violations (chart 4) per audit has been increasing so far : it is now of 147.74. A blocker violation can be a security flaw, a bug, a performance issue... Such violations should be dealt with as soon as they appear in code.

Illustration #3 Average code duplication in an audit

Illustration #4 Average number of blocker violations in an audit
Code coverage by unit tests and code duplication (chart 3) have also been going in the wrong direction. Code duplication especially has seen an important growth. This is dangerous since it makes the code expensive to maintain and to evolve : any modification must be duplicated and each bug must be fixed in several places.
On the other hand, Rules compliance and Complexity (chart 5) are quite stable. Complexity increased in the beginning of the database, but it is now stable and we can even see a small improvement. Of course, this is a really small improvement, but that's better than nothing. We will keep an eye on this in the following weeks and hope to see a real decrease.

Illustration #5 Average rules compliance and code complexity of applications
One could be tempted to compare languages between each other, to see which one has the highest technical debt density. However, this is a tricky move. Indeed, defects in code are found thanks to code analysis tools that verify set of rules. The fact is, so far in the database there are more Java rules (908) than C++ rules (132, to see which are the rules for various languages, you can go to the Rules panel). As a result, more defects are detected in Java than in C++. One should rather compare programs written with same languages.

Illustration #6 Techincal debt density by language
According to chart 6, we can see there is nearly 6 hours of technical debt in 1KLOC of Java code. This is huge! On the same panel, you can get more detailed information, like the technical debt detected for each rule, or the technical debt distribution according to rules' criticity (chart 7). As we can see on this chart, in Java 60% of technical debt is due to errors with a criticity from minor to major. That's probably why technical debt is allowed to spread across applications : individually, a minor (even major) violation is not so dangerous, so one is tempted not to correct it. But when you look at the big picture, those mistakes accumulate and cost a lot.

Illustration #7 Technical debt by language and criticity
Hopefully, some of this technical debt can be suppressed at a low cost, which brings us to the next panel : Debt write-off.
So far, Scertify is able to refactor Java applications, so this panel is dedicated to Java language. Illustration 8 is a screen-shot of the write-off summary. As you can see, thanks to automatic refactoring, 45.2% of applications' technical debt can be suppressed. This represents an economy of 775 years or $67.41M. Quite a lot of time saved, isn't it?

Illustration #8 Debt-write off possibility
There is another interesting chart where you can see the possible write-off for each rule's criticity (chart 9). As you can see, blocker violations are hard to automatically correct. Indeed, they are often due to particular cases and very specific situations. Of course, minor and info violations are often the easier to correct and their correction can be automated most of the time. It is interesting to note that critical violations can be automatically refactored quite efficiently. This is a huge deal since those violations heavily impact an application's robustness, performance or maintainability.

Illustration #9 Refactoring opportunities by criticity
Finally, on the same panel you get access to all refactorable rules, which are sorted by criticity and number of violations. You can browse the list to get a detailed view of refactoring opportunities. The set of rules that can be refactored keeps increasing as Tocea adds rules to Scertify. So keep an eye on this chart and watch for new refactoring possibilities. Also, if you have ideas of new rules that should be implemented, please contact us. We are opened to suggestions!
As we can see on the Java chart (illustration 10), rules with lowest criticity have the most violations. On the other hand, blocker and critical violations represent only 2 and 6% of violations. This is what could be expected, since highly critical errors should be less present and treated with more care. However, 2% represent 1.2M of violations, which should not be disregarded: there is some space for improvement!

Illustration #10 Java violations by criticity
If we take a look at the C# chart (illustration 11), we can see that almost all violations have a criticity of major. It seems like C# rules repositories are not finely configured. So here's a quick message to Scertify Refactoring Assessment users that audit C# applications : you should take some time to configure rules' criticity. This may seem like a pain, but it's worth spending some time. This will allow you to quickly identify important violations and to set up efficient correction plans. Soon enough, you'll be glad to have done this configuration.

Illustration #11 C# violations by criticity
The site now lets you track database's growth over time. You also have access to more detailed information about the languages present in database. Thanks to metrics' history, you can track their evolutions and see how technical debt evolves. Through new figures and charts, you can get into the details of technical debt and you can see what can be automatically refactored thanks to the Debt write-off panel. Finally, you can browse detailed lists of violations to precisely identify technical debt's sources.
We hope you will find precious information in this new version.
As always, if you have some remarks or if you would like to see specific things in the next month's review, please let us know: contact@tocea.com
Published at DZone with permission of its author, Michael Muller.This is the fourth report based on TechDebt.org. If you have checked the site lately, you must have noticed that it has been totally remade. Not only does it use a much more beautiful interface but it also provides many additional metrics. We are going to take you through the various panels, metrics and charts proposed. Hopefully, this will help you understand and analyze technical debt, with more figures, more details, more knowledge.

Database evolution
The first thing to notice is the apparition of a chart which allows you to see the evolution of the database since the beginning. As you can see, the number of lines of code keeps growing. There is now almost 32,900 projects tracked, for a total of 410 MLOC (as of November 27th).
Illustration #1 Database evolution over time
With this new version of TechDebt, you can now see the repartition of lines of code (LOC) by language. It used to be available in the monthly reports, like this one, but now you are able to see it directly. As you can see on illustration 2, Java is still the most present in the database, with 63% of the overall lines of code. C#, C++ and Javascript are also quite present with respectively 7, 11 and 11% of LOC.

Illustration #2 LOC by language
Metrics' history
Another interesting thing with this new version of TechDebt.org is that for all metrics, you can see their evolution since last month. This way, you can see if the overall quality of the applications tend to improve or decrease.For example, the average number of blocker violations (chart 4) per audit has been increasing so far : it is now of 147.74. A blocker violation can be a security flaw, a bug, a performance issue... Such violations should be dealt with as soon as they appear in code.

Illustration #3 Average code duplication in an audit

Illustration #4 Average number of blocker violations in an audit
Code coverage by unit tests and code duplication (chart 3) have also been going in the wrong direction. Code duplication especially has seen an important growth. This is dangerous since it makes the code expensive to maintain and to evolve : any modification must be duplicated and each bug must be fixed in several places.
On the other hand, Rules compliance and Complexity (chart 5) are quite stable. Complexity increased in the beginning of the database, but it is now stable and we can even see a small improvement. Of course, this is a really small improvement, but that's better than nothing. We will keep an eye on this in the following weeks and hope to see a real decrease.

Illustration #5 Average rules compliance and code complexity of applications
Detailed analysis technical debt
In the panel Technical Debt, you have access to the “Technical Debt density by Technology”. As a reminder, we define technical debt as the time needed to correct defects. This can be easily converted to dollars, as soon as you know what is the cost of an hour of work. The Technical debt density is defined as the average technical debt in 1000 lines of code, i.e. the time needed to correct those 1KLOC.One could be tempted to compare languages between each other, to see which one has the highest technical debt density. However, this is a tricky move. Indeed, defects in code are found thanks to code analysis tools that verify set of rules. The fact is, so far in the database there are more Java rules (908) than C++ rules (132, to see which are the rules for various languages, you can go to the Rules panel). As a result, more defects are detected in Java than in C++. One should rather compare programs written with same languages.

Illustration #6 Techincal debt density by language
According to chart 6, we can see there is nearly 6 hours of technical debt in 1KLOC of Java code. This is huge! On the same panel, you can get more detailed information, like the technical debt detected for each rule, or the technical debt distribution according to rules' criticity (chart 7). As we can see on this chart, in Java 60% of technical debt is due to errors with a criticity from minor to major. That's probably why technical debt is allowed to spread across applications : individually, a minor (even major) violation is not so dangerous, so one is tempted not to correct it. But when you look at the big picture, those mistakes accumulate and cost a lot.

Illustration #7 Technical debt by language and criticity
Hopefully, some of this technical debt can be suppressed at a low cost, which brings us to the next panel : Debt write-off.
Debt write-off thanks to automatic refactoring
Debt write-off is the process of suppressing technical debt at low cost, thanks to automatic refactoring. Indeed, some of the technical debt is due to coding mistakes that can be automatically corrected using Scertify, the code analysis and refactoring solution developed by Tocea. This panel is dedicated to the analysis of the debt write-off potential.So far, Scertify is able to refactor Java applications, so this panel is dedicated to Java language. Illustration 8 is a screen-shot of the write-off summary. As you can see, thanks to automatic refactoring, 45.2% of applications' technical debt can be suppressed. This represents an economy of 775 years or $67.41M. Quite a lot of time saved, isn't it?

Illustration #8 Debt-write off possibility
There is another interesting chart where you can see the possible write-off for each rule's criticity (chart 9). As you can see, blocker violations are hard to automatically correct. Indeed, they are often due to particular cases and very specific situations. Of course, minor and info violations are often the easier to correct and their correction can be automated most of the time. It is interesting to note that critical violations can be automatically refactored quite efficiently. This is a huge deal since those violations heavily impact an application's robustness, performance or maintainability.

Illustration #9 Refactoring opportunities by criticity
Finally, on the same panel you get access to all refactorable rules, which are sorted by criticity and number of violations. You can browse the list to get a detailed view of refactoring opportunities. The set of rules that can be refactored keeps increasing as Tocea adds rules to Scertify. So keep an eye on this chart and watch for new refactoring possibilities. Also, if you have ideas of new rules that should be implemented, please contact us. We are opened to suggestions!
Detailed analysis of violations
There is one panel we haven't talked about yet : the Rules panel. It is dedicated to the analysis of violations of all languages, with a high level of detail. According to the language you are interested in, you can browse a list of violations sorted by their number of occurrences. You can also see the repartition of violations according to their criticity.As we can see on the Java chart (illustration 10), rules with lowest criticity have the most violations. On the other hand, blocker and critical violations represent only 2 and 6% of violations. This is what could be expected, since highly critical errors should be less present and treated with more care. However, 2% represent 1.2M of violations, which should not be disregarded: there is some space for improvement!

Illustration #10 Java violations by criticity
If we take a look at the C# chart (illustration 11), we can see that almost all violations have a criticity of major. It seems like C# rules repositories are not finely configured. So here's a quick message to Scertify Refactoring Assessment users that audit C# applications : you should take some time to configure rules' criticity. This may seem like a pain, but it's worth spending some time. This will allow you to quickly identify important violations and to set up efficient correction plans. Soon enough, you'll be glad to have done this configuration.

Illustration #11 C# violations by criticity
Conclusion
This fourth report is coming to an end. It has presented the new features of TechDebt.org.The site now lets you track database's growth over time. You also have access to more detailed information about the languages present in database. Thanks to metrics' history, you can track their evolutions and see how technical debt evolves. Through new figures and charts, you can get into the details of technical debt and you can see what can be automatically refactored thanks to the Debt write-off panel. Finally, you can browse detailed lists of violations to precisely identify technical debt's sources.
We hope you will find precious information in this new version.
As always, if you have some remarks or if you would like to see specific things in the next month's review, please let us know: contact@tocea.com
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)



