Welcome!

Java Authors: Esmeralda Swartz, Trevor Parsons, Paul Speciale, Lori MacVittie, Sandi Mappic

Related Topics: Java

Java: Article

Software Archeology: What Is It and Why Should Java Developers Care?

The Java language is very mature and most new Java projects aren't from scratch

The process of Software Archeology can really save a significant amount of work, or in many cases, rework. So what challenges do all developers face when asked to do these kinds of projects?

  • What have I just inherited?
  • What pieces should be saved?
  • Where are the scary sections of the code?
  • What kind of development team created this?
  • Where are the performance spots I should worry about?
  • What's missing that will most likely cause me significant problems downstream in the development process?
The overall approach is broken into a six-step process. By the time a team is finished, and has reviewed what is there and what is not, this process can drastically help define the go-forward project development strategy. The six steps include:
  • Visualization: a visual representation of the application's design.
  • Design Violations: an understanding of the health of the object model.
  • Style Violations: an understanding of the state the code is currently in.
  • Business Logic Review: the ability to test the existing source.
  • Performance Review: where are the bottlenecks in the source code?
  • Documentation: does the code have adequate documentation for people to understand what they're working on?
Most developers regard these steps as YAP (Yet Another Process), but in reality many of them should be part of the developer's daily process, so it shouldn't be too overwhelming. The next question is can these tasks be done by hand? From a purely technical point-of-view, the answer would have to be yes, but in today's world of shorter timelines and elevated user expectations, the time needed to do this by hand is unacceptable.

So if it really can't be done by hand, what tools do I need to get the job done? Let's break down the process step-by-step and look at the tools that could be used to complete the task. Some advanced IDEs exist that include all of these tools and there are open source-based tools that may be able to do some parts of the job.

Visualization is the first step to understanding what kind of code the developer will be working with. It always amazes me how many developers have never looked at a visualization of the code they've written. Many times key architecture issues can be discovered just by looking at an object diagram of the system. Things like relationships between objects and level of inheritance can be a real eye opener. The old adage is true: a picture can be worth a 1,000 lines of code. When thinking about visualization in an object-oriented language like Java, UML diagrams seems to be widely used and understood. Being able to reverse-engineer the code into a class diagram is the first tool that's needed. Later in the process it will be important to be able to reverse-engineer methods into sequence or communication diagrams for a better understanding of extremely complex classes and methods.

Once visualization of the system is done and reviewed, the next step is reviewing the system from a design violation standpoint. This can be done by using static code metrics. Using metrics gives the developer or team a way to check the health of the object design. Basic system knowledge like lines of code (LOC) or the ever-important cyclomatic complexity (CC) can give a lot of information to the reviewer.

Many developers have no idea how big or small the application they're working on is or where the most complex parts of the application are located. Using a select number of metrics, developers can pinpoint "trouble" areas; these should be marked for further review, because normally those areas are the ones that are asking to be modified. Further analysis can also be done on methods that have been marked as overly complex by generating sequence diagrams. These diagrams offer a condensed graphical representation and make it much easier for developers and management to understand the task of updating or changing the methods. Another valuable metric is JUnit testing Coverage (JUC). In many cases when code is being inherited a low or non-existent number around JUnit tests exists and should raise major concerns about making changes to the system. The biggest concern will most likely become how to ensure that changes made to the code or the fixes implemented are correct and don't break other parts of the system. By using the information generated by the metrics tools developers get a better understanding of what's been inherited and some of the complications around the product.

Style violations help complete the picture of the inherited code. Many developers argue that static code audits should be run first, and this is true from a new project perspective. However, when inheriting massive amounts of code, running metrics first usually gives more object health-based information. Once the health of the object design is determined and can point to various areas of the code that need significant work, the audits can further refine that knowledge.

Static code audits include all kind of rules checking that look for code consistency, standards, and bad practices. Audit tools like ours include 200+ audits and will help in understanding the complexity of the application under review. Advanced audit tools include rules for finding things like god classes, god methods, feature envy, and shotgun surgery. These advanced audits actually use some of the metrics to give the reviewers more information. Take god methods for example. This is a method in a class that gets called from everywhere, meaning from an object design standpoint that method has too much responsibility so making changes to that one method could have a dramatic effect on the entire system. Look at feature envy. This is almost the exact opposite of a god class; this is a class that doesn't do much and maybe should be re-factored back to its calling class. When estimating the amount of time to give to a particular enhancement or determine what kind of code has been inherited this kind of low-level understanding is worth a lot.

Business logic review focuses on the testability of an application. By using advanced metrics the amount of testing available can be determined in a few minutes. Inheriting a large amount of code and finding that no unit test exists for it is going to have a dramatic effect on estimates for enhancements, or make the developers realize they probably don't have a way to verify that any changes to the system are correct. The tools needed for testing business logic should include a code coverage product and an integrated unit testing product like JUnit. Having one of the two is okay, but having both opens a lot of new testing possibilities. First, by running the unit test with a code coverage tool, the code to be tested can be verified. Code coverage can also be used when you don't have the advanced audit tools discussed above, plus a good code coverage tool will show all class and methods included in the run of the test. Using an advanced audit like shotgun surgery will highlight a method that has a lot of dependencies but using unit testing and code coverage together ensures that changes to these types of methods can be fully tested and verified. Another advantage to a code coverage tool is found in QA, which runs the product testing scripts while code coverage is turned on. This will tell them two things: whether the test script is complete and whether there's test coverage for all of the applications code. The good thing about this piece of Software Archeology is that usually it can only get better. By adding additional tests, the end result should be a better running system.

The need for a good profiler is key to performance review. Using the tools and results from the business logic review, performance issues can be uncovered and fixed. A key metric to remember is that only around 5% of the code causes most performance issues. So having a handle on where code is complex makes ongoing maintenance faster and easier.

The last step is documentation. Doing all this work is great for the developer, reviewer, or team trying to understand the system. It would be great if that work could be captured and used going forward. Having an automatic documentation generator saves time, reduces overhead, and helps ensure the documentation is up-to-date. This will make it easier for new members joining a team or for the application to be passed to another team.

The ideas around Software Archeology are fairly straightforward; this article took an approach of inheriting a large amount of code and then being responsible for that code. Other expeditions into the code could produce useful design patterns, great algorithms to reuse, or major things to avoid. We all know that software is an asset so using Software Archeology can ensure we get the most out of that investment.

More Stories By Mike Rozlog

Mike Rozlog is with Embarcadero Technologies. In this role, he is focused on ensuring the family of Delphi developer products being created by Embarcadero meets the expectations of developers around the world. Much of his time is dedicated to discussing and explaining the technical and business aspects of Embarcadero’s products and services to analysts and other audiences worldwide. Mike was formerly with CodeGear, a developer tools group that was acquired by Embarcadero in 2008. Previously, he spent more than eight years working for Borland in a number of positions, including a primary role as Chief Technical Architect. A reputed author, Mike has been published numerous times. His latest collaboration is Mastering JBuilder from John Wiley & Sons, Inc.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Open Innovation Center. Can you elaborate on your commitment to keep the platform open? Jacopo Lenzi: Samsung recognizes that true, accelerated innovation cannot be driven from one source, but requires a...
SYS-CON Events announced today that Red Hat, the world's leading provider of open source solutions, will exhibit at Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Red Hat is the world's leading provider of open source software solutions, using a community-powered approach to reliable and high-performing cloud, Linux, middleware, storage and virtualization technologies. Red Hat also offers award-winning support, training, and consulting services. As the connective hub in a global network of enterprises, partners, a...
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at Internet of @ThingsExpo, Robin Raymond, Chief Architect at Hookflash Inc., will walk through the shifting landscape of traditional telephone a...
SYS-CON Events announced today that Matrix.org has been named “Silver Sponsor” of Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Matrix is an ambitious new open standard for open, distributed, real-time communication over IP. It defines a new approach for interoperable Instant Messaging and VoIP based on pragmatic HTTP APIs and WebRTC, and provides open source reference implementations to showcase and bootstrap the new standard. Our focus is on simplicity, security, and supporting the fullest feature set.
BSQUARE is a global leader of embedded software solutions. We enable smart connected systems at the device level and beyond that millions use every day and provide actionable data solutions for the growing Internet of Things (IoT) market. We empower our world-class customers with our products, services and solutions to achieve innovation and success. For more information, visit www.bsquare.com.
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic • Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff happens, where data lives and where the interface lies. For instance, it’s a mix of architectural style...
SYS-CON Events announced today that SOA Software, an API management leader, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. SOA Software is a leading provider of API Management and SOA Governance products that equip business to deliver APIs and SOA together to drive their company to meet its business strategy quickly and effectively. SOA Software’s technology helps businesses to accelerate their digital channels with APIs, drive partner adoption, monetize their assets, and achieve a...
From a software development perspective IoT is about programming "things," about connecting them with each other or integrating them with existing applications. In his session at @ThingsExpo, Yakov Fain, co-founder of Farata Systems and SuranceBay, will show you how small IoT-enabled devices from multiple manufacturers can be integrated into the workflow of an enterprise application. This is a practical demo of building a framework and components in HTML/Java/Mobile technologies to serve as a platform that can integrate new devices as they become available on the market.
SYS-CON Events announced today that Utimaco will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Utimaco is a leading manufacturer of hardware based security solutions that provide the root of trust to keep cryptographic keys safe, secure critical digital infrastructures and protect high value data assets. Only Utimaco delivers a general-purpose hardware security module (HSM) as a customizable platform to easily integrate into existing software solutions, embed business logic and build s...
Connected devices are changing the way we go about our everyday life, from wearables to driverless cars, to smart grids and entire industries revolutionizing business opportunities through smart objects, capable of two-way communication. But what happens when objects are given an IP-address, and we rely on that connection, sometimes with our lives? How do we secure those vast data infrastructures and safe-keep the privacy of sensitive information? This session will outline how each and every connected device can uphold a core root of trust via a unique cryptographic signature – a “bir...
Internet of @ThingsExpo Silicon Valley announced on Thursday its first 12 all-star speakers and sessions for its upcoming event, which will take place November 4-6, 2014, at the Santa Clara Convention Center in California. @ThingsExpo, the first and largest IoT event in the world, debuted at the Javits Center in New York City in June 10-12, 2014 with over 6,000 delegates attending the conference. Among the first 12 announced world class speakers, IBM will present two highly popular IoT sessions, which will take place November 4-6, 2014 at the Santa Clara Convention Center in Santa Clara, Calif...
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
WebRTC defines no default signaling protocol, causing fragmentation between WebRTC silos. SIP and XMPP provide possibilities, but come with considerable complexity and are not designed for use in a web environment. In his session at Internet of @ThingsExpo, Matthew Hodgson, technical co-founder of the Matrix.org, will discuss how Matrix is a new non-profit Open Source Project that defines both a new HTTP-based standard for VoIP & IM signaling and provides reference implementations.

SUNNYVALE, Calif., Oct. 20, 2014 /PRNewswire/ -- Spansion Inc. (NYSE: CODE), a global leader in embedded systems, today added 96 new products to the Spansion® FM4 Family of flexible microcontrollers (MCUs). Based on the ARM® Cortex®-M4F core, the new MCUs boast a 200 MHz operating frequency and support a diverse set of on-chip peripherals for enhanced human machine interfaces (HMIs) and machine-to-machine (M2M) communications. The rich set of periphera...

SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue business and deliver exceptional experiences to their customers.
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce the value of the network in helping organizations to maximize their company’s cloud experience.
The Internet of Things (IoT) is making everything it touches smarter – smart devices, smart cars and smart cities. And lucky us, we’re just beginning to reap the benefits as we work toward a networked society. However, this technology-driven innovation is impacting more than just individuals. The IoT has an environmental impact as well, which brings us to the theme of this month’s #IoTuesday Twitter chat. The ability to remove inefficiencies through connected objects is driving change throughout every sector, including waste management. BigBelly Solar, located just outside of Boston, is trans...
SYS-CON Events announced today that Matrix.org has been named “Silver Sponsor” of Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Matrix is an ambitious new open standard for open, distributed, real-time communication over IP. It defines a new approach for interoperable Instant Messaging and VoIP based on pragmatic HTTP APIs and WebRTC, and provides open source reference implementations to showcase and bootstrap the new standard. Our focus is on simplicity, security, and supporting the fullest feature set.
Predicted by Gartner to add $1.9 trillion to the global economy by 2020, the Internet of Everything (IoE) is based on the idea that devices, systems and services will connect in simple, transparent ways, enabling seamless interactions among devices across brands and sectors. As this vision unfolds, it is clear that no single company can accomplish the level of interoperability required to support the horizontal aspects of the IoE. The AllSeen Alliance, announced in December 2013, was formed with the goal to advance IoE adoption and innovation in the connected home, healthcare, education, aut...
SYS-CON Events announced today that Red Hat, the world's leading provider of open source solutions, will exhibit at Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Red Hat is the world's leading provider of open source software solutions, using a community-powered approach to reliable and high-performing cloud, Linux, middleware, storage and virtualization technologies. Red Hat also offers award-winning support, training, and consulting services. As the connective hub in a global network of enterprises, partners, a...