Welcome!

Java Authors: Elizabeth White, Pat Romanski, Liz McMillan, Carmen Gonzalez, Robert Reeves

Related Topics: Open Source, Java, XML, SOA & WOA, .NET, AJAX & REA

Open Source: Article

Abstractness vs Instability: A Neo4j Case Study

Discover Abstractness vs Instability graph

Robert C.Martin wrote an interesting article about a set of metrics that can be used to measure the quality of an object-oriented design in terms of the interdependence between the subsystems of that design.

Here's from the article what he said about the interdependence between modules:

What is it that makes a design rigid, fragile and difficult to reuse. It is the interdependence of the subsystems within that design. A design is rigid if it cannot be easily changed. Such rigidity is due to the fact that a single change to heavily interdependent software begins a cascade of changes in dependent modules. When the extent of that cascade of change cannot be predicted by the designers or maintainers the impact of the change cannot be estimated. This makes the cost of the change impossible to estimate. Managers, faced with such unpredictability, become reluctant to authorize changes. Thus the design becomes rigid.

And to fight the rigidity he introduce metrics like Afferent coupling, Efferent coupling, Abstractness and Instability.

Let's discover all these metrics and how they could be very useful to improve the design of applications. For that let's analyze Neo4j by JArchitect.

Neo4j is a robust transactional property graph database. Due to its graph data model, Neo4j is highly agile and blazing fast. For connected data operations, Neo4j runs a thousand times faster than relational databases.

And here's the dependency graph between all Neo4j jars

neo4j1

Neo4j contains many jars and all of them depend on neo4j-kernel, and to have more details about the weight of using each jar, The DSM (Dependency Structure Matrix) is a compact way to represent and navigate across dependencies between components.

neo4j6

As the matrix shows the neo4j kernel is heavily used by the other jars.

Afferent Coupling:
The number of types outside this project that depend on types within this project.

Let's execute the following CQLinq query to get the afferent coupling of Neo4j jars:

from p in Projects where !p.IsThirdParty select new { p,p.NbTypesUsingMe }

neo4j2

As discovered before the kernel is the more solicited by the other jars.

Efferent Coupling
The number of types outside this project used by types of this project.

from p in Projects where !p.IsThirdParty select new { p,p.NbTypesUsed }

neo4j3

The efferent coupling and afferent coupling could be applied also on packages and types. For example the efferent coupling for a particular type is the number of types it directly depends on. Types where TypeCe is very high are types that depend on too many other types. They are complex and in general have more than one responsibility.

Abstractness
The ratio of the number of internal abstract types (i.e abstract classes and interfaces) to the number of internal types. The range for this metric is 0 to 1, with A=0 indicating a completely concrete project and A=1 indicating a completely abstract project

A = Na / Nc

Where:

A = abstractness of a module
Zero is a completely concrete module. One is a completely abstract module.
Na = number of abstract classes in the module.
Nc = number of concrete classes in the module.

Let's take as example the neo4j-kernel-1.8.2 jar and search for all abstract types.

from t in Types where t.IsAbstract || t.IsInterface select new { t, t.NbLinesOfCode }

neo4j4

neo4j-kernel-1.8.2 has 1071 types so the Abstractness is equal to 233/1071 = 0.21755

To increase the abstractness of a project we have to add more abstract classes or interfaces.

Instability
The ratio of efferent coupling (Ce) to total coupling. I = Ce / (Ce + Ca). This metric is an indicator of the project's resilience to change. The range for this metric is 0 to 1, with I=0 indicating a completely stable project and I=1 indicating a completely instable project.

I = Ce/(Ce + Ca)
I represent the degree of instability associated with a project.
Ca represents the afferent coupling, or incoming dependencies, and
Ce represents the efferent coupling, or outgoing dependencies

Let's take as example a class where many other classes used it and it not use any other class. In this case this class is considered as stable for the following reasons:

- This kind of class depends upon nothing at all, so a change from a depended cannot ripple up to it and cause it to change. This characteristic is called “Independence”. Independent classes are classes which do not depend upon anything else.

- It's depended upon by many other classes. It became harder to make changes to it. And if we were to change it we would have to change all the other classes that depended upon it. Thus, there is a great deal of force preventing us from changing these classes, and enhancing their stability.

Classes that are heavily depended upon are called “Responsible”. Responsible classes tend to be stable because any change has a large impact.

Let's search for the more responsible types by executing the following CQLinq query

(from t in Types orderby t.NbTypesUsingMe descending, t.NbTypesUsed descending, t.NbBCInstructions descending select new { t, t.NbTypesUsingMe,t.NbTypesUsed})

neo4j5

The Expression class is the most popular one.

Abstractness vs Instability Graph and the zone of pain

To have more details about this graph you can refer to the Robert C.Martin article.

Here's the graph for Neo4j framework

AbstractnessVSInstability

The idea behind this graph is that the more a code element of a program is popular, the more it should be abstract. Or in other words, avoid depending too much directly on implementations, depend on abstractions instead. By popular code element I mean a project (but the idea works also for packages and types) that is massively used by other projects of the program. It is not a good idea to have concrete types very popular in your code base. This provokes some Zones of Pains in your program, where changing the implementations can potentially affect a large portion of the program. And implementations are known to evolve more often than abstractions.

The main sequence line (dotted) in the above diagram shows the how abstractness and instability should be balanced. A stable component would be positioned on the left. If you check the main sequence you can see that such a component should be very abstract to be near the desirable line - on the other hand, if its degree of abstraction is low, it is positioned in an area that is called the "zone of pain".

For example neo4j kernel has many classes depending on it, so it's positioned on the left and in this case it's preferable to be more abstract to leave the orange zone and goes to the green zone.

What's important is to avoid the zone of pain, if a jar is inside this zone, any changes to it will impact a lot of classes and it became hard to maintain or evolve this module.

More Stories By Lahlali Issam

Lahlali Issam Lead Developer at JavaDepend, a tool to manage and understand complex Java code. With JavaDepend, software quality can be measured using Code Metrics, visualized using Graphs and Treemaps, and queried using CQL language, a SQL like to query the code base.

@ThingsExpo Stories
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focused on understanding how industrial data can create intelligence for industrial operations. Imagine ...
The Internet of Things is tied together with a thin strand that is known as time. Coincidentally, at the core of nearly all data analytics is a timestamp. When working with time series data there are a few core principles that everyone should consider, especially across datasets where time is the common boundary. In his session at Internet of @ThingsExpo, Jim Scott, Director of Enterprise Strategy & Architecture at MapR Technologies, discussed single-value, geo-spatial, and log time series data. By focusing on enterprise applications and the data center, he will use OpenTSDB as an example t...
Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile devices as well as laptops and desktops using a visual drag-and-drop application – and eForms-buildi...
There is no doubt that Big Data is here and getting bigger every day. Building a Big Data infrastructure today is no easy task. There are an enormous number of choices for database engines and technologies. To make things even more challenging, requirements are getting more sophisticated, and the standard paradigm of supporting historical analytics queries is often just one facet of what is needed. As Big Data growth continues, organizations are demanding real-time access to data, allowing immediate and actionable interpretation of events as they happen. Another aspect concerns how to deliver ...
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial Cloud.
Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, addressed the big issues involving these technologies and, more important, the results they will achieve. Rodney Rogers, chairman and CEO of Virtustream; Brendan O'Brien, co-founder of Aria Systems, Bart Copeland, president and CEO of ActiveState Software; Jim Cowie, chief scientist at Dyn; Dave Wagstaff, VP and chief architect at BSQUARE Corporation; Seth Proctor, CTO of NuoDB, Inc.; and Andris Gailitis, C...
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff happens, where data lives and where the interface lies. For instance, it's a mix of architectural styles ...
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, data security and privacy.
Technology is enabling a new approach to collecting and using data. This approach, commonly referred to as the "Internet of Things" (IoT), enables businesses to use real-time data from all sorts of things including machines, devices and sensors to make better decisions, improve customer service, and lower the risk in the creation of new revenue opportunities. In his General Session at Internet of @ThingsExpo, Dave Wagstaff, Vice President and Chief Architect at BSQUARE Corporation, discuss the real benefits to focus on, how to understand the requirements of a successful solution, the flow of ...
Cloud Expo 2014 TV commercials will feature @ThingsExpo, which was launched in June, 2014 at New York City's Javits Center as the largest 'Internet of Things' event in the world.
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In his General Session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, discussed how to take advantage of a multitude of compute options and platform features to make cloud the cornerstone of your online presence.
In this Women in Technology Power Panel at 15th Cloud Expo, moderated by Anne Plese, Senior Consultant, Cloud Product Marketing at Verizon Enterprise, Esmeralda Swartz, CMO at MetraTech; Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems; Seema Jethani, Director of Product Management at Basho Technologies; Victoria Livschitz, CEO of Qubell Inc.; Anne Hungate, Senior Director of Software Quality at DIRECTV, discussed what path they took to find their spot within the technology industry and how do they see opportunities for other women in their area of expertise.
Wearable devices have come of age. The primary applications of wearables so far have been "the Quantified Self" or the tracking of one's fitness and health status. We propose the evolution of wearables into social and emotional communication devices. Our BE(tm) sensor uses light to visualize the skin conductance response. Our sensors are very inexpensive and can be massively distributed to audiences or groups of any size, in order to gauge reactions to performances, video, or any kind of presentation. In her session at @ThingsExpo, Jocelyn Scheirer, CEO & Founder of Bionolux, will discuss ho...
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
“With easy-to-use SDKs for Atmel’s platforms, IoT developers can now reap the benefits of realtime communication, and bypass the security pitfalls and configuration complexities that put IoT deployments at risk,” said Todd Greene, founder & CEO of PubNub. PubNub will team with Atmel at CES 2015 to launch full SDK support for Atmel’s MCU, MPU, and Wireless SoC platforms. Atmel developers now have access to PubNub’s secure Publish/Subscribe messaging with guaranteed ¼ second latencies across PubNub’s 14 global points-of-presence. PubNub delivers secure communication through firewalls, proxy ser...
We’re no longer looking to the future for the IoT wave. It’s no longer a distant dream but a reality that has arrived. It’s now time to make sure the industry is in alignment to meet the IoT growing pains – cooperate and collaborate as well as innovate. In his session at @ThingsExpo, Jim Hunter, Chief Scientist & Technology Evangelist at Greenwave Systems, will examine the key ingredients to IoT success and identify solutions to challenges the industry is facing. The deep industry expertise behind this presentation will provide attendees with a leading edge view of rapidly emerging IoT oppor...