Welcome!

Java IoT Authors: Yeshim Deniz, Pat Romanski, Liz McMillan, Zakia Bouachraoui, Carmen Gonzalez

Related Topics: Java IoT

Java IoT: Article

Optimizing Database Performance in J2EE Applications

Optimizing Database Performance in J2EE Applications

The Java 2 Platform, Enterprise Edition (J2EE), is the platform of choice for implementing scalable and reliable enterprise applications from reusable components. But Java developers building enterprise-class J2EE applications face a quandary.

The object paradigm has proven ideal for modeling a wide variety of real-world scenarios. However, finding a Java-compatible data repository optimized for such applications has become a stumbling block. While object database management systems (ODBMSs) provide the convenience of transparent persistence of Java objects, their client-centric architecture has not scaled well in enterprise environments. Relational database management systems (RDBMSs) do scale well, but map each object to a two-dimensional relational table. The increased overhead can reduce application performance to a crawl.

This article discusses the limits of using these two types of databases with Java and suggests a better alternative for J2EE - a hybrid database that combines the best features of both. Hybrid databases share with ODBMSs the ability to map data stored in back-end databases directly into an implementation-neutral Java representation. As with relational systems, hybrid databases can scale to meet the performance requirements of an enterprise-class J2EE application.

ODBMSs: The Hidden Headache of Transparent Persistence
Over the years, finding a database that's both Java-compatible and scalable enough for enterprise-class J2EE applications has not been easy. Ideally, a Java-compatible database should store Java objects whose classes have been declared "persistent-capable" and can be manipulated seamlessly by the Java language.

That has been the promise of ODBMSs, which made their appearance in the mid-1990s as a solution designed specifically for objects and thus better suited for object development. With ODBMSs, Java developers can define persistent Java classes in the same way transient Java classes are defined in the application.

An apparent advantage of pure object databases is the implementation of transparent persistence that automates the process of mapping persistent data objects into the data repository. With transparent persistence, you don't even have to alter your existing Java classes to describe the persistent data that's permanently stored in the database (see Listing 1). That means you don't have to decide ahead of time, usually during the design phase, which objects to include and exclude from the database.

Adding a new customer order into the database is as simple as creating a new object in Java. Persistent-capable objects are transient until attached to a persistent manager or to other persistent objects.

This convenience quickly becomes a nightmare, however, when developing scalable enterprise-class applications. In a typical application, objects are highly interconnected, and it's very important to know precisely which objects have been stored with the database and which have not. Consider an e-commerce application in which products, customers, and orders are all linked together (see Figure 1). The object model naturally captures the interrelationships of real-world applications. With transparent persistence, you wind up loading an entire closure of objects even though you want to access only a single object (see Figure 2). While the programmer wants to load only one customer, the closure of instances reachable from this object recursively loads a large portion of the database. Loading unneeded data in the Java VM limits concurrency and scalability.

A simple customer query, for example, could also lock pending orders and products purchased, even though this data was not requested and will remain unchanged. Such "overloading" is not a noticeable problem within a standalone environment that manipulates a small amount of data. However, in an enterprise-class, multiuser, transaction-intensive application, large portions of data get locked and instantiated, limiting concurrency and scalability.

During the pilot phase of development, performance is usually acceptable since the system is not running under heavy computational loads. But with wider deployment and more users, transaction rates can slow unacceptably as a massive amount of data - much of it unneeded - fills the pipeline. In the end, transparent persistence leads to a performance black hole, requiring substantial work to improve scalability, increase concurrency, and reduce network traffic. To gain sufficient control over which objects stay persistent and which do not, the ODBMS's transparent persistence mechanism must be bypassed and the ODBMS's proprietary API used instead. Developers must master the ODBMS's proprietary API and then invest the many hours required for the complex, trial-and-error process, which has no guarantee of success.

The hard lesson, often learned at company expense, is that the ODBMS used to validate a pilot application must be replaced by a relational database when the system goes into production. That's the programming equivalent of a heart transplant, setting development schedules back by months. As we will see, relational databases bring their own set of problems in terms of overhead, and can require 25-50% more Java code.

RDBMSs: The Frustration of Object-Relational Mapping
Java developers are hindered by relational databases; however, RDBMSs do have two major advantages: a long, successful track record of deployment in scalable, transaction-processing systems and a standard language, SQL. While the relational model works well enough in banking applications where the row-and-column model reflects the two-dimensional world of ledgers and spreadsheets, it has proven more limited in tracking highly interconnected information. Relationship navigation commonly used in J2EE applications requires extensive use of multitable joins. But joins are computationally intensive, and each join is computed at runtime to link information on-the-fly (see Listing 2). Reconstructing an order object with its line items from row-and-column tables requires two SQL queries and much coding. The same operation in an object database would require only one call. Moreover, relational systems require the rebuilding of relationships between objects each time they're accessed, substantially impacting performance.

In today's economy where business intelligence is key, the Java object model provides a more powerful mechanism for capturing real-world relationships and concept commonalities. In the relational model the relationships disappear and are replaced by primary keys; foreign keys, columns, and indexes; and often by intermediate tables (see Figure 3).

In response to the demands from object developers, relational vendors have extended the relational model to support objects, much the way C++ was an object extension of C. But just as C programmers did not fully embrace C++, Java programmers have remained skeptical of object extensions to what is clearly not an object-oriented environment.

The underlying model of object-relational databases remains the same: rows and columns. As a result, the simplicity of the object model vanishes because classes, inheritance, and relationships must be mapped into tables - a structure ill-suited to the task. Even a simple many-to-many relationship between two classes must be expressed using intermediate tables, with two associated indexes. Therefore, a cleanly designed Java application translated through the normalization process results in a thicket of tables that must be recombined whenever an object is called by the application. The process adds significant load, especially when executing extensive table joins.

To solve the problem of mapping objects into relational databases, a number of OR mapping tools have been created. While these tools do make it easier to develop Java applications that use relational databases, they don't eliminate the underlying RDBMS problems of code complexity and poor performance.

Both database technologies have limitations for Java programming. A pure object database makes sense in a standalone environment in which concurrency and network traffic are not issues. Relational databases, while accommodating transaction-processing loads, merely simulate a true object environment.

Hybrid Databases: The Best of Both Worlds
Hybrid databases represent the best of both worlds: the ability to map objects from Java directly to the database with the support of a standard query language (SQL-99) and the scalable, enterprise capabilities implemented in relational database products. Designed from the ground up as a database server for objects, hybrid databases directly map the object model of Java as well as other object programming languages. Because the database object model matches perfectly with Java, you can freely and easily define the database classes that describe real-world scenarios.

Unlike an RDBMS, a hybrid database preserves the original Java data model. For example, a single class and two subclasses represent customers, consumers, and business customers, respectively. No tables are mapped back into Java objects; no translation of any kind is needed. Unlike an ODBMS, a hybrid database enforces a layered design of the persistent classes. The operations to manipulate objects are explicit, enabling you to keep tight control over the data that's locked and instantiated in the JVM, seamlessly improving the application's scalability.

Hybrid databases eliminate the mismatch between the Java and database environments, while still maintaining the scalability of server-side processing, such as relational systems. Within the J2EE environment, you manipulate Java objects representing a proxy to the object in the database by means of object-to-object mapping. The proxy objects are pure Java classes that map to those of the database schema (see Listing 3). With a hybrid database, the code stays compact and object-based (as in Listing 1), providing the same benefit as a first-generation ODBMS. Hybrid databases don't require any of the special compilation tricks or postprocessing byte code manipulations of ODBMSs - both of which make it hard to identify the root cause of performance degradation.

In a typical application, classes are highly interconnected, and the graph of instances can include large portions of the database. Therefore, controlling object-locking effectively, always a challenge in enterprise-class J2EE applications, is crucial to controlling the instantiation of Java objects in the JVM. To build scalable applications, data-intensive processing needs to take place where the data sits on the server, not on the client, further reducing locking contention as well as network traffic and taking advantage of the faster processing speeds of many server architectures.

Like RDBMSs, hybrid databases support the SQL-99 syntax. While SQL queries are relational in their syntax, they take advantage of the object paradigm by supporting inheritance, polymorphism, and true navigation. Furthermore, the query processing takes place on the server to enforce security and achieve performance. Consider a broad query of two classes of customers: business and consumer. The query is issued from the client, executed on the server, with selected objects from each class retrieved to the client.

This approach gives developers full access to Java objects through JDBC without having to learn a proprietary API (see Listing 4). In this listing, two customer subclasses, Consumer and Business, share properties from the parent Customer class while maintaining properties of their own. A query to locate "good customers" can combine criteria - bonus miles for home consumers, a high credit line for businesses - pulling the information simultaneously from both subclasses. Unlike an RDBMS, a hybrid database returns Java objects through JDBC and natively supports inheritance.

While developers still benefit from the power of expression and performance of SQL queries, these queries eliminate the object-relational mapping layer to reduce source code by 25-50% and improve application performance.

Unlike first-generation ODBMSs, hybrid databases can be accessed through JDBC and ODBC drivers, both of which support the SQL-99 language, thereby taking advantage of in-house SQL expertise. Support for ODBC and JDBC drivers also allows IT staff to use off-the-shelf database tools without having to master SQL.

First Major Optimization: Keep It Simple
Building enterprise-class J2EE applications with a hybrid database is straightforward. Here are some considerations to make the process even smoother:

  • Carefully define the object model of your persistent classes, reflecting the business model as closely as possible. That's common sense in an object environment, but is even more crucial in database applications because the way you define the model greatly impacts system performance.

    Defining the right level of granularity for your objects has a big payoff in terms of transaction rate because only the specific queried data gets locked.

  • Avoid cross-referencing persistent and transient objects as transient information can access persistent information, but not the other way around. Doing so makes the application much more complex to manage since the persistent objects loaded from the database may need to be linked to transient information that's not yet available. While a callback can also be used, it unnecessarily complicates program flow and can usually be avoided with more ordered layering of the application.
  • Keep transactions as short as possible. Long transactions will unnecessarily lock data for long periods of time, making it unavailable to other business transactions.
  • In some cases, data is cached by the middleware, reducing contention, but it requires "dirty reads" (reading data without locking) from the database. A way around this is to use a versioning facility, which allows a consistent view of the database any time, even while users are modifying the current version.

    Conclusion
    Hybrid databases give developers a new and important option when selecting a database for their J2EE application. Until now, Java developers have really had just one viable option: an RDBMS. Despite the drawbacks of the relational model, only RDBMSs solved the performance requirements intrinsic to enterprise applications. As for ODBMSs, they haven't even begun to meet these requirements. Without that, an adaptable object model is irrelevant to large-scale J2EE development.

    With hybrid databases, J2EE developers can demand both: a database that meets the intrinsic requirements of scalability, high transaction volumes, high-volume data transfer, and the need for fast throughput, together with an object data model that more accurately represents business processes, now and in the future.

    As the number of J2EE applications grows, the limitations of RDBMSs and ODBMSs will become more and more apparent. Hybrid databases represent the missing ingredient for broader J2EE implementation, providing scalability without compromising Java's object environment.

  • More Stories By Didier Cabannes

    Didier Cabannes, chief technology officer at Fresher Information, is the chief architect of the Matisse database, a hybrid database for object developers. For the past 15 years, he has been focused on object and database technology, and developing and deploying mission-critical object-based applications in a variety of environments. He holds a master degree in engineering and has conducted post-graduate research in computer science.

    Comments (16)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    IoT & Smart Cities Stories
    Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
    Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
    IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
    The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
    Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
    Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...
    Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
    To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
    In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
    Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...