Welcome!

Java IoT Authors: Liz McMillan, Jason Bloomberg, Elizabeth White, Yeshim Deniz, Zakia Bouachraoui

Related Topics: Java IoT

Java IoT: Article

Optimizing Database Performance in J2EE Applications

Optimizing Database Performance in J2EE Applications

The Java 2 Platform, Enterprise Edition (J2EE), is the platform of choice for implementing scalable and reliable enterprise applications from reusable components. But Java developers building enterprise-class J2EE applications face a quandary.

The object paradigm has proven ideal for modeling a wide variety of real-world scenarios. However, finding a Java-compatible data repository optimized for such applications has become a stumbling block. While object database management systems (ODBMSs) provide the convenience of transparent persistence of Java objects, their client-centric architecture has not scaled well in enterprise environments. Relational database management systems (RDBMSs) do scale well, but map each object to a two-dimensional relational table. The increased overhead can reduce application performance to a crawl.

This article discusses the limits of using these two types of databases with Java and suggests a better alternative for J2EE - a hybrid database that combines the best features of both. Hybrid databases share with ODBMSs the ability to map data stored in back-end databases directly into an implementation-neutral Java representation. As with relational systems, hybrid databases can scale to meet the performance requirements of an enterprise-class J2EE application.

ODBMSs: The Hidden Headache of Transparent Persistence
Over the years, finding a database that's both Java-compatible and scalable enough for enterprise-class J2EE applications has not been easy. Ideally, a Java-compatible database should store Java objects whose classes have been declared "persistent-capable" and can be manipulated seamlessly by the Java language.

That has been the promise of ODBMSs, which made their appearance in the mid-1990s as a solution designed specifically for objects and thus better suited for object development. With ODBMSs, Java developers can define persistent Java classes in the same way transient Java classes are defined in the application.

An apparent advantage of pure object databases is the implementation of transparent persistence that automates the process of mapping persistent data objects into the data repository. With transparent persistence, you don't even have to alter your existing Java classes to describe the persistent data that's permanently stored in the database (see Listing 1). That means you don't have to decide ahead of time, usually during the design phase, which objects to include and exclude from the database.

Adding a new customer order into the database is as simple as creating a new object in Java. Persistent-capable objects are transient until attached to a persistent manager or to other persistent objects.

This convenience quickly becomes a nightmare, however, when developing scalable enterprise-class applications. In a typical application, objects are highly interconnected, and it's very important to know precisely which objects have been stored with the database and which have not. Consider an e-commerce application in which products, customers, and orders are all linked together (see Figure 1). The object model naturally captures the interrelationships of real-world applications. With transparent persistence, you wind up loading an entire closure of objects even though you want to access only a single object (see Figure 2). While the programmer wants to load only one customer, the closure of instances reachable from this object recursively loads a large portion of the database. Loading unneeded data in the Java VM limits concurrency and scalability.

A simple customer query, for example, could also lock pending orders and products purchased, even though this data was not requested and will remain unchanged. Such "overloading" is not a noticeable problem within a standalone environment that manipulates a small amount of data. However, in an enterprise-class, multiuser, transaction-intensive application, large portions of data get locked and instantiated, limiting concurrency and scalability.

During the pilot phase of development, performance is usually acceptable since the system is not running under heavy computational loads. But with wider deployment and more users, transaction rates can slow unacceptably as a massive amount of data - much of it unneeded - fills the pipeline. In the end, transparent persistence leads to a performance black hole, requiring substantial work to improve scalability, increase concurrency, and reduce network traffic. To gain sufficient control over which objects stay persistent and which do not, the ODBMS's transparent persistence mechanism must be bypassed and the ODBMS's proprietary API used instead. Developers must master the ODBMS's proprietary API and then invest the many hours required for the complex, trial-and-error process, which has no guarantee of success.

The hard lesson, often learned at company expense, is that the ODBMS used to validate a pilot application must be replaced by a relational database when the system goes into production. That's the programming equivalent of a heart transplant, setting development schedules back by months. As we will see, relational databases bring their own set of problems in terms of overhead, and can require 25-50% more Java code.

RDBMSs: The Frustration of Object-Relational Mapping
Java developers are hindered by relational databases; however, RDBMSs do have two major advantages: a long, successful track record of deployment in scalable, transaction-processing systems and a standard language, SQL. While the relational model works well enough in banking applications where the row-and-column model reflects the two-dimensional world of ledgers and spreadsheets, it has proven more limited in tracking highly interconnected information. Relationship navigation commonly used in J2EE applications requires extensive use of multitable joins. But joins are computationally intensive, and each join is computed at runtime to link information on-the-fly (see Listing 2). Reconstructing an order object with its line items from row-and-column tables requires two SQL queries and much coding. The same operation in an object database would require only one call. Moreover, relational systems require the rebuilding of relationships between objects each time they're accessed, substantially impacting performance.

In today's economy where business intelligence is key, the Java object model provides a more powerful mechanism for capturing real-world relationships and concept commonalities. In the relational model the relationships disappear and are replaced by primary keys; foreign keys, columns, and indexes; and often by intermediate tables (see Figure 3).

In response to the demands from object developers, relational vendors have extended the relational model to support objects, much the way C++ was an object extension of C. But just as C programmers did not fully embrace C++, Java programmers have remained skeptical of object extensions to what is clearly not an object-oriented environment.

The underlying model of object-relational databases remains the same: rows and columns. As a result, the simplicity of the object model vanishes because classes, inheritance, and relationships must be mapped into tables - a structure ill-suited to the task. Even a simple many-to-many relationship between two classes must be expressed using intermediate tables, with two associated indexes. Therefore, a cleanly designed Java application translated through the normalization process results in a thicket of tables that must be recombined whenever an object is called by the application. The process adds significant load, especially when executing extensive table joins.

To solve the problem of mapping objects into relational databases, a number of OR mapping tools have been created. While these tools do make it easier to develop Java applications that use relational databases, they don't eliminate the underlying RDBMS problems of code complexity and poor performance.

Both database technologies have limitations for Java programming. A pure object database makes sense in a standalone environment in which concurrency and network traffic are not issues. Relational databases, while accommodating transaction-processing loads, merely simulate a true object environment.

Hybrid Databases: The Best of Both Worlds
Hybrid databases represent the best of both worlds: the ability to map objects from Java directly to the database with the support of a standard query language (SQL-99) and the scalable, enterprise capabilities implemented in relational database products. Designed from the ground up as a database server for objects, hybrid databases directly map the object model of Java as well as other object programming languages. Because the database object model matches perfectly with Java, you can freely and easily define the database classes that describe real-world scenarios.

Unlike an RDBMS, a hybrid database preserves the original Java data model. For example, a single class and two subclasses represent customers, consumers, and business customers, respectively. No tables are mapped back into Java objects; no translation of any kind is needed. Unlike an ODBMS, a hybrid database enforces a layered design of the persistent classes. The operations to manipulate objects are explicit, enabling you to keep tight control over the data that's locked and instantiated in the JVM, seamlessly improving the application's scalability.

Hybrid databases eliminate the mismatch between the Java and database environments, while still maintaining the scalability of server-side processing, such as relational systems. Within the J2EE environment, you manipulate Java objects representing a proxy to the object in the database by means of object-to-object mapping. The proxy objects are pure Java classes that map to those of the database schema (see Listing 3). With a hybrid database, the code stays compact and object-based (as in Listing 1), providing the same benefit as a first-generation ODBMS. Hybrid databases don't require any of the special compilation tricks or postprocessing byte code manipulations of ODBMSs - both of which make it hard to identify the root cause of performance degradation.

In a typical application, classes are highly interconnected, and the graph of instances can include large portions of the database. Therefore, controlling object-locking effectively, always a challenge in enterprise-class J2EE applications, is crucial to controlling the instantiation of Java objects in the JVM. To build scalable applications, data-intensive processing needs to take place where the data sits on the server, not on the client, further reducing locking contention as well as network traffic and taking advantage of the faster processing speeds of many server architectures.

Like RDBMSs, hybrid databases support the SQL-99 syntax. While SQL queries are relational in their syntax, they take advantage of the object paradigm by supporting inheritance, polymorphism, and true navigation. Furthermore, the query processing takes place on the server to enforce security and achieve performance. Consider a broad query of two classes of customers: business and consumer. The query is issued from the client, executed on the server, with selected objects from each class retrieved to the client.

This approach gives developers full access to Java objects through JDBC without having to learn a proprietary API (see Listing 4). In this listing, two customer subclasses, Consumer and Business, share properties from the parent Customer class while maintaining properties of their own. A query to locate "good customers" can combine criteria - bonus miles for home consumers, a high credit line for businesses - pulling the information simultaneously from both subclasses. Unlike an RDBMS, a hybrid database returns Java objects through JDBC and natively supports inheritance.

While developers still benefit from the power of expression and performance of SQL queries, these queries eliminate the object-relational mapping layer to reduce source code by 25-50% and improve application performance.

Unlike first-generation ODBMSs, hybrid databases can be accessed through JDBC and ODBC drivers, both of which support the SQL-99 language, thereby taking advantage of in-house SQL expertise. Support for ODBC and JDBC drivers also allows IT staff to use off-the-shelf database tools without having to master SQL.

First Major Optimization: Keep It Simple
Building enterprise-class J2EE applications with a hybrid database is straightforward. Here are some considerations to make the process even smoother:

  • Carefully define the object model of your persistent classes, reflecting the business model as closely as possible. That's common sense in an object environment, but is even more crucial in database applications because the way you define the model greatly impacts system performance.

    Defining the right level of granularity for your objects has a big payoff in terms of transaction rate because only the specific queried data gets locked.

  • Avoid cross-referencing persistent and transient objects as transient information can access persistent information, but not the other way around. Doing so makes the application much more complex to manage since the persistent objects loaded from the database may need to be linked to transient information that's not yet available. While a callback can also be used, it unnecessarily complicates program flow and can usually be avoided with more ordered layering of the application.
  • Keep transactions as short as possible. Long transactions will unnecessarily lock data for long periods of time, making it unavailable to other business transactions.
  • In some cases, data is cached by the middleware, reducing contention, but it requires "dirty reads" (reading data without locking) from the database. A way around this is to use a versioning facility, which allows a consistent view of the database any time, even while users are modifying the current version.

    Conclusion
    Hybrid databases give developers a new and important option when selecting a database for their J2EE application. Until now, Java developers have really had just one viable option: an RDBMS. Despite the drawbacks of the relational model, only RDBMSs solved the performance requirements intrinsic to enterprise applications. As for ODBMSs, they haven't even begun to meet these requirements. Without that, an adaptable object model is irrelevant to large-scale J2EE development.

    With hybrid databases, J2EE developers can demand both: a database that meets the intrinsic requirements of scalability, high transaction volumes, high-volume data transfer, and the need for fast throughput, together with an object data model that more accurately represents business processes, now and in the future.

    As the number of J2EE applications grows, the limitations of RDBMSs and ODBMSs will become more and more apparent. Hybrid databases represent the missing ingredient for broader J2EE implementation, providing scalability without compromising Java's object environment.

  • More Stories By Didier Cabannes

    Didier Cabannes, chief technology officer at Fresher Information, is the chief architect of the Matisse database, a hybrid database for object developers. For the past 15 years, he has been focused on object and database technology, and developing and deploying mission-critical object-based applications in a variety of environments. He holds a master degree in engineering and has conducted post-graduate research in computer science.

    Comments (16)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    IoT & Smart Cities Stories
    @CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
    There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
    LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
    Data Theorem is a leading provider of modern application security. Its core mission is to analyze and secure any modern application anytime, anywhere. The Data Theorem Analyzer Engine continuously scans APIs and mobile applications in search of security flaws and data privacy gaps. Data Theorem products help organizations build safer applications that maximize data security and brand protection. The company has detected more than 300 million application eavesdropping incidents and currently secu...
    Rafay enables developers to automate the distribution, operations, cross-region scaling and lifecycle management of containerized microservices across public and private clouds, and service provider networks. Rafay's platform is built around foundational elements that together deliver an optimal abstraction layer across disparate infrastructure, making it easy for developers to scale and operate applications across any number of locations or regions. Consumed as a service, Rafay's platform elimi...
    The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
    In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace.
    Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, Sandy Ca...
    New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...
    According to Forrester Research, every business will become either a digital predator or digital prey by 2020. To avoid demise, organizations must rapidly create new sources of value in their end-to-end customer experiences. True digital predators also must break down information and process silos and extend digital transformation initiatives to empower employees with the digital resources needed to win, serve, and retain customers.