Welcome!

Java IoT Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, Scott Allen, Roger Strukhoff

Related Topics: Java IoT

Java IoT: Article

Best Practices for JDBC Programming

Best Practices for JDBC Programming

As a consultant, developer and database administrator, I've often been asked to provide coding guidelines and tuning assistance for Java code that utilizes JDBC. Over time, I've been introduced to or developed standard coding practices that make JDBC code faster and less error-prone, and easier to read, understand and use. This article documents some of the more important "best practices" for using JDBC libraries to perform database access. As most of my clients are using Oracle database technologies, I've included several practices that are Oracle-specific.

For the purposes of this article the goals of best practices for JDBC programming are maintainability, portability and performance.

  • Maintainability refers to the ease with which developers can understand, debug and modify JDBC code that they didn't write.
  • Portability refers to the ease with which JDBC code can be used with multiple databases. It turns out that JDBC doesn't make database programming as platform independent as I'd like. In addition, I consider portability a noble goal even if you have no current plans to support multiple databases. Who knows how long your code will be around and what kinds of changes will have to be made to it?
  • Performance refers to optimizing the speed and/or memory needed to run JDBC code.
While I've labeled my recommendations best practices, these recommendations change as technology changes and as I discover even better coding practices. In addition, I'm always annoyed by articles that make recommendations and then don't explain the rationale for making them. I'll try not to make that mistake here.

Best Practices for JDBC Programming
The most common recommendations I make to Java programmers using JDBC are the following (discussed individually later):

  • Use host variables for literals - avoid hard-coding them (Oracle specific).
  • Always close statements, prepared statements and connections.
  • Consolidate formation of SQL statement strings.
  • Use the delegate model for database connection.
  • Use Date, Time and Timestamp objects as host variables for temporal fields (avoid using strings).
  • Limit use of column functions.
  • Always specify a column list with an select statement (avoid "select *").
  • Always specify a column list with an insert statement.
Use Host Variables for Literals in SQL Statements (Oracle Specific)
I recommend that developers use host variables in SQL statements instead of hard-coding literals in SQL strings. As a convenience, many developers embed literals in SQL statements instead. I've provided an example of embedding literals in the following code. While the performance benefits of using host variables greatly improve Oracle performance, it won't hurt performance for other database platforms that I'm aware of. Note that this example places a user ID directly in the SQL statement. (As an aside, note that this example uses the "+" operator for string concatenation. While this is convenient, using StringBuffers and the StringBuffer.append() method is a faster way to concatenate strings.)

Statement stmt;
ResultSet rst;
Connection dbconnection;
...
stmt = dbconnection.createStatement();
rst = stmt.executeQuery("select count(*) from portfolio_info where
USER_ID = " + userID);
if(rst.next()){
count = rst.getInt(1);
}

To get the benefit of Oracle's optimizations, we need to use PreparedStatements instead of statements for SQL that will be executed multiple times. Furthermore, we need to use host variables instead of literals for literals that will change between executions. In the code above the SQL statement for User id 1 will be different than for User Id 2 ("where USER_ID = 1" is different from "where USER_ID = 2"). A better way to approach this SQL statement is the following:

ResultSet rst;
PreparedStatement pstmt;
Connection dbconnection;
...
pstmt = dbconnection.prepareStatement("select count(*) from portfolio_info where USER_ID = ? "); pstmt.setDouble(1,userID);
rst = pstmt.executeQuery();
if(rst.next()){
count = rst.getInt(1);
}

In this code, because we're using host variables instead of literals, the SQL statement is identical no matter what the qualifying user ID is. Furthermore, we used a PreparedStatement instead of a statement. So that we can better understand the source of the performance benefit, let's walk through how SQL statements are processed by the Oracle optimizer. When SQL statements are executed, Oracle will execute (roughly speaking) the following steps:

  1. Look up the statement in the shared pool to see if it has already been parsed or interpreted. If yes, Oracle will go directly to step 4.
  2. Parse (or interpret) the statement.
  3. Figure out how it will get the data you want; record that information in a portion of memory called the shared pool.
  4. Get your data.
A flowchart of this decision process can be found in Figure 1.

When an Oracle user looks up a SQL statement to see if it's already been executed (step 1), he or she attempts a character-by-character match of the SQL statement. If the user finds a match, he or she can use the parse information already in the shared pool and doesn't have to do steps 2 and 3 above because the work has already been done. If you hard-code literals in your SQL statements, the probability of finding a match is very low ("where USER_ID = 1" isn't the same as "where USER_ID = 2"). This means that Oracle will have to reparse the second code example for each portfolio selected. Had the code used host variables, that statement (which would look something like "where USER_ID = :1" in the shared pool) would have been parsed once and only once.

I've experienced anywhere from a 5% to a 25% performance increase by writing SQL statements that are reusable (results vary with transaction volume, number of users, network latency and many other things). More information on this can be found in the Oracle Tuning manual. Within this manual look at the "Writing Identical SQL Statements" subheading within the "Tuning the Shared Pool" section.

While this best practice is Oracle-specific, many database platforms optimize preparing and reusing similar SQL statements. Most database platforms do this by optimizing reuse of PreparedStatement objects. Some databases, such as Cloudscape, optionally will store prepared statements in the database so they can be reused and shared by many users. Following this practice won't hurt performance with any database platform I'm aware of.

Always Close Statements, Prepared Statements and Connections
Many databases allocate resources to servicing statements, prepared statements and connections. Many database platforms continue to allocate those resources for a period of time if these objects aren't closed after use. With Oracle databases it's possible to get a "max cursors exceeded" error message when you don't close statements or prepared statements. In addition, with Oracle databases, the connections stay around on the server. This practice improves time and resources spent on maintenance to keep errors from happening.

An example can be found in Listing 1. Note that I use a "finally" block to close the PreparedStatement. I don't close the connection in the example method as it is used elsewhere in the application. Note also that I call a utility to close the PreparedStatement for me. The code for this utility can be found in Listing 2. I use a utility to do the close so I don't have to replicate the exception-catching code everywhere.

Consolidate Formation of SQL Statement Strings
As a database administrator, a substantial portion of my time is spent reading the code of others and suggesting ways to improve performance. As you might expect, looking at the SQL statements being issued is of particular interest to me. It's hard to follow SQL statements that are constructed by string manipulation scattered over several methods. Developers who maintain this kind of code must have the same problem. It greatly enhances readability if you consolidate the logic that forms the SQL statement in one place.

Listing 2 is a good example of this point. The string manipulation to form the SQL statement is located in one place, and the SQL statement logic is in a separate static block instead of within the method itself. This is done to reduce the number of times this string concatenation happens. Also note that StringBuffers are used for the string manipulation, not Strings. StringBuffers are more efficient at string concatenation than Strings are. In a project I recently completed the development team adopted this convention of consolidating SQL statements in static blocks directly above the method in which they were used. We found this practice quite readable and maintainable.

Use Delegate Model for Database Connection
I recently had the task of making the same application runnable on Oracle 8i, Cloudscape and Oracle Lite with as few modifications to existing code as possible. The development team wanted to avoid making JDBC-related classes platform-aware. In addition, the team wanted to take advantage of some platform-specific features, such as array processing and write batching in Oracle 8i, in special cases.

I was able to port the application to multiple environments largely through manipulation of one class responsible for managing our database connection. We had the foresight to create a delegate class for the java.sql.connection that manages needed connection functions and allows us to take advantage of platform-specific performance-tuning enhancements. All of our code used the delegate, not a native JDBC connection, as illustrated in Figure 2. While the specific class used for the project is proprietary, I've created another delegate, dvt.util.db.Connection, that illustrates the concept for the purposes of this article. The source for this delegate can be found in Listing 3.

Note that dvt.util.db.Connection determines that the database platform is being used. If the platform is Oracle 8i, I establish array processing by setting the default row prefetch size (available with Oracle database connections) to improve the performance of our "select" statements. I also establish write batching to improve performance of update, insert and delete statements.

Since I consolidate the platform-specific code in my connection object delegate, classes that use my connection delegate don't need to be platform specific. In case they do, however, developers can use getPlatform() to get information about the database platform being used. Furthermore, I can add support for additional database platforms (e.g., Cloudscape and Sybase) largely by changing this class. The connection delegate won't solve all portability issues, but it will solve a good percentage of them.

I recommend using a connection delegate even for projects that current supporting only one database platform. As we saw from recent Y2K efforts, you may find that your code is used for longer than you think, and used in other applications down the road.

Use Date, Time and Timestamp Objects as Host Variables for Temporal Fields (Avoid Using Strings)
For convenience, I've seen many developers use strings as host variables to represent dates, times and timestamps. I think they consider Java.sql.Date, Time and Timestamp awkward. I agree with from a coding perspective. Unfortunately, using strings as host variables for temporal fields can affect data access performance.

The following code snippet contains a SQL statement meant for an Oracle platform that uses a string variable to represent a DATE field. Without an understanding of how the database optimizers work, this appears to be an acceptable coding technique. For the small inconvenience of using a "to_char" function in the SQL statement, we avoid the Java work of converting a java.sql.Date or Timestamp into a more easily displayable data type elsewhere in the code.

Select sum(sale_price)
From order_sales
Where to_char(sale_dt,'YYYY-MM-DD') >= ?

Unfortunately, Oracle and most database optimizers can't use an index to speed up performance of the query in this snippet. Developers will have to read all rows of the order_sales table and convert the sale_dt of all rows to a string before they can do the comparison to see which rows satisfy the where clause of the query.

If we rewrite the query in the snippet to use a java.sql.Timestamp hostvariable, Oracle (and most of the common database platforms) will use an index and significantly improve performance in most cases, as follows:

Select sum(sale_price)
From order_sales
Where sale_dt >= ?

For applications that use Oracle exclusively, I recommend using java.sql.Timestamp exclusively. Oracle's DATE data type actually contains time information (hours, minutes, seconds) as well as date information. Most other database platforms would call this type of field a TIMESTAMP. Oracle has no direct counterpart for a DATE (which has year, month and day only) and TIME data type offered by other platforms.

Limit Use of Column Functions
I generally recommend that developers limit use of column functions to the select lists of select statements. Moreover, I tend to stick to aggregate functions (e.g., count, sum, average) needed for select statements that use a "group by" clause. I make this recommendation for two reasons: performance and portability. Limiting function use to select lists (and keeping it out of where clauses) means that the use of a function won't block the use of an index. In the same way that the use of the "to_char" function prohibited the database from using an index in the earlier code snippet, column functions in where clauses likely prohibit the database from using an index.

In addition, many of the operations for which developers use SQL column functions (data type conversion, value formatting, etc.) are faster in Java than if the database did them. I've had between a 5% and a 20% performance improvement in many applications by opting to avoid some column functions and implementing the logic in Java instead. Another way to look at it is that column functions aren't tunable as we don't control the source code. Implementing that logic in Java makes it code that we can tune if need be.

Moreover, using non-ANSI—standard column functions can also cause portability problems. There are large differences in which column functions are implemented by the database vendors. For instance, one of my favorite Oracle column functions, "decode", which allows you to translate one set of values into another, isn't implemented in many of the other major database platforms. In general, column function use such as the use of "decode" has the potential to become a portability issue.

Always Specify a Column List with a Select Statement (Avoid "Select *")
A common shortcut for developers is to use the "*" in select statements to avoid having to type out a column list. The line below illustrates this shortcut while the snippet immediate following illustrates the alternative where desired columns are explicitly listed.

Select * from customer

Select last_nm, first_nm, address, city, state, customer_nbr from customer

I recommend that developers explicitly list columns in select statements as illustrated above. The reason is that if the columns in any of the tables in the select are reordered or new columns are added, the results obtained with the select-asterisk shortcut will change and the class will have to be modified. For example, suppose a database administrator changes the order of the columns and puts column customer_nbr first (there are valid reasons why a DBA could reorder columns). In addition, suppose the DBA adds a column called country. The developer who used the shortcut select * from customer will have to change code. All the offset references used in processing the Resultset will change. The developer who explicitly listed all columns can be oblivious to the change because the code will still work.

Explicitly listing columns in a select statement is a best practice because it prevents the need for maintenance in some cases.

Always Specify a Column List with an Insert Statement
A common shortcut for developers is to omit the column list in insert statements to avoid having to type out a column list. By default, the column order is the same as physically defined in the table. The first snippet below illustrates this shortcut while the next one illustrates the alternative where desired columns are explicitly listed.

Insert into customer
Values ('Ashmore','Derek','3023 N. Clark','Chicago','IL', 555555)

Insert into customer
(last_nm, first_nm, address, city, state, customer_nbr)
Values (?,?,?,?,?,?)

I recommend that developers explicitly list columns in insert statements as illustrated in the second snippet above. The reason is the same as why we should explicitly list columns in select statements. If the columns in any of the tables in the select are reordered or new columns are added, the insert could generate an exception and insert in class will have to be modified. For example, suppose a DBA, as in the previous example, changes the order of the columns, puts column customer_nbr first and adds a column called country. The developer who used the first shortcut above will have to change code. The developer who explicitly listed all columns may be oblivious to the change because the code may still work. In addition, note that the version in second snippet above uses host variables so the same PreparedStatement can be used for all inserts if there are multiple inserts.

Explicitly listing columns in an insert statement is a best practice because it prevents the need for maintenance in many cases.

Recommendations for Stored Procedure Usage
Stored procedure programming languages (such as Oracle's PL/SQL) are handy and in many cases very convenient. I use them often for utility scripts and data-cleansing activities. I'm often asked about recommendations for stored procedure use in applications, but as their capabilities differ greatly among the major database platforms, I can't give platform-independent advice on the subject. I can, however, provide some thoughts on stored procedure use as it relates to portability and performance.

As these languages differ so greatly, their use within applications causes portability issues. For instance, some stored procedure languages allow procedures to return result sets, some do not. Some stored procedure languages allow temporary tables (usable within the current session only), some do not. We could find many more differences, but I think the point is clear. If portability is a concern, I recommend avoiding use of stored procedures except for database triggers.

Performance is a tougher issue because it differs radically between database vendors. Stored procedure use for some database platforms enhances performance; in others it degrades it. For Oracle platforms I advocate stored procedures within Java applications for database triggers only. For most other situations their use provides no benefit. If you want a more detailed discussion on when and how to use stored procedures, functions and packages within Oracle databases, see my article in JDJ December 1999 (Vol. 4, issue 12).

Summary
This article has discussed several ways to make JDBC code more performance-, maintenance- and portability-friendly on an individual basis. I always recommend team code reviews and documented coding standards as ways to develop more best practices and consistently apply existing practices. Furthermore, team code reviews help further the goals of best practices by improving the maintainability and general quality of code within an application.

More Stories By Derek Ashmore

Derek Ashmore is a consultant and the author of the J2EE
Architect's Handbook, available at www.dvtpress.com.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that Roundee / LinearHub will exhibit at the WebRTC Summit at @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LinearHub provides Roundee Service, a smart platform for enterprise video conferencing with enhanced features such as automatic recording and transcription service. Slack users can integrate Roundee to their team via Slack’s App Directory, and '/roundee' command lets your video conference ...
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, wh...
Adobe is changing the world though digital experiences. Adobe helps customers develop and deliver high-impact experiences that differentiate brands, build loyalty, and drive revenue across every screen, including smartphones, computers, tablets and TVs. Adobe content solutions are used daily by millions of companies worldwide-from publishers and broadcasters, to enterprises, marketing agencies and household-name brands. Building on its established design leadership, Adobe enables customers not o...
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lea...
What are the new priorities for the connected business? First: businesses need to think differently about the types of connections they will need to make – these span well beyond the traditional app to app into more modern forms of integration including SaaS integrations, mobile integrations, APIs, device integration and Big Data integration. It’s important these are unified together vs. doing them all piecemeal. Second, these types of connections need to be simple to design, adapt and configure...
What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, will compare the Jevons Paradox to modern-day enterprise IT, e...
Major trends and emerging technologies – from virtual reality and IoT, to Big Data and algorithms – are helping organizations innovate in the digital era. However, to create real business value, IT must think beyond the ‘what’ of digital transformation to the ‘how’ to harness emerging trends, innovation and disruption. Architecture is the key that underpins and ties all these efforts together. In the digital age, it’s important to invest in architecture, extend the enterprise footprint to the cl...
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management solutions, helping companies worldwide activate their data to drive more value and business insight and to transform moder...
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
What does it look like when you have access to cloud infrastructure and platform under the same roof? Let’s talk about the different layers of Technology as a Service: who cares, what runs where, and how does it all fit together. In his session at 18th Cloud Expo, Phil Jackson, Lead Technology Evangelist at SoftLayer, an IBM company, spoke about the picture being painted by IBM Cloud and how the tools being crafted can help fill the gaps in your IT infrastructure.
Digital innovation is the next big wave of business transformation based on digital technologies of which IoT and Big Data are key components, For example: Business boundary innovation is a challenge to excavate third-party business value using IoT and BigData, like Nest Business structure innovation may propose re-building business structure from scratch, as Uber does in the taxicab industry The social model innovation is also a big challenge to the new social architecture with the design fr...
Data is an unusual currency; it is not restricted by the same transactional limitations as money or people. In fact, the more that you leverage your data across multiple business use cases, the more valuable it becomes to the organization. And the same can be said about the organization’s analytics. In his session at 19th Cloud Expo, Bill Schmarzo, CTO for the Big Data Practice at EMC, will introduce a methodology for capturing, enriching and sharing data (and analytics) across the organizati...
DevOps at Cloud Expo, taking place Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long dev...
IoT offers a value of almost $4 trillion to the manufacturing industry through platforms that can improve margins, optimize operations & drive high performance work teams. By using IoT technologies as a foundation, manufacturing customers are integrating worker safety with manufacturing systems, driving deep collaboration and utilizing analytics to exponentially increased per-unit margins. However, as Benoit Lheureux, the VP for Research at Gartner points out, “IoT project implementers often ...