|By Derek Ashmore||
|April 1, 2000 12:00 AM EST||
As a consultant, developer and database administrator, I've often been asked to provide coding guidelines and tuning assistance for Java code that utilizes JDBC. Over time, I've been introduced to or developed standard coding practices that make JDBC code faster and less error-prone, and easier to read, understand and use. This article documents some of the more important "best practices" for using JDBC libraries to perform database access. As most of my clients are using Oracle database technologies, I've included several practices that are Oracle-specific.
For the purposes of this article the goals of best practices for JDBC programming are maintainability, portability and performance.
- Maintainability refers to the ease with which developers can understand, debug and modify JDBC code that they didn't write.
- Portability refers to the ease with which JDBC code can be used with multiple databases. It turns out that JDBC doesn't make database programming as platform independent as I'd like. In addition, I consider portability a noble goal even if you have no current plans to support multiple databases. Who knows how long your code will be around and what kinds of changes will have to be made to it?
- Performance refers to optimizing the speed and/or memory needed to run JDBC code.
Best Practices for JDBC Programming
The most common recommendations I make to Java programmers using JDBC are the following (discussed individually later):
- Use host variables for literals - avoid hard-coding them (Oracle specific).
- Always close statements, prepared statements and connections.
- Consolidate formation of SQL statement strings.
- Use the delegate model for database connection.
- Use Date, Time and Timestamp objects as host variables for temporal fields (avoid using strings).
- Limit use of column functions.
- Always specify a column list with an select statement (avoid "select *").
- Always specify a column list with an insert statement.
I recommend that developers use host variables in SQL statements instead of hard-coding literals in SQL strings. As a convenience, many developers embed literals in SQL statements instead. I've provided an example of embedding literals in the following code. While the performance benefits of using host variables greatly improve Oracle performance, it won't hurt performance for other database platforms that I'm aware of. Note that this example places a user ID directly in the SQL statement. (As an aside, note that this example uses the "+" operator for string concatenation. While this is convenient, using StringBuffers and the StringBuffer.append() method is a faster way to concatenate strings.)
stmt = dbconnection.createStatement();
rst = stmt.executeQuery("select count(*) from portfolio_info where
USER_ID = " + userID);
count = rst.getInt(1);
To get the benefit of Oracle's optimizations, we need to use PreparedStatements instead of statements for SQL that will be executed multiple times. Furthermore, we need to use host variables instead of literals for literals that will change between executions. In the code above the SQL statement for User id 1 will be different than for User Id 2 ("where USER_ID = 1" is different from "where USER_ID = 2"). A better way to approach this SQL statement is the following:
pstmt = dbconnection.prepareStatement("select count(*) from portfolio_info where USER_ID = ? "); pstmt.setDouble(1,userID);
rst = pstmt.executeQuery();
count = rst.getInt(1);
In this code, because we're using host variables instead of literals, the SQL statement is identical no matter what the qualifying user ID is. Furthermore, we used a PreparedStatement instead of a statement. So that we can better understand the source of the performance benefit, let's walk through how SQL statements are processed by the Oracle optimizer. When SQL statements are executed, Oracle will execute (roughly speaking) the following steps:
- Look up the statement in the shared pool to see if it has already been parsed or interpreted. If yes, Oracle will go directly to step 4.
- Parse (or interpret) the statement.
- Figure out how it will get the data you want; record that information in a portion of memory called the shared pool.
- Get your data.
When an Oracle user looks up a SQL statement to see if it's already been executed (step 1), he or she attempts a character-by-character match of the SQL statement. If the user finds a match, he or she can use the parse information already in the shared pool and doesn't have to do steps 2 and 3 above because the work has already been done. If you hard-code literals in your SQL statements, the probability of finding a match is very low ("where USER_ID = 1" isn't the same as "where USER_ID = 2"). This means that Oracle will have to reparse the second code example for each portfolio selected. Had the code used host variables, that statement (which would look something like "where USER_ID = :1" in the shared pool) would have been parsed once and only once.
I've experienced anywhere from a 5% to a 25% performance increase by writing SQL statements that are reusable (results vary with transaction volume, number of users, network latency and many other things). More information on this can be found in the Oracle Tuning manual. Within this manual look at the "Writing Identical SQL Statements" subheading within the "Tuning the Shared Pool" section.
While this best practice is Oracle-specific, many database platforms optimize preparing and reusing similar SQL statements. Most database platforms do this by optimizing reuse of PreparedStatement objects. Some databases, such as Cloudscape, optionally will store prepared statements in the database so they can be reused and shared by many users. Following this practice won't hurt performance with any database platform I'm aware of.
Always Close Statements, Prepared Statements and Connections
Many databases allocate resources to servicing statements, prepared statements and connections. Many database platforms continue to allocate those resources for a period of time if these objects aren't closed after use. With Oracle databases it's possible to get a "max cursors exceeded" error message when you don't close statements or prepared statements. In addition, with Oracle databases, the connections stay around on the server. This practice improves time and resources spent on maintenance to keep errors from happening.
An example can be found in Listing 1. Note that I use a "finally" block to close the PreparedStatement. I don't close the connection in the example method as it is used elsewhere in the application. Note also that I call a utility to close the PreparedStatement for me. The code for this utility can be found in Listing 2. I use a utility to do the close so I don't have to replicate the exception-catching code everywhere.
Consolidate Formation of SQL Statement Strings
As a database administrator, a substantial portion of my time is spent reading the code of others and suggesting ways to improve performance. As you might expect, looking at the SQL statements being issued is of particular interest to me. It's hard to follow SQL statements that are constructed by string manipulation scattered over several methods. Developers who maintain this kind of code must have the same problem. It greatly enhances readability if you consolidate the logic that forms the SQL statement in one place.
Listing 2 is a good example of this point. The string manipulation to form the SQL statement is located in one place, and the SQL statement logic is in a separate static block instead of within the method itself. This is done to reduce the number of times this string concatenation happens. Also note that StringBuffers are used for the string manipulation, not Strings. StringBuffers are more efficient at string concatenation than Strings are. In a project I recently completed the development team adopted this convention of consolidating SQL statements in static blocks directly above the method in which they were used. We found this practice quite readable and maintainable.
Use Delegate Model for Database Connection
I recently had the task of making the same application runnable on Oracle 8i, Cloudscape and Oracle Lite with as few modifications to existing code as possible. The development team wanted to avoid making JDBC-related classes platform-aware. In addition, the team wanted to take advantage of some platform-specific features, such as array processing and write batching in Oracle 8i, in special cases.
I was able to port the application to multiple environments largely through manipulation of one class responsible for managing our database connection. We had the foresight to create a delegate class for the java.sql.connection that manages needed connection functions and allows us to take advantage of platform-specific performance-tuning enhancements. All of our code used the delegate, not a native JDBC connection, as illustrated in Figure 2. While the specific class used for the project is proprietary, I've created another delegate, dvt.util.db.Connection, that illustrates the concept for the purposes of this article. The source for this delegate can be found in Listing 3.
Note that dvt.util.db.Connection determines that the database platform is being used. If the platform is Oracle 8i, I establish array processing by setting the default row prefetch size (available with Oracle database connections) to improve the performance of our "select" statements. I also establish write batching to improve performance of update, insert and delete statements.
Since I consolidate the platform-specific code in my connection object delegate, classes that use my connection delegate don't need to be platform specific. In case they do, however, developers can use getPlatform() to get information about the database platform being used. Furthermore, I can add support for additional database platforms (e.g., Cloudscape and Sybase) largely by changing this class. The connection delegate won't solve all portability issues, but it will solve a good percentage of them.
I recommend using a connection delegate even for projects that current supporting only one database platform. As we saw from recent Y2K efforts, you may find that your code is used for longer than you think, and used in other applications down the road.
Use Date, Time and Timestamp Objects as Host Variables for Temporal Fields
(Avoid Using Strings)
For convenience, I've seen many developers use strings as host variables to represent dates, times and timestamps. I think they consider Java.sql.Date, Time and Timestamp awkward. I agree with from a coding perspective. Unfortunately, using strings as host variables for temporal fields can affect data access performance.
The following code snippet contains a SQL statement meant for an Oracle platform that uses a string variable to represent a DATE field. Without an understanding of how the database optimizers work, this appears to be an acceptable coding technique. For the small inconvenience of using a "to_char" function in the SQL statement, we avoid the Java work of converting a java.sql.Date or Timestamp into a more easily displayable data type elsewhere in the code.
Where to_char(sale_dt,'YYYY-MM-DD') >= ?
Unfortunately, Oracle and most database optimizers can't use an index to speed up performance of the query in this snippet. Developers will have to read all rows of the order_sales table and convert the sale_dt of all rows to a string before they can do the comparison to see which rows satisfy the where clause of the query.
If we rewrite the query in the snippet to use a java.sql.Timestamp hostvariable, Oracle (and most of the common database platforms) will use an index and significantly improve performance in most cases, as follows:
Where sale_dt >= ?
For applications that use Oracle exclusively, I recommend using java.sql.Timestamp exclusively. Oracle's DATE data type actually contains time information (hours, minutes, seconds) as well as date information. Most other database platforms would call this type of field a TIMESTAMP. Oracle has no direct counterpart for a DATE (which has year, month and day only) and TIME data type offered by other platforms.
Limit Use of Column Functions
I generally recommend that developers limit use of column functions to the select lists of select statements. Moreover, I tend to stick to aggregate functions (e.g., count, sum, average) needed for select statements that use a "group by" clause. I make this recommendation for two reasons: performance and portability. Limiting function use to select lists (and keeping it out of where clauses) means that the use of a function won't block the use of an index. In the same way that the use of the "to_char" function prohibited the database from using an index in the earlier code snippet, column functions in where clauses likely prohibit the database from using an index.
In addition, many of the operations for which developers use SQL column functions (data type conversion, value formatting, etc.) are faster in Java than if the database did them. I've had between a 5% and a 20% performance improvement in many applications by opting to avoid some column functions and implementing the logic in Java instead. Another way to look at it is that column functions aren't tunable as we don't control the source code. Implementing that logic in Java makes it code that we can tune if need be.
Moreover, using non-ANSIstandard column functions can also cause portability problems. There are large differences in which column functions are implemented by the database vendors. For instance, one of my favorite Oracle column functions, "decode", which allows you to translate one set of values into another, isn't implemented in many of the other major database platforms. In general, column function use such as the use of "decode" has the potential to become a portability issue.
Always Specify a Column List with a Select Statement (Avoid "Select *")
A common shortcut for developers is to use the "*" in select statements to avoid having to type out a column list. The line below illustrates this shortcut while the snippet immediate following illustrates the alternative where desired columns are explicitly listed.
Select * from customer
Select last_nm, first_nm, address, city, state, customer_nbr from customer
Select last_nm, first_nm, address, city, state, customer_nbr from customer
I recommend that developers explicitly list columns in select statements as illustrated above. The reason is that if the columns in any of the tables in the select are reordered or new columns are added, the results obtained with the select-asterisk shortcut will change and the class will have to be modified. For example, suppose a database administrator changes the order of the columns and puts column customer_nbr first (there are valid reasons why a DBA could reorder columns). In addition, suppose the DBA adds a column called country. The developer who used the shortcut select * from customer will have to change code. All the offset references used in processing the Resultset will change. The developer who explicitly listed all columns can be oblivious to the change because the code will still work.
Explicitly listing columns in a select statement is a best practice because it prevents the need for maintenance in some cases.
Always Specify a Column List with an Insert Statement
A common shortcut for developers is to omit the column list in insert statements to avoid having to type out a column list. By default, the column order is the same as physically defined in the table. The first snippet below illustrates this shortcut while the next one illustrates the alternative where desired columns are explicitly listed.
Insert into customer
Insert into customer
Values ('Ashmore','Derek','3023 N. Clark','Chicago','IL', 555555)
(last_nm, first_nm, address, city, state, customer_nbr)
Insert into customer
I recommend that developers explicitly list columns in insert statements as illustrated in the second snippet above. The reason is the same as why we should explicitly list columns in select statements. If the columns in any of the tables in the select are reordered or new columns are added, the insert could generate an exception and insert in class will have to be modified. For example, suppose a DBA, as in the previous example, changes the order of the columns, puts column customer_nbr first and adds a column called country. The developer who used the first shortcut above will have to change code. The developer who explicitly listed all columns may be oblivious to the change because the code may still work. In addition, note that the version in second snippet above uses host variables so the same PreparedStatement can be used for all inserts if there are multiple inserts.
Explicitly listing columns in an insert statement is a best practice because it prevents the need for maintenance in many cases.
Recommendations for Stored Procedure Usage
Stored procedure programming languages (such as Oracle's PL/SQL) are handy and in many cases very convenient. I use them often for utility scripts and data-cleansing activities. I'm often asked about recommendations for stored procedure use in applications, but as their capabilities differ greatly among the major database platforms, I can't give platform-independent advice on the subject. I can, however, provide some thoughts on stored procedure use as it relates to portability and performance.
As these languages differ so greatly, their use within applications causes portability issues. For instance, some stored procedure languages allow procedures to return result sets, some do not. Some stored procedure languages allow temporary tables (usable within the current session only), some do not. We could find many more differences, but I think the point is clear. If portability is a concern, I recommend avoiding use of stored procedures except for database triggers.
Performance is a tougher issue because it differs radically between database vendors. Stored procedure use for some database platforms enhances performance; in others it degrades it. For Oracle platforms I advocate stored procedures within Java applications for database triggers only. For most other situations their use provides no benefit. If you want a more detailed discussion on when and how to use stored procedures, functions and packages within Oracle databases, see my article in JDJ December 1999 (Vol. 4, issue 12).
This article has discussed several ways to make JDBC code more performance-, maintenance- and portability-friendly on an individual basis. I always recommend team code reviews and documented coding standards as ways to develop more best practices and consistently apply existing practices. Furthermore, team code reviews help further the goals of best practices by improving the maintainability and general quality of code within an application.
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Ge...
Feb. 19, 2017 01:15 PM EST Reads: 649
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Feb. 19, 2017 12:45 PM EST Reads: 965
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
Feb. 19, 2017 11:45 AM EST Reads: 937
SYS-CON Events announced today that Outlyer, a monitoring service for DevOps and operations teams, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Outlyer is a monitoring service for DevOps and Operations teams running Cloud, SaaS, Microservices and IoT deployments. Designed for today's dynamic environments that need beyond cloud-scale monitoring, we make monitoring effortless so you...
Feb. 19, 2017 11:30 AM EST Reads: 829
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Feb. 19, 2017 11:15 AM EST Reads: 1,579
Have you ever noticed how some IT people seem to lead successful, rewarding, and satisfying lives and careers, while others struggle? IT author and speaker Don Crawley uncovered the five principles that successful IT people use to build satisfying lives and careers and he shares them in this fast-paced, thought-provoking webinar. You'll learn the importance of striking a balance with technical skills and people skills, challenge your pre-existing ideas about IT customer service, and gain new in...
Feb. 19, 2017 11:15 AM EST Reads: 1,638
SYS-CON Events announced today that CrowdReviews.com has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. CrowdReviews.com is a transparent online platform for determining which products and services are the best based on the opinion of the crowd. The crowd consists of Internet users that have experienced products and services first-hand and have an interest in letting other potential buyers...
Feb. 19, 2017 11:00 AM EST Reads: 1,543
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, Cloud Expo and @ThingsExpo are two of the most important technology events of the year. Since its launch over eight years ago, Cloud Expo and @ThingsExpo have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, I provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading the...
Feb. 19, 2017 10:45 AM EST Reads: 7,626
While not quite mainstream yet, WebRTC is starting to gain ground with Carriers, Enterprises and Independent Software Vendors (ISV’s) alike. WebRTC makes it easy for developers to add audio and video communications into their applications by using Web browsers as their platform. But like any market, every customer engagement has unique requirements, as well as constraints. And of course, one size does not fit all. In her session at WebRTC Summit, Dr. Natasha Tamaskar, Vice President, Head of C...
Feb. 19, 2017 10:30 AM EST Reads: 6,521
In the enterprise today, connected IoT devices are everywhere – both inside and outside corporate environments. The need to identify, manage, control and secure a quickly growing web of connections and outside devices is making the already challenging task of security even more important, and onerous. In his session at @ThingsExpo, Rich Boyer, CISO and Chief Architect for Security at NTT i3, will discuss new ways of thinking and the approaches needed to address the emerging challenges of securit...
Feb. 19, 2017 09:45 AM EST Reads: 1,142
TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets. By creating abundant, high-quality editorial content across more than 140 highly targeted technology-specific websites, TechTarget attracts and nurtures communities of technology buyers researching their companies' information technology needs. By understanding these buyers' content consumption behaviors, TechTarget creates the purchase inte...
Feb. 19, 2017 09:45 AM EST Reads: 738
Almost two-thirds of companies either have or soon will have IoT as the backbone of their business. Though, IoT is far more complex than most firms expected with a majority of IoT projects having failed. How can you not get trapped in the pitfalls? In his session at @ThingsExpo, Tony Shan, Chief IoTologist at Wipro, will introduce a holistic method of IoTification, which is the process of IoTifying the existing technology portfolios and business models to adopt and leverage IoT. He will delve in...
Feb. 19, 2017 09:15 AM EST Reads: 1,020
As cloud adoption continues to transform business, today's global enterprises are challenged with managing a growing amount of information living outside of the data center. The rapid adoption of IoT and increasingly mobile workforce are exacerbating the problem. Ensuring secure data sharing and efficient backup poses capacity and bandwidth considerations as well as policy and regulatory compliance issues.
Feb. 19, 2017 09:15 AM EST Reads: 1,559
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great dea...
Feb. 19, 2017 07:45 AM EST Reads: 1,664
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Feb. 19, 2017 07:30 AM EST Reads: 1,406
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Feb. 19, 2017 05:45 AM EST Reads: 4,725
WebRTC defines no default signaling protocol, causing fragmentation between WebRTC silos. SIP and XMPP provide possibilities, but come with considerable complexity and are not designed for use in a web environment. In his session at @ThingsExpo, Matthew Hodgson, technical co-founder of the Matrix.org, discussed how Matrix is a new non-profit Open Source Project that defines both a new HTTP-based standard for VoIP & IM signaling and provides reference implementations.
Feb. 19, 2017 05:00 AM EST Reads: 4,668
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Feb. 19, 2017 04:00 AM EST Reads: 10,974
We all know that data growth is exploding and storage budgets are shrinking. Instead of showing you charts on about how much data there is, in his General Session at 17th Cloud Expo, Scott Cleland, Senior Director of Product Marketing at HGST, showed how to capture all of your data in one place. After you have your data under control, you can then analyze it in one place, saving time and resources.
Feb. 19, 2017 03:00 AM EST Reads: 3,822
910Telecom exhibited at the 19th International Cloud Expo, which took place at the Santa Clara Convention Center in Santa Clara, CA, in November 2016. Housed in the classic Denver Gas & Electric Building, 910 15th St., 910Telecom is a carrier-neutral telecom hotel located in the heart of Denver. Adjacent to CenturyLink, AT&T, and Denver Main, 910Telecom offers connectivity to all major carriers, Internet service providers, Internet backbones and exchanges.
Feb. 19, 2017 02:30 AM EST Reads: 1,274