Close Window

Print Story

JDBC 4.0: A Significant Advance on the Standard and Features Worth the Wait

Pooling is great - except it's not very tunable, it's hard to map end users back to connections in the pool, and if a connection ever becomes invalid inside the pool, expunging only that connection from the pool is nearly impossible; JDBC 4.0 addresses all these drawbacks

As a member of every previous JDBC Expert Group, it sometimes seems as if the specification process moves too slowly for the few features being added or it seems as if the new feature list is good, but "not good enough." The JDBC 4.0 specification fits neither of these perceptions. The specification includes a lot of new features - too many, in fact, to describe in this article. As this article was written a few months before the publication of the public draft of the JDBC 4.0 specification, it's possible that some features, features even more significant than those described here, might make it into the preview release. Most developers will be pleasantly surprised at the enhancement list, which includes everything from performance-tuning options to support for extended-level database features. Here we'll describe in detail some of the new features that are available in JDBC 4.0, along with the reasons why those features are important.

The JDBC 4.0 specification is now in public review and will ship as part of J2SE 6.0 sometime in 2006. The key goals of the JDBC Expert Group were to align with the most important features of the SQL 2003 specification, provide constructs that help improve developer productivity (sometimes called Ease of Development or EOD features), fine tune pooling constructs, and improve scalability. Unlike the JDBC 3.0 specification, there are some major new additions in the JDBC 4.0 spec, such as XML support. Overall, it's not just a collection of bug fixes to the JDBC 3.0 specification; JDBC 4.0 is clearly a significant advance of the standard.

XML Support
One of the most useful new features in JDBC 4.0 is support for the SQL 2003 XML data type. That is, the JDBC spec has been expanded to support XML data types in the database, Java XML bindings, and SQL/XML extensions to the SQL grammar. Although many databases support XML data types today, applications must use either JDBC driver extensions to transfer data to or from the database or use the Clob interface, which is limited in nature to a string representation of the XML.

A new SQL data type, SQLXML, is part of the specification. Applications can use getTypeInfo() to determine if their database supports a native XML data type. For example, getTypeInfo() against a SQL Server 2000 instance doesn't return a result row corresponding to the SQLXML data type, indicating that there's no native XML data type available for that particular database backend. By contrast, getTypeInfo() against a SQL Server 2005 instance returns a result row, indicating that an XML data type is available. It also returns information indicating that the native type name is "XML." From this information, applications can create tables that contain columns of the XML data type.

So applications can populate data in XML columns and retrieve data from those columns, JDBC has been expanded to include native Java bindings for XML. Initially, the Expert Group looked at adding all types of XML bindings, such as text, JAXP, DOM, JDOM, SAX, stream, and StAX. In the end, allowing so many different bindings had too many drawbacks. The Expert Group decided to allow bindings for Java strings, Java character streams, and StAX.

If an application needs to process data inside DOM or SAX, for example, it's a straightforward exercise to convert that data from a StAX stream to those representations.

To create a Java construct that can be used to process XML data, an application can create a SQLXML object off of the Connection object using the createSQLXML() method. The object that's created doesn't contain any data initially. Data can be added to the object by calling setString() or associating an XML stream using createXMLStreamWriter() with the object. The following code illustrates how an application can use these techniques to insert a row containing XML data into a table.

Similarly, applications can retrieve a string representation of the XML using the getString() method of the object or by associating an XML stream using createXMLStreamReader(). The following code illustrates how an application can SELECT a column of the SQLXML data type, create a SQLXML Java binding using getSQLXML() on the result set, and read the value into the new object using a StAX representation via createXMLStreamReader().

SQL 2003 also includes extensions to the SELECT syntax that lets you construct XML results from tabular columns. The following code shows how to create a SELECT statement that produces a result set containing two columns: a CustId result column of type integer and a CustInfo column of type SQLXML.

The SELECT statement uses the new SQL/XML extension XMLELEMENT to process multiple base columns into a single XML result column. JDBC 4.0 has also been expanded to support using database metadata methods to determine which SQL/XML constructs are supported on the connection. Applications can then execute any supported SELECT statement with SQL/XML extensions to produce SQLXML result columns that can be processed using the new XML Java bindings.

Connection and Statement Pooling Enhancements
Most deployed JDBC applications use connection pooling, statement pooling, or a combination to obtain better application performance. Pooling is great - except it's not very tunable, it's hard to map end users back to connections in the pool, and if a connection ever becomes invalid inside the pool, expunging only that connection from the pool is nearly impossible. JDBC 4.0 addresses all these drawbacks.

Currently, prepared statement pooling is very atomic - either statement pooling is on or it's off. This kind of operation doesn't fit the programming model many applications use. In common deployments, it's likely that an application will have a certain set of SQL statements that are re-executed multiple times and a few SQL statements that might only be executed once or twice during the life of the application. Unfortunately, existing statement pooling implementations give no weight to a SQL statement executed 100 times versus one that's executed only twice. Again, either a statement goes into the pool, potentially causing another statement to be removed from the pool, or there's no pool. JDBC 4.0 provides a more granular level of statement pooling by letting applications hint to the pool manager about whether a SQL statement should be pooled.

The preparedStatement interface has been expanded by the addition of two new methods: isPoolable() and setPoolable(). The isPoolable() method returns a Boolean flag that denotes whether the SQL statement identified on the preparedStatement object should be pooled (by default, a statement is poolable when it's created). Applications can specifically request that a statement not be pooled by calling setPoolable(false). Using these constructs, application designers gain more control over the performance aspects of their applications. Queries that are reused are pooled and provide optimal performance, and queries that are used infrequently don't affect the pool.

Connection pooling is a mature feature available in all J2EE application servers and is used in many standalone Java applications. One might think that a technology that's been available for so long would have all the kinks worked out. JDBC 4.0 addresses some major concerns of connection pooling and discussions are already underway on connection pooling enhancements for the JDBC specification post-4.0.

Today, when the response time of your database queries is ridiculously slow because your application server is out of CPU cycles, your database appears to be "hung," or you try to monitor the status of your applications only to see that "some JDBC connection" is using all the CPU, the facilities available to help you find the culprit aren't very good. Once a JDBC connection is established, the tracking mechanism between that physical connection and an application's use of the logical connection is lost. The connection pool manager assigns physical connections in the pool to any application that meets authentication requirements; the pool manager doesn't keep any statistics on the application requesting a connection, and the connection itself is a black box to the application. In other words, if you are using a monitoring tool and see that a JDBC connection is "bogging down the system," it's impossible to track down which JDBC application is actually invoking the driver.

To solve this problem, JDBC 4.0 has added setClientInfo() and getClientInfo() to the Connection interface. After connecting, an application can call setClientInfo() to associate client-specific information to the JDBC connection object, such as application name, site name, and department name for the JDBC connection. Monitoring tools can then retrieve this information to help pinpoint where the problem is.

Large-scale deployments often face another problem when a pool is populated with a large number of connections. How does the pool manager detect when a connection has become invalid? Today, there's no facility inside a JDBC driver to check and see if a connection is still valid. The Connection.isClosed() method is sometimes mistakenly thought to do this, but the intent of isClosed() is to check and see if a connection is open or closed, not whether the connection is still usable. If a connection pool manager decides that a connection is invalid or is suspect (through whatever proprietary means is available), the most common technique used is for the pool manager to terminate all the connections in the pool and re-initialize it. This is a very drastic approach to take and is extremely expensive in terms of performance. A new method on the Connection interface, isValid(), has been added so pool managers can specifically request the driver if a connection is still usable. If a connection is invalid, the pool manager can discard only the marred connection and not the contents of the entire pool.

SQLException Improvements
JDBC 4.0 is meant to make it easier for developers to write JDBC applications. There are too many changes to the specification to describe all of the "ease of development" features here; however, one of the features we'll talk about is handling SQLExceptions. Applications can call getSQLState() when a SQLException happens to get the details about the cause of the error. The problem developers face is that there are many different SQLStates that can be returned. Programmatically figuring out what higher-level reason caused the error is straightforward, but time-consuming, error-prone, and monotonous.

JDBC 4.0 expands the java.sql.package's exception hierarchy by providing two distinct subclasses that indicate whether exceptions are transient (and might succeed if retried) or aren't transient (and won't succeed if retried). These subclasses are: SQLNonTransientException and SQLTransientException. SQLNonTransientExceptions are subclassed further into five distinct cases: SQLSyntaxErrorException, SQLInvalidAuthorizationSpecException, SQLIntegrityConstraintViolationException, SQLDataException, and SQLNonTransientConnectionException. SQLTransientExceptions are subclassed into three distinct cases: SQLTimeoutException, SQLTransactionRollbackException, and SQLTransientConnectionException.

The idea behind this change in the specification is that applications might only be concerned with whether this error is "expected" or not. If it's not expected, the operation can just be retried and may well succeed. In this case, there's no checking 30 different SQLStates to see if the statement should be re-executed. An application just checks to see if the SQLException was a SQLTransientException and can then be retried.

If an application needed to determine whether the error was a "programming error," such as an invalid data conversion, it could check the SQLException to see if it was an SQLDataException. Programming a single check is much easier than checking 12 different SQLStates.

Other New Features
We can only touch on a few of the many enhancements for JDBC 4.0 here. It includes support for a new ROWID data type, bindings for the National Character Set, improved management of Clob and Blob objects, an improved mechanism for installing and recognizing JDBC drivers on a system, new annotations and interfaces, and extensive JDBC specification clarifications. Please take a look at the JDBC 4.0 (JSR-221) details on jcp.org and provide feedback that's relevant to your JDBC use.

© 2008 SYS-CON Media Inc.