Java IoT Authors: Yeshim Deniz, Zakia Bouachraoui, Dana Gardner, Liz McMillan, Pat Romanski

Related Topics: Java IoT

Java IoT: Article

Cross-Database Portabliity with JDBC

Cross-Database Portabliity with JDBC

Java programs can use the JDBC API to access relational databases, thereby cleanly separating the database system from the application. This approach holds the promise of cross-database portability, i.e., "write once, run on any database." In practice, several stumbling blocks stand in the way of fulfilling this promise.

This article shows examples of these stumbling blocks so you can see the common pattern of development. Drawing on this pattern, I'll develop a roadmap you can use to write applications that can work around these stumbling blocks. Finally, I'll demonstrate how the roadmap can be applied to build a portable application.

Breakdown of Portability
Three general situations in which the portability of code breaks down are using SQL escape syntax, translating external data types to database-specific types, and executing positioned updates on the rows of a table. These examples arise out of my experiments with writing JDBC-based applications using an organization schema with two tables, EMP and DEPT. These schemas were implemented in two different database systems, MS Access 97 and Oracle8 Personal Edition, both running on a Windows NT 4.0 workstation. The schemas for the respective database systems are shown in Figure 1. Note that the structure of the EMP and DEPT tables are the same in both databases, but the respective native data types of the columns are different, as shown in the box.

To drive the Access database, I used a Windows-supplied Microsoft Access driver and the JDBC-ODBC bridge from Sun (which came with JDK 1.1) as the JDBC driver of choice. For Oracle I used the JDBC OCI8 driver that supported JDBC 1.22.

Situation 1: SQL Escape Syntax
The JDBC API is implemented in the java.sql package. A java.sql.Statement instance is used to execute simple SQL statements.

JDBC specifies a so-called SQL escape syntax that may be used in Statement instances for various tasks such as pattern matching of strings, executing database-specific scalar functions, manipulating date and time values, calling stored procedures and executing outer joins. For each of these tasks JDBC specifies a syntax that's used by the developer, and the JDBC driver is responsible for translating this into database-specific code. Following is an outer join example that doesn't port.

  • Outer join: Consider the data in the EMP and DEPT tables shown in Figure 2 (only the relevant columns are shown).
The following join operation would produce the result shown in Figure 3:

select dname, ename from dept d, emp e
where d.deptno = e.deptno

Note that the department OPERATIONS (DEPTNO = 40) doesn't appear in the resulting table because none of the entries in the EMP table has a DEPTNO value of 40. But what if you want to see every department represented in the result, even if it doesn't have any entries in EMP? You need to execute an outer join.

An outer join preserves unmatched rows in either the left table (left outer join) or the right table (right outer join). JDBC prescribes a SQL escape syntax for outer joins that looks like this:

{ oj outer-join}

where the keyword oj stands for outer join, and outer-join is of the form:

table left outer join { table | outer-join} on search-condition

If you want to preserve all the unmatched departments in the example, you can issue a left outer join by specifying the tables in order DEPT followed by EMP:

select dname, ename from
{ oj dept d left outer join emp e on d.deptno = e.deptno }

This works perfectly well with Access, but not with Oracle. The OCI8 driver doesn't support the oj escape syntax. Instead, you need to use the following equivalent SQL statement:

select dname, ename from dept d, emp e
where d.deptno = e..deptno (+)

in which the (+) at the end of the statement stands for outer join.

The result of this outer join is the table in Figure 4.

Situation 2: DataType Translation
A large part of the work in making JDBC applications portable involves matching external data types to native database types (and vice versa). This matching is done in two steps: (1) the external type is matched to a JDBC SQL type (defined as a constant in the java.sql.Types class), and (2) the JDBC SQL type is translated by the JDBC driver to the native database type.

The next example shows how the table of a database can be populated. A naïve approach hard-codes datatype-specific information in the program, thereby making it nonportable. Following this I'll show a portable alternative that illustrates another stumbling block.

  • Example: Inserting data using PreparedStatement: A PreparedStatement instance is used to prepare and execute precompiled SQL statements. Suppose you want to insert rows of data in the table EMP. The pattern of the SQL statement to insert a row is always the same; it's only the values of the columns that change with every new row. To start with, you construct a PreparedStatement instance as follows:

    PreparedStatement ps = conn.prepareStatement("insert into emp " +
    "(empno, ename, job, mgr, hiredate, sal, comm, deptno) " +
    "values (?, ?, ?, ?, ?, ?, ?, ?)";

The variable conn refers to the database connection. The PreparedStatement instance, ps, is parametrized with the values of the columns, indicated by the question marks. Each parameter is matched with a value before executing the prepared statement for some row. The following code segment shows how the empno column value is filled:

String next = st.nextToken();
if (next.equals("null")) ps.setNull(1, Types.SMALLINT);
else ps.setShort(1, (short)Integer.parseInt(next));

The first line reads an input token (using a StringTokenizer instance st to parse each input line) as a String object. The second line checks whether this token spells null, which indicates that the input does not have a value for this column. In this case a special setNull method is invoked on ps. The first argument to this method indicates the parameter position in the prepared statement (empno is the first); the second argument indicates the JDBC SQL type of the intended value. As mentioned earlier, the class java.sql.Types defines a set of constants corresponding to various SQL types, and the constant java.sql.Types.SMALLINT stands for the SQL type SMALLINT.

For every column in the EMP table, these three lines of code need to be implemented with appropriate modification in data types. There are a couple of serious drawbacks to this approach. One is that the program becomes ungainly and hard to maintain. Another is that all data types are exposed to the developer, so if a column type changes, or this program is ported to a different database, this code must be rewritten.

A better alternative postpones the datatype translation to runtime, thereby making the code portable. This alternative makes use of the PreparedStatement method setObject:

ps.setObject(1, next);

used instead of

ps.setShort(1, (short)Integer.parseInt(next));

The second argument, next, is a String instance that's automatically translated by the JDBC driver to the required database type, which in this case is a small integer. There is no coding of any datatype information. This statement can be set in a loop, using the loop index to control the first argument, which is the position of the parameter in the PreparedStatement instance.

If one or more of the column values is null, a little more work is required. The statement would then be written as:

ps.setNull(1, )

To maintain the datatype independence of the code to make it portable, the SQL type of the column value isn't coded directly here. Instead, it's discovered at runtime using the java.sql.DatabaseMetaData interface, and then plugged into the above statement.

The DatabaseMetaData interface provides metadata information for a database. Metadata is data that describes data. For instance, the EMP table contains employee information. This is data. Metadata would contain information on things like the number of columns in the table, the data types of these columns, whether a column can have null values, and so on. This is data about data.

The DatabaseMetaData interface provides methods that can be called to find out various metadata information about a database.

For our example above, we need to find the data types of the columns of EMP so we can plug that information into the PreparedStatement setNull method call.

This alternative works with the MS Access database, but not with the Oracle database ­ the OCI8 driver refuses to pass the setObject invocation because it's unable to translate the String external type to the required database type.

Situation 3: Positioned Update
When a SQL query is executed in JDBC, it returns a result set that represents the resulting table, consisting of a sequence of rows. A cursor is used to traverse the rows of a result set. The term positioned update refers to updating a database row referred to by the current position of the result set cursor. The following steps need to be taken to effect positioned updates from a result set:

  1. Execute a SELECT FOR UPDATE statement. At the very least this will lock the rows of the table in the result set against other concurrent transactions. For this step to work, the database must support SELECT FOR UPDATE.
  2. Get the cursor name of the result set. This will be used to reference the current row at the time of the update.
  3. Construct a prepared statement to update a row, using the UPDATEŠWHERE CURRENT OF <cursor name> form, with input parameters for the updated columns as well as for the cursor. For this to work, the database must support positioned update.
  4. Traverse the result set, and for every row to be updated execute the prepared statement after setting all the input parameter values. The cursor name would refer to the row currently being referenced in the result set.
Now I'll show a pure JDBC code template using the above steps, written for full portability. This time around, the stumbling block to portability is even more severe.

  • Example: JDBC template and Oracle specifics: Whether a database supports the required SELECT FOR UPDATE and UPDATE...WHERE statements described above can be discovered by using the DatabaseMetaData interface. Assuming a database does in fact support the required functionality, the code in Listing 1 serves as a template for updating all employee names in the EMP table to lower case.
MS Access doesn't support the required database functionality for positioned update. Oracle supports it, but the OCI8 driver doesn't implement the JDBC specification of cursor name. Instead, a completely different solution is adopted. The driver provides a ROWID, which is equivalent to the cursor name. A ROWID is added as a pseudocolumn to a query:

select name, rowid from emp

It may be retrieved using the ResultSet getString method:

String rowid = rset.getString(2);

It may be set as a parameter using the PreparedStatement setString method:

ps.setString(2, rowid)

The JDBC-compliant template code shown in Listing 1 can be reworked for Oracle, using rowid, at the cost of giving up portability.

Roadmap for Workarounds
The examples given here point to a common factor that contributes to the stumbling blocks: the JDBC driver. A JDBC driver implements the JDBC specification for a specific database system. For any given database system there is generally a wide choice of drivers available, including those from the database system vendor as well as third-party vendors. These drivers may differ in various respects, especially in the degree to which they implement the JDBC specification. Choosing an inappropriate driver can force the developer to write database-specific code in the application, thereby giving up portability.

A related common issue that is apparent from the examples is that even if the driver is appropriate for the task at hand, some database-specific information may have to be factored into the application. A naïve approach to this, which is to hard-code the required information into the application, makes the application nonportable. However, it's often possible to discover this information at runtime instead by using the DatabaseMetaData interface, thereby maintaining the portability of the application.

These observations suggest a two-point roadmap to work around the stumbling blocks:

  • Point 1: Implement a "back-end" class that can load any given driver at runtime from a list of candidate drivers. This can be used not only to choose among a set of drivers for a single database, but among drivers for different databases if the application is ported to a different database. Figure 5 illustrates this point.

  • Point 2: Separate metadata discovery from other code by implementing one or more classes that can serve to discover metadata at runtime. Depending on what kind of metadata needs to be discovered, you could design a suite of discovery classes that could be loaded ("plugged in") at runtime as required. Figure 6 illustrates this point.
Implementing the Roadmap
I'll now use the roadmap to sketch the process of building an application that populates the EMP table with data from a text file. Recall that a row of data may be inserted into the table by using a PreparedStatement instance, which in turn requires that the type of each column of EMP be discovered at runtime using the DatabaseMetaData interface.

Following Point 1 of the roadmap, the DriverLoader class is implemented, which loads either the OCI8 driver for Oracle8 or the JDBC-ODBC bridge for MS Access, as required at runtime. Other drivers for the existing databases, as well as drivers for other databases, may be added as needed.

Following Point 2 of the roadmap, the application is divided into three classes. One of these, TableColumns, implements the metadata discovery process. In this case it's the discovery of column types for the table EMP. Another class, TableMediator, uses this metadata and interacts with a third class, TableBuilder, that reads data from the input text file and sends it down the chain of classes to the database.

This chain of classes is illustrated in Figure 7.

The JDBC driver is the most critical piece in any Java database application. The driver must be picked with care, taking into account the architecture of the application, the extent to which the driver implements the JDBC specification, and the performance of the driver for various connection and database access operations.

I picked this specific set of three stumbling blocks for illustration simply because they provide a window into very different ways in which JDBC may be used in a database application. These examples and the workarounds demonstrated point to a general way of structuring a database application for cross-database portability. Specific refinements to this general approach can be adopted based on special requirements of the applications and the architecture; there's a lot of room for maneuvering within the proposed roadmap.

More Stories By Sesh Venugopal

Sesh Venugopal holds a Ph.D. in computer science from Rutgers University. He runs his own IT and education consulting company, Intecus, Inc. (www.intecus.com), specializing in Web-based systems using the Java platform. Sesh is the author of a textbook, Data Structures: An Object-Oriented Approach with Java, also online at www.intecus.com/bookpage.html.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
When Enterprises started adopting Hadoop-based Big Data environments over the last ten years, they were mainly on-premise deployments. Organizations would spin up and manage large Hadoop clusters, where they would funnel exabytes or petabytes of unstructured data.However, over the last few years the economics of maintaining this enormous infrastructure compared with the elastic scalability of viable cloud options has changed this equation. The growth of cloud storage, cloud-managed big data e...
Your applications have evolved, your computing needs are changing, and your servers have become more and more dense. But your data center hasn't changed so you can't get the benefits of cheaper, better, smaller, faster... until now. Colovore is Silicon Valley's premier provider of high-density colocation solutions that are a perfect fit for companies operating modern, high-performance hardware. No other Bay Area colo provider can match our density, operating efficiency, and ease of scalability.
ScaleMP is the leader in virtualization for in-memory high-end computing, providing higher performance and lower total cost of ownership as compared with traditional shared-memory systems. The company's innovative Versatile SMP (vSMP) architecture aggregates multiple x86 systems into a single virtual x86 system, delivering an industry-standard, high-end shared-memory computer. Using software to replace custom hardware and components, ScaleMP offers a new, revolutionary computing paradigm. vSMP F...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
Atmosera delivers modern cloud services that maximize the advantages of cloud-based infrastructures. Offering private, hybrid, and public cloud solutions, Atmosera works closely with customers to engineer, deploy, and operate cloud architectures with advanced services that deliver strategic business outcomes. Atmosera's expertise simplifies the process of cloud transformation and our 20+ years of experience managing complex IT environments provides our customers with the confidence and trust tha...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
CloudEXPO has been the M&A capital for Cloud companies for more than a decade with memorable acquisition news stories which came out of CloudEXPO expo floor. DevOpsSUMMIT New York faculty member Greg Bledsoe shared his views on IBM's Red Hat acquisition live from NASDAQ floor. Acquisition news was announced during CloudEXPO New York which took place November 12-13, 2019 in New York City.
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
The graph represents a network of 1,329 Twitter users whose recent tweets contained "#DevOps", or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 18,000 tweets. The network was obtained from Twitter on Thursday, 10 January 2019 at 23:50 UTC. The tweets in the network were tweeted over the 7-hour, 6-minute period from Thursday, 10 January 2019 at 16:29 UTC to Thursday, 10 January 2019 at 23:36 UTC. Additional tweets that were mentioned in this...