| By Tim Middleton | Article Rating: |
|
| February 21, 2008 12:00 PM EST | Reads: |
9,190 |
Keeping data in the mid-tier ensures fast access, but poses a number of challenges in scalability and manageability. These include:
a) How to keep a consistent view of the data for all members
b) How to ensure that transactions aren't lost if they're not immediately written to back-end data stores
c) How to manage the cluster membership, data partitioning, and workload distribution of the servers that form this kind of architecture
Mid-tier caching or "data grid" solutions bring the data closer to applications, thereby removing load from the database. This lets applications scale linearly through the addition of commodity-based hardware, ensuring that the architecture can satisfy ever-increasing transaction throughput and availability demands.
This article is intended for application developers and software architects who would like to understand what a data grid is, what types of applications and industries can benefit from the technology, how it works under the covers, and what to consider when deploying such a solution.
Mid-tier Object Caching and Data Grid Solutions
Keeping data cached in object form in a mid-tier data grid minimizes the overhead of translating between the relational world of the back-end data sources and the object form required by the mid-tier. It also ensures minimum latency when accessing and updating data from applications.
Solutions such as Oracle Coherence, Gigaspaces XAP, and IBM ObjectGrid use slightly different methods for solving these issues in a mid-tier caching solution, but they generally rely on their clustering technology. Although caching is often the initial reason for adopting these kinds of technologies, their capabilities go far beyond simple data caching. The enhanced functionality provided by these solutions qualifies them as data grids.
Applications that require extremely fast and reliable access to data, massive parallelization of processing, predictable scalability, and extreme event processing capabilities can benefit greatly from data grid solutions. Industries that use these solutions include financial services (for online stock trading), airline and accommodation (for online search aggregation sites), telecommunications, and online gaming.
Data grid architecture typically comprises many commodity-based, multi-core machines with multiple Java Virtual Machines (JVMs) running on each machine. High-speed switched networks connect these machines together, and clients connect into the data grid to do data processing and manipulation. Machines can be added to the grid as required, without bringing down or reconfiguring each grid server or client.
Data grid solutions provide a reliable, performant, scalable data tier that fits nicely with existing clustering solutions for objects such as HttpSession and EJB stateful session beans. In existing application server implementations, the client state represented in these artifacts is usually serialized and written to a back-end replication channel, which could be the data grid rather than the database — even though the data grid is "backed up" by the database.
Under the Covers
Most caching solutions implement the java.util.Map interface, which provides "key value" pair storage maps for objects. This enables easy replacement of custom-built caching solutions, and most data grid implementations can be used as a "drop-in" replacement. Solutions such as Oracle Coherence and others provide many extensions to this interface.
Some of these extensions include querying and aggregating data seamlessly across the data grid, grid-style data processing (sending the processing to the data), data locking and transactions, real-time events, and many more. Table 1 outlines some of these capabilities.
Typical use of data in these solutions involves access and updating it via the Map interface. For example, to create a new Customer object and put it in the cache, you do the following:
NamedCache customers = CacheFactory.getCache(Customer.CACHENAME);
Customer cust = new Customer(1, "Customer Name","Address", "City", "State","PostCode", 0.0);
customers.put(cust.getKey(), cust);
To access and then update the information from the cache
Customer myCust = (Customer)customers.get(key);
myCust.setBalance(1234.00);
customers.put(key, myCust);
In this example we used a single Customer object in the data grid. Taking advantage of the distributed and parallel nature of the data grid lets us efficiently load information from the back-end data stores, making the data available to applications in a timely fashion.
In the mid-tier we work with data in object form - POJOs, PONOs, and POCOs. Typical usage scenarios require access to this cached data as well as grid-style processing. It's also important to be able to access this data seamlessly across object-oriented languages such as .NET/C#, C++, and Java.
The real-time event-processing requirement is particularly useful in the grid because it lets developers design and implement extremely fast event propagation across different applications. It also lets clients monitor changing data from, say, a .NET client on a workstation.
Most data grids manage data in one of two ways: federation or clustering. Federation is the more traditional approach that requires that the entire application be partitioned top to bottom (that is, a transaction can only work with data in a single partition). Because few applications can be fully partitioned, an explicit message (for example, via a JMS API) is usually required to stitch the partitions manually into a cohesive whole. This messaging is typically asynchronous, so transactions will appear on different partitions at different points in time depending on the degree of system load.
Clustering uses a cluster management protocol to ensure that data integrity is maintained, so it can provide a consistent view of the data across multiple servers. This eliminates the need to stitch together multi-partition operations and provides a "flat" view of the data grid. Clustering also increases data resiliency because there's no dependency on client-side timeouts. This means that data can be failed over as soon as the underlying server fails, allowing new backups to be made immediately in case there's a subsequent failure.
Published February 21, 2008 Reads 9,190
Copyright © 2008 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Tim Middleton
Tim Middleton is a solution architect with Oracle in Perth, Western Australia. He has over 17 years of experience in the IT industry. During this time he has been involved in the design and implementation of many large and leading-edge technology projects within the government and private sectors. His focus is on providing middleware solutions around SOA, with an emphasis on architectures that are highly available, scalable and reliable. Tim also has extensive development experience with J2EE and application server-based solutions, as well as many years experience as a DBA.
- Kindle 2 vs Nook
- Why IBM’s Server Chief Got Busted
- Is Cloud Computing Like Teenage Sex?
- Industry Experts Discuss the State of Cloud Computing
- Performance Tuning Essentials for Java
- Confessions of a Ulitzer Addict
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- It's the Java vs. C++ Shootout Revisited!
- Cloud Computing Can Revitalize Your Career as Software Developer
- IBM Could "Reinvent" Java: Mills
- Oracle & Cloud Computing: Exclusive Q&A with SVP Richard Sarwal
- A Brief History of Cloud Computing
- Kindle 2 vs Nook
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Why IBM’s Server Chief Got Busted
- Is Cloud Computing Like Teenage Sex?
- Industry Experts Discuss the State of Cloud Computing
- Performance Tuning Essentials for Java
- The Difference Between Web Hosting and Cloud Computing
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- Ajax in RichFaces 3.3, JSF 2 and RichFaces 4
- Confessions of a Ulitzer Addict
- My Thoughts on Ulitzer
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- A Cup of AJAX? Nay, Just Regular Java Please
- Java Developer's Journal Exclusive: 2006 "JDJ Editors' Choice" Awards
- The i-Technology Right Stuff
- JavaServer Faces (JSF) vs Struts
- Rich Internet Applications with Adobe Flex 2 and Java
- Java vs C++ "Shootout" Revisited
- Bean-Managed Persistence Using a Proxy List
- Reporting Made Easy with JasperReports and Hibernate
- Creating a Pet Store Application with JavaServer Faces, Spring, and Hibernate
- What's New in Eclipse?
- Why Do 'Cool Kids' Choose Ruby or PHP to Build Websites Instead of Java?
- i-Technology Predictions for 2007: Where's It All Headed?




































