Welcome!

Java Authors: Pete Pickerill, AppDynamics Blog, Elizabeth White, Sematext Blog , Charles Jolley

Related Topics: Java

Java: Article

Non-Stop EJB Services

Deploy New Releases At Your Leisure

Service-oriented architectures (SOA) provide numerous benefits: reuse of business logic by many clients, location transparency of business logic, simplified unit testing, better scalability through distributed and load-balanced processing, and the composition of new services from existing services. Enterprise JavaBeans are a favorite platform on which to base service-oriented architectures because of their enterprise-class features.

As many new SOA applications are now developed on the J2EE platform, a problem arises: how to maintain 100% availability while deploying maintenance fixes and new versions of the services. Most application server vendors do not recommend hot deployment of applications in production; problems may occur with unloading classes, class loaders, and resources being used by existing deployments. Instead, the vendors recommend restarting the server or cluster of servers after a redeployment; however, the total time to redeploy, test, and restart a cluster of servers can be substantial. This downtime is unacceptable for many production sites due to loss of revenue and customer goodwill, and the mission-critical nature of the services.

A solution to this problem is to provide a mechanism for dynamically switching clients from a cluster of application servers running the old version to another cluster of application servers running the new version. We refer to this as dynamic cluster switching. This can be accomplished by some enhancements to commonly used J2EE patterns in conjunction with JMS-based messaging. The result is that most deployments of new releases can be made without interruption of services to the client.

Why bother with non-stop EJB services? We have all experienced the issues associated with EJB application upgrades and deployments, such as unexpected outages due to limited testing, poor fall-back strategies, and planned downtime for maintenance in the wee hours. For businesses selling commodity goods and services on the Web, downtime directly translates to lost revenue when customers can easily surf to other sites to buy the same product. With non-stop EJB services, you can reduce if not eliminate downtime while seamlessly rolling out new versions of your services. Imagine redeploying and upgrading your EJBs without impacting your clients and their Web sites, Web services, consumers, and business partners. If there are issues with the new software, fallback is low-risk and easy to accomplish. All of this occurs during peak traffic periods when using non-stop EJB services. This article describes how this was accomplished on a large consumer Web site handling over 15,000 concurrent sessions during peak times.

Dynamic Cluster Switching
This solution uses JMS messaging to control a plug-in used by clients of the EJB services. When it's time to release a new version of software, an alternate cluster of servers is deployed with the new code on the same hardware platform as the existing servers. A console or command-line program publishes a "cluster switch" message to the client plug-ins that subscribe to a JMS admin topic. The client plug-ins then start to open connections to the new cluster and allow connections to the old cluster to "die off" as sessions or connections are released. In a short time, all the client plug-ins are seamlessly connected to the new cluster. While this approach sounds simple at a high level, the implementation needs the confluence of many design patterns to be successful in practice.

A basic assumption to this solution is that the EJB services are deployed as their own J2EE application, independent of any Web components or J2EE application clients. In environments requiring performance, flexibility, scalability, and reliability, this is likely to be the case anyway.

Implementation
The implementation of the solution uses several common design patterns and enhancements in combination with JMS messaging. The particular patterns used are Service Locator, Business Delegate, Publish/Subscribe Messaging, and Observer. Figure 1 provides a graphical depiction of how the various components and message flows work together to perform the cluster switch.

 

Business Delegates are the client's proxy to the services. They use a Service Locator to obtain an EJBHome object and subsequently create a remote reference to an EJB. To be able to create Business Delegates that point to a different cluster of servers, the Service Locator needs to change its provider URL where it looks up EJBHome objects. This can be accomplished by having the Service Locator receive an update configuration message on a JMS topic that contains the new provider URL.

Clients using existing Business Delegates are not affected and their existing remote references to EJBs continue to operate. As the sessions for these clients expire, the remote references are released and their Business Delegates are garbage collected. New Business Delegates that were created after the Service Locator received the update configuration message are in effect pointing to the cluster identified by the updated provider URL. This is because their EJB handle was created from EJBHome objects looked up at the updated URL.

A JMS subscriber receives update configuration messages and passes them on to a MultiCaster. The MultiCaster becomes the sole point in the client VM for receiving these messages and distributing them to interested components. When the client code first loads, the JMS subscriber is initialized and components, such as the Service Locator, register with the MultiCaster for the type of messages they wish to receive.

A simple command-line program can be used to generate the JMS message that initiates the cluster switch, or this functionality could be part of a more comprehensive management and monitoring console application. The publish-subscribe paradigm is important here because any number of clients can be dynamically reconfigured through their connection to a JMS topic. This approach supports the management of a dynamic and ever-changing set of clients connected to the EJB servers.

Figure 2 is a class diagram of implementations of the various components and patterns. The source code for this article can be downloaded from www.sys-con.com/java/sourcec.cfm. The code should be considered fragments, intended only to illustrate the points in this article since it's missing important features such as logging, exception handling, and configurability. The more important classes will now be discussed in detail.

 

Service Locator
The Service Locator pattern, as described in Core J2EE Patterns, abstracts all JNDI usage, hides the details of initial context creation as well as EJBHome lookup, and caches EJBHomes for performance reasons. The Service Locator is usually made a singleton so that all clients can access the same EJBHome cache.

For the Service Locator to receive update configuration messages, it must register with the MultiCaster when first loaded. When a message is received, the Service Locator replaces its local copy of the provider URL and the initial context factory class with those obtained from the message. Subsequently, it invalidates its current cache of EJBHome objects. Then, the next time a Business Delegate asks for the EJBHome, it won't be found in the cache and will be looked up at the new provider URL. Once looked up, the new EJBHome object will be placed in the cache.

The implementation of the Service Locator provided in the source code is named ClientServiceLocator. As the name indicates, there may be other Service Locators in an application for use in other layers of the architecture (e.g., Services, Foundation, etc.).

Business Delegate (BD)
The Business Delegate pattern hides the details of connecting to and using an EJB. Typically each business method in an EJB has a corresponding method in the Business Delegate that delegates client invocations to the EJB. The Business Delegate catches all the exceptions that can result from communicating with an EJB and turns them into application-specific exceptions. It allows clients to use the services as if they were local, and is thus a client-side proxy for a service. Business delegates can also be used to cache frequently requested data and provide other similar performance improvements to the services.

In addition to the normal responsibilities ascribed to the Business Delegate, the following additional responsibilities are required to support continuous availability of services:
1.  The BD must automatically perform a client/server version compatibility check. The first time a remote reference is retrieved by a business delegate, the client version must be compared to the server version to ensure compatibility. If incompatible, the business delegate must return a specific exception on compatibility mismatch that can be caught by a client. The exception should be logged by the client in the form of an informative error message. This provides a quick indication to support personnel that the client view JAR file is out of date. Without this check, a serialization error will result if the client and server classes are incompatible, and the source of the error will not be obvious to support personnel.
2.  The BD provides a create() and release() method for use by the client. Typically the Business Delegate Factory invokes the create method so the client doesn't need to. The client should always call the release method, however, when finished with a Business Delegate. For Web component clients (servlets and JSP pages), assuming the BD has been placed in the session, this can be accomplished by catching HTTP session timeouts with the HTTPSessionBindingListener interface. The release method not only invokes remove() on the Business Delegate's EJB remote reference, but a BusinessDelegateReleasedMsg is sent to the MultiCaster. The MultiCaster in turn notifies objects that have registered to receive this event, notably the Business Delegate Factories. The use of this event by the Business Delegate Factory is described in the next section.

The above responsibilities are implemented in the BusinessDelegate base class and should be extended by each Business Delegate in an application. All the business methods of each Business Delegate subclass typically invoke the inherited getService method to obtain the remote reference. Rather than store a remote reference to an EJB, which is not guaranteed to be serializable by the EJB specification, BusinessDelegate stores the EJB Handle. getService() reconstitutes the remote reference from the EJB Handle on each invocation in case the Business Delegate has been serialized to another server in the cluster between invocations.

Business Delegate Factory
A Business Delegate Factory is used primarily because it provides the flexibility to hand out other implementations of the Business Delegates depending on the type of client. It also enables a total count to be kept of the number of Business Delegates of each type that have been handed out, as well as a running count of the current number of outstanding Business Delegates.

A subclass of BusinessDelegateFactory should be created for each Business Delegate in an application and a singleton should be created for it. The singleton should register with the MultiCaster to receive Business Delegate release messages for the corresponding Business Delegate type. The management of the counters and the reporting of the counts is all inherited from the BusinessDelegateFactory base class. The specific mechanism for reporting the counts is outside the scope of this article but could be reported by a JMX agent or published to a JMS topic.

MultiCaster
The MultiCaster is the central player in the implementation of the Observer pattern. Observers register with the MultiCaster, providing a filter implementation. When the MultiCaster is notified of an event, it applies all filters to it and notifies observers (subscribers) who have matching filters for the event.

The role of the MultiCaster is to deliver Business Delegate-released notifications to each subclass of BusinessDelegateFactory, as well as deliver update configuration messages to the Service Locator that was received on a JMS topic.

To receive notifications that a Business Delegate has been released, each subclass of BusinessDelegateFactory adds itself as an observer to the MultiCaster with a filter type of BusinessDelegateReleasedFilter. This filter type checks to see that the published object is of type BusinessDelegateReleasedMsg, and that the BD name in the message is the same as that with which the filter was constructed. This causes each BusinessDelegateFactory to receive release notifications only for the type of Business Delegates it creates.

To receive update configuration messages, the Service Locator adds itself as an observer to the MultiCaster with a filter type of UpdateServiceLocatorFilter. This filter type checks to see that the published object is of type ConfigureServiceLocatorMsg.

Two Levels of Client Redirection
The solution presented in this article redirects new clients of the services to the new version of the services. Existing clients using the old version are left to slowly bleed off as their sessions expire. A modification to the solution could be made to immediately switch all existing clients of the services to the new version as well. This would mean that every Business Delegate registering with the MultiCaster would receive Service Locator reconfigured messages, which the Service Locator would have to publish after reconfiguration was complete. This enhancement would also involve the additional complication of managing access to BD instances by multiple threads since the client thread using the BD would be distinct from the thread used by the MultiCaster to deliver event notifications to the BD.

Procedure for Cluster Switch
Now that the architecture of the solution that enables an application for dynamic cluster switching has been presented, we'll discuss the procedure for actually performing a switch. While the procedure might seem obvious, experience has shown the obvious approach is not necessarily the best.

Recall that one of the assumptions stated at the beginning of this article is that clients of the services are running in separate containers from the services. This means that those clients will be using a client view JAR file that has all the classes necessary to be a client of the services. Included in that client view JAR file are configuration resources that point the Business Delegates to a specific application server cluster (subsequently called the "primary" cluster). Assume the new version of the services is deployed to the "alternate" cluster and clients are switched there. It's not unreasonable to assume that at some point, days or weeks later, the client environment (such as a Web container) may need to be restarted. In that case, the clients will get their configuration from their existing client view JAR file, which is pointing to the primary cluster. But the latest services are running on the alternate cluster.

The procedure we've been using in production to solve this problem is as follows:

  1. Boot the alternate cluster.
  2. Deploy the old services to the alternate cluster.
  3. Run regression tests to verify the services are functioning as expected on the alternate cluster.
  4. . Issue a cluster switch to clients to point them to the alternate cluster.
  5. Enable trace-level logging in the old services in the primary cluster to ascertain when existing sessions have bled off the primary cluster. An admin console that is able to query and display the outstanding BD counts from the Business Delegate Factories can also be used as a cross check.
  6. Remove the old services from the primary cluster and deploy the new ones to it.
  7. Run regression tests against the new services on the primarycluster.
  8. Issue a cluster switch to clients to point them at the primary cluster.
  9. Monitor old services on the alternate cluster to determine when incoming traffic has stopped.
  10. Shut down the alternate cluster.
In summary, two switches are performed. New clients are first switched to the old code on the alternate cluster, and then subsequently new clients are switched to the new code on the primary cluster. With an HTTP session timeout of 15 minutes on an e-commerce-related site, the authors have found that letting the traffic bleed off after both cluster switches generally takes a total of three hours. Obviously this number may vary greatly depending on the nature of the services. Three hours is thus the total time that both application server clusters must be active, potentially straining resources such as memory, CPU, and connection pools if both clusters are run in a single hardware environment.

Service Compatibility
A caveat to dynamic cluster switching is that if a change in the public API of the services would cause a serialization or marshaling error between clients using old classes and the new services, the switch cannot be performed. Clients will have to shut down to upgrade their client view JAR files to the new version.

Minimizing the frequency of incompatible builds requires careful attention to application and object versioning. The Java Object Serialization Specification describes exactly what changes to a class make it incompatible with previous versions with regards to serialization. A technique that maximizes long term compatibility of class versions is to manually control their Stream Unique Identifier (SUID).

It's also recommended that a compatibility version number be added to the overall version number for the application. The version number must be made available to clients through the service API so that the BusinessDelegate base class can automatically retrieve it the first time a Business Delegate of each type is used. At that point, the version number in the client view JAR file is compared with the value returned from the service, and a difference in the compatibility number causes an exception to be thrown to the client. This mechanism can be seen in the BusinessDelegate code fragment in the source code.

Conclusion
This solution enables you to deploy new releases into production at leisure. A full regression test can be run on the newly deployed services before putting them into production. Care can be taken to assure that the deployment is perfect since there is no time pressure due to a production outage.

We have used the solution presented here to push a half-dozen new releases into production over the past six months at one of the top revenue-generating Web sites. At this particular site, 75% of the new releases of the services have been compatible builds for which this technique was successfully applied.

References

  • Alur, D., Crupi, J., and Malks, D. (2001). Core J2EE Patterns: Best Practices and Design Strategies. Prentice Hall PTR.

  • More Stories By Joe Bradley

    Joe Bradley has worked as a Senior Java Architect with Sun Software Services for the past 6 years. During his 18 year career he has focused primarily on architecture and development of distributed enterprise applications as well as scientific modeling and simulation applications.

    More Stories By David Raal

    David Raal is a software architect with experience in designing and building complex multitier distributed systems using Java, J2EE, CORBA, and C++. Recently, David has focused on creating e-commerce systems in the manufacturing, telecommunications, hospitality, and retail industries on the J2EE platform.

    Comments (1) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    John Jaster 12/26/03 05:34:50 PM EST

    with including sample code this article is pretty useless.

    @ThingsExpo Stories
    SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, phone and digital TV services to consumers primarily in rural areas.
    The BPM world is going through some evolution or changes where traditional business process management solutions really have nowhere to go in terms of development of the road map. In this demo at 15th Cloud Expo, Kyle Hansen, Director of Professional Services at AgilePoint, shows AgilePoint’s unique approach to dealing with this market circumstance by developing a rapid application composition or development framework.
    The Internet of Things is not new. Historically, smart businesses have used its basic concept of leveraging data to drive better decision making and have capitalized on those insights to realize additional revenue opportunities. So, what has changed to make the Internet of Things one of the hottest topics in tech? In his session at @ThingsExpo, Chris Gray, Director, Embedded and Internet of Things, discussed the underlying factors that are driving the economics of intelligent systems. Discover how hardware commoditization, the ubiquitous nature of connectivity, and the emergence of Big Data a...
    "BSQUARE is in the business of selling software solutions for smart connected devices. It's obvious that IoT has moved from being a technology to being a fundamental part of business, and in the last 18 months people have said let's figure out how to do it and let's put some focus on it, " explained Dave Wagstaff, VP & Chief Architect, at BSQUARE Corporation, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo (http://www.CloudComputingExpo.com), moderated by Ashar Baig, Research Director, Cloud, at Gigaom Research, Nate Gordon, Director of T...
    SYS-CON Events announced today that IDenticard will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. IDenticard™ is the security division of Brady Corp (NYSE: BRC), a $1.5 billion manufacturer of identification products. We have small-company values with the strength and stability of a major corporation. IDenticard offers local sales, support and service to our customers across the United States and Canada. Our partner network encompasses some 300 of the world's leading systems integrators and security s...

    ARMONK, N.Y., Nov. 20, 2014 /PRNewswire/ --  IBM (NYSE: IBM) today announced that it is bringing a greater level of control, security and flexibility to cloud-based application development and delivery with a single-tenant version of Bluemix, IBM's platform-as-a-service. The new platform enables developers to build ap...

    “In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
    "People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    Nigeria has the largest economy in Africa, at more than US$500 billion, and ranks 23rd in the world. A recent re-evaluation of Nigeria's true economic size doubled the previous estimate, and brought it well ahead of South Africa, which is a member (unlike Nigeria) of the G20 club for political as well as economic reasons. Nigeria's economy can be said to be quite diverse from one point of view, but heavily dependent on oil and gas at the same time. Oil and natural gas account for about 15% of Nigera's overall economy, but traditionally represent more than 90% of the country's exports and as...
    The Internet of Things is a misnomer. That implies that everything is on the Internet, and that simply should not be - especially for things that are blurring the line between medical devices that stimulate like a pacemaker and quantified self-sensors like a pedometer or pulse tracker. The mesh of things that we manage must be segmented into zones of trust for sensing data, transmitting data, receiving command and control administrative changes, and peer-to-peer mesh messaging. In his session at @ThingsExpo, Ryan Bagnulo, Solution Architect / Software Engineer at SOA Software, focused on desi...
    "At our booth we are showing how to provide trust in the Internet of Things. Trust is where everything starts to become secure and trustworthy. Now with the scaling of the Internet of Things it becomes an interesting question – I've heard numbers from 200 billion devices next year up to a trillion in the next 10 to 15 years," explained Johannes Lintzen, Vice President of Sales at Utimaco, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    "For over 25 years we have been working with a lot of enterprise customers and we have seen how companies create applications. And now that we have moved to cloud computing, mobile, social and the Internet of Things, we see that the market needs a new way of creating applications," stated Jesse Shiah, CEO, President and Co-Founder of AgilePoint Inc., in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    SYS-CON Events announced today that Gridstore™, the leader in hyper-converged infrastructure purpose-built to optimize Microsoft workloads, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Gridstore™ is the leader in hyper-converged infrastructure purpose-built for Microsoft workloads and designed to accelerate applications in virtualized environments. Gridstore’s hyper-converged infrastructure is the industry’s first all flash version of HyperConverged Appliances that include both compute and storag...
    Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile devices as well as laptops and desktops using a visual drag-and-drop application – and eForms-buildi...
    We certainly live in interesting technological times. And no more interesting than the current competing IoT standards for connectivity. Various standards bodies, approaches, and ecosystems are vying for mindshare and positioning for a competitive edge. It is clear that when the dust settles, we will have new protocols, evolved protocols, that will change the way we interact with devices and infrastructure. We will also have evolved web protocols, like HTTP/2, that will be changing the very core of our infrastructures. At the same time, we have old approaches made new again like micro-services...
    Code Halos - aka "digital fingerprints" - are the key organizing principle to understand a) how dumb things become smart and b) how to monetize this dynamic. In his session at @ThingsExpo, Robert Brown, AVP, Center for the Future of Work at Cognizant Technology Solutions, outlined research, analysis and recommendations from his recently published book on this phenomena on the way leading edge organizations like GE and Disney are unlocking the Internet of Things opportunity and what steps your organization should be taking to position itself for the next platform of digital competition.
    The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
    As the Internet of Things unfolds, mobile and wearable devices are blurring the line between physical and digital, integrating ever more closely with our interests, our routines, our daily lives. Contextual computing and smart, sensor-equipped spaces bring the potential to walk through a world that recognizes us and responds accordingly. We become continuous transmitters and receivers of data. In his session at @ThingsExpo, Andrew Bolwell, Director of Innovation for HP's Printing and Personal Systems Group, discussed how key attributes of mobile technology – touch input, sensors, social, and ...