Java IoT Authors: Pat Romanski, Liz McMillan, Yeshim Deniz, Elizabeth White, Zakia Bouachraoui

Related Topics: Java IoT

Java IoT: Article

Monitoring and Diagnostics of CORBA Systems

Monitoring and Diagnostics of CORBA Systems

Developing distributed applications, in contrast to developing traditional single-process applications, requires a completely different level of monitoring and diagnostic support. In this article I'll discuss how to monitor and diagnose distributed applications based on the CORBA standard.

Perils of Distributed Applications
As organizations make key aspects of their businesses ready and enabled for e-business, multitiered distributed applications are becoming increasingly ubiquitous. Diverse by nature, e-business systems require middleware to integrate middle-tier components into a cohesive computing environment. And as object-oriented programming has entered mainstream application development, CORBA has emerged as the standard middleware solution for integrating distributed e-business components that are implemented in disparate languages ­ Java, C++ and others.

Distributed applications require a completely different level of monitoring and diagnostic support than traditional single-process applications. While the factors causing unexpected behavior and failures in a single process might be simple and easy to anticipate, a distributed system can suffer from any one or more of a whole range of bugs. Let's briefly review seven of the most common problems:

  • Performance bottlenecks can appear in a distributed application when a complex operation is performed at a time-critical point, and can substantially slow down your application's overall performance.
  • Network resource limitations can cause a distributed system to fail when the size of the system is ramped up. Scalability problems may not occur within your test configurations, but can appear later during deployment in the form of limited connections or insufficient bandwidth.

  • Network failures can often partially afflict a complex network. As the application developer, it behooves you to detect and circumvent each point of failure.
  • Race conditions can occur if parallel working modules of a distributed application aren't properly synchronized to prevent different modules from producing contradictory results. Synchronization errors are difficult to detect because they tend to be sporadic and aren't easily reproduced.
  • Deadlocks can appear when the synchronization protocol between modules prevents each from completing its task. Like a race condition, a deadlock often appears only in a special situation and can be difficult to locate.
  • Design errors in control flow can occur very easily. The control flow in a distributed application is usually much more complex than in a single-process application, leading to a wide variety of design errors. Unlikely events such as exceptions and failures within multiple modules can be especially difficult to handle.
  • Timeout failures can occur owing to delays and bottlenecks in the network that cause distributed parts of an application to time out and produce a failure. Such a failure may propagate through the rest of the application if you don't handle it properly.
Over and above these various problems, in a distributed system diagnosis is more complicated than debugging a conventional single-process application. To set up and step through a test case can be very time-consuming when using code debuggers for the distributed modules of a system. And correlating message entry and exit points among numerous processes, each with its own code debugger, can quickly become impractical. Since you're not testing the application in real execution time, time-critical failures such as bottlenecks, race conditions, deadlocks and timeouts can't be detected. Conventional debugging rarely detects scalability problems either.

Monitoring Messages Between CORBA Objects
One good and effective way of diagnosing distributed systems is through monitoring communication between the various distributed components. The objective of this article is to demystify the CORBA communication bus by showing you how to capture the details of messages passed between CORBA objects. Such monitoring lets you observe and record method invocations and exceptions selectively, helping you avoid or eliminate bottlenecks, race conditions and other potential failures that might otherwise impede the performance of your application.

Let's look at what goals you should bear in mind as you monitor these messages.

  • Distributed debugging: You need to monitor them during the development and test phases of a project. That way, you'll uncover problems before an application is deployed. When the application goes live, communication details should be logged to enable performance analysis and to make it possible to troubleshoot unexpected failures quickly. You should be able to activate monitoring on the fly without having to stop and restart the application.
  • Application-level communication details: You should observe request-reply details as they occur at the application level. For example, "The buy method of the stock_exchange object was called using a stock symbol of SEGU and a share amount of 1000." Monitoring at the application level requires an understanding of all CORBA data types and of complex user-defined types. Details captured about each message should include request ID, interface name of the target object, method being invoked, parameter values, timing data, process IDs, host IDs and any thrown exceptions.
  • Dynamic activation: Make sure that application objects are completely unaware of any active monitoring. You should be able to dynamically turn monitoring on or off and specify which communication details to observe while your application is running.

  • Filter criteria: Ensure that it's easy to filter traffic and thus monitor only those interfaces, methods and parameters that you're interested in. Make sure too that it's possible to stipulate how many times a particular method will be observed and at which communication entry and exit points.
  • Timing analysis: Use message timestamps and timing data to help identify server latency, message travel time and client wait time ­ information that's extremely helpful for diagnosing and resolving timing-related problems.
  • Data recording: Record monitored communication to enable logging message activity or analyzing results. It should be possible to parse and sort the recorded data.

How CORBA Communication Works
CORBA can be conceptualized as a communication bus for distributed objects. In a CORBA system the "client/ server" terminology applies within the context of a specific request. In other words, if object A invokes a method on object B, A is the client and B is the server; if B then calls A, the roles are reversed.

The Object Request Broker (ORB) is the mediator, responsible for brokering interactions between objects. Its job is to provide object location and access transparency by enabling client invocations of methods on server objects (see Figure 1). If the server interface is known at build time, a client can connect ­ or bind ­ to a server object statically. If unknown, it can use dynamic binding to ascertain a server's interface and construct a call to the corresponding object.

Exported server interfaces are specified in the CORBA standard interface definition language (IDL). You don't write server implementations in IDL: an interface description is mapped instead, using an IDL compiler, to native language bindings such as Java or C++. This allows each programmer to write source code independently in whichever language may be the most appropriate. A Java program, for example, can access a server object implemented in C++ ­ the Java programmer merely invokes methods on the server as though they're local Java method calls. Figures 2 and 3 illustrate, respectively, an IDL description for a CORBA server and a Java client that calls a corresponding object implementation.

In Figure 2 Account is an interface that corresponds to a class implemented in a server. IDL attributes define the properties of a class (e.g., balance). The IDL compiler maps attributes to "get" and possibly "set" methods. Operations define the methods to be implemented by the server (e.g., make_deposit and make_withdrawal). Their parameters must be explicitly identified in the interface description as in, out or inout. Many other features are supported by IDL, such as inheritance for specifying derived interfaces, modules for establishing naming scopes and exceptions that are supported by an interface or raised by operations.

The IDL compiler generates a skeleton that's linked to the server program and provides static interfaces to call methods of an object implementation. The skeleton unmarshals methods and parameters that come from a client via the ORB. The IDL compiler also generates a client stub that's linked to programs that will statically invoke server methods through the associated interface. The client stub maps a CORBA server object to a native object in the client's language (see Figure 3). The stub acts as a proxy for remote server objects by marshaling methods and parameters to be transmitted via the ORB. CORBA also supplies the dynamic invocation interface (DII) for client programs to discover server interfaces and construct method calls at runtime. The DII requires the use of the CORBA interface repository, which contains compiled IDL descriptions that can be interrogated programmatically (see Figure 4).

The CORBA standard guarantees interoperability between applications built using different vendors' ORBs. The Internet InterORB Protocol (IIOP) defines standard message formats, a common data representation for mapping IDL data types to flat messages and a format for an interoperable object reference (IOR) over TCP/IP networks. In other words, IIOP is the CORBA wire-level protocol.

CORBA communication typically consists of a request message and a reply message. Most ORBs implement interceptors that permit these IIOP messages to be traced at the four points shown in Figure 5: SendRequest, ReceiveRequest, SendReply and ReceiveReply.

An Architecture for Monitoring Communication
Now that we've discussed the goals of monitoring messages between distributed components and reviewed how CORBA communication works, let's look at an architecture for monitoring in a CORBA environment.

Intercepting and interpreting IIOP messages can be achieved using four types of architectural components: Probe, Profile, Collector and Observer. Each monitored CORBA process ­ whether acting as a client, server or both ­ contains a Probe object that captures messages based on the filter criteria as specified by an active Profile. A Profile can be created, updated or uploaded to a Probe at any time. The intercepted data is recorded by the Probe, read by a Collector and transmitted to an Observer. The Observer is the primary collection point for aggregating data from multiple Collectors, and the data it records can then be viewed and analyzed (see Figure 6).

Given the absence of standardization in the area of CORBA monitoring and diagnostics, the design and implementation of a monitoring architecture will vary depending upon who's creating it ­ the application developer, the ORB vendor or (preferably) an independent tool vendor. The issue of standardization will be discussed later. First, let's explore each of our architectural components in detail.

Most ORBs provide interceptors that allow the creation of a Probe object, which captures and records IIOP messages, within each monitored CORBA process. Only one Probe object is necessary per process, regardless of the number of business objects created. The business objects are completely unaware of the Probe, which means the application's business logic doesn't take the Probe into acount, except for the code that creates an instance of the Probe object. (Such instrumentation code would typically be placed in the main routine after initializing the ORB ­ outside the actual business objects.)

A Profile specifies filter criteria used by a Probe while collecting messages. It's a dynamically configurable filter that can be uploaded to a Probe residing within a running program (see Figure 7) and it serves three purposes:

  1. It scopes the traffic being observed to include only the interfaces, methods and parameters of interest.
  2. It indicates how many times a particular method should be observed.
  3. It specifies any or all of the four possible communication points to capture data, i.e., SendRequest, ReceiveRequest, SendReply and ReceiveReply.

A Profile might specify, for example, "Monitor up to 20 invocations of the method make_deposit at the ReceiveRequest and SendReply points." The CORBA interface repository is useful for a Profile editor or similar tool to determine details about available object interfaces. This allows you to create or modify a Profile, which you can then upload to a Probe within a running process.

A collector serves as the registration point in the monitoring architecture for the local application programs. A Probe writes intercepted IIOP messages for subsequent retrieval by the Collector using the fastest possible mechanism and format so that the Probe doesn't become a bottleneck by blocking the message traffic. As the data is written, the Collector reads the messages and transmits them to a primary collection point (see Figure 8). A Collector may also transmit additional relevant data about monitored processes to the primary collection point.

The Observer is the primary registration and collection point for all Collectors across the distributed environment. As data is transmitted to the Observer, it's written to a global database viewable in real time to permit the analysis of system performance (see Figure 9).

The information about each message captured includes:

  • Sequence number of the request
  • Name of the interface containing the operation or attribute
  • Name of the operation or attribute in the request
  • Amount of time the server took to process the request
  • Amount of time spent by the request on the wire
  • Total time between the request being sent and reply being received (server time + travel time)
  • Parameters in the request at each of the communication points
  • System process IDs and names of the hosts for the server and the caller
  • Timestamps at each of the communication points
  • Details about any thrown exception including the communication point where the exception was raised
Figure 10 illustrates a simple captured message.

Summary: What You Can Do
Gaining insight into distributed system behavior can be complicated, but by performing application-level monitoring during a project's development and test phases you can uncover problems prior to deployment and thereby ensure that your application is reliable. Then, when it subsequently goes live, you can capture communication details to allow performance analysis and quick troubleshooting of unexpected failures. Monitoring and diagnostics in your CORBA system can be achieved using a commercially available tool such as Segue Software's SilkObserver or by building custom instrumentation into your application. Either way, differing techniques may be applied depending on which ORBs you're using and what the specific monitoring objectives are for your system.

The OMG's Test Special Interest Group is in the process of standardizing distributed instrumentation and control for CORBA systems. (A Request for Proposals is being drafted as of this writing and will be issued this summer.) The group is defining a common set of instrumentation capabilities for use in the management, debugging and profiling of multivendor CORBA-based systems. These capabilities will likely rely on the OMG's anticipated standard for portable interceptors. The Test SIG's efforts should yield an interface specification for controlling object execution, returning state information and providing other useful functions that today are performed inconsistently ­ if at all ­ across CORBA implementations. When complete, the OMG's efforts in this area should result in consistent mechanisms for distributed monitoring and diagnostics while using the ORB of any vendor. You can check out the progress of the Test SIG on the OMG Web site (www.omg.org) or by sending an e-mail to [email protected].

More Stories By Todd Scallan

Todd Scallan is the vice president of product and engineering at Axcient, where he is responsible for leading the development team and driving product for the Axcient platform. He has over 25 years of experience in a variety of senior-level product management, engineering and business development roles at companies including Interwoven, Segue Software and Black & White Software (acquired by Segue Software). Todd holds an MS in computing engineering, a BS in electrical engineering and has published numerous articles and papers on a range of computing topics.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

IoT & Smart Cities Stories
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use of real time applications accelerate, legacy networks are no longer able to architecturally support cloud adoption and deliver the performance and security required by highly distributed enterprises. These outdated solutions have become more costly and complicated to implement, install, manage, and maintain.SD-WAN offers unlimited capabilities for accessing the benefits of the cloud and Internet. ...
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
The Founder of NostaLab and a member of the Google Health Advisory Board, John is a unique combination of strategic thinker, marketer and entrepreneur. His career was built on the "science of advertising" combining strategy, creativity and marketing for industry-leading results. Combined with his ability to communicate complicated scientific concepts in a way that consumers and scientists alike can appreciate, John is a sought-after speaker for conferences on the forefront of healthcare science,...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
DXWorldEXPO LLC announced today that Ed Featherston has been named the "Tech Chair" of "FinTechEXPO - New York Blockchain Event" of CloudEXPO's 10-Year Anniversary Event which will take place on November 12-13, 2018 in New York City. CloudEXPO | DXWorldEXPO New York will present keynotes, general sessions, and more than 20 blockchain sessions by leading FinTech experts.
Apps and devices shouldn't stop working when there's limited or no network connectivity. Learn how to bring data stored in a cloud database to the edge of the network (and back again) whenever an Internet connection is available. In his session at 17th Cloud Expo, Ben Perlmutter, a Sales Engineer with IBM Cloudant, demonstrated techniques for replicating cloud databases with devices in order to build offline-first mobile or Internet of Things (IoT) apps that can provide a better, faster user e...
Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...