Click here to close now.

Welcome!

Java Authors: Carmen Gonzalez, Elizabeth White, Roger Strukhoff, Liz McMillan, Navrup Johal

Related Topics: Java

Java: Article

Using Space-Based Programming for Loosely Coupled Distributed Systems

Using Space-Based Programming for Loosely Coupled Distributed Systems

One of the problems of highly distributed systems is figuring out how systems discover each other. After all, the whole point of having systems distributed is to allow flexible and perhaps even dynamic configurations to maximize system performance and availability. How do these distributed components of one system or multiple systems discover each other? And once they're discovered how do we allow enough flexibility, such as rediscovery, to allow their fail-safe operation?

Space-based programming may provide us with a good answer to these questions and more. In this article I'll describe what a space is and how it can be used to mitigate some of the issues mentioned above. And I've included a technique to convert an ordinary message queue into a space.

What Is a Space?
Conventional distributed tools rely on passing messages between processes (asynchronous communication) or invoking methods on remote objects (synchronous communication). A space is an extension of the asynchronous communication model in which two processes are not passing messages to one another. In fact, the processes are totally unaware of each other.

In Figure 1, Process 1 places a message into the space. Process 2, which has been waiting for this type of message, takes the message out of the space and processes it. Based on the results, it places another message into the space. Process 3, which has been waiting for this type of message, takes the message out of the space.

Following are highlights of the preceding discussion:

  1. The space may contain different types of messages. In fact, I used the term message for clarity. These messages are actually just "things" (the message may be an object, an XML document or anything else that the space allows to be put in it). In Figure 1 the different shapes in the space illustrate the different types of messages.
  2. The three processes involved have no knowledge of one another. All they know is that they put a message in a space and get a message out of the space.
  3. As in the message-passing scenario, we aren't limited to two processes communicating asynchronously, but rather any number of processes communicating via a common space. This allows the creation of loosely coupled systems that can be highly distributed and extremely flexible, and can provide high availability and dynamic load balancing.
Let's look at a more specific example this time. A common encryption method is the use of "one-way" functions, which take an input and, like any other function, generate an output. The distinguishing feature of such functions is that it's extremely difficult to compute the input that was given to the function to get the output (i.e., to compute the inverse of the function); hence, the term one-way function. Instead of trying to figure out the inverse of the function to get the input required for the given output, an easier way may be to take all possible inputs and compute the output for each one. When we get an output that matches the one we have, we've found the right "input." But this can be extremely time consuming given the vast number of possible inputs.

Assume that passwords can't be more than four characters in length and only alphanumeric ASCII characters are used. This gives us 14,776,336 possible passwords (624). Using the brute force technique to break the password, assume that the main program breaks the input set into 16 pieces and puts each piece – along with the encrypted password – in the space. The password-breaking programs watch the space for such pieces and each available program immediately grabs a piece and starts working. The programs continue until no more such pieces are available or until the password has been broken. If the password is broken, the breaking program puts the solution in the space, which is picked up by the main program.

The main program then proceeds to pick up the remaining pieces, since it has already found the solution it needs. The program never knew how many password-breaking programs were available, nor did it know where they were located. The password-breaking programs had no knowledge about one another or about the main program. If there were 16 password-breaking programs available, and each one was on a separate machine, we would've had 16 machines working on breaking the password simultaneously!

No change to any configuration of the system is required to add new password-breaking programs. This is why spaces are so good for fault tolerance, load balancing and scalability.

As you can see, spaces provide an extremely powerful concept/mechanism to decouple cooperating or dependent systems. The concept of a space isn't new, however. Tuple spaces were first described in 1982 in the context of a programming language called Linda. Linda consisted of tuples, which were collections of data grouped together, and the tuple space, which was the shared blackboard from which applications could place and retrieve tuples. The concept never gained much popularity outside of academia, however. Today spaces may be an elegant solution to many of the traditional distributed computing dilemmas. In recognition of this fact, JavaSoft has created its own implementation of the space concept, JavaSpaces, and IBM has created TSpaces, which is much more functional and complex than JavaSpaces. (We won't discuss IBM's TSpaces in this article.)

We're now in a position to describe some of the key characteristics of a space:

  • Spaces provide shared access: A space provides a network-accessible "shared memory" that can be accessed by many shared remote/local processes concurrently. The space handles all issues regarding concurrent access, allowing the processes to focus on the task at hand. At the very least, spaces provide processes with the ability to place and retrieve "things." Some spaces also provide the ability to read/peek at things (i.e., to get the thing without actually removing it from the space, thus allowing other processes to access it as well).
  • Spaces are persistent: A space provides reliable storage for processes to place "things." These "things" may outlive the processes that created them. It also allows the dependent/cooperating processes to work together even when they have nonoverlapping life cycles, and boosts the fault tolerance and high-availability capability of distributed systems.
  • Spaces are associative. Associative lookup allows processes to "find" the "things" they're interested in. As many processes may be using/sharing the same space, many different "things" may be in the space. It's important for processes to be able to get the "things" they require without having to filter out the "noise" themselves. This is possible because spaces allow processes to define filters/templates that instruct/direct the space to "find" the right "things" for that process.
These are just a few key characteristics of spaces. Many commercial space implementations, such as the ones from JavaSoft and IBM, have additional characteristics such as the ability to perform "transacted" operations on the space.

JavaSoft's Implementation: JavaSpaces
JavaSpaces technology, a new realization of the tuple spaces concept described above, is an implementation that's available free from JavaSoft. JavaSpaces is built on top of another complex technology, Jini, a Java-based technology that allows any device to become network aware. Jini provides a complex yet elegant programming model that realizes the Jini team's vision of "network anything, anytime, anywhere."

The goal of JavaSpaces is to provide what might be thought of as a file system for objects. Like other JavaSoft APIs, JavaSpaces provides a simple yet powerful set of features to developers. As I see it, however, JavaSpaces has four drawbacks:

  1. The implementation of JavaSpaces is complex to install.
  2. The fact that it builds on top of Jini makes it a little too heavy, especially if there are no plans to use Jini elsewhere in the project.
  3. JavaSpaces relies on Java RMI, the suitability of which for highly scalable commercial applications is a topic of debate among many software gurus.
  4. JavaSpaces works only with serializable Java objects.
Creating Your Own Space Implementation
Even though commercial implementations of spaces are available in the market, there are several reasons to create your own. If you work in a start-up company, budget constraints may be a big reason. Also, the functionality offered by a commercial implementation may be too much for the job at hand. Not only may this result in a larger learning curve, it may even slow down your application due to the sheer size of the memory footprint. Finally, it's always fun to create your own implementation.

At Online Insight we decided to create our own implementation. The primary reasons for our decision were our limited set of requirements and the extremely lightweight implementation we required to achieve our scalability and performance goals.

Our requirements can be summarized as follows:

  1. The space must support shared access.
  2. The space must be persistent.
  3. The space must provide the ability to specify a filtering template.
  4. The space must allow one "thing" to be accessed by only one process/application at a time (i.e., we don't support the "read" operation).
  5. The space must perform and scale well under load.
  6. The space must be accessible to other CORBA objects.
  7. The space must not impose a limitation on what you can put in it (unlike JavaSpaces, for example).
  8. The space must not impose size limitations on what you can put in it (the underlying hardware, however, may impose a limitation).
Note that the first three requirements are, in addition, key characteristics of a space.

Java Message Service
At the time we were evaluating message queue–type software – specifically, Java Message Service (JMS) implementations – we realized that we could build our space facility on top of one of these queues.

JMS is an API for accessing enterprise-messaging systems from Java programs. It defines a common set of enterprise-messaging concepts and facilities, and attempts to minimize the set of concepts a Java language programmer must learn to use, including enterprise-messaging products such as IBM MQSeries. JMS also strives to maximize the portability of messaging applications. It doesn't, however, address load balancing/fault tolerance, error notification, administration of the message queue or security issues. These are all message queue vendor–specific and outside the domain of the JMS.

By using message queues that expose a JMS interface, we allow ourselves the flexibility to switch vendors of message queues if we discover that the selected one doesn't meet our scalability requirements. This separation of implementation from interface is an important design pattern (see the Bridge design pattern in Design Patterns by Gamma et al., published by Addison-Wesley). Since each JMS implementation has its own unique way of getting the initial connection factory, we defined a Java interface with one method, "getConnectionFactory", which returns the initial connection factory.

Each space is configured through a properties file. One property in this file is the fully qualified name of the class that implements this interface. There is one such class for each JMS implementation supported by the space. For example, we created one class for Sun's Java Message Queue and one for Progress Software's SonicMQ. By doing this, changing the underlying message queue used by the space is simply a matter of changing the name of the Java class in the properties file for the space. Therefore, if one vendor's message queue doesn't live up to our expectations, we can quickly switch to another.

The space implementation itself is a CORBA object that has the following interface:

interface Space
{
void write(in ByteStream blob) raises (SpaceException);
ByteStream take() raises (SpaceException);
void write_filter(in ByteStream blob, in FilterSeq f)
raises (SpaceException);
ByteStream take_filter(in FilterSeq f) raises (SpaceException);
ByteStream take_filter_as_string(in string f)
raises (SpaceException);

void shutdown();
};

The type ByteStream simply evaluates to a stream of bytes. Hence, anything that can be represented as a stream of bytes, such as a CORBA object IOR, a serialized Java object or an XML document, can be stored in the space and retrieved.

Each space instance has three properties: a name, a property that indicates if this instance of the space is persistent and a property that indicates if this instance of the space allows filters. The reason there are properties to turn the persistence and filtering off is purely for performance.

Not all spaces in our application domain are required to be persistent, in which case persistence is a performance bottleneck because it involves writing out to a database or similar storage mechanism. Similarly, if filtering isn't required, it's a performance bottleneck. As mentioned above, each space is configured through a properties file,which has the property indicating the space name, the persistence status (on/off) and the filtering status (on/off) of the space.

An example of the properties file used in configuring the space is shown below:

SpaceName=MySpace
AllowFilter=true
Persistent=true

# The factory to use to get the initial Connection Factory
SpaceFactory=SonicMQSpaceFactoryImpl

The "SpaceName" property is the name of the space, "AllowFilter" is a boolean property where true means the space turns filter support on and "Persistent" is a boolean property where true means the space turns persistence on. "SpaceFactory" is set to the fully qualified name of the class that allows us to get the initial connection factory from the message queue. In the foregoing example, this property is set to a class that works with SonicMQ implementation.

During start-up each space installs itself in the CORBA Name Service using its name property as the binding name and in the CORBA Trader Service with the name, persistence and filter properties. Thus interested applications/processes can find a space by using a well-known name from the CORBA Name Service or the space properties from the CORBA Trader Service. For example, an application that wants filtering but isn't interested in persistence can indicate these requirements to the CORBA Trader Service, which will then provide the application with a list of CORBA space references that match these requirements. The application may then choose one from that list based on some further screening.

Our implementation of the space gains all its persistence and filtering capabilities from the underlying messaging queue provider. Our space is the only client of the message queue. In our implementation the only purpose the message queue serves is as a high-quality storage/retrieval mechanism that also provides filtering capabilities. We aren't relying on the queuing facilities per se.

Each method of the CORBA interface is detailed below:

  • write: This method is called by an application when it wants to put a stream of bytes into the space and doesn't want to attach filtering properties to the stream.
  • write_filter: This method is used by an application when it wants to put a stream of bytes into the space and wants to attach filtering properties to the stream. The type FilterSeq evaluates to an array of filters that are attached to that bytestream. A filter is a name-value pair. Hence, a FilterSeq is an array of name value pairs.
  • take: This method is called by an application when it wants to retrieve a stream of bytes from the space. No filtering is performed since none is specified.
  • take_filter: This method is called by an application when it wants to retrieve a stream of bytes from the space. However, in this case a FilterSeq is provided. For a match to occur, the bytestream must have a subset of the filters provided in the method call, and the value of each filter attached to the bytestream must match the value for the corresponding filter in the method call.
  • take_filter_as_string: This method is called by an application when it wants to retrieve a stream of bytes from the space. In this case a string that specifies the exact filter is provided. For a match to occur, the filter properties attached to the bytestream must satisfy the filter string provided in the method call. This method is used when the filtering conditions can't be specified as a FilterSeq.
  • shutdown: This method is called to shut down the space. The shutdown is clean, which means the registration with the Name Service and the Trader Service is removed.
The space implements all methods in the interface as synchronized. Furthermore, the take implementations are nonblocking, that is, if there's nothing to take, the method returns with nothing.

Conclusion
Distributed applications can be notoriously difficult to design, build and debug. The distributed environment introduces many complexities that aren't present when writing stand-alone applications. Some of these challenges are network latency, synchronization and concurrency, and partial failure.

Space-based programming, although not a silver bullet, is an excellent concept that can lead to an elegant solution to these problems. It takes us one step closer to achieving our goals in a distributed system, namely those of scalability, high availability, loose coupling and performance. It also helps us face the challenges mentioned above. Best of all, you don't have to buy an expensive implementation to get started with this excellent concept. It's fairly easy to create a homegrown implementation that satisfies your requirements...and it's fun, too!

Resources

  1. Linda Group: www.cs.yale.edu/HTML/YALE/CS/Linda/linda.html
  2. JavaSpaces homepage: www.javasoft.com/products/javaspaces/
  3. IBM, TSpaces: www.almaden.ibm.com/cs/TSpaces/
  4. Carriero, N.J. (1987). "Implementation of Tuple Space Machines," PhD thesis, Yale University, Department of Computer Science.
  5. Segall, E.J. (1993). "Tuple Space Operations: Multiple-Key Search, Online Matching and Wait-Free Synchronization," PhD thesis, Rutgers University, Department of Computer Science.
  6. Gul, A., et al. "ActorSpaces: An Open Distributed Programming Paradigm," University of Illinois at Urbana-Champaign, ULIUENG-92-1846.

More Stories By Tarak Modi

Tarak Modi, a certified Java programmer, is a lead systems architect at Online Insight where he's responsible for setting, directing, and implementing the vision and strategy of the company's product line from a technical and architectural perspective. Tarak has worked with Java, C++, and technologies such as EJB, Corba, and DCOM, and holds a BS in EE, an MS in computer engineering, and an MBA with a concentration in IS.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that On the Avenue Marketing Group, a sales and marketing firm that utilizes events to market and sell products to consumers, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. On the Avenue Marketing Group (OTA) is a sales and marketing firm that utilizes events to market and sell products to consumers. On behalf of our clients, we attend thousands of fairs, festivals, expos, concerts, conferences, and sporting events annually, helping them reach millions of individuals ...
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch of Docker's initial release in March of 2013, interest was revved up several notches. Then late last...
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable older adults to live independent lives while staying connected to loved ones. M2M will continue to gr...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
SYS-CON Events announced today that Ciqada will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Ciqada™ makes it easy to connect your products to the Internet. By integrating key components - hardware, servers, dashboards, and mobile apps - into an easy-to-use, configurable system, your products can quickly and securely join the internet of things. With remote monitoring, control, and alert messaging capability, you will meet your customers' needs of tomorrow - today! Ciqada. Let your products take flight. For more inform...
SYS-CON Events announced today that GENBAND, a leading developer of real time communications software solutions, has been named “Silver Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. The GENBAND team will be on hand to demonstrate their newest product, Kandy. Kandy is a communications Platform-as-a-Service (PaaS) that enables companies to seamlessly integrate more human communications into their Web and mobile applications - creating more engaging experiences for their customers and boosting collaboration and productiv...
Dave will share his insights on how Internet of Things for Enterprises are transforming and making more productive and efficient operations and maintenance (O&M) procedures in the cleantech industry and beyond. Speaker Bio: Dave Landa is chief operating officer of Cybozu Corp (kintone US). Based in the San Francisco Bay Area, Dave has been on the forefront of the Cloud revolution driving strategic business development on the executive teams of multiple leading Software as a Services (SaaS) application providers dating back to 2004. Cybozu's kintone.com is a leading global BYOA (Build Your O...
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provides a high level technical overview of many cloud services available to mobile app developers, includi...
SYS-CON Events announced today that BroadSoft, the leading global provider of Unified Communications and Collaboration (UCC) services to operators worldwide, has been named “Gold Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. BroadSoft is the leading provider of software and services that enable mobile, fixed-line and cable service providers to offer Unified Communications over their Internet Protocol networks. The Company’s core communications platform enables the delivery of a range of enterprise and consumer calling...
While not quite mainstream yet, WebRTC is starting to gain ground with Carriers, Enterprises and Independent Software Vendors (ISV’s) alike. WebRTC makes it easy for developers to add audio and video communications into their applications by using Web browsers as their platform. But like any market, every customer engagement has unique requirements, as well as constraints. And of course, one size does not fit all. In her session at WebRTC Summit, Dr. Natasha Tamaskar, Vice President, Head of Cloud and Mobile Strategy at GENBAND, will explore what is needed to take a real time communications ...
The IoT Bootcamp is coming to Cloud Expo | @ThingsExpo on June 9-10 at the Javits Center in New York. Instructor. Registration is now available at http://iotbootcamp.sys-con.com/ Instructor Janakiram MSV previously taught the famously successful Multi-Cloud Bootcamp at Cloud Expo | @ThingsExpo in November in Santa Clara. Now he is expanding the focus to Janakiram is the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquired by Aditi Technologies. He is a Microsoft Regional Director for Hyderabad, India, and one of the f...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
WebRTC is an up-and-coming standard that enables real-time voice and video to be directly embedded into browsers making the browser a primary user interface for communications and collaboration. WebRTC runs in a number of browsers today and is currently supported in over a billion installed browsers globally, across a range of platform OS and devices. Today, organizations that choose to deploy WebRTC applications and use a host machine that supports audio through USB or Bluetooth can use Plantronics products to connect and transit or receive the audio associated with the WebRTC session.
What exactly is a cognitive application? In her session at 16th Cloud Expo, Ashley Hathaway, Product Manager at IBM Watson, will look at the services being offered by the IBM Watson Developer Cloud and what that means for developers and Big Data. She'll explore how IBM Watson and its partnerships will continue to grow and help define what it means to be a cognitive service, as well as take a look at the offerings on Bluemix. She will also check out how Watson and the Alchemy API team up to offer disruptive APIs to developers.
As enterprises move to all-IP networks and cloud-based applications, communications service providers (CSPs) – facing increased competition from over-the-top providers delivering content via the Internet and independently of CSPs – must be able to offer seamless cloud-based communication and collaboration solutions that can scale for small, midsize, and large enterprises, as well as public sector organizations, in order to keep and grow market share. The latest version of Oracle Communications Unified Communications Suite gives CSPs the capability to do just that. In addition, its integration ...
SYS-CON Events announced today that Litmus Automation will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Litmus Automation’s vision is to provide a solution for companies that are in a rush to embrace the disruptive Internet of Things technology and leverage it for real business challenges. Litmus Automation simplifies the complexity of connected devices applications with Loop, a secure and scalable cloud platform.
In 2015, 4.9 billion connected "things" will be in use. By 2020, Gartner forecasts this amount to be 25 billion, a 410 percent increase in just five years. How will businesses handle this rapid growth of data? Hadoop will continue to improve its technology to meet business demands, by enabling businesses to access/analyze data in real time, when and where they need it. Cloudera's Chief Technologist, Eli Collins, will discuss how Big Data is keeping up with today's data demands and how in the future, data and analytics will be pervasive, embedded into every workflow, application and infra...
As Marc Andreessen says software is eating the world. Everything is rapidly moving toward being software-defined – from our phones and cars through our washing machines to the datacenter. However, there are larger challenges when implementing software defined on a larger scale - when building software defined infrastructure. In his session at 16th Cloud Expo, Boyan Ivanov, CEO of StorPool, will provide some practical insights on what, how and why when implementing "software-defined" in the datacenter.
SYS-CON Media announced today that @ThingsExpo Blog launched with 7,788 original stories. @ThingsExpo Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @ThingsExpo Blog can be bookmarked. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago.