|By Tarak Modi||
|October 1, 2000 12:00 AM EDT||
One of the problems of highly distributed systems is figuring out how systems discover each other. After all, the whole point of having systems distributed is to allow flexible and perhaps even dynamic configurations to maximize system performance and availability. How do these distributed components of one system or multiple systems discover each other? And once they're discovered how do we allow enough flexibility, such as rediscovery, to allow their fail-safe operation?
Space-based programming may provide us with a good answer to these questions and more. In this article I'll describe what a space is and how it can be used to mitigate some of the issues mentioned above. And I've included a technique to convert an ordinary message queue into a space.
What Is a Space?
Conventional distributed tools rely on passing messages between processes (asynchronous communication) or invoking methods on remote objects (synchronous communication). A space is an extension of the asynchronous communication model in which two processes are not passing messages to one another. In fact, the processes are totally unaware of each other.
In Figure 1, Process 1 places a message into the space. Process 2, which has been waiting for this type of message, takes the message out of the space and processes it. Based on the results, it places another message into the space. Process 3, which has been waiting for this type of message, takes the message out of the space.
Following are highlights of the preceding discussion:
- The space may contain different types of messages. In fact, I used the term message for clarity. These messages are actually just "things" (the message may be an object, an XML document or anything else that the space allows to be put in it). In Figure 1 the different shapes in the space illustrate the different types of messages.
- The three processes involved have no knowledge of one another. All they know is that they put a message in a space and get a message out of the space.
- As in the message-passing scenario, we aren't limited to two processes communicating asynchronously, but rather any number of processes communicating via a common space. This allows the creation of loosely coupled systems that can be highly distributed and extremely flexible, and can provide high availability and dynamic load balancing.
Assume that passwords can't be more than four characters in length and only alphanumeric ASCII characters are used. This gives us 14,776,336 possible passwords (624). Using the brute force technique to break the password, assume that the main program breaks the input set into 16 pieces and puts each piece along with the encrypted password in the space. The password-breaking programs watch the space for such pieces and each available program immediately grabs a piece and starts working. The programs continue until no more such pieces are available or until the password has been broken. If the password is broken, the breaking program puts the solution in the space, which is picked up by the main program.
The main program then proceeds to pick up the remaining pieces, since it has already found the solution it needs. The program never knew how many password-breaking programs were available, nor did it know where they were located. The password-breaking programs had no knowledge about one another or about the main program. If there were 16 password-breaking programs available, and each one was on a separate machine, we would've had 16 machines working on breaking the password simultaneously!
No change to any configuration of the system is required to add new password-breaking programs. This is why spaces are so good for fault tolerance, load balancing and scalability.
As you can see, spaces provide an extremely powerful concept/mechanism to decouple cooperating or dependent systems. The concept of a space isn't new, however. Tuple spaces were first described in 1982 in the context of a programming language called Linda. Linda consisted of tuples, which were collections of data grouped together, and the tuple space, which was the shared blackboard from which applications could place and retrieve tuples. The concept never gained much popularity outside of academia, however. Today spaces may be an elegant solution to many of the traditional distributed computing dilemmas. In recognition of this fact, JavaSoft has created its own implementation of the space concept, JavaSpaces, and IBM has created TSpaces, which is much more functional and complex than JavaSpaces. (We won't discuss IBM's TSpaces in this article.)
We're now in a position to describe some of the key characteristics of a space:
- Spaces provide shared access: A space provides a network-accessible "shared memory" that can be accessed by many shared remote/local processes concurrently. The space handles all issues regarding concurrent access, allowing the processes to focus on the task at hand. At the very least, spaces provide processes with the ability to place and retrieve "things." Some spaces also provide the ability to read/peek at things (i.e., to get the thing without actually removing it from the space, thus allowing other processes to access it as well).
- Spaces are persistent: A space provides reliable storage for processes to place "things." These "things" may outlive the processes that created them. It also allows the dependent/cooperating processes to work together even when they have nonoverlapping life cycles, and boosts the fault tolerance and high-availability capability of distributed systems.
- Spaces are associative. Associative lookup allows processes to "find" the "things" they're interested in. As many processes may be using/sharing the same space, many different "things" may be in the space. It's important for processes to be able to get the "things" they require without having to filter out the "noise" themselves. This is possible because spaces allow processes to define filters/templates that instruct/direct the space to "find" the right "things" for that process.
JavaSoft's Implementation: JavaSpaces
JavaSpaces technology, a new realization of the tuple spaces concept described above, is an implementation that's available free from JavaSoft. JavaSpaces is built on top of another complex technology, Jini, a Java-based technology that allows any device to become network aware. Jini provides a complex yet elegant programming model that realizes the Jini team's vision of "network anything, anytime, anywhere."
The goal of JavaSpaces is to provide what might be thought of as a file system for objects. Like other JavaSoft APIs, JavaSpaces provides a simple yet powerful set of features to developers. As I see it, however, JavaSpaces has four drawbacks:
- The implementation of JavaSpaces is complex to install.
- The fact that it builds on top of Jini makes it a little too heavy, especially if there are no plans to use Jini elsewhere in the project.
- JavaSpaces relies on Java RMI, the suitability of which for highly scalable commercial applications is a topic of debate among many software gurus.
- JavaSpaces works only with serializable Java objects.
Even though commercial implementations of spaces are available in the market, there are several reasons to create your own. If you work in a start-up company, budget constraints may be a big reason. Also, the functionality offered by a commercial implementation may be too much for the job at hand. Not only may this result in a larger learning curve, it may even slow down your application due to the sheer size of the memory footprint. Finally, it's always fun to create your own implementation.
At Online Insight we decided to create our own implementation. The primary reasons for our decision were our limited set of requirements and the extremely lightweight implementation we required to achieve our scalability and performance goals.
Our requirements can be summarized as follows:
- The space must support shared access.
- The space must be persistent.
- The space must provide the ability to specify a filtering template.
- The space must allow one "thing" to be accessed by only one process/application at a time (i.e., we don't support the "read" operation).
- The space must perform and scale well under load.
- The space must be accessible to other CORBA objects.
- The space must not impose a limitation on what you can put in it (unlike JavaSpaces, for example).
- The space must not impose size limitations on what you can put in it (the underlying hardware, however, may impose a limitation).
Java Message Service
At the time we were evaluating message queuetype software specifically, Java Message Service (JMS) implementations we realized that we could build our space facility on top of one of these queues.
JMS is an API for accessing enterprise-messaging systems from Java programs. It defines a common set of enterprise-messaging concepts and facilities, and attempts to minimize the set of concepts a Java language programmer must learn to use, including enterprise-messaging products such as IBM MQSeries. JMS also strives to maximize the portability of messaging applications. It doesn't, however, address load balancing/fault tolerance, error notification, administration of the message queue or security issues. These are all message queue vendorspecific and outside the domain of the JMS.
By using message queues that expose a JMS interface, we allow ourselves the flexibility to switch vendors of message queues if we discover that the selected one doesn't meet our scalability requirements. This separation of implementation from interface is an important design pattern (see the Bridge design pattern in Design Patterns by Gamma et al., published by Addison-Wesley). Since each JMS implementation has its own unique way of getting the initial connection factory, we defined a Java interface with one method, "getConnectionFactory", which returns the initial connection factory.
Each space is configured through a properties file. One property in this file is the fully qualified name of the class that implements this interface. There is one such class for each JMS implementation supported by the space. For example, we created one class for Sun's Java Message Queue and one for Progress Software's SonicMQ. By doing this, changing the underlying message queue used by the space is simply a matter of changing the name of the Java class in the properties file for the space. Therefore, if one vendor's message queue doesn't live up to our expectations, we can quickly switch to another.
The space implementation itself is a CORBA object that has the following interface:
void write(in ByteStream blob) raises (SpaceException);
ByteStream take() raises (SpaceException);
void write_filter(in ByteStream blob, in FilterSeq f)
ByteStream take_filter(in FilterSeq f) raises (SpaceException);
ByteStream take_filter_as_string(in string f)
The type ByteStream simply evaluates to a stream of bytes. Hence, anything that can be represented as a stream of bytes, such as a CORBA object IOR, a serialized Java object or an XML document, can be stored in the space and retrieved.
Each space instance has three properties: a name, a property that indicates if this instance of the space is persistent and a property that indicates if this instance of the space allows filters. The reason there are properties to turn the persistence and filtering off is purely for performance.
Not all spaces in our application domain are required to be persistent, in which case persistence is a performance bottleneck because it involves writing out to a database or similar storage mechanism. Similarly, if filtering isn't required, it's a performance bottleneck. As mentioned above, each space is configured through a properties file,which has the property indicating the space name, the persistence status (on/off) and the filtering status (on/off) of the space.
An example of the properties file used in configuring the space is shown below:
SpaceName=MySpaceThe "SpaceName" property is the name of the space, "AllowFilter" is a boolean property where true means the space turns filter support on and "Persistent" is a boolean property where true means the space turns persistence on. "SpaceFactory" is set to the fully qualified name of the class that allows us to get the initial connection factory from the message queue. In the foregoing example, this property is set to a class that works with SonicMQ implementation.
# The factory to use to get the initial Connection Factory
During start-up each space installs itself in the CORBA Name Service using its name property as the binding name and in the CORBA Trader Service with the name, persistence and filter properties. Thus interested applications/processes can find a space by using a well-known name from the CORBA Name Service or the space properties from the CORBA Trader Service. For example, an application that wants filtering but isn't interested in persistence can indicate these requirements to the CORBA Trader Service, which will then provide the application with a list of CORBA space references that match these requirements. The application may then choose one from that list based on some further screening.
Our implementation of the space gains all its persistence and filtering capabilities from the underlying messaging queue provider. Our space is the only client of the message queue. In our implementation the only purpose the message queue serves is as a high-quality storage/retrieval mechanism that also provides filtering capabilities. We aren't relying on the queuing facilities per se.
Each method of the CORBA interface is detailed below:
- write: This method is called by an application when it wants to put a stream of bytes into the space and doesn't want to attach filtering properties to the stream.
- write_filter: This method is used by an application when it wants to put a stream of bytes into the space and wants to attach filtering properties to the stream. The type FilterSeq evaluates to an array of filters that are attached to that bytestream. A filter is a name-value pair. Hence, a FilterSeq is an array of name value pairs.
- take: This method is called by an application when it wants to retrieve a stream of bytes from the space. No filtering is performed since none is specified.
- take_filter: This method is called by an application when it wants to retrieve a stream of bytes from the space. However, in this case a FilterSeq is provided. For a match to occur, the bytestream must have a subset of the filters provided in the method call, and the value of each filter attached to the bytestream must match the value for the corresponding filter in the method call.
- take_filter_as_string: This method is called by an application when it wants to retrieve a stream of bytes from the space. In this case a string that specifies the exact filter is provided. For a match to occur, the filter properties attached to the bytestream must satisfy the filter string provided in the method call. This method is used when the filtering conditions can't be specified as a FilterSeq.
- shutdown: This method is called to shut down the space. The shutdown is clean, which means the registration with the Name Service and the Trader Service is removed.
Distributed applications can be notoriously difficult to design, build and debug. The distributed environment introduces many complexities that aren't present when writing stand-alone applications. Some of these challenges are network latency, synchronization and concurrency, and partial failure.
Space-based programming, although not a silver bullet, is an excellent concept that can lead to an elegant solution to these problems. It takes us one step closer to achieving our goals in a distributed system, namely those of scalability, high availability, loose coupling and performance. It also helps us face the challenges mentioned above. Best of all, you don't have to buy an expensive implementation to get started with this excellent concept. It's fairly easy to create a homegrown implementation that satisfies your requirements...and it's fun, too!
- Linda Group: www.cs.yale.edu/HTML/YALE/CS/Linda/linda.html
- JavaSpaces homepage: www.javasoft.com/products/javaspaces/
- IBM, TSpaces: www.almaden.ibm.com/cs/TSpaces/
- Carriero, N.J. (1987). "Implementation of Tuple Space Machines," PhD thesis, Yale University, Department of Computer Science.
- Segall, E.J. (1993). "Tuple Space Operations: Multiple-Key Search, Online Matching and Wait-Free Synchronization," PhD thesis, Rutgers University, Department of Computer Science.
- Gul, A., et al. "ActorSpaces: An Open Distributed Programming Paradigm," University of Illinois at Urbana-Champaign, ULIUENG-92-1846.
“We're a global managed hosting provider. Our core customer set is a U.S.-based customer that is looking to go global,” explained Adam Rogers, Managing Director at ANEXIA, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Feb. 20, 2017 03:15 AM EST Reads: 1,284
In today's uber-connected, consumer-centric, cloud-enabled, insights-driven, multi-device, global world, the focus of solutions has shifted from the product that is sold to the person who is buying the product or service. Enterprises have rebranded their business around the consumers of their products. The buyer is the person and the focus is not on the offering. The person is connected through multiple devices, wearables, at home, on the road, and in multiple locations, sometimes simultaneously...
Feb. 20, 2017 02:00 AM EST Reads: 6,038
China Unicom exhibit at the 19th International Cloud Expo, which took place at the Santa Clara Convention Center in Santa Clara, CA, in November 2016. China United Network Communications Group Co. Ltd ("China Unicom") was officially established in 2009 on the basis of the merger of former China Netcom and former China Unicom. China Unicom mainly operates a full range of telecommunications services including mobile broadband (GSM, WCDMA, LTE FDD, TD-LTE), fixed-line broadband, ICT, data communica...
Feb. 20, 2017 01:00 AM EST Reads: 808
As businesses adopt functionalities in cloud computing, it’s imperative that IT operations consistently ensure cloud systems work correctly – all of the time, and to their best capabilities. In his session at @BigDataExpo, Bernd Harzog, CEO and founder of OpsDataStore, will present an industry answer to the common question, “Are you running IT operations as efficiently and as cost effectively as you need to?” He will expound on the industry issues he frequently came up against as an analyst, and...
Feb. 20, 2017 12:00 AM EST Reads: 1,379
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, looked at differ...
Feb. 19, 2017 10:30 PM EST Reads: 6,133
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
Feb. 19, 2017 09:30 PM EST Reads: 823
IoT offers a value of almost $4 trillion to the manufacturing industry through platforms that can improve margins, optimize operations & drive high performance work teams. By using IoT technologies as a foundation, manufacturing customers are integrating worker safety with manufacturing systems, driving deep collaboration and utilizing analytics to exponentially increased per-unit margins. However, as Benoit Lheureux, the VP for Research at Gartner points out, “IoT project implementers often un...
Feb. 19, 2017 08:00 PM EST Reads: 2,820
SYS-CON Events announced today that Technologic Systems Inc., an embedded systems solutions company, will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Technologic Systems is an embedded systems company with headquarters in Fountain Hills, Arizona. They have been in business for 32 years, helping more than 8,000 OEM customers and building over a hundred COTS products that have never been discontinued. Technologic Systems’ pr...
Feb. 19, 2017 06:45 PM EST Reads: 3,139
SYS-CON Events announced today that IoT Now has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
Feb. 19, 2017 06:15 PM EST Reads: 1,078
SYS-CON Events announced today that WineSOFT will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Based in Seoul and Irvine, WineSOFT is an innovative software house focusing on internet infrastructure solutions. The venture started as a bootstrap start-up in 2010 by focusing on making the internet faster and more powerful. WineSOFT’s knowledge is based on the expertise of TCP/IP, VPN, SSL, peer-to-peer, mob...
Feb. 19, 2017 06:00 PM EST Reads: 1,392
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
Feb. 19, 2017 05:45 PM EST Reads: 941
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
Feb. 19, 2017 05:00 PM EST Reads: 8,149
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Feb. 19, 2017 05:00 PM EST Reads: 1,602
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Feb. 19, 2017 05:00 PM EST Reads: 7,998
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, discussed the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports.
Feb. 19, 2017 05:00 PM EST Reads: 901
SYS-CON Events announced today that Dataloop.IO, an innovator in cloud IT-monitoring whose products help organizations save time and money, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Dataloop.IO is an emerging software company on the cutting edge of major IT-infrastructure trends including cloud computing and microservices. The company, founded in the UK but now based in San Fran...
Feb. 19, 2017 04:15 PM EST Reads: 2,070
In his session at @ThingsExpo, Sudarshan Krishnamurthi, a Senior Manager, Business Strategy, at Cisco Systems, will discuss how IT and operational technology (OT) work together, as opposed to being in separate siloes as once was traditional. Attendees will learn how to fully leverage the power of IoT in their organization by bringing the two sides together and bridging the communication gap. He will also look at what good leadership must entail in order to accomplish this, and how IT managers ca...
Feb. 19, 2017 02:30 PM EST Reads: 1,207
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
Feb. 19, 2017 02:00 PM EST Reads: 1,041
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Ge...
Feb. 19, 2017 01:15 PM EST Reads: 881
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Feb. 19, 2017 12:45 PM EST Reads: 1,135