Welcome!

Java IoT Authors: Elizabeth White, Pat Romanski, Liz McMillan, Stackify Blog, Progress Blog

Related Topics: Java IoT

Java IoT: Article

Distributed Garbage Collection

Distributed Garbage Collection

As any ex-C++ software developer will attest, the Java garbage collector greatly simplifies the task of cleaning up after your objects. With distributed software applications, the garbage collector faces many new challenges since objects may be used by applications running across the Internet. This article looks at some common solutions to garbage collection in CORBA, RMI and DCOM. Finally, the distributed garbage collector in RMI is implemented on top of CORBA.

Introduction
Back in the old days of software development, programmers had to carefully keep track of all the memory used in a program and clean up each of the unused bits. Failure to properly care for memory could lead to memory actually getting lost somewhere in the ether. In the case of some of the older operating systems, lost memory could only be recovered by rebooting the computer in some circumstances. As a result, software developers had to be intimately familiar with the size of every piece of data used in their applications. Hours were spent tracing through code to determine when data was no longer required and even more hours were spent writing procedures to properly remove the data. A whole niche in the software industry was built on the marketing of development tools to detect lost memory.

Garbage-collected languages such as Java have improved this situation dramatically. No longer do we have to worry about where our memory goes when we are done with it. The garbage collector will find it and clean it up for us. And the Java garbage collector is pretty good at its job. It runs as a low priority thread so you probably will never notice it cleaning up your mess.

However, distributed software has a whole new set of unique challenges. Objects which previously were used only within one program on a single computer can now be used by many different programs running on many different computers. Now that Netscape has built-in CORBA support, it won't be long before you might want to access your objects across the Web.

The garbage collector's job of finding out when an object is still in use just became a whole lot more difficult. The garbage collector used to easily determine which objects were still in use by literally looking at each object in a program, marking those which were still in use and removing the leftovers. But with distributed objects, the whole Internet could be using your objects if you let them. The garbage collector can't very well look at every object on the entire Web to determine which are still in use.

Now add to the equation the realities of the modern Internet. Network links fail all the time. Corrupt packet routing tables bring down whole branches of the Internet temporarily. Machines occasionally crash, both clients and servers. Finally, how many of us have suffered through the occasional 50-bit-per-second connection to read our e-mail?

When a link fails or computer crashes, the distributed garbage collector must be smart enough to do the right thing, whatever that thing may be. Consider if a distributed object is running on a Web server hosted by ACME Web Service, Inc. and the distributed object is currently in use by your Web browser. First, let's say your computer suddenly crashes. In this case, you might want the distributed object to be cleaned up immediately. After all, you probably won't be able to log back into the Internet and just start over where you left off (unless the software was written by a particularly talented developer). But now, let's say that your Internet connection temporarily drops off. This happens often for periods of just a few seconds and you don't even notice it. In this situation, you don't want the garbage collector to go after your object; you'll be back in just a few moments. The distributed garbage collector has to walk a fine line to satisfy everyone.

The CORBA Approach
CORBA uses a combination of reference counting and Internet connection management in order to perform distributed garbage collection. Once a server object has been instantiated, the reference count to it is implicitly incremented whenever a new reference to it is created and implicitly decremented whenever a reference is destroyed. When the reference count reaches zero, the instance of the server object is cleaned up. This is enough for most situations and provides an effective means of distributed garbage collection even in languages such as C++ which don't normally have a garbage collector.

In addition to the implicit rules for reference counting, explicit operations are provided for adding and removing references called duplicate and release. These operations are most useful when manipulating object references through pointers. When a pointer to a reference is copied, the duplicate procedure should be called in order to indicate that a new reference has been created. When a pointer to a reference is destroyed, the release procedure should be called for the opposite reason.

CORBA also carefully manages Internet network connections. When a client is disconnected, either due to a client machine crash or due to a complete network failure, any references held by the client machine are immediately released. This mechanism of detecting a client failure behaves correctly even when a temporary network slowdown causes the server to lose touch with the client. As long as the network connection remains active, the references will not be released and the object will not be garbage-collected.

For the truly adventurous, additional mechanisms are provided to sever communication with an object and immediately cause garbage collection. The deactivate_obj call is an example of such a mechanism.

The RMI Approach
RMI uses a fairly straightforward mechanism for garbage collection. Any program which has a reference to an object must obtain a "lease" for the object. The lease, which is literally represented by a Lease object, entitles the program to use the object for a certain period of time, basically the same idea as leasing office equipment.

If the object continues to be used for an extended period of time, the lease must be renewed before it expires. The renewed lease again entitles the holder to use the object for a certain period of time. If the object is no longer in use, the lease is simply allowed to expire or can be explicitly terminated by the holder at any time. When all the leases have expired or terminated, the object can be garbage-collected.

This design easily solves most of the problems faced by a distributed garbage collector. When an object is no longer in use anywhere on the Internet, no leases are renewed so the object will eventually be garbage-collected. If the computer with an RMI program running on it suddenly crashes, the leases for any distributed objects will simply expire over time and can be garbage-collected.

The Lease object is obtained and renewed using the DGC interface (DGC presumably stands for Distributed Garbage Collector) which is provided by RMI. The main operations on the DGC interface, shown in Listing 1, are dirty and clean. Dirty is for obtaining a Lease object and clean is for terminating Lease objects. However, don't worry too much about learning the details. The software developer should never need to use the DGC interface since it is all taken care of by RMI itself.

The dirty method on the normal DGC interface accepts an array of ObjIDs, a sequence number and a Lease. The array of ObjIDs are object identification numbers for those objects whose lease requires renewal. The sequenceNum is used for nothing more than to guarantee proper network packet ordering since RMI makes use of the unreliable protocol UDP to transmit garbage collection requests. The Lease is just used as a data container to hold a unique identification number for the client making the request and the desired length of the lease. Keep in mind that since the DGC interface is hidden under the covers, RMI itself is choosing the "desired" length of the lease. The software developer has no part in this decision.

One thing you should keep in mind: some network overhead is incurred every time an object renews its lease. A remote request must be sent across the network to the object's host. Thus, if you would like to deploy a system with several hundred clients or several thousand distributed objects, this overhead might become quite considerable. Currently, no means are provided for configuring the leasing period for objects and thus the time between requests to renew a lease, so keep this limitation in mind when architecting your system.

In addition, any distributed architecture which relies on mechanisms like leases is subject to problems when network failures or even slowdowns occur. For example, if you are in the unfortunate situation of having your Internet connection hang just long enough to cause your leases to expire, all of your distributed objects will suddenly be garbage collected even though you are still using them.

The DCOM Approach
DCOM, which stands for Distributed COM, is Microsoft's foray into the world of distributed computing. Since DCOM is supported by the world's second largest software vendor, it deserves at least a brief mention here even though it's unclear how well it will be supported for use with Java. DCOM is totally unlike any other distributed object technology when it comes to garbage collection. First, DCOM differentiates between interfaces and objects. Each has its own type of garbage collection support.

Garbage collection of interfaces is handled through a manual reference counting mechanism. RemAddRef and RemRelease respectively add and release references to remote objects. Both of these calls are sent across the network to the remote system, incurring some network overhead whenever additional references are made. Under the covers, DCOM tries to reduce this overhead by "multiplexing references". This means that a single reference can actually stand for many references within a single program. In addition, programs may optionally request "private references", which are references associated with a particular client identification. Normally, DCOM allows one client to issue more releases than the number of references it currently owns. Private references are a way of preventing this from occurring.

An entirely different mechanism is used for objects. So called keepalive messages are sent periodically to objects as a way of pinging the objects to let them know they are still needed. These keepalive messages are similar in some ways to RMI Leases and have the same weaknesses. A temporary network failure may result in the garbage collection of objects which are still in use simply because keepalive messages were not received in time.

To add a few more variables to the equation, COM implementations may defer the release of references to an interface for an indefinite period of time. The DCOM specification recommends that the remote release of all interfaces be deferred until all local references to all interfaces on an object are released. It's not clear what sort of logic is required by the user, if any, to match up respective interfaces with their objects in order for garbage collection to work as advertised. To top it off, garbage collection of the interfaces is actually left as optional; some COM implementation may never perform it. Confused? Maybe that's what Microsoft intended.

Merging the Approaches
The RMI distributed garbage collector is rather simple and easy to implement using CORBA. First, the DGC interface is hidden from the developer. No means of directly invoking the DGC interface is available. Thus, I feel justified in redesigning the interface slightly in order to simplify the task of implementing it.

In the dirty method, I would like to just pass an object identification number for the object which is to be leased and the identification number for the client requesting the lease. I'll simply return a number indicating the length of time for which the lease was granted rather than a whole object. This method may not be as type safe as returning an object; however, the interface will only be used by our own stub code. That means type safety is not as important as efficiency. I won't return or send a Lease object or allow the desired length of the lease to be configured since this interface is not exposed for the software developer anyway. The sequenceNum which was added just to guarantee a certain amount of packet ordering due to RMI's use of the unreliable UDP network protocol can simply be deleted since CORBA would itself guarantee reliable delivery.

The sequenceNum on the clean method can be deleted for the same reasons. I also broke up the arrays of object identification numbers into a single identification number per method invocation. Although renewing leases in groups may prove useful later, manipulating the lease of one object at a time seems like the most natural way of handling a lease. The modified DGC interface is shown in Listing 2.

To implement the DGC interface, I added a class called DGCImpl whose main responsibilities are to keep track of the leases and periodically clean up those objects which no longer have any outstanding leases. This was accomplished by making the DGCImpl implement runnable so that it would have its own thread to periodically check its leases. When an object no longer has any outstanding leases, the CORBA deactivate_obj is called to immediately remove the object and allow it to be garbage-collected. The full implementation of this is too long to reproduce here due to space considerations but is available for download at my Web site, mentioned at the end of this article.

Two numbers are passed into the DGC interface, the object identification and client identification number. In RMI, these identification numbers are generated by the ObjID and VMID classes respectively. For implementing the DGC on CORBA, I continue this tradition but simplify it slightly by extracting the integer contained in both objects.

The usage of this DGC interface with CORBA is identical to its usage with RMI. When a new reference is created, a lease for the object should be obtained in order to prevent the object from being garbage collected. By performing this action in the client stubs, this can be entirely hidden from the software developer so they don't need to worry about it.

Summary
Garbage collection has greatly improved the way in which we write software, but garbage collection in distributed applications has many difficult problems to solve. We've looked briefly at how the three main distributed object systems tackle this problem and demonstrated that two of them aren't quite as different as you might expect at first glance.

Where To Go From Here
RMI and Java can be found at http://www.javasoft.com
CORBA standards can be found at http://www.omg.org
Visigenic, the makers of VisiBroker for Java, can be found at http://www.visigenic.com
More information on distributed GC may be found at http://www-sor.inria.fr

More Stories By Jeff Nelson

Jeff Nelson is a distributed systems architect with DiaLogos Incorporated, experts in CORBA and Java Technologies (http://dialogosweb.com) and active participants in the Object Management Group. He has 8 years of experience in distributed computing and object technology. Jeff can be found on the Web at http://www.distributedobjects.com/

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
There is huge complexity in implementing a successful digital business that requires efficient on-premise and cloud back-end infrastructure, IT and Internet of Things (IoT) data, analytics, Machine Learning, Artificial Intelligence (AI) and Digital Applications. In the data center alone, there are physical and virtual infrastructures, multiple operating systems, multiple applications and new and emerging business and technological paradigms such as cloud computing and XaaS. And then there are pe...
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
In his session at @ThingsExpo, Greg Gorman is the Director, IoT Developer Ecosystem, Watson IoT, will provide a short tutorial on Node-RED, a Node.js-based programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using a wide range of nodes in the palette that can be deployed to its runtime in a single-click. There is a large library of contributed nodes that help so...
Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.