Java IoT Authors: TJ Randall, Dana Gardner, Christopher Harrold, Andreas Grabner, Liz McMillan

Related Topics: Java IoT

Java IoT: Article

Zip Objects, Zap Wait Time

Zip Objects, Zap Wait Time

As the capabilities of our distributed applications increased, so did our consumption of bandwidth. In 1998, our server sent objects no larger than 50K to a group of users on a local network. By 2002, we were passing an average of 500K per object, with some as large as 1.5MB.

More important, the distribution of our user base grew from 50 to over 1,500, with some users based across the country from the server. Add in a group of users roaming on their modem connections and the full scale of our bandwidth issues become clear. We were presented with a problem faced by many developers of distributed systems: reduce bandwidth usage and client wait time without removing any functionality. This article shares our solution to this problem, providing you with the simple code that helped us eliminate over 80% of our network traffic.

Evaluating bandwidth is quite simple. The developer has two options: get more of it or use less of it. Given the magnitude and expense of expanding bandwidth on a nationally distributed application, it was clear we had to find ways to reduce the amount of bandwidth required by our systems. It's important to note the wording: reduce the bandwidth usage, not the amount of data passed over the network. To preserve the functionality of the systems, we needed all the data being passed over the line. In the end, there was one conclusion: the data needed to be compressed.

As I researched compression in Java, I was looking for a way to pass in an object and receive a compressed object back. I found that there are a number of ways to compress sockets or build zip files on the disk, but not the object-level solution I was seeking. We needed an API that could be selectively implemented and used for the largest data objects and most critical applications without impacting other parts of the system. We also wanted the ability to compress an object one time, and use that same object for multiple downloads to client machines, essentially caching a compressed object.

During this research, I found an article on compression on the Java developer's Web site that laid out all the pieces to our solution (see Resources section ). Using just a few of the classes in the java.io and java.util.zip packages, we were able to build an API to compress any serializable Java object. Being the kind of developer who prefers simplicity, I was excited at the ease of use and performance of the underlying Java classes as well as the API we built. We were able to develop and integrate our solution in just under two days, resulting in more than an 80% reduction in network traffic and astounding improvements in client wait times.

A Compression Factory for Serialized Objects
The Java compression functions are located in the java.util.zip package, where the Deflater class compresses byte arrays and the Inflater class decompresses byte arrays. As you may have noted, both of these classes perform compression routines on byte arrays. Therefore, to compress an object, the first step is to translate it into a representation of bytes, which begins with the Serializable interface.

When an object implements the Serializable interface, it can be represented as a stream of bytes. This byte stream can be written using the ObjectOutputStream.writeObject() method and reconstituted using the ObjectInputStream.readObject() method, allowing for a simple translation of a byte stream to and from an object. This ability to serialize an object, capturing the resulting byte stream into a byte array, provides a usable input for the compression methods available in the java.util.zip classes.

Using this approach, we will accept a serialized object, write the object into a byte array, and then compress the array. The array of compressed bytes, along with a few other key variables, will be stored in a new object, cZipObject, which is shown in its entirety in Listing 1. The cZipObject will encapsulate the compressed version of the input object. The cZipObject can then be serialized to transfer across the network. On the receiving end, the byte array will be extracted from the cZipObject, decompressed, input to a byte stream, and then reconstituted into an object. This process is not truly compressing the object, but compressing the serialized representation of the object and its data.

To easily integrate these compression routines on both the server and client side, we'll create a cZipFactory class that will contain all the methods for compressing and decompressing objects. We'll create a number of methods along the way that can be of direct use, such as a byte compression method. By encapsulating both the compress and decompress functions into a single class, we can add the functionality to both the client and server by creating a single object. This will allow us to compress objects sent from the server to the client as well as from the client back up to the server.

The first step is to convert the Serializable object into a byte array. This can be achieved by using the Object- OutputStream with an underlying ByteArrayOutputStream from the java.io package. First, we'll create a new ByteOutput Stream that will capture the byte stream when the object is written. We'll then create a new ObjectOutputStream, write the serialized object, and then extract a byte array from the ByteOutputStream.

try {
ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
ObjectOutputStream objOut = new ObjectOutputStream(byteOut);
byte[] DataArray = byteOut.toByteArray();
} catch (Exception e) {

With this code, we now have the ability to translate any object that implements the Serializable interface into a byte array capable of compression. The resulting byte array contains the details of the object as well as the object's data. The array contains the essential structural and data attributes to replicate the object and all its content. The next step is to compress the data contained in the byte array, thereby compressing the serialized representation of the object.

There are a few simple steps to compressing byte arrays using the Deflater class from the java.util.zip package. First, we'll create a new array for the compressed bytes. Without a method to accurately predict or estimate the size of the byte array resulting from compression, it's advisable to create an array of equal size to the noncompressed bytes and then shrink the array once the compression is complete and the true size can be determined.

The next step is to create a new instance of the Deflater class, passing in the desired compression level in the constructor. There are a few options for compression level, each with benefits and drawbacks. The best compression option provides the greatest reduction in byte size at the expense of increased processing time. The best speed option provides a good compression level, usually 80% or better, in the shortest possible time. I usually opt for best compression, finding the extra milliseconds in processing time worth the decreased object size. For more information on the available compression levels, refer to the JavaDocs for java.util.zip.Deflater.

Once the Deflater object has been created, call the setInput(byte[]) method providing the byte array we extracted from the object serialization. Invoke the finish() method to inform the Deflater class that all inputs have been defined. Next, call the deflate(byte[]) method, providing the byte array to house the compressed data. When this method completes its execution, the data has been compressed and populated in the output byte array. The getTotalOut() method in the Deflater class will return the total number of bytes that were written in the output byte array. Using the new array size, we'll create a byte array to the exact size of the compressed output. We'll then use the System.arraycopy function to copy the bytes from the temporary array into the exact size array.

For ease of use, we'll encapsulate these steps into a single method named CompressBytes in the cZipFactory object (see Listing 2). Now, when we need to compress a byte array, we can invoke a single method:

byte[] bytesCompress = ZipFactory.CompressBytes(DataArray);

There are two key pieces of data required to quickly and accurately decompress the object: the byte array containing the compressed data and the original size of the serialized byte array. When the byte array is decompressed, it will be written into another byte array. Knowing the size of the decompressed array will not only make the decompression more efficient, it will also ensure accuracy. To save the byte array and original size easily, we will encapsulate them in a new instance of the cZipObject class.

cZipObject cZipObj = new cZipObject();
cZipObj.setData(bytesCompress, iOrigSize);

By combining all these steps, we can now create a method that accepts any Serializable object and returns a cZipObject. This is the Compress method in the cZipFactory class, shown in its entirety in Listing 3. Using the new method in cZipFactory greatly simplifies the integration of object compression functions. First, we create an instance of the cZipFactory class, providing the desired compression level during object creation.

cZipFactory ZipFactory = new

Using the new cZipFactory class, we can compress a serializable object using a single line of code:

cZipObject newZObject = ZipFactory.Compress(inObject);

When the client or receiving machine obtains the cZipObject, it needs to be decompressed and reconstituted into an object. To achieve this, we'll create another method in cZipFactory to handle the Decompress operation. This method will extract the byte array from the provided cZipObject, decompress the array, and then translate the bytes into an object. The Decompress method in cZipFactory will return a Serializable object, which can be cast into the original type of object.

Using the java.util.zip.Inflater class, we can easily decompress the byte array in a few lines of code. Given the compressed byte array and the original size of the byte array, the Inflater class can be used to decompress the byte array. As this function could be useful in a variety of situations, we'll create a method in the cZipFactory class named DecompressBytes. The method will accept a byte array containing the compressed bytes and a primitive integer for the size of the decompressed array. At this point, it's very important that we know the original size of the byte array (see Listing 4). Without this information, it wouldn't be possible to accurately predict the total size of the decompressed bytes without extracting the data in a loop. Knowing the original size of the byte array makes the decompression code easier and more efficient.

With the ability to decompress a byte array in place, we then move to the process of converting the bytes back into a usable object using an instance of ObjectInputStream. First, we'll create a ByteArrayInputStream using the decompress byte array. Using the byte stream, we'll construct a new ObjectInputStream to reconstitute the object. By invoking the readObject method, the ObjectInputStream will translate the byte stream into a usable object. To simplify our coding, we'll place this code in a method named ConvertByteToObject in the cZipFactory class (see Listing 5).

The final step is to create a Decompress method in the cZipFactory class that will accept a cZipObject and return a Serializable object. The completed Decompress method is shown in the cZipFactory class in Listing 3.

Using the cZipFactory class, we can now decompress a cZipObject using a single line of code:

Serializable retObject = ZipFactory.Decompress(newZObject);

The Serializable object can then be cast into its original form or in the same line of code as the call to Decompress:

Vector vClientList = (Vector)ZipFactory.Decompress(newZObject);

In the end, the cZipFactory provides easy-to-use methods that translate serializable objects to and from compressed representations of objects. The entire compression API can be quickly implemented in just a few lines of code. Another important feature is the ability to use the function selectively rather than a system-wide change, such as compressing a socket. The resulting cZipObject can be extended or expanded to meet the requirements of an application or can be treated like any other Java object. This also allows for the reuse of a cZipObject, allowing the developer to cache a compressed object, effectively eliminating the need to redundantly perform compressions.

A Simple Client List Example
Now that we've built the classes to compress Serializable objects, we'll work through an example using the new objects. To begin, let's create a vector of client names. For our example, we'll create a vector with generic content, but you could imagine this list of clients being derived from a database call, an XML document, or some other data source.

Vector vClients = new Vector(1000);
for (int i = 0; i < 1000; i++)
vClients.add("Client # " + i);

The resulting vector, vClients, contains 1,000 entries and when serialized is 14,046 bytes. If the client machine connects using a 28.8 modem, they will retrieve this vector at approximately 3.33 KBS. At this throughput rate, it'll take the client machine approximately 4,200 milliseconds to download this list of 1,000 clients. If we wanted to add in compression, we'd add this line of code on the server:

//Using a pre-existing cZipFactory class instance
cZipObject zoClients = ZipFactory.Compress(vClients);

On the client machine, we add this line of code to decompress the cZipObject:

//Using a pre-existing cZipFactory class instance
Vector vClients = (Vector)ZipFactory.Decompress(zoClients);

Using this example, the Compress method executes in approximately 40 milliseconds. We would then transmit the zoClients object to the client machine, which when serialized is 2,296 bytes. At 28.8 modem speed, the cZipObject instance is downloaded to the client in approximately 690 milliseconds. The client then decompresses the cZipObject, casting the contents into a vector. The Decompression operation on the client takes an additional 30 milliseconds. The total time using compression was 40 + 690 + 30 = 760 milliseconds. When compared to the original download time of 4,200 milliseconds, the compression technique saved 3,440 milliseconds of client wait time and reduced the total object size by 11,750 bytes, resulting in 83.6% less bandwidth consumption. This is more than five times faster and is achieved with a few simple lines of code on the server and client.

Listing 6 provides a simple testing class that was used for this example and the benchmarks quoted in this article. By using this simple testing class, you can see that when applied to larger data structures, the compression functions make a more profound impact on bandwidth reduction and client wait times.

Expense of Compression
There are two primary expenses to this compression technique: increased memory usage and CPU cycles. This approach is compressing the serialized representation of an object, which requires that the object be serialized into an array that's then compressed and included in another serializable object. In addition to the increase in memory usage, there will be an increase in CPU utilization. The compression routines are comprised of arithmetic operations, which will result in increased CPU usage during deflation and inflation processing. For larger installations of these compression routines, it would be reasonable to expect notable increases in server CPU usage, which would need to be analyzed in terms of frequency and the size of the objects being compressed. As a benchmark, in one installation the server processed approximately 10,000 compressions an hour on objects ranging from 10K to 350K. The addition of compression functions resulted in approximately a 3% increase in CPU usage.

Another important factor to remember is that the client machines will also have increases in memory usage and CPU utilization to decompress the objects, or compress objects being sent to the server. The speed of these decompression routines will depend on the client machine hardware.

If you are writing distributed Java applications, whether they're EJB systems or custom RMI solutions, the introduction of compression routines can provide tremendous improvements to the response time and bandwidth consumption of your programs. One of the primary advantages to the approach presented here is its simplicity, allowing the developer to continually work with objects and avoid the compression functions. Using the cZipFactory also allows the developer to avoid socket-level operations or the creation of disk files, retaining the structure of existing programs and making it possible to selectively implement the functions. Another benefit of the cZipFactory is the use of standard Java libraries, making the compression function available in both J2SE and J2EE applications.

For our applications, the performance of the compression routines has been excellent, with minimal server impact and network usage down by 85%. Today, of the approximately 3,000 client machines using the compression classes, there have been no reports of problems with CPU utilization or memory usage. Overall, the introduction of compression was the single largest performance improvement made in our five year development effort.


  • "Compressing and Decompressing Data Using Java APIs": http://developer.java.sun.com/developer/technicalArticles/Programming/compression/
  • Object Serialization in Java: http://java.sun.com/j2se/1.4.2/docs/guide/serialization/
  • Java Documentation for java.util.zip package: http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/package-summary.html
  • Java Documentation for java.io package: http://java.sun.com/j2se/1.4.2/docs/api/java/io/package-summary.html

    Calculating the Benefits of Compression

    There are a number of benefits to using serialized object compression, most notably the reduction in the size of the serialized output. The performance gain is directly related to the average object size, the bandwidth of the client connections, and the CPU processing power of the server and client machines. When determining whether to implement a compression function, these factors should be projected in order to ensure a positive gain. Consider this simple equation to determine if compression routines would be beneficial:

    [(Object Size bytes) × 8] ÷ [Line Speed kbs] = Avg. Download Time (ms)

    [10000 × 8] ÷ 128 = 625 ms

    Now, reduce the average object size by 80% and recalculate the download time; this time add an additional 100 milliseconds for processing time.

    [(Object Size bytes × 8 × 0.2)] ÷ [Line Speed kbs] + 100 = Compress Download Time (ms)

    {[(10000 × 8) × 0.2] ÷ 128 } + 100 = 225 ms

    In the chart in Figure 1 we see how the slight increase in processing time required for compression can create tremendous gains in download time.

    Regardless of the bandwidth from the server to the client, compression routines will have a definitive impact on network usage (see Figure 2).

    It's important to remember that at some point the law of diminishing returns becomes prevalent. For example, if the average size of the object before compression is 5,000 bytes, then compression could reduce this to as little as 1,000 bytes. The total expense of this compression would be about 100 milliseconds. If the client machines were on 28.8 modems, the compression would have a positive impact, reducing client wait time by about 1,100 milliseconds. However, if the client machines were on 512K connections, downloading the original 5,000 bytes would only take about 90 milliseconds. Even though the 1,000 bytes would take 17 milliseconds, we have now added additional processing time for the compress and decompress operations, potentially creating a negative return, and not significantly impacting download time.

    The chart in Figure 3 helps to illustrate how the benefits of compression on client wait time can be quickly reduced in higher bandwidth environments. It's important to note that while client wait time may not be significantly reduced by compression, network traffic will always be reduced. Even though the end user may not notice improvements, the network will always benefit from the reduction in throughput.

  • More Stories By Robert Beckett

    Robert Beckett is the Chief Architect for The Software Development Cooperative. Robert is currently working on two Java products: an extensive API set for Java developers www.thesdc.com\basesys\ and a high-performance Java RMI server www.thesdc.com\symtier\.

    Comments (2) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

    Most Recent Comments
    Robert Beckett 11/08/03 04:02:05 PM EST

    I would like to thank all of the developers for their supportive, positive, and constructive feedback. A number of developers have been using the zip classes from the article and are having great success; however, a few errors have been flushed out, namely with trying to zip small or empty objects. There are also some of us working on enhancing the cZipFactory to use streams, such as those from the java.util.zip package. We have created a web page with these corrections and ongoing updates. If anyone is having problems or has update suggestions, please check out the page at http://www.thesdc.com/basesys/zipzap_updates.html or e-mail me at [email protected]. Again, thanks for the kind words and great coding contributions!

    10/24/03 03:34:05 PM EDT

    I've been disappointed of late with the JDJ content (less technical, more fluff). However this article brings back memories of past technical, how-to JDJ articles that were the norm rather than the exception. Kudos to Robert Beckett for his excellent article and utility. Let's hope this article starts a trend at JDJ!

    @ThingsExpo Stories
    Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
    In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
    The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
    @ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
    If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
    SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
    SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
    In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
    One of biggest questions about Big Data is “How do we harness all that information for business use quickly and effectively?” Geographic Information Systems (GIS) or spatial technology is about more than making maps, but adding critical context and meaning to data of all types, coming from all different channels – even sensors. In his session at @ThingsExpo, William (Bill) Meehan, director of utility solutions for Esri, will take a closer look at the current state of spatial technology and ar...
    The Open Connectivity Foundation (OCF), sponsor of the IoTivity open source project, and AllSeen Alliance, which provides the AllJoyn® open source IoT framework, today announced that the two organizations’ boards have approved a merger under the OCF name and bylaws. This merger will advance interoperability between connected devices from both groups, enabling the full operating potential of IoT and representing a significant step towards a connected ecosystem.
    SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
    SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
    You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
    @ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
    SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 7-9, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
    Cloud based infrastructure deployment is becoming more and more appealing to customers, from Fortune 500 companies to SMEs due to its pay-as-you-go model. Enterprise storage vendors are able to reach out to these customers by integrating in cloud based deployments; this needs adaptability and interoperability of the products confirming to cloud standards such as OpenStack, CloudStack, or Azure. As compared to off the shelf commodity storage, enterprise storages by its reliability, high-availabil...
    Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
    The IoT industry is now at a crossroads, between the fast-paced innovation of technologies and the pending mass adoption by global enterprises. The complexity of combining rapidly evolving technologies and the need to establish practices for market acceleration pose a strong challenge to global enterprises as well as IoT vendors. In his session at @ThingsExpo, Clark Smith, senior product manager for Numerex, will discuss how Numerex, as an experienced, established IoT provider, has embraced a ...
    When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
    November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...