Welcome!

Java IoT Authors: Liz McMillan, Yeshim Deniz, Zakia Bouachraoui, Pat Romanski, Elizabeth White

Related Topics: Java IoT

Java IoT: Article

Encoded Streams

Encoded Streams

Two basic types of data - test and binary - are used in applications to create files such as documents, images, video, text and executables. Certain applications, however, may need to alter a file to make it available to other applications; for example, e-mail requires text and binary data to be encoded before it's sent.

This article discusses a technique used to read and write encoded data using Java I/O streams. We'll define encoding and cover some of its history, examine two I/O stream classes and an interface, then finish by applying this technique to both a text and a binary file using the Base64 encoding scheme. With this technique you can provide encoding in your applications as well as encoded user information for authenticating against HTTP servers. This technique is provided using a standard, familiar group of Java classes: the I/O streams.

What Is Encoding?
Encoding manipulates and reorganizes bytes so they can be understood by other applications (see Figure 1). This is done primarily for Internet e-mail systems, but is also used in places like basic authentication. Basic authentication requires the user ID and password to be encoded using Base64. Although encoding has been around awhile, you probably never knew it. For example, your e-mail and attachments could be encoded before being sent and decoded when received. E-mails specify encoded content by using the Content-Transfer-Encoding header. This header field can have the following values:

  • 7Bit
  • Quoted-Printable
  • Base64
  • 8Bit
  • Binary

One side effect of encoding is a possible increase in the size of your data. It all depends on the encoding scheme you're using.

Now that we have some basics, let's look at the EncodedInputStream and EncodedOutputStream classes, which are used to read and write encoded data.

EncodedInputStream
The EncodedInputStream takes encoded data and give it back as a byte array. Convert this data to any form you wish, such as text (see Listing 1). Its constructor takes two arguments: InputStream and EncodingScheme. The InputStream course could be a FileInputStream or even a socket.

Base64EncodingScheme scheme = new Base64EncodingScheme();
EncodedInputStream eIn = new EncodedInputStream(new FileInputStream("encoded.txt"),scheme);
Byte data[] = eIn.readEncoded();
This class overrides the read method and adds a method called readEncoded, which reads encoded data and returns it as a byte array. The read method has been overridden to always return a -1. Initially this was done because the read method returns single bytes; when decoding data, you may be working with more than a single byte at a time.

EncodedOutputStream
The EncodedOutputStream writes out data using whatever encoding scheme you specify (see Listing 2). Its constructor takes two arguments: InputStream and EncodingScheme. The OutputStream can be almost any kind of stream, such as a FileOutputStream or a socket.

Base64EncodingScheme scheme = new Base64EncodingScheme();
EncodedOutputStream eOut = new EncodedOutputStream(new FileOutputStream("encoded.txt"),scheme);
eOut.write("This is unencoded data".getBytes());
This class will buffer output as it's written to the class, encode the data, then write it out to the actual OutputStream specified in the constructor. Use it as you would any other I/O stream - just write either an integer or a byte array and the data will be encoded using the scheme you passed into the constructor.

EncodingScheme
Let's look at the EncodingScheme interface. It's a class that provides different encoding implementations such as the Base64 used in this article (see Listing 3). Its two methods are encode and decode. The EncodedInputStream and EncodedOutputStream delegate to this class when writing and reading the data. Rather than impose different encoding scheme implementations on a user of the stream, developers can plug in different encoding schemes (Quoted-Printable, 7Bit and Base64) and use familiar methods to read and write data without requiring significant changes to their code.

Base64 Encoding Scheme
Before moving to our sample application, we need to implement an encoding scheme; I'll show the Base64 encoding scheme. This scheme basically reorganizes three 8-bit chunks into four 6-bit chunks (see Figure 2). These four 6-bit chunks are represented using a special NVT ASCII character set. The "=" sign is used to pad chunks that aren't a multiple of 3 bytes. You must also organize encoded data into chunks no greater than 76 bytes each. A more formal explanation is available in RFC 2045. As noted previously, encoding increases the size of your data. Base64 increases the size by approximately one-third.

The basic flow of the encode method is to work with 3 byte chunks at all times. When you reach the end of your data, pad with the "=" character. After each iteration of the loop, 4 bytes will be written out to the buffer. When the loop has completely passed through all the data, padding is added and the encoded byte array is returned. The decode method operates almost the same except it works with 4 byte chunks instead of 3 and ignores the padding character (see Listing 4).

Sample Application
Let's put our encoding scheme to use. Our first example encodes a Java source file, then decodes it (see Listing 5). Compile EncodingSample and then run it, specifying HelloWorld.java as the argument (see Listing 6). Once it's finished running, look at the contents of the encoded.txt file to see what the file looks like in its encoded state.

Now take the HelloWorld Java class file, encode it and then decode it. If you haven't already done so, compile the HelloWorld.java file and then run EncodingSample, specifying HelloWorld.class as the argument. Then look at encoded.txt file to see what the file looked like encoded. To prove the file was successfully decoded, type "java HelloWorld" - you should see "HelloWorld" printed out.

Enhancements
While EncodedInputStream and EncodedOutputStream allow you to easily read and write encoded data, some enhancements can be made. Buffering large datasets makes it easy to decode all at once but may cause intermittent OutOfMemoryErrors. Alternatively, data can be encoded and decoded in chunks rather than all at once. Due to time constraints I was unable to implement this feature.

Summary
It's easy to provide an extensible means to read and write encoded data using ordinary Java I/O streams. You can also provide your own EncodingScheme implementations and plug them into your code without changes. For all you sun.misc.BASE64Encoder users, you now have a documented way to use Base64 encoding. Good Luck!

More Stories By Mike Jasnowski

Mike Jasnowski is a senior software engineer on the BEA WebLogic Server Administration Console team. He has been involved in development for almost 20 years and in many industries. Mike is a contributing author to several books and author of JMX Programming (Wiley)

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. The IoT Global Network is a platform where you can connect with industry experts and network across the IoT community to build the successful IoT business of the future.
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear these words all day every day... lofty goals but how do we make it real? Add to that, that simply put, people don't like change. But what if we could implement and utilize these enterprise tools in a fast and "Non-Disruptive" way, enabling us to glean insights about our business, identify and reduce exposure, risk and liability, and secure business continuity?
DXWorldEXPO LLC announced today that Telecom Reseller has been named "Media Sponsor" of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...