Welcome!

Java IoT Authors: Yeshim Deniz, Pat Romanski, Liz McMillan, Elizabeth White, Zakia Bouachraoui

Related Topics: Java IoT

Java IoT: Article

Java & Stream Ciphers

Java & Stream Ciphers

In the 1990s, I worked extensively with the Winsock 2 interface and encryption when it first came out from Microsoft in Beta form; it was exciting in those days of networking because it allowed you to easily encrypt data through the networks.

When Java sockets came out, the encryption could be easily managed through a stream of data. After getting data in a socket stream and encrypting the stream as it passed through the Internet, I was hooked on Java. While C++ was prevalent, it didn't seem to have streaming algorithms ingrained in the language as well as Java did. Java created a secure programming language that separated itself from the operating systems and network internals, while the streams created a layer to process a continuous wave of data that could be encrypted and further the evolution of programming techniques.

In this article I discuss streams from the cipher perspective and provide an example of how to design and build a stream algorithm so you can practice proper techniques rather than rely on the technology to do it for you (an Ant script to deploy is included with the source code, which can be downloaded from www.sys-con.com/java/sourcec.cfm). The basic terminology to remember throughout the article is that a cipher is an encryption/decryption algorithm, and a stream is data processed a piece (either bit or byte) at a time. Knowing only those terms, you can build on the rest.

Theory
If you understand a lot about pattern matching and encryption techniques, and are familiar with Applied Cryptography by Bruce Schneier, this section may seem too simple. For the rest of you, I'd like to start with the basics of streams and encryption (I've also written a book, Java Security Solutions, that provides good information on the topic).

As I mentioned earlier, a stream is data that's processed one bit or, more likely, one byte at a time. It's worth noting that most algorithms will only work on a byte as a whole, not at a bit level. A stream cipher is both a decryption and encryption algorithm for streams. Encryption is used to change readable text - plaintext - into a nonreadable form - ciphertext. Decryption does the reverse. While most ciphers follow various block cipher modes based on the original DES, a true stream cipher (like RC4) comes in handy with unknown block sizes.

For those who are unfamiliar with ciphers and keys, a cipher is the engine that decrypts and encrypts data. The key is the extra data needed for the engine to complete the task. In the early days of cryptography, only a handful of people knew the algorithms and only they could encrypt and decrypt data. As time progressed, most algorithms became published specifications that anyone could access, and the key became the missing piece to ensure that not everyone could encrypt and decrypt the data. The key is a very important element that must be protected at all costs, especially if the key is symmetric. A symmetric or secret key is one in which the same key is used for encryption and decryption. Anyone who has access to the key can easily decrypt a message. Finding the algorithm that was used to encrypt is not complex, because like a virus, an encrypted message may also contain signatures that can describe its originating algorithm. The size of the key will determine how easy or difficult it is to break an encryption simply because a smaller size key has fewer possible choices, while a larger key has more.

To understand the concept of a stream cipher, part of the basics, let's discuss the theory of a key stream, sometimes called a running key. A key stream, in theory, is a continuous key that's constantly and randomly generated to produce the ciphertext. In other words, each key byte generated is XORed with a plaintext byte to produce a ciphertext byte for the size of the plaintext. In theory, if we have true randomness and the key is infinite, then the encryption could not be broken. The larger the key, the more secure the ciphertext, because there are more permutations of a key that have to be broken. The more random the key, the harder it is to break, because any pattern becomes harder to reverse.

The algebraic notation for the previous discussion can be represented as ci = pi ‰ ki. The symbol ci stands for the ciphertext at index i, the symbol pi stands for the plaintext data at index i, and ki stands for the key at index i. As we'll see in the RC4 algorithm, XOR is great because the same algorithm can be used in reverse. That is pi = ci ‰ ki , which means I can find the plaintext by XORing the ciphertext and the original key.

One of the practices that evolved from running a key theorem is using a product of one ciphertext byte as the key for the next plaintext byte. Another evolution of the running key is the idea of using the key to generate a larger set of keys by hashing initialization data to generate an S-box (a substitution box). S-boxes are discussed later, but can be described as creating a vector from a key to manipulate the data in an algorithm.

When all is said and done, we need to have a key (say 40-bits in some cases) with as few patterns as possible and the key needs to be kept secure. If you remember anything from this article, safeguarding the key must be the highest priority, since it controls access just like the keys to your car or house. The other point to remember about keys is that size does matter.

Practical
In the previous section, I expressed the need for the key and its randomization. The providers of Java realized this need, and came up with a more random number generator than most libraries provided. Therefore they produced the java.security.SecureRandom class to reseed the generator. The idea is to create a variation of different keys that no one could guess at. Many random generators are not truly random, and some keys could be guessed by knowing the generator. Random generators use a seed to help give it something different to generate a new number. Many may use the time of day as a seed. Others may even randomize the seed, but if there is a flaw in the randomization, the seed won't be any better. The SecureRandom class reseeding process does not initialize the random number generator, but factors the initial seed with the next to produce a new seed. For this reason alone, some may consider using Java for their encryption needs.

Other utilities that play a big role in secure programming in Java are the KeyStore, the jarsigner, and the security manager. While this article is too brief to describe these utilities in detail, you need to know that Java provides a utility called the keytool to store keys in a secure store, that Java has a jarsigner utility to sign a JAR file so it can't be written into without a certificate, and Java has a security manager that can control which resources can be accessed during runtime using a security policy. These utilities can control access to vital resources and data. I'd like to note that these resources come out of the box in Java as well as many encryption algorithms and, again, this is a benefit of using Java.

The de facto stream algorithm is RC4. RC4 stands for Ron's code number 4, Ron being Ron Rivest, the "R" in RSA. Since it's the de facto stream algorithm, I'll use it as an example to design, build, and deploy a stream algorithm. The reason you should understand cipher algorithms and their uses is not just to know how to use them, but to understand when to use them, their vulnerabilities, strengths, and how to develop your own algorithms.

After spending many years as a consultant, I've heard programmers proclaim, "I just need to know how to use it, not what it does; there are builders of the JCEs who are concerned about those talks." Yet, I have gotten a lot of consulting work reworking some organizations' code, usually because someone didn't understand the algorithm correctly. Just as an e-commerce programmer may understand the internals of JSPs and EJBs, a security programmer needs to know the internals of RC4, RSA, and other algorithms. From the IT security officer's point a view, programmers should be able to give the reasoning, strengths, weaknesses, and history behind the algorithms that they're using. Not understanding a cipher algorithm in enough detail could make misusing the algorithm worse than not having an algorithm at all.

RC4
RC4 is the de facto stream algorithm that was made public by an anonymous cypherpunk contributor in 1994. The knowledge of the algorithm was made public, but I believe its commercial use is still licensed through RSA. I'll leave the reader to contact RSA before using it commercially. Examples of RC4's commercial uses include Oracle and wireless technologies. I'd imagine that's because of the small and large sets of data needed in these products.

Unlike a stream cipher, a block cipher uses chunks of data, a block, and usually 64 bytes to process through the cipher. If the block ends at less than 64 bytes, the algorithm pads the remaining block. For data that may be a few bytes, this may seem like a lot of overhead. For data that's time-consuming with a lot of I/O, the breaking up of blocks may seem to take up a lot of time. The solution to many may be to use a stream that handles any size data and is quick to process. Some of the places that a stream cipher may be a detriment would be using it for document files in which you wouldn't want plaintext and ciphertext lengths to match. I tend to use stream ciphers with stream I/O, especially Java sockets, when speed is important. Some users of RC4 state that RC4 is 10 times faster than DES. When using RC4, pay careful attention to the keys. If the same key is used over and over again, it could be compromised by constant observation and, if the key is not adequately randomized, it could be weak.

When using a cipher in Java, understanding the cipher itself, like RC4, is only a piece of the puzzle. An understanding about the Java Cryptography Extension (JCE) and service providers becomes paramount when using any cipher. A large part of understanding how a provider is accessed through the provider chain, how to access the provider, and how the algorithm is used in KeyGenerator and in CipherSpi is crucial. Understanding these concepts is important because programmers may be using a service provider without understanding the origin of the cipher they're using. In other words, programmers need to understand how to prevent Trojans and backdoors by understanding the origins of what their code is using.

All the key generators and ciphers in Java are built using the Service Provider Interface (SPI) layer. The idea of an SPI layer is to provide vendors with the ability to create their own algorithms with the use of a common interface. Since Sun supplies this interface; it allows others to commercially produce extensions that could work within the 1.4.1 SDK while not having to be built with the SDK. This article provides the code necessary to create a provider (code can be downloaded from www.sys-con.com/java/sourcec.cfm). All providers are registered with Sun to ensure that Sun knows who is integrating and interfacing into the 1.4.1 SDK.

Provider
The first step in using a provider is to get the SPI loaded in the provider chain. One way to load a provider is to place the provider in the %JAVA_HOME%jrelibsecurity java.security file as a line item like security.provider.1=com.richware.cipher.RichProvider. Another way is to load it at runtime in code as shown in the com.richware.cipher.TestRC4Cipher class using the code Security.insertProvider At(new RichProvider(), 1);. Either way will give a provider interface that's defined by its class and position in the provider chain. Where the provider falls in the chain is also important. There could be three RC4 cipher algorithms defined with the alias RC4, but the executing programs will pick up the first alias in the chain. I defined my provider as the first, so if there are other RC4 algorithms with the same alias, mine will be executed, not the others. If hackers place their provider in the chain while code is executing, it could give them a doorway to your data.

Using the example class com.richware.cipher.RichProvider, the provider class is simple and there are a few things to remember. I declared the class as a final class so as not to allow the class to be extended; the class is extended from the java.security.Provider class shipped with the Java 1.4.1 SDK. The RichProvider registers information about itself to describe its origins like its name, version, and info. If a programmer is executing providers and possibly one that they downloaded, this is important information since it allows you to discover the origin of the provider. A lot of security goes into the provider interface because the valuable data of an organization that they encrypt could be sent through the Internet through a rogue provider.

Another piece of the provider is that it associates aliases to classes, usually both a KeyGenerator and a Cipher alias depending on the algorithm. However, this is totally dependent on matching a corresponding key type to the algorithm. For example, I used the following code:

AccessController.doPrivileged(
new PrivilegedAction() {
public Object run() {
put( "Cipher.RC4",
"com.richware.cipher.RichRC4Cipher" );
put( "KeyGenerator.RC4",
"com.richware.cipher.RichRC4KeyGenerator" );
return null;
}
} );

This code simply means that when I pass RC4 in a KeyGenerator, it calls my KeyGenerator service provider code com.richware.cipher.RichRC4KeyGenerator. It will likewise call the provider's corresponding code when I pass RC4 in a Cipher instance. The code fragment executes this code in a block as a privileged action, which gets the JVM's Security Manager involved. All provider code must be signed in a Java Archive (JAR) file with a certificate from Sun so that security providers can possibly be tracked. In my example, when the richprovider.jar gets loaded, it has to be authenticated with a trusted certificate. You have to use the keytool to get the trusted certificate and the jarsigner utility to sign it in the provider's JAR file. Take a look at the sidebar "How to Get a Service Provider Certificate" for a set of simplified steps.

Looking at the steps in the sidebar, it's obvious that there are a lot of security provisions and traceability for using Java JCE providers. It was a lot different in the C++ days. In those days we just added a Dynamic Link Library (DLL) to the System32 path in Windows or a library in Unix. However, not to be preachy about the robustness of Java, you can examine the origin and execution of the JAR file. For instance, when in doubt about a JAR file, just move the JAR and isolate the execution of it to another system to examine. A trace of the JVM tracing into the JAR could be done to see if the JAR is Trojaned, but that is another discussion.

When using some of the more native libraries, it becomes more difficult to trace for Trojans through the libraries because it requires an understanding of the operating system that the native calls are integrated within. Some security consulting involves isolating the signatures for Trojans on libraries stored in the computer. These techniques help in host-based intrusion detection.

KeyGeneratorSpi
The purpose of the provider is to securely associate the correct KeyGenerator and Cipher code to the algorithm being called by the application. Most KeyGenerator code is responsible for generating a random key. Most of the differences rely on the key size. The RC4 algorithm allows for a key size of 1-256 bytes to match up to the size of the S-box. Now there are export restrictions, so when exporting to other countries, it may be necessary to limit the key size. You must contact the Department of Commerce for current export restrictions.

The code checks the key size in bits and, if it's the wrong size, it will throw an exception, otherwise it will generate a key based on the size. Most of the work is ensuring that the correct key size, given in bits and generated in bytes, is returned as a SecretKeySpec class.

Notice in Listing 1 that the SecureRandom class is used to create the random number key. The SecretKeySpec class is returned because the key is a secret key. One of the features of the code is that if you're not happy with Java's SecureRandom class and feel that you can build a better one, you can extend it and pass it in the KeyGenerator class to use it instead. A fragment of the RichRC4KeyGenerator code is provided in Listing 1.

CipherSpi
The CipherSpi is also started from the provider when it associates the Cipher.RC4 with the com.richware.cipher. RichRC4Cipher class. The key is generated in the previous section; now it's time to encrypt or decrypt the message. The CipherSpi normally takes in the mode of operation for the cipher engine, like the operation to encrypt versus decrypt. In the RC4 algorithm, there's no difference between encryption and decryption except for the input data. When the RC4 engine is initialized, it will first build its S-box. Many algorithms have multiple S-boxes, but RC4 has one array of S-boxes from 0 to 255. Listing 2 provides an example.

The building of an S-box involves manipulating the key from an initialized S-box to produce a new substitution box to be used in the RC4 algorithm. Simply put, an S-box is built from a key and some initialization code like a new key that cannot be deciphered. The idea is to build an S-box to swap data from a known index into an unknown index to avoid Guassian elimination in trying to reverse the algorithm. This is accomplished by using the random key to define the position of the next swap with the index2 variable. The counter variable, along with the index1 variable, ensures that all the S-box bytes are swapped at least once during calculations. The idea is simply to try to avoid any pattern and factoring of the S-box while having the same key produce the same S-box.

After the S-box is built from the key, the RC4 algorithm can be used to encrypt or decrypt the data. Listing 3 demonstrates the RC4 cipher.

From the code, you can see that each output byte of the RC4 algorithm is the product of each input byte that is XORed with an S-box value with the xorIndex. First, an x and y index is selected. The y is a product of the S-box from the x index. The two S-boxes are swapped. Then an S-box is selected, a xorIndex that is the product of two other S-boxes that are symbolized by x and y indexes. Again, the idea is to keep swapping the S-boxes and index to make the value and location difficult to produce by factoring. Finding ways to make finding patterns and reverse factoring difficult becomes the guideline for developing ciphers.

Testing the Program
The Test program is straightforward. The com.richware. cipher.TestRC4Cipher class will load up the RichProvider class and generate a key of a SecretKeySpec class that is 128 bits with the following:

kg.init(128);
Key secretKey = kg.generateKey();

The program encrypts the message "This is a test, hackers beware," that is 31 bytes to a 31 byte encrypted message similar to "cUi8DZfy+IQti6xl4Z4FhzRZl2mY2Pa7RmZygn VXnA==" depending on the key. After encrypting, I decrypt and compare the output to the original message to see if anything changes. Listing 4 provides the code fragment that accomplishes this.

Included in the source code is a "-v" option that puts the test code and provider in a verbose mode for readers who might want to trace the provider calls and S-box information. A point to note in Listing 4 is that the same secret key used for encryption is used for decryption. When building the richprovider.jar, it won't work without being signed by the Sun certification. A "java.security.NoSuchProviderException: JCE cannot authenticate the provider RichWare" exception should appear without the certification.

Summary
This article introduces a stream algorithm to help you understand the proper techniques to produce better and more robust algorithms in the future. Many standard cipher algorithms have been around for decades with little to add to them. You would think that it was due to the fact that they have not been broken, but some algorithms like DES have been broken many times. Understanding the strengths, weaknesses, and how the algorithms work may help us produce more secure data for our corporations and ourselves.

References

  • Schneier, B. (1996). Applied Cryptography, Protocols, Algorithms, and Source Code in C, Second Edition. John Wiley & Sons.
  • Helton, R., and Helton, J. (2002). Java Security Solutions. John Wiley & Sons.

    SIDEBAR
    Key Patterns

    To make the key pattern simple, if my key was all 0s and XORed with the plaintext, then the ciphertext would be the same as the plaintext. The cipher is useless. Now if the key contained only 1s and XORed with the plaintext, it would be easy to see the difference. If there was more of a mix of 1s and 0s throughout, it would become more difficult to see any relationship between the plaintext and ciphertext.

    SIDEBAR
    How to Get a Service Provider Certificate

    Here are simplified steps for getting a service provider certificate from Sun; of course, the Java 1.4.1 SDK documents go into a lot more detail. You'll need time to get this certificate from Sun. The following steps take you through the process:
    1.  Get a Code-Signing Certificate:

  • Use the keytool utility to generate a DSA key pair.
  • Use the keytool utility to generate a certificate-signing request.
  • Send the CSR (Code Signing Request) and other information to the JCE Code Signing Certification Authority at Sun.
  • After receiving the certificate from Sun, import the certificate using the keytool utility.
    2.  JAR the provider class (i.e., RichProvider) code using the JAR utility.
    3.  Sign the JAR (i.e., richprovider.jar) trusted certification with the certificate from Sun using the jarsigner.
    4.  Ensure that the JAR (i.e., richprovider.jar) is in the %JAVA_HOME%jrelibext path.
  • You may install the provider in a different classpath location, but it may require setting up a separate java.policy file. The default java. policy file is already set for the above directory by default.
  • More Stories By Rich Helton

    Rich Helton has worked on computer systems since 1982 and entered the
    private sector in 1990 as a lead computer architect with OmniPoint Data
    Corporation, the inventors of PCS. Since then, he started consulting as a
    Security Architect providing many organizations with secure Enterprise and
    Network solutions. He is the co-author of Java Security Solutions and the
    BEA WebLogic Server Bible.

    He has lectured extensively on security. Rich has many degrees and many
    certifications on the subject.
    [email protected]

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    IoT & Smart Cities Stories
    Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
    Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
    IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
    The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
    Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
    Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...
    Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
    To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
    In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
    Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...