Welcome!

Java Authors: Elizabeth White, Trevor Parsons, Plutora Blog, Liz McMillan, Carmen Gonzalez

Related Topics: Java

Java: Article

Parsing Command Line Arguments with Java

Parsing Command Line Arguments with Java

One of Java's great appeals is that the language provides out-of-the-box GUI development capabilities. Still, a lot of us use Java to write command line tools. Such tools are great to automate batch and offline processes. This article presents a framework that jump-starts the development of such tools.

Command line tools are usually invoked from a shell (e.g., DOS prompt, sh, ksh, etc.) and perform a certain task. The task can be customized based on the command line arguments. For instance:

telnet foo.bar.com

attempts to open a telnet connection to host foo.bar.com. It uses the default telnet port. The next example:

telnet -p 3434 foo.bar.com

attempts a similar connection using port 3434.

Command line tools can be as simple or as complicated as the developer desires. An example of a simple command line tool is the echo command found in most shells. On the other hand, the Java compiler and the Java Virtual Machine (JVM) are complex command line tools.

Java presents the command line arguments in an array of strings. This is already a huge improvement over C and C++, in which the arguments are presented as an array of C strings, i.e., an array of pointers to arrays of characters. Yet it comes short of the developer's desire to get the arguments parsed and ready to use.

Since I published my 1997 C++ command line parsing framework (see Reference), many readers have e-mailed me with requests and suggestions. The top two requests have been for a Java implementation and an improvement to handle arrays of arguments. In this article I present a total rewrite of the framework with improvements for Java programmers.

Using the Framework
Before moving to the implementation I'll demonstrate how to use the framework to write a command line utility. Let's say you want to write a utility called "mycat" - like the UNIX cat - which takes a number of files and concatenates them together into a larger file. A -v option turns verbose output on and off. A -l option allows the insertion of extra empty lines between the files. The command would look like:

&127;mycat [-l ] [-v] file1 file2 ... .

In your Main class you need to add a Token object for each argument. In this example we have three Tokens: the number of lines, the verbosity mode and the input files. In addition, you need to add an ApplicationSettings object. This object is used to contain all the arguments.

The source code for these settings is shown in Listing 1. I first declare the sm_main variable and then the three Token variables: sm_verbose, sm_files, sm_lines. The arguments in the constructor of each token object fully describe the expected usage of the Token:

  • Is it a switch or an argument?
  • What is the switch's name (e.g., -v)?
  • What is its type (integer, string, etc.)?
  • Can it appear multiple times (e.g., -l, 1, -l, 2)?
  • Is it a required argument?
  • If not, when the argument is missing:
    1. Is there an environment variable to provide the value?
    2. Is there a default value?

    A static initializer adds the Token variables to the ApplicationSettings variable. By the time the main() function of your application is reached, the ApplicationSettings object knows everything about the syntax of your command line utility.

    Listing 2 shows the main program of my example. The first line after the try statement calls the parseArgs() method of the ApplicationSettings object. The actual command line arguments are passed as an argument to the object. When the syntax is incorrect, a usage message is printed and an exception is thrown. Otherwise, the Token objects are set to contain appropriate values. For instance, when the -v option is present, the sm_verbose object will be set. Later, when its getValue() method is called, it will return true.

    In a similar fashion, if two files are passed as arguments, let's say foo.cc and bar.cc, the sm_files Token will be set appropriately. Its getValue(0) method will return foo.cc, its getValue(1) method will return bar.cc.

    Now compile the example with your favorite development environment and run the resulting code without passing any arguments. You should get the usage message in Listing 3. But wait a minute: you never wrote code to print usage messages; what's going on here? It's very simple. The framework uses the same code that defines the expected Tokens to generate usage statements. Kiss the ugly, always-out-of-date, static String statements that describe the usage of the utility goodbye.

    Now let's run the program again with some decent arguments. Let's say we run it with arguments "-v foo.cc bar.cc". The program prints the arguments correctly. Though we didn't pass any value for the -l switch, the Token returns 0. This is the expected behavior because the default value of the sm_lines Token is indeed 0.

    Why Use the Framework?
    By now some of the advantages of the framework should be obvious to you. The error-prone while and switch statements that usually parse the arguments have been replaced by a few very readable statements.

    These statements:

  • Document the usage of the command line utility
  • Encapsulate the settings so they can be used by the rest of the program
  • Automatically generate usage messages when the user enters incorrect syntax:
    1. Missing arguments
    2. Unexpected arguments
    3. Wrong types of arguments

    The stated advantages speed up the original development of any command line utility. They allow the developer to jump to the real code as soon as possible. At the same time, they provide immediate access to the command line settings and usage messages.

    Where the framework really shines is in the area where most of a developer's time is spent: software maintenance. If a command line utility is successful, users will ask for changes and improvements. Many of them will translate to more command line options or change the syntax of existing ones. The framework makes adding and modifying options trivial and safe. Compile-time messages will save the developer from runtime embarrassment.

    Finally, the framework is extensible. One can define new types of switches that accommodate new data types or anything else a developer desires.

    At this point you can go ahead, download the code and start using it in your own applications. The next few sections discuss the design of the framework.

    The StringArrayIterator Class
    The StringArrayIterator class is a utility class (see Listing 4). It encapsulates an array of strings and a position inside the array. The get() method returns the String at the current position. The moveNext() operation on the array allows the programmer to advance the current position to the next string. The EOF() operation determines when the end of the array has been reached.

    The ApplicationSettings object contains a StringArrayIterator object. It gets initialized from the command line arguments.

    The Token Class
    The Token class, shown in Listing 5, is an abstract class. Each Token object contains a description of an argument or a switch. After a successful parsing it also contains the value or values that were provided for the argument in the command line.

    During the parsing phase, the most important methods of the Token class are the parseSwitch() and parseArgument() methods. Both of them take the StringArrayIterator object with the command line arguments as input. If the current command line argument is recognized, three things occur: it's parsed, the pointer of the StringArrayIterator object is moved and a value of true is returned. If it's not recognized, a value of false is returned.

    The values that correspond to this switch or argument are stored in a Vector of objects. Subclasses determine their class. For instance, the StringToken subclass will have String objects, and the IntegerToken subclass will contain Integer objects.

    While the program is running, the values are accessible using the getValue(int) and getValue() operations.

    Token Subclasses
    A Token subclass encapsulates arguments of a specific type. For example, there's a StringToken, an IntegerToken, etc. Since most of its methods have a generic implementation, each Token subclass has very few methods to implement.

    Listing 6 presents the implementation for the class StringToken. A few more subclasses are provided in the downloaded code. You can extend the framework by implementing more subclasses.

    The ApplicationSettings Object
    The ApplicationSettings object puts everything I've discussed so far together (see Listing 7). It contains all the Token objects and initiates the parsing algorithm. The user triggers the parsing by calling the parseArgs() method.

    The command line arguments are assigned to the StringArrayIterator member of the class. Then, for every command line argument, each Token object is called and asked to parse it as either an option or an argument.

    If no Token object can parse the argument, a usage message is printed by iterating through the Tokens and calling their printUsage() and printUsageExtended() methods. Both methods take an OutputStream as an argument. They print their output to this stream.

    Pure Java and Impurities
    Almost all the code is pure Java. Since pure Java doesn't provide support for environment variables and assertions, I had to use the functions provided in my environment, the Win32 Virtual Machine.

    These few lines of code are carefully isolated in the util.java file shown in Listing 8. In a pure Java environment you can comment out three lines of code from this file. You don't get assertions and support for initialization of arguments from environment variables. Otherwise, everything else works as advertised.

    Limitations
    The framework doesn't provide support for complex scenarios. For instance, there's no support for switches that depend on each other. You can't dictate that the -t option can appear if and only if the -p option appears. You'd have to implement such checks yourself after the arguments were parsed.

    Conclusion
    In this article I presented an extensible Java framework. The framework simplifies the development and maintenance of code that parses the arguments of command line utilities and tools.

    The framework doesn't provide support for complex scenarios. Still, my experience is that the framework covers most common cases. I expect that it will be as useful for Java development as it has been for C++.

    Reference
    P. Kougiouris (1997). "Yet Another Command-Line Parser." C/C++ Users Journal, Vol. 15, No. 4, April.

  • More Stories By Panos Kougiouris

    Panos Kougiouris has ten years' experience in software development for high-tech companies. For the past three years he has been at Healtheon, a Silicon Valley startup, and he has held technical positions with Oracle and Sun Microsystems. Panos holds computer science degrees from the University of Illinois at Urbana-Champaign and the University of Patra, Greece.

    Comments (1) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    AST 12/04/04 10:52:06 AM EST

    Hi. Just wanted to point out another package for solving this problem. It supports popt-style autohelp as well as POSIX options, joined options (-Wall -Dfoo=bar), repeated options and of course GNU-style (--some-long-option) options.

    Where the library really differs is that it leverages the GoF Command Pattern to make the options "active" in a similar manner to the Swing Action objects. Another feature is the ability to specify which sets of options must be present or cannot be present without requiring coding this logic yourself. The parser does the work for you.

    An article discussing how this can be done at http://te-code.sourceforge.net/article-20041121-cli.html .

    Ok, and yes, I'm a bit biased because I wrote the library... ;)

    Hope this helps,

    ast

    @ThingsExpo Stories
    In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial Cloud.
    SYS-CON Media announced that Splunk, a provider of the leading software platform for real-time Operational Intelligence, has launched an ad campaign on Big Data Journal. Splunk software and cloud services enable organizations to search, monitor, analyze and visualize machine-generated big data coming from websites, applications, servers, networks, sensors and mobile devices. The ads focus on delivering ROI - how improved uptime delivered $6M in annual ROI, improving customer operations by mining large volumes of unstructured data, and how data tracking delivers uptime when it matters most.
    SYS-CON Events announced today that ActiveState, the leading independent Cloud Foundry and Docker-based PaaS provider, has been named “Silver Sponsor” of SYS-CON's DevOps Summit New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. ActiveState believes that enterprises gain a competitive advantage when they are able to quickly create, deploy and efficiently manage software solutions that immediately create business value, but they face many challenges that prevent them from doing so. The Company is uniquely positioned to help address these challenges thro...
    The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, data security and privacy.
    SYS-CON Media announced that Cisco, a worldwide leader in IT that helps companies seize the opportunities of tomorrow, has launched a new ad campaign in Cloud Computing Journal. The ad campaign, a webcast titled 'Is Your Data Center Ready for the Application Economy?', focuses on the latest data center networking technologies, including SDN or ACI, and how customers are using SDN and ACI in their organizations to achieve business agility. The Cisco webcast is available on-demand.
    IoT is still a vague buzzword for many people. In his session at @ThingsExpo, Mike Kavis, Vice President & Principal Cloud Architect at Cloud Technology Partners, discussed the business value of IoT that goes far beyond the general public's perception that IoT is all about wearables and home consumer services. He also discussed how IoT is perceived by investors and how venture capitalist access this space. Other topics discussed were barriers to success, what is new, what is old, and what the future may hold. Mike Kavis is Vice President & Principal Cloud Architect at Cloud Technology Pa...
    Dale Kim is the Director of Industry Solutions at MapR. His background includes a variety of technical and management roles at information technology companies. While his experience includes work with relational databases, much of his career pertains to non-relational data in the areas of search, content management, and NoSQL, and includes senior roles in technical marketing, sales engineering, and support engineering. Dale holds an MBA from Santa Clara University, and a BA in Computer Science from the University of California, Berkeley.
    The Internet of Things (IoT) is rapidly in the process of breaking from its heretofore relatively obscure enterprise applications (such as plant floor control and supply chain management) and going mainstream into the consumer space. More and more creative folks are interconnecting everyday products such as household items, mobile devices, appliances and cars, and unleashing new and imaginative scenarios. We are seeing a lot of excitement around applications in home automation, personal fitness, and in-car entertainment and this excitement will bleed into other areas. On the commercial side, m...
    The Internet of Things (IoT) promises to evolve the way the world does business; however, understanding how to apply it to your company can be a mystery. Most people struggle with understanding the potential business uses or tend to get caught up in the technology, resulting in solutions that fail to meet even minimum business goals. In his session at @ThingsExpo, Jesse Shiah, CEO / President / Co-Founder of AgilePoint Inc., showed what is needed to leverage the IoT to transform your business. He discussed opportunities and challenges ahead for the IoT from a market and technical point of vie...
    Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, addressed the big issues involving these technologies and, more important, the results they will achieve. Rodney Rogers, chairman and CEO of Virtustream; Brendan O'Brien, co-founder of Aria Systems, Bart Copeland, president and CEO of ActiveState Software; Jim Cowie, chief scientist at Dyn; Dave Wagstaff, VP and chief architect at BSQUARE Corporation; Seth Proctor, CTO of NuoDB, Inc.; and Andris Gailitis, C...
    SYS-CON Events announced today that CodeFutures, a leading supplier of database performance tools, has been named a “Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. CodeFutures is an independent software vendor focused on providing tools that deliver database performance tools that increase productivity during database development and increase database performance and scalability during production.
    Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile devices as well as laptops and desktops using a visual drag-and-drop application – and eForms-buildi...
    "People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
    Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In his General Session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, discussed how to take advantage of a multitude of compute options and platform features to make cloud the cornerstone of your online presence.
    Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
    As enterprises move to all-IP networks and cloud-based applications, communications service providers (CSPs) – facing increased competition from over-the-top providers delivering content via the Internet and independently of CSPs – must be able to offer seamless cloud-based communication and collaboration solutions that can scale for small, midsize, and large enterprises, as well as public sector organizations, in order to keep and grow market share. The latest version of Oracle Communications Unified Communications Suite gives CSPs the capability to do just that. In addition, its integration ...
    “The age of the Internet of Things is upon us,” stated Thomas Svensson, senior vice-president and general manager EMEA, ThingWorx, “and working with forward-thinking companies, such as Elisa, enables us to deploy our leading technology so that customers can profit from complete, end-to-end solutions.” ThingWorx, a PTC® (Nasdaq: PTC) business and Internet of Things (IoT) platform provider, announced on Monday that Elisa, Finnish provider of mobile and fixed broadband subscriptions, will deploy ThingWorx® platform technology to enable a new Elisa IoT service in Finland and Estonia.
    Advanced Persistent Threats (APTs) are increasing at an unprecedented rate. The threat landscape of today is drastically different than just a few years ago. Attacks are much more organized and sophisticated. They are harder to detect and even harder to anticipate. In the foreseeable future it's going to get a whole lot harder. Everything you know today will change. Keeping up with this changing landscape is already a daunting task. Your organization needs to use the latest tools, methods and expertise to guard against those threats. But will that be enough? In the foreseeable future attacks w...
    From telemedicine to smart cars, digital homes and industrial monitoring, the explosive growth of IoT has created exciting new business opportunities for real time calls and messaging. In his session at @ThingsExpo, Ivelin Ivanov, CEO and Co-Founder of Telestax, shared some of the new revenue sources that IoT created for Restcomm – the open source telephony platform from Telestax. Ivelin Ivanov is a technology entrepreneur who founded Mobicents, an Open Source VoIP Platform, to help create, deploy, and manage applications integrating voice, video and data. He is the co-founder of TeleStax, a...
    We certainly live in interesting technological times. And no more interesting than the current competing IoT standards for connectivity. Various standards bodies, approaches, and ecosystems are vying for mindshare and positioning for a competitive edge. It is clear that when the dust settles, we will have new protocols, evolved protocols, that will change the way we interact with devices and infrastructure. We will also have evolved web protocols, like HTTP/2, that will be changing the very core of our infrastructures. At the same time, we have old approaches made new again like micro-services...