Java IoT Authors: Elizabeth White, Yeshim Deniz, Tim Hinds, Douglas Lyon, Stackify Blog

Related Topics: Java IoT

Java IoT: Article

Corba & XML

Corba & XML

Every now and then the computer industry gets swept up in a wave of enthusiasm for some new silver bullet that's apparently going to solve everyone's problems overnight. Actually, these days the wild surges of millennial euphoria seem to come at annual intervals. Usually the technology in question is actually a step forward, able to solve real problems better or faster than was possible before. However, as word spreads about the power of the new technique, some people will inevitably try to apply it to the wrong problems.

It's a bit like the enthusiasm for microwave ovens when they first became cheap enough for anyone to buy ­ one could buy microwave cookbooks explaining how to use them to cook everything from a complete Christmas dinner to a soufflé. Fortunately, after a while sanity returned, and people now use microwaves for what they're best at, and have gone back to making toast in the toaster or roasting the turkey in the oven, just as they always did, because they're the best tools for the job.

The same is true in the computer business, and as with cooking gadgets, it's important to get the balance right. Pointing out that you shouldn't try to make soup in your breadmaker doesn't in anyway diminish the fact that it's very, very good at making bread. In the same way, this article aims to put the current enthusiasm for XML in perspective without in any way detracting from or criticizing XML, which is an excellent tool for the job for which it was designed. However, the question "Will XML replace middleware?" is being asked so often at the moment that it seems appropriate to pen a few words on what applications XML is (and is not) suited for, and in particular why it isn't going to replace middleware solutions like CORBA (or vice versa, for that matter). To do this properly, we have to start with a little history. So, are you sitting comfortably? Then we'll begin.

A Little History
XML ­ eXtensible Markup Language ­ is a simplified subset of a previous markup language standard called SGML (Standard Generalized Markup Language) and was devised by a committee of the World Wide Web consortium in response to the need for a generalization of HTML, the HyperText Markup Language used to format Web pages.

SGML was conceived as a successor to document-markup languages like TeX, troff and nroff. These languages add formatting directives to plain text to tell typesetters, laser printers and other high-quality output devices how to format the text in various fonts of different sizes and styles. When they first appeared in the 1960s, markup languages were designed to be written by hand; one would use a text editor to create a plain text document, adding in the occasional markup directive to indicate that some piece of text should be printed in bold or centered or whatever. Of course, it was important to make sure there was no confusion between the content and the markup directives, so each family of markup languages had a set of conventions for separating them. For instance, in nroff and troff the directives are on lines beginning with a full stop (or period), while TeX begins directives with a "\" character.

As the use of markup languages became widespread, macros were added as a convenience feature. If headings in your document are to be displayed in centered bold 14 point Helvetica, it would soon get tedious to write four directives to change font, size, weight and justification for each heading. With a macro facility one can define a single command to do all this. Better yet, if you later decide your headings should be in Zapf Chancery instead, changing the definition of the "heading" macro automatically does the job everywhere you've used the macro.

Structure vs Presentation
Pretty soon authors creating complex documents found themselves maintaining large libraries of macro definitions and never using raw formatting directives in the documents at all. UNIX man pages are a good example ­ they're defined using the "man" macros for the nroff text formatter, making it easy to create manual pages with a consistent appearance.

During the '70s and '80s it became clear that the best way to use markup was by formalizing this approach: create a set of directives for describing the structure of the document as sections, subsections, bulleted items and so on, then separately define how to format those structural elements on paper. By keeping these two kinds of definitions (of structure and presentation) separate, altering the formatting of the documents or even reusing the content in new documents could be a completely mechanical process. Furthermore, automatic tools can process the documents to do jobs like building a contents page by listing all the headings. If your job is maintaining the many tons of paper documentation for (say) a commercial airliner, representing the logical structure of the document in this way is no small advantage since it allows the same source documents to be used to deliver information in a number of different formats. Again, UNIX man pages are a good example; when the manuals are printed on a high-resolution printer, using the same source text with a different library of (troff) macro definitions automatically creates book-quality manual pages rather than the screen-formatted pages generated from the same sources by nroff.

SGML was designed by ISO (the International Standards Organization) as a new standardized markup language that enshrined this separation of structure and presentation.

To apply SGML one creates a Document Type Definition (DTD) that defines the set of valid tags for the documents being created, and uses DSSSL (the ISO-standardized Document Style Semantics and Specification Language that accompanies SGML) to define how to display text labeled with those tags. Between them the DTD and DSSSL definitions fill the same role as the macro library in older markup languages.

SGML has achieved limited success in large organizations that maintain very large documentation sets, but the SGML standard alone is over 500 pages, and the accompanying DSSSL (rhymes with "whistle") standard is also rather large and uses a syntax based on the Scheme programming language, which some people find hard to learn. Many users lack the will or resources to climb the SGML learning curve.

Meanwhile, at CERN in Geneva, Tim Berners-Lee was creating a simple SGML DTD to define a few document structure tags like "heading" and "numbered list" for defining the structure of documentation to be shared between nuclear physicists over computer networks. This simple application of SGML, called HTML, didn't have any accompanying way of defining the appearance of documents ­ that was provided by settings in the Web browser used to display the HTML document. The original HTML specification was simply a conforming SGML DTD describing the syntax of HTML documents, with the added wrinkle that one of the tags defined a way to hyperlink to another HTML document.

HTML, of course, has been much more widely used than SGML, but as its use spread, two problems became apparent. The first was that HTML defined only the structure of Web page elements, with no associated way of specifying their presentation, so the Web page designers had no way of controlling exactly how their creations looked.

As Web pages became more sophisticated, with more graphic content, this became a serious problem, and ad hoc extensions were added to HTML to allow direct control of presentation by specifying fonts, font sizes, text colors and so on (which of course completely violates the original SGML design principles). At the same time, because the HTML had one fixed DTD, document designers had no way to create new structure tags to represent document structure in certain HTML applications. With neither an extension mechanism (like macros) nor a way of defining and controlling presentation, the original HTML fell neatly between two stools, and short-term product development pressures have inevitably pushed it toward being a presentation markup language that provides the Webpage designer with detailed control over how his/her document appears, rather than representing its logical structure. While this deals effectively with the primary purpose of Web pages, which is to be viewed by people using Web browsers, the increasing size and ubiquity of the Web is creating an increasing demand for Web pages that can be manipulated by Web-scanning "robots" such as the search engines that "read" and catalog millions of Web pages daily. It became clear that the lack of structured encoding threatened to slow down the development of the Web.

Enter XML
One solution to the problem of HTML's lack of structure would simply have been to step up one level and use SGML and DSSSL directly on the Web. However, the complexity of the ISO standards mitigated against this; something simpler was needed. In mid-1996 Jon Bosak, an influential member of the SGML community, persuaded W3C to set up an SGML Editorial Review Board and Working Group to define a simplified, extensible subset of SGML designed for the Web. The final XML 1.0 specification was published by W3C in February 1998, and will be complemented by two further specifications currently being prepared: XLL (the eXtensible Linking Language, for defining how XML documents are linked together) and XSL (the eXtensible Style Language, for defining how XML markup is formatted for display).

What Should XML Be Used For?
XML is being enthusiastically embraced in many application domains because a lot of applications need to store data intended for human use, but will also be useful to manipulate by machine. One example might be storing and displaying mailing list information. Defining and using an XML DTD for storing address data makes it comparatively easy to write applications to (say) generate address labels without inadvertently printing the phone number in the postcode field. There are a large number of initiatives to replace home-grown markup formats with applications of XML ­ examples include Bioinformatic Sequence Markup Language (BSML), Weather Observation Markup Format (OMF), the Extensible Log Format (XLF ­ a markup format for logging information generated by Web servers) and others for legal documents and real estate information, and many more. In each case the working group simply needs to define a DTD that defines the tags and how they can be legally combined. These DTDs can then be used with XML parsers and other XML tools to rapidly create applications to process and display the stored information in whatever way is required. Of course, there are still standardization issues to be addressed, such as who controls the libraries of tag definitions, how to manage version control in those libraries, and how to manage using multiple libraries simultaneously (especially when tag names collide). Nevertheless, using XML for these applications is a lot simpler than creating a completely new markup language from scratch every time, with a lot more scope for reusing the work of others.

One important point to note is that nowhere in the XML DTDs is there a way of specifying what an XML tag "means," just where it can be positioned in relationship to other tags, and (using XSL) how to format it on a display.

Creators of XML DTDs naturally choose short descriptive names for their tags just as PC users usually choose short descriptive names for their files, so it's very appealing to think that XML files are "self-describing," because to an English speaker it's intuitive that an <address> tag labels an address or a <date-of-birth> tag labels a person's birthday. However, this is just the intuitive "meaning" we assign to the terms by assuming that the creator of the DTD used these words in the way we would expect; if the creator of the DTD had instead specified his tags in a foreign language or using some private code, we'd be none the wiser. XML files are in fact just as "self-describing" as a C program or a database schema.

What Shouldn't XML Be Used For?
The common thread in XML applications is that the document content is intended to be read by people. Because XML is intended for marking up human-readable, textual data, it is by the same token a rather inefficient way of storing information that only needs to be machine-readable. The embedded XML tags provide a way to extract or format particular parts of the content, but the content itself won't usually be interpreted by the computers, only by the ultimate human user ­ which is why it makes sense to store it in human-readable form. Of course, it's perfectly possible to write parsers to read in (say) formatted floating-point numbers from an XML file so they can be processed, but it's relatively time-consuming, and the XML file would be relatively larger than one written in native floating-point format.

When the requirement is to exchange data between cooperating computer applications, there are other, more efficient ways of defining and storing the data. Traditionally these definitions of data formats for machine communication are called Interface Definition Languages (IDLs) because they're used for defining the interfaces between cooperating computer applications. In contrast to markup, which is used for the long-term storage of human-readable data, IDLs define the smaller packets of transient, machine-readable data that is exchanged between the components of a distributed application when some particular event occurs.

IDLs are the most visible components of a class of software known as "middleware," that class of software that is neither part of an operating system nor an application but is used to link the various parts of a distributed application spread across geographically separated computers. By their very nature, successful middleware solutions blend into the background, making few impositions on the users, designers and programmers of a distributed system. Today's most widely used middleware packages all implement the CORBA (Common Object Request Broker Architecture) specification, published by the OMG (Object Management Group).

Although IDL is the most visible aspect of middleware, there's much more to it than that: middleware solutions like CORBA also provide security to authenticate users and control access to resources, error handling to gracefully handle the failures inevitable in a distributed computing system, and a host of other support functions to keep computer networks running smoothly. In these sorts of distributed computing applications the data are transient, transferred between computers, often not permanently stored anywhere and probably never seen by human eyes. To use XML as the data encoding in such applications is less efficient than the compact, native machine representations used to marshal data in (for instance) the IIOP wire format used by CORBA implementations. Of course, if the requirement is to store data for the long term and extract human-readable summaries and reports, then XML would be the more appropriate medium ­ but for the data exchanges that tie together the components of a distributed system, using XML would be expensive and pointless.

XML and middleware are complementary technologies. XML is intended for the storage and manipulation of text making up human-readable documents like Web pages, while middleware solutions like CORBA tie together cooperating computer applications exchanging transient data that will probably never be directly read by anyone. Neither of these technologies will replace the other. Instead, they will increasingly be used together ­ not least in the specifications published by OMG, the body responsible for the CORBA specification.

More Stories By Andrew Watson

Andrew Watson, OMG's VP and technical director, also chairs OMG's Architecture Board, which oversees the technical consistency of all OMG's specifications. Previously he chaired OMG's ORB task force, which was responsible for the development and deployment of the CORBA 2 specification. Before that he spent six years with the ANSA core team in Cambridge (UK) researching distributed object architectures, specializing in distributed object type systems.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
BnkToTheFuture.com is the largest online investment platform for investing in FinTech, Bitcoin and Blockchain companies. We believe the future of finance looks very different from the past and we aim to invest and provide trading opportunities for qualifying investors that want to build a portfolio in the sector in compliance with international financial regulations.
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
Imagine if you will, a retail floor so densely packed with sensors that they can pick up the movements of insects scurrying across a store aisle. Or a component of a piece of factory equipment so well-instrumented that its digital twin provides resolution down to the micrometer.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settle...
Product connectivity goes hand and hand these days with increased use of personal data. New IoT devices are becoming more personalized than ever before. In his session at 22nd Cloud Expo | DXWorld Expo, Nicolas Fierro, CEO of MIMIR Blockchain Solutions, will discuss how in order to protect your data and privacy, IoT applications need to embrace Blockchain technology for a new level of product security never before seen - or needed.
Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
We are given a desktop platform with Java 8 or Java 9 installed and seek to find a way to deploy high-performance Java applications that use Java 3D and/or Jogl without having to run an installer. We are subject to the constraint that the applications be signed and deployed so that they can be run in a trusted environment (i.e., outside of the sandbox). Further, we seek to do this in a way that does not depend on bundling a JRE with our applications, as this makes downloads and installations rat...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
DX World EXPO, LLC, a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...
Digital Transformation (DX) is not a "one-size-fits all" strategy. Each organization needs to develop its own unique, long-term DX plan. It must do so by realizing that we now live in a data-driven age, and that technologies such as Cloud Computing, Big Data, the IoT, Cognitive Computing, and Blockchain are only tools. In her general session at 21st Cloud Expo, Rebecca Wanta explained how the strategy must focus on DX and include a commitment from top management to create great IT jobs, monitor ...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
The IoT Will Grow: In what might be the most obvious prediction of the decade, the IoT will continue to expand next year, with more and more devices coming online every single day. What isn’t so obvious about this prediction: where that growth will occur. The retail, healthcare, and industrial/supply chain industries will likely see the greatest growth. Forrester Research has predicted the IoT will become “the backbone” of customer value as it continues to grow. It is no surprise that retail is ...