Welcome!

Java IoT Authors: Kalyan Ramanathan, Liz McMillan, Elizabeth White, Sematext Blog, Pat Romanski

Related Topics: @CloudExpo

@CloudExpo: Blog Post

Cousins of Cobol in Big Data Analytics

How DFSORT, REXX Support Big Data Analytics

In this  article  I would  like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that  have  already  heavy investment  on Mainframe  and  would like to continue  with the utilization of Mainframe can consider these  tools for further  expanding their Big Data Analytics reach.

DFSORT-  Sorting & Merging Large Data Sets :

  • Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
  • SORT operation accepts un-sequenced input and produces output in specified sequence
  • The Merge operation compares records from two or more files and combines them in order
  • DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
  • DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
  • A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
  • After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
  • Much like a SORT , MERGE statement is also called from a COBOL job
  • The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
  • As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
  • Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
  • With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
  • The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
  • As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.

REXX :

  • REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
  • REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
  • One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
  • There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
  • PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
  • Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
  • The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
  • REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
  • Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.

Summary : There is  a strong  need for enterprises  to  adopt Big Data  Analytics  and start mining the  huge sets  of  unstructured data which has been ignored so far to arrive at meaningful business decisions.  While  newer  frameworks like Hadoop  or  the new breed of  analytical databases are going to satisfy  this need,  however   enterprises  should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.

If there is a significant  investment  and organization direction is to use the legacy  platforms like Cobol, JCL, REXX, DFSORT  it is only prudent  to utilize best  of their capabilities  in arriving  at options for Big Data Analytics.

We are seeing   that  Big Data Analytics  is mainly dependent on Map / Reduce algorithms,  these  functions are aimed  at  crunching  large data sets, like reading the input files  and  create key/value pair   and map functions take these  key/value pairs  and generates  another  key/value pair.  Further Reducer function  also depends on  sorted  key/value pairs  and iterate them and reduce the output further.

If we look at the way this logic works,  there is a  heavy need for sorting, merging, string  manipulation and parsing all the way. Hence  the tools mentioned  above like DFSORT,  REXX  along with Cobol  will likely to satisfy  the Big Data needs  of large enterprises  if  they  have already invested  on Mainframe compute capacity.

 

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Onalytica. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
As data explodes in quantity, importance and from new sources, the need for managing and protecting data residing across physical, virtual, and cloud environments grow with it. Managing data includes protecting it, indexing and classifying it for true, long-term management, compliance and E-Discovery. Commvault can ensure this with a single pane of glass solution – whether in a private cloud, a Service Provider delivered public cloud or a hybrid cloud environment – across the heterogeneous enter...
"IoT is going to be a huge industry with a lot of value for end users, for industries, for consumers, for manufacturers. How can we use cloud to effectively manage IoT applications," stated Ian Khan, Innovation & Marketing Manager at Solgeniakhela, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
"We're a cybersecurity firm that specializes in engineering security solutions both at the software and hardware level. Security cannot be an after-the-fact afterthought, which is what it's become," stated Richard Blech, Chief Executive Officer at Secure Channels, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Information technology is an industry that has always experienced change, and the dramatic change sweeping across the industry today could not be truthfully described as the first time we've seen such widespread change impacting customer investments. However, the rate of the change, and the potential outcomes from today's digital transformation has the distinct potential to separate the industry into two camps: Organizations that see the change coming, embrace it, and successful leverage it; and...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
We are always online. We access our data, our finances, work, and various services on the Internet. But we live in a congested world of information in which the roads were built two decades ago. The quest for better, faster Internet routing has been around for a decade, but nobody solved this problem. We’ve seen band-aid approaches like CDNs that attack a niche's slice of static content part of the Internet, but that’s it. It does not address the dynamic services-based Internet of today. It does...
Internet of @ThingsExpo, taking place June 6-8, 2017 at the Javits Center in New York City, New York, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo New York Call for Papers is now open.
What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, discussed why and how ReadyTalk diverted from healthy revenue and mor...
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
The Internet of Things (IoT) promises to simplify and streamline our lives by automating routine tasks that distract us from our goals. This promise is based on the ubiquitous deployment of smart, connected devices that link everything from industrial control systems to automobiles to refrigerators. Unfortunately, comparatively few of the devices currently deployed have been developed with an eye toward security, and as the DDoS attacks of late October 2016 have demonstrated, this oversight can ...
"ReadyTalk is an audio and web video conferencing provider. We've really come to embrace WebRTC as the platform for our future of technology," explained Dan Cunningham, CTO of ReadyTalk, in this SYS-CON.tv interview at WebRTC Summit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...
Major trends and emerging technologies – from virtual reality and IoT, to Big Data and algorithms – are helping organizations innovate in the digital era. However, to create real business value, IT must think beyond the ‘what’ of digital transformation to the ‘how’ to harness emerging trends, innovation and disruption. Architecture is the key that underpins and ties all these efforts together. In the digital age, it’s important to invest in architecture, extend the enterprise footprint to the cl...
"Dice has been around for the last 20 years. We have been helping tech professionals find new jobs and career opportunities," explained Manish Dixit, VP of Product and Engineering at Dice, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...