Java IoT Authors: Elizabeth White, Liz McMillan, Pat Romanski, Roger Strukhoff, Yeshim Deniz

Related Topics: Java IoT

Java IoT: Article

An Introduction To Workflow and Workflow Management Systems

An Introduction To Workflow and Workflow Management Systems

The Java 2 Platform, Enterprise Edition (J2EE), especially its Enterprise JavaBeans technology, provides an industry standard for the development of distributed enterprise applications. EJB helps solve a major problem: providing distributed access to persistent data. But it doesn't solve a related problem: modeling the business processes that applications use to access and manipulate that data.

Workflow helps solve the problem of modeling and implementing business processes within enterprise applications. It's just as important a part of enterprise computing as data persistence and distribution. EJB models behavior at the object level and limited interactions with any one client. Workflow models behavior across objects, applications and even systems, coordinating multiple clients while externalizing the processes from the code so they're easier to understand, change and manage.

In this article I'll provide an introduction to workflow concepts and how they relate to developing J2EE-style systems. I'll adhere as much as possible to the concepts described by the Workflow Management Coalition (WfMC) and use terms defined in their glossary. However, because workflow standards are still being developed and its concepts can be as much opinion as fact, I'll also describe workflow in terms of the Java-based workflow automation software I use - the Verve process engine. In describing workflow I'll explain how it relates to other major parts of your system, the components of workflow and two major styles for using workflow. With this information you'll be prepared to evaluate how you should use workflow as part of your J2EE systems.

What Is Workflow?
The WfMC describes workflow as the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action according to a set of procedural rules. As a simple example, let's think about the basics of processing an insurance claim. Here's the gist of how the claims process works:

  1. Fill out the claim form.
  2. Approve or deny the claim.
  3. If approved, send the insured a check.
  4. If denied, send the insured a rejection letter.
This simple example could easily be enhanced to handle improperly filled-out forms, fraudulent claims and so forth. The activity diagram for this workflow is shown in Figure 1.

This workflow is a process - in this case one for processing an insurance claim. It contains four activities, each a task to be performed, each potentially performed by a different person. The workflow manages a set of information passed between the tasks: namely, the claim form. Besides the activities, the workflow also contains a decision point that splits the workflow into two parallel branches - approved and denied - and decides which branch to follow. Notice that it specifies what order the activities should be performed in, but not what work should be done during each activity. Workflow is more concerned with linking the activities together than with what work any particular activity does.

The first function we've performed here is to model the business process. That may seem like a trivial accomplishment for this simple example, but the requirements-gathering and modeling can be the most difficult part of a workflow effort. The second function is a question of how to enact the workflow. Before computers, workflows were enacted by passing a file folder of papers from one person's desk to another. Today we use computer systems to store the documents electronically, but we still need a way to manage the process of passing the document from one person's computer to the next.

Need for Tool Support
The WfMC defines a workflow management system (WfMS) as a system that defines, creates and manages the execution of workflows through the use of software, running on one or more workflow engines, and is then able to interpret the process definition, interact with workflow participants and, where required, invoke the use of IT tools and applications. A WfMS manages a workflow in two ways:

  1. Process definition
  2. Process enactment
Process definition is the act of "coding" the workflow, defining it in such a way as to describe what it will do when it runs. Depending on the WfMS, defining can be implemented through a declarative coding model or through a visual programming model (which nonprogrammers find easier to use). Either way, the resulting workflow model is a data structure that can be stored in XML or any other data format of choice. Process enactment is the act of running a process definition, much the way bytecodes are run by the virtual machine. I'll discuss the details of enactment shortly.

Workflow is best handled by embedding a separate WfMS tool within your application. The question of why your application needs a separate workflow tool is similar to asking why your application needs a separate database management system. Back when applications ran on mainframe computers and didn't share data, each application contained its own code to manage its data. But with the need to share data between applications, distribute it across networks and manage overhead issues like concurrency and security, DBMSs evolved. They help alleviate the need for applications to manage their own data. Similarly, workflow management systems help alleviate the need for applications to manage their business processes. The application can then delegate its business processes to the workflow engine, allowing the application to focus on using the business processes rather than on implementing them.

Role in Application
A workflow management system is neither a database management system nor an application server, although the three are frequently used together. Figure 2 shows how a WfMS fits into a typical system architecture.

The application server manages running applications and provides clients access to those applications through understood APIs. The server doesn't define the application, but it does store, execute and provide access to it. The problem with an application server is that it doesn't know how to coordinate a workflow. It allows a single client to access an application and coordinates several clients accessing an application, each within its own session. Workflow cuts across these sessions, specifying a series of such sessions - requiring that when one session ends, others must be scheduled to begin, and performing work that occurs outside the context of any client sessions.

EJB wasn't designed to provide workflow functionality. Entity beans are persistent and transactional, but they don't manage the session state or process, only domain object behavior. Session beans manage small bits of process, but only for a single client/server session (usually in a single thread), and provide poor support for transactions and persistence. If the application crashes or is shut down, entity beans preserve the state, but the session beans don't remember what they were doing when they stopped. A session bean isn't designed to coordinate a series of transactions coordinating multiple clients through a lengthy process during which the system could crash and restart.

A DBMS provides transactions and persistence (which supports the container's entity bean implementation), but it doesn't have a good way to tie together a common series of transactions as a workflow. Applications frequently need to perform a series of transactions, one after another, to implement a process. Because the DBMS doesn't help manage this series of transactions, the responsibility falls on the application. But because the application session management has poor persistence and transactions, it's not well suited for remembering which transactions have been completed so far and which still need to be run. Furthermore, application code that manages such transactions tends to be difficult to understand and maintain, so the processes they implement become buried and lost.

This is where a workflow management system comes in. It should be persistent and transactional (often by being implemented on top of a database management system, or perhaps as an EJB application), simplify modeling processes separately from the code that implements them, and make certain that when each transaction in a process completes, the next transaction begins. This frees the application from these concerns and allows it to concentrate on modeling the domain that the DBMS stores and that the WfMS manipulates.

Workflow Enactment
Previously I mentioned that a process is defined, then enacted. Enactment is a little more complicated than simply running the workflow. Each separate enactment is represented by a work item. The WfMC defines a work item as a representation of the work to be processed (by a workflow participant) in the context of an activity within a process instance. When a process is enacted, the WfMS creates a work item to represent that particular enactment of that particular process. In this way, when a process is run multiple times (by multiple users or repeatedly by a single user), each separate run is represented by a work item.

When a WfMS enacts a process, it enacts each of the process's activities in the order defined by the process. Just as enacting a process creates a work item to represent that enactment, enacting an activity creates a work item to represent the execution of that activity. If a particular activity is enacted several times, such as in a process loop, each enactment creates a separate work item.

An activity can be automated or manual. The work item for an automated activity is managed automatically by the workflow management system. For example, when a process work item is created, the WfMS automatically manages the work item by enacting the process's activities. The work item for a manual activity must be managed by an entity external to the workflow management system. Such an entity is usually a person - a user of the application - which the WfMC calls a participant (something I'll discuss later under Organizational Knowledge).

Manual work items are queued up on worklists. Each participant and organizational role (discussed below) has its own worklist. The items on a particular worklist represent the work that is available for the worklist owner to perform. If a worklist becomes too large, this indicates that the workflows are producing work requests faster than the worklist owner can perform them. A worklist is associated with an owner, not a workflow, so a single worklist often gets work items added to it by several different workflows.

Workflow Components
Now that we know what a workflow is, how does it fit into a WfMS? And how does it interface with the rest of the application? Workflow consists of three parts that work together:

  1. Organizational knowledge
  2. Domain knowledge
  3. Process knowledge
The relationship of these three parts is shown in Figure 3.

Organizational Knowledge
Organizational knowledge
is the set of users of the system and the groups they're in. Why is this important to the workflow? Because the activities of a workflow are performed by people, so the workflow management system needs to know which people are allowed to perform what activities. Each manual activity is assigned to an organizational role - a description of the person within the organization who should perform this work. The set of people within a role is defined by users' permissions and their interests and responsibilities within the organization, and the intent of the workflow developer. If an activity should be performed by a particular person, the role will describe just that person. The profiles we might define for the insurance claim example are shown in Table 1.

Each workflow user is represented as a participant - someone (or something) capable of performing work defined by an activity. Which activities a participant can perform depend on which roles the participant is a member of. If the participant is a member of a particular role, and an activity is assigned to that role, then - when that activity is enacted and a work item is created - the participant is allowed to perform that work item. If the participant weren't a member of that role, he or she wouldn't be allowed to perform the work item.

How does the WfMS know who the participants and roles are, and how they fit together? The WfMS accesses this organizational knowledge through an organizational model adapter. The model contains the organization entities (participants and roles) and their relationships. It gets its data from an external database, usually the databases that the enterprise already uses to model its employees and other users, such as LDAP.

Domain Knowledge
Domain knowledge
is the business domain that the application models. In our example the domain is insurance - specifically, claims processing. The WfMC distinguishes between application data - domain data that the workflow management system never uses - and workflow relevant data (WfRD) - domain data that the WfMS must be able to access. Most domain data is application data, but what we're interested in here is (as its name implies) workflow-relevant data.

A workflow is surprisingly unaware of and uninterested in most of what's going on in the domain. This is because a particular activity isn't much concerned with what work it represents, only that whatever work it represents is done when it needs to be done. The WfMS tells the application to "do this work now" and the application does it; it's up to the application to decide what it means exactly to do the work. So while the application typically needs lots of domain data to perform its work, the workflows tend not to be interested. However, workflows are interested in some domain data, especially to help make decisions within the workflow.

Using our insurance example, after approving or denying the claim, the workflow then needs to decide whether to send a check or a rejection letter. How does it decide? The approval activity should have modified the claim object to set an approved flag. (It may also set an "evaluated by" field so we know which adjuster approved or denied the claim, but the workflow isn't interested in that.) The workflow will look at this flag on the claim to determine which activity to perform next. When the workflow accesses the claim as workflow-relevant data, it will typically ignore the multitude of fields and relationships having to do with who submitted the claim, the specifics of the claim and so forth. The WfRD will look at the claim as nothing more than a big approval flag container. The workflow accesses its workflow-relevant data through a workflow-relevant data adapter. The adapter gathers the data from within the domain and presents it to the workflow in a simple way that's just what the workflow needs. A single workflow may use several different adapters to access different sets of data, and multiple workflows that want the same data presented in the same way can share the same adapter code (although they probably won't be able to share the same adapter instances).

Process Knowledge
Process knowledge
is the set of process definitions. These are the workflows that the system knows how to run. What's really interesting here isn't so much what process knowledge is, but what it isn't. Our insurance workflow example revolves around the insurance claim. The claim is created by the first activity, and is then used by all the subsequent activities. It's difficult to imagine an activity for this workflow that wouldn't somehow use the claim to do its work. Yet the workflow doesn't contain the claim. The claim object is domain knowledge, not process knowledge. The workflow does contain a reference to the claim, a key or handle that uniquely identifies the claim within the application. This way, for example, when an adjuster gets a work item to approve a claim, the application knows which claim to present to him or her for approval.

Similarly, when the workflow assigns the approval activity to the adjuster role, the workflow has no idea what the adjuster role really means or who within the organization is allowed to perform adjuster tasks. When the application asks the WfMS what work is available for a particular user, the WfMS runs that user's participant through the organizational model adapter(s) to determine what roles he or she fulfills. It then finds the worklists for those roles and tells the application that the work items on those lists are available for that user.

This separation of responsibilities can be confusing at first, but it's ultimately clean and powerful, and one of the strongest advantages of using a workflow management system. Although the WfMS has its fingers into lots of organizational and domain knowledge, it really separates the workflows from that knowledge. Then the workflows can focus on what work needs to be done and what sorts of people will do it, but workflows don't focus on the specific people who will do it or how they'll do it.

This separation allows the workflow designer to work fairly independently of the application designer and the LDAP administrator. It focuses the workflow designer on the work to be done and away from how it will be done. It allows enterprises to make major changes to their business processes while minimizing the impact on the applications that enable those processes.

Workflow Styles
In my work with designing workflows I've discovered two distinct approaches to using workflow:

  1. User-centric
  2. Automation-centric
A particular workflow can use either approach, or a combination of both. In practice, a workflow may be 100% user-centric, but it's rarely 100% automation-centric. Even the most automation-centric architecture still needs some small portion that's user-centric - as a last defense for error handling, if nothing else. In other words, it's difficult and not very desirable to automate everything.

This is the classic workflow approach. A person performs the work for each manual activity. The person sees the work item on a worklist and performs the work described by that work item. The person will typically interact with the application and the WfMS through a GUI (either in an AWT/Swing-style native window or through an HTML Web browser). These systems are relatively simple to build. The application developers have to implement the relevant GUIs for the users and add the code that lets the GUIs interact not only with the domain but also with the WfMS through its API. These GUIs then become a client part of the application.

This is the approach for automating workflow so that large amounts of work can be performed with a minimum of human intervention. The work for a manual activity is performed (whenever possible) by an automated system that's external to the WfMS and probably external to the application as well. The external systems interact with the WfMS through an API. For example, in our insurance claim example the "send check" and "send reject letter" activities could probably be automated by systems that print and mail the documents.

Systems that automate activities tend to be more complex than user-centric ones because of the difficulty in interfacing the WfMS to the external systems that do the work. The WfMS queues the work on worklists, just as it would for people. But whereas people have GUIs that give them access to those worklists, a typical out-of-the-box back-end system has no idea how to interface to worklists. They can use the same WfMS APIs that the GUIs use, but somebody has to write the code to tie together the WfMS APIs with the back-end system APIs.

This work can be simplified somewhat by using a messaging system, such as one that implements the Java Message Service API. This way, WfMS work requests can be queued as messages and the messaging system then has to worry about getting the back-end system to perform the messages. This also allows other systems besides the WfMS to make requests of the back-end systems in an asynchronous, persistent, transactional way.

The issue is then interfacing the WfMS API to a messaging API. As this interface code moves work requests between the worklist queues and the messaging queues, it has to avoid losing or duplicating any of the requests. Ideally, moving the requests should be performed transactionally, which requires a two-phase (distributed) transaction between the WfMS and the messaging system. Many such systems (both workflow and messaging) don't support external distributed transactions at this time. Likewise, a work request will often produce results data that needs to be stored in the database before the request is considered complete. This involves a three-way distributed transaction between the messaging, database and workflow systems.

We've now seen the importance of workflow and how it relates to the rest of our application, and basic workflow concepts. Workflow models business processes, something that applications can't do well and that databases can't do at all. A workflow management system separates the business process from the applications that hook into the business processes and manages the execution of those processes. It separates the process from the organization that performs the work and the domain in which the work is performed. Finally, a workflow management system prepares work to be performed by human users or automated systems, or a combination of both.

Much like database management systems 10 or 20 years ago, the workflow management systems' time has come. They are rapidly becoming an indispensable part of an enterprise application architecture.

More Stories By Bobby Woolf

Bobby Woolf is a senior architect at GemStone Systems (www.gemstone.com), a Brokat company, and a member of their Professional Services division. He specializes in developing application architectures using various J2EE technologies and embeddable tools.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
The IoT Will Grow: In what might be the most obvious prediction of the decade, the IoT will continue to expand next year, with more and more devices coming online every single day. What isn’t so obvious about this prediction: where that growth will occur. The retail, healthcare, and industrial/supply chain industries will likely see the greatest growth. Forrester Research has predicted the IoT will become “the backbone” of customer value as it continues to grow. It is no surprise that retail is ...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
DXWorldEXPO LLC announced today that ICOHOLDER named "Media Sponsor" of Miami Blockchain Event by FinTechEXPO. ICOHOLDER give you detailed information and help the community to invest in the trusty projects. Miami Blockchain Event by FinTechEXPO has opened its Call for Papers. The two-day event will present 20 top Blockchain experts. All speaking inquiries which covers the following information can be submitted by email to [email protected] Miami Blockchain Event by FinTechEXPO also offers s...
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, we provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...