Welcome!

Java IoT Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan, Zakia Bouachraoui

Related Topics: Java IoT

Java IoT: Article

An Introduction To Workflow and Workflow Management Systems

An Introduction To Workflow and Workflow Management Systems

The Java 2 Platform, Enterprise Edition (J2EE), especially its Enterprise JavaBeans technology, provides an industry standard for the development of distributed enterprise applications. EJB helps solve a major problem: providing distributed access to persistent data. But it doesn't solve a related problem: modeling the business processes that applications use to access and manipulate that data.

Workflow helps solve the problem of modeling and implementing business processes within enterprise applications. It's just as important a part of enterprise computing as data persistence and distribution. EJB models behavior at the object level and limited interactions with any one client. Workflow models behavior across objects, applications and even systems, coordinating multiple clients while externalizing the processes from the code so they're easier to understand, change and manage.

In this article I'll provide an introduction to workflow concepts and how they relate to developing J2EE-style systems. I'll adhere as much as possible to the concepts described by the Workflow Management Coalition (WfMC) and use terms defined in their glossary. However, because workflow standards are still being developed and its concepts can be as much opinion as fact, I'll also describe workflow in terms of the Java-based workflow automation software I use - the Verve process engine. In describing workflow I'll explain how it relates to other major parts of your system, the components of workflow and two major styles for using workflow. With this information you'll be prepared to evaluate how you should use workflow as part of your J2EE systems.

What Is Workflow?
The WfMC describes workflow as the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action according to a set of procedural rules. As a simple example, let's think about the basics of processing an insurance claim. Here's the gist of how the claims process works:

  1. Fill out the claim form.
  2. Approve or deny the claim.
  3. If approved, send the insured a check.
  4. If denied, send the insured a rejection letter.
This simple example could easily be enhanced to handle improperly filled-out forms, fraudulent claims and so forth. The activity diagram for this workflow is shown in Figure 1.

This workflow is a process - in this case one for processing an insurance claim. It contains four activities, each a task to be performed, each potentially performed by a different person. The workflow manages a set of information passed between the tasks: namely, the claim form. Besides the activities, the workflow also contains a decision point that splits the workflow into two parallel branches - approved and denied - and decides which branch to follow. Notice that it specifies what order the activities should be performed in, but not what work should be done during each activity. Workflow is more concerned with linking the activities together than with what work any particular activity does.

The first function we've performed here is to model the business process. That may seem like a trivial accomplishment for this simple example, but the requirements-gathering and modeling can be the most difficult part of a workflow effort. The second function is a question of how to enact the workflow. Before computers, workflows were enacted by passing a file folder of papers from one person's desk to another. Today we use computer systems to store the documents electronically, but we still need a way to manage the process of passing the document from one person's computer to the next.

Need for Tool Support
The WfMC defines a workflow management system (WfMS) as a system that defines, creates and manages the execution of workflows through the use of software, running on one or more workflow engines, and is then able to interpret the process definition, interact with workflow participants and, where required, invoke the use of IT tools and applications. A WfMS manages a workflow in two ways:

  1. Process definition
  2. Process enactment
Process definition is the act of "coding" the workflow, defining it in such a way as to describe what it will do when it runs. Depending on the WfMS, defining can be implemented through a declarative coding model or through a visual programming model (which nonprogrammers find easier to use). Either way, the resulting workflow model is a data structure that can be stored in XML or any other data format of choice. Process enactment is the act of running a process definition, much the way bytecodes are run by the virtual machine. I'll discuss the details of enactment shortly.

Workflow is best handled by embedding a separate WfMS tool within your application. The question of why your application needs a separate workflow tool is similar to asking why your application needs a separate database management system. Back when applications ran on mainframe computers and didn't share data, each application contained its own code to manage its data. But with the need to share data between applications, distribute it across networks and manage overhead issues like concurrency and security, DBMSs evolved. They help alleviate the need for applications to manage their own data. Similarly, workflow management systems help alleviate the need for applications to manage their business processes. The application can then delegate its business processes to the workflow engine, allowing the application to focus on using the business processes rather than on implementing them.

Role in Application
A workflow management system is neither a database management system nor an application server, although the three are frequently used together. Figure 2 shows how a WfMS fits into a typical system architecture.

The application server manages running applications and provides clients access to those applications through understood APIs. The server doesn't define the application, but it does store, execute and provide access to it. The problem with an application server is that it doesn't know how to coordinate a workflow. It allows a single client to access an application and coordinates several clients accessing an application, each within its own session. Workflow cuts across these sessions, specifying a series of such sessions - requiring that when one session ends, others must be scheduled to begin, and performing work that occurs outside the context of any client sessions.

EJB wasn't designed to provide workflow functionality. Entity beans are persistent and transactional, but they don't manage the session state or process, only domain object behavior. Session beans manage small bits of process, but only for a single client/server session (usually in a single thread), and provide poor support for transactions and persistence. If the application crashes or is shut down, entity beans preserve the state, but the session beans don't remember what they were doing when they stopped. A session bean isn't designed to coordinate a series of transactions coordinating multiple clients through a lengthy process during which the system could crash and restart.

A DBMS provides transactions and persistence (which supports the container's entity bean implementation), but it doesn't have a good way to tie together a common series of transactions as a workflow. Applications frequently need to perform a series of transactions, one after another, to implement a process. Because the DBMS doesn't help manage this series of transactions, the responsibility falls on the application. But because the application session management has poor persistence and transactions, it's not well suited for remembering which transactions have been completed so far and which still need to be run. Furthermore, application code that manages such transactions tends to be difficult to understand and maintain, so the processes they implement become buried and lost.

This is where a workflow management system comes in. It should be persistent and transactional (often by being implemented on top of a database management system, or perhaps as an EJB application), simplify modeling processes separately from the code that implements them, and make certain that when each transaction in a process completes, the next transaction begins. This frees the application from these concerns and allows it to concentrate on modeling the domain that the DBMS stores and that the WfMS manipulates.

Workflow Enactment
Previously I mentioned that a process is defined, then enacted. Enactment is a little more complicated than simply running the workflow. Each separate enactment is represented by a work item. The WfMC defines a work item as a representation of the work to be processed (by a workflow participant) in the context of an activity within a process instance. When a process is enacted, the WfMS creates a work item to represent that particular enactment of that particular process. In this way, when a process is run multiple times (by multiple users or repeatedly by a single user), each separate run is represented by a work item.

When a WfMS enacts a process, it enacts each of the process's activities in the order defined by the process. Just as enacting a process creates a work item to represent that enactment, enacting an activity creates a work item to represent the execution of that activity. If a particular activity is enacted several times, such as in a process loop, each enactment creates a separate work item.

An activity can be automated or manual. The work item for an automated activity is managed automatically by the workflow management system. For example, when a process work item is created, the WfMS automatically manages the work item by enacting the process's activities. The work item for a manual activity must be managed by an entity external to the workflow management system. Such an entity is usually a person - a user of the application - which the WfMC calls a participant (something I'll discuss later under Organizational Knowledge).

Manual work items are queued up on worklists. Each participant and organizational role (discussed below) has its own worklist. The items on a particular worklist represent the work that is available for the worklist owner to perform. If a worklist becomes too large, this indicates that the workflows are producing work requests faster than the worklist owner can perform them. A worklist is associated with an owner, not a workflow, so a single worklist often gets work items added to it by several different workflows.

Workflow Components
Now that we know what a workflow is, how does it fit into a WfMS? And how does it interface with the rest of the application? Workflow consists of three parts that work together:

  1. Organizational knowledge
  2. Domain knowledge
  3. Process knowledge
The relationship of these three parts is shown in Figure 3.

Organizational Knowledge
Organizational knowledge
is the set of users of the system and the groups they're in. Why is this important to the workflow? Because the activities of a workflow are performed by people, so the workflow management system needs to know which people are allowed to perform what activities. Each manual activity is assigned to an organizational role - a description of the person within the organization who should perform this work. The set of people within a role is defined by users' permissions and their interests and responsibilities within the organization, and the intent of the workflow developer. If an activity should be performed by a particular person, the role will describe just that person. The profiles we might define for the insurance claim example are shown in Table 1.

Each workflow user is represented as a participant - someone (or something) capable of performing work defined by an activity. Which activities a participant can perform depend on which roles the participant is a member of. If the participant is a member of a particular role, and an activity is assigned to that role, then - when that activity is enacted and a work item is created - the participant is allowed to perform that work item. If the participant weren't a member of that role, he or she wouldn't be allowed to perform the work item.

How does the WfMS know who the participants and roles are, and how they fit together? The WfMS accesses this organizational knowledge through an organizational model adapter. The model contains the organization entities (participants and roles) and their relationships. It gets its data from an external database, usually the databases that the enterprise already uses to model its employees and other users, such as LDAP.

Domain Knowledge
Domain knowledge
is the business domain that the application models. In our example the domain is insurance - specifically, claims processing. The WfMC distinguishes between application data - domain data that the workflow management system never uses - and workflow relevant data (WfRD) - domain data that the WfMS must be able to access. Most domain data is application data, but what we're interested in here is (as its name implies) workflow-relevant data.

A workflow is surprisingly unaware of and uninterested in most of what's going on in the domain. This is because a particular activity isn't much concerned with what work it represents, only that whatever work it represents is done when it needs to be done. The WfMS tells the application to "do this work now" and the application does it; it's up to the application to decide what it means exactly to do the work. So while the application typically needs lots of domain data to perform its work, the workflows tend not to be interested. However, workflows are interested in some domain data, especially to help make decisions within the workflow.

Using our insurance example, after approving or denying the claim, the workflow then needs to decide whether to send a check or a rejection letter. How does it decide? The approval activity should have modified the claim object to set an approved flag. (It may also set an "evaluated by" field so we know which adjuster approved or denied the claim, but the workflow isn't interested in that.) The workflow will look at this flag on the claim to determine which activity to perform next. When the workflow accesses the claim as workflow-relevant data, it will typically ignore the multitude of fields and relationships having to do with who submitted the claim, the specifics of the claim and so forth. The WfRD will look at the claim as nothing more than a big approval flag container. The workflow accesses its workflow-relevant data through a workflow-relevant data adapter. The adapter gathers the data from within the domain and presents it to the workflow in a simple way that's just what the workflow needs. A single workflow may use several different adapters to access different sets of data, and multiple workflows that want the same data presented in the same way can share the same adapter code (although they probably won't be able to share the same adapter instances).

Process Knowledge
Process knowledge
is the set of process definitions. These are the workflows that the system knows how to run. What's really interesting here isn't so much what process knowledge is, but what it isn't. Our insurance workflow example revolves around the insurance claim. The claim is created by the first activity, and is then used by all the subsequent activities. It's difficult to imagine an activity for this workflow that wouldn't somehow use the claim to do its work. Yet the workflow doesn't contain the claim. The claim object is domain knowledge, not process knowledge. The workflow does contain a reference to the claim, a key or handle that uniquely identifies the claim within the application. This way, for example, when an adjuster gets a work item to approve a claim, the application knows which claim to present to him or her for approval.

Similarly, when the workflow assigns the approval activity to the adjuster role, the workflow has no idea what the adjuster role really means or who within the organization is allowed to perform adjuster tasks. When the application asks the WfMS what work is available for a particular user, the WfMS runs that user's participant through the organizational model adapter(s) to determine what roles he or she fulfills. It then finds the worklists for those roles and tells the application that the work items on those lists are available for that user.

This separation of responsibilities can be confusing at first, but it's ultimately clean and powerful, and one of the strongest advantages of using a workflow management system. Although the WfMS has its fingers into lots of organizational and domain knowledge, it really separates the workflows from that knowledge. Then the workflows can focus on what work needs to be done and what sorts of people will do it, but workflows don't focus on the specific people who will do it or how they'll do it.

This separation allows the workflow designer to work fairly independently of the application designer and the LDAP administrator. It focuses the workflow designer on the work to be done and away from how it will be done. It allows enterprises to make major changes to their business processes while minimizing the impact on the applications that enable those processes.

Workflow Styles
In my work with designing workflows I've discovered two distinct approaches to using workflow:

  1. User-centric
  2. Automation-centric
A particular workflow can use either approach, or a combination of both. In practice, a workflow may be 100% user-centric, but it's rarely 100% automation-centric. Even the most automation-centric architecture still needs some small portion that's user-centric - as a last defense for error handling, if nothing else. In other words, it's difficult and not very desirable to automate everything.

User-centric
This is the classic workflow approach. A person performs the work for each manual activity. The person sees the work item on a worklist and performs the work described by that work item. The person will typically interact with the application and the WfMS through a GUI (either in an AWT/Swing-style native window or through an HTML Web browser). These systems are relatively simple to build. The application developers have to implement the relevant GUIs for the users and add the code that lets the GUIs interact not only with the domain but also with the WfMS through its API. These GUIs then become a client part of the application.

Automation-centric
This is the approach for automating workflow so that large amounts of work can be performed with a minimum of human intervention. The work for a manual activity is performed (whenever possible) by an automated system that's external to the WfMS and probably external to the application as well. The external systems interact with the WfMS through an API. For example, in our insurance claim example the "send check" and "send reject letter" activities could probably be automated by systems that print and mail the documents.

Systems that automate activities tend to be more complex than user-centric ones because of the difficulty in interfacing the WfMS to the external systems that do the work. The WfMS queues the work on worklists, just as it would for people. But whereas people have GUIs that give them access to those worklists, a typical out-of-the-box back-end system has no idea how to interface to worklists. They can use the same WfMS APIs that the GUIs use, but somebody has to write the code to tie together the WfMS APIs with the back-end system APIs.

This work can be simplified somewhat by using a messaging system, such as one that implements the Java Message Service API. This way, WfMS work requests can be queued as messages and the messaging system then has to worry about getting the back-end system to perform the messages. This also allows other systems besides the WfMS to make requests of the back-end systems in an asynchronous, persistent, transactional way.

The issue is then interfacing the WfMS API to a messaging API. As this interface code moves work requests between the worklist queues and the messaging queues, it has to avoid losing or duplicating any of the requests. Ideally, moving the requests should be performed transactionally, which requires a two-phase (distributed) transaction between the WfMS and the messaging system. Many such systems (both workflow and messaging) don't support external distributed transactions at this time. Likewise, a work request will often produce results data that needs to be stored in the database before the request is considered complete. This involves a three-way distributed transaction between the messaging, database and workflow systems.

Conclusion
We've now seen the importance of workflow and how it relates to the rest of our application, and basic workflow concepts. Workflow models business processes, something that applications can't do well and that databases can't do at all. A workflow management system separates the business process from the applications that hook into the business processes and manages the execution of those processes. It separates the process from the organization that performs the work and the domain in which the work is performed. Finally, a workflow management system prepares work to be performed by human users or automated systems, or a combination of both.

Much like database management systems 10 or 20 years ago, the workflow management systems' time has come. They are rapidly becoming an indispensable part of an enterprise application architecture.

More Stories By Bobby Woolf

Bobby Woolf is a senior architect at GemStone Systems (www.gemstone.com), a Brokat company, and a member of their Professional Services division. He specializes in developing application architectures using various J2EE technologies and embeddable tools.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use of real time applications accelerate, legacy networks are no longer able to architecturally support cloud adoption and deliver the performance and security required by highly distributed enterprises. These outdated solutions have become more costly and complicated to implement, install, manage, and maintain.SD-WAN offers unlimited capabilities for accessing the benefits of the cloud and Internet. ...
Discussions of cloud computing have evolved in recent years from a focus on specific types of cloud, to a world of hybrid cloud, and to a world dominated by the APIs that make today's multi-cloud environments and hybrid clouds possible. In this Power Panel at 17th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the importance of customers being able to use the specific technologies they need, through environments and ecosystems that expose their APIs to make true ...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
DXWorldEXPO LLC announced today that "IoT Now" was named media sponsor of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and G...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
DXWorldEXPO LLC announced today that ICC-USA, a computer systems integrator and server manufacturing company focused on developing products and product appliances, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. ICC is a computer systems integrator and server manufacturing company focused on developing products and product appliances to meet a wide range of ...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.