In this short series of articles, I will explain how process discovery, a technique used in process mining, can be used on the event logs of a generic application or process to extract enough information to re-create an accurate process model in Business Process Model Notation (BPMN). BPMN is the standard widely used by BPM-based business applications.
The goal of my work is to create a BPMN-standard-compatible file, using a process mining discovery technique on event logs from an application. How can this be done?
First, some definitions.
“Process mining is the extraction of information and workflow models from event logs.“ (source: Process Mining, a research agenda, via ScienceDirect.com)
Business process discovery, a part of process mining, is a set of techniques to visually build the business process. There are many tools available for process discovery. For my work, I’m using bupaR (https://www.bupar.net/). bupaR is an open-source, integrated suite of R-packages for the handling and analysis of business process data. It currently consists of 8 packages, including the central package, supporting different stages of a process mining workflow.
In the world of Business Process Management (BPM), processes are modeled using the BPMN (Business Process Model and Notation) 2.0 standard, so this is useful to describe a process visually in a way familiar to application designers and business analysts.
Process discovery through log data
For process discovery, at least 3 pieces of information are required:
- case id
- task name
For this example, I’m using a fake dataset.
process data example
From this data, it is possible to generate a view of the process with bupaR:
Here is the how same process looks in BPMN format:
In BPMN, when multiple paths split or join according to a condition, a gateway is used. We can see that the gateways, symbolized by are not present in event logs, as they are specific to BPMN.
Without gateways, the model is less defined, its readability is degraded, and the BPMN format is no longer respected. Building a tool to detect gateways from log data would allow us to respond to this problem.
Is it possible to use process discovery to create a visualization of the process in BPMN format using only the event logs?
To do this, we must first be able to detect the split and join gateways. The next step will be to create an algorithm capable of detecting gateways and integrating them into the event logs.
In my next article, I will show more about this algorithm, what it does, and in another following article I’ll explain how to use it.
For further reading: