(this post was originally part of the 7.0.0-alpha6 release blog post but later extracted into an own post).
In 7.0.0-alpha6 we introduced the concept of activity instances and the activity instance tree to the core process engine. This post explains the motivation and gives some insight into the internal implementation.
Why do we need an Activity Instance Tree?
The activity instance tree contains a node for each activity that is currently active in the process instance. Some activity instances are scopes (such as Embedded Subprocesses). Such scope activity instance may have child activity instances:
|Activity Instance Tree and Scopes|
So far so good. But is that not the execution tree provided by the process engine? To some extent yes, but not in all cases. There are numerous cases where the execution tree is not aligned with activity scoping in BPMN 2.0.
Scoping vs. Concurrency
The process engine uses the concept of a "child execution" both for scoping and for concurrency. If a non-concurrent execution reaches a parallel gateway, it will stay there and concurrent child executions are spawned for each outgoing sequence flow. The effect is that the child executions are nested below an inactive execution waiting in the parallel gateway (fork). This concept is an internal implementation detail and not aligned with the BPMN 2.0 specification and usually very hard to understand for new users of the process engine:
In the execution tree (left hand side) you will see an inactive scope execution waiting in the parallel gateway (even though the process instance has already completed that activity and is now executing the service tasks). The executions executing the service tasks are nested below the inactive scope execution waiting in the fork gateway. In the activity instance tree (right hand side), this execution is not visible.
In the activity instance tree, the root node in the activity instance tree represents the process instance itself (ie. the instance of the process definition activity) and its children represent active activity instances for both service tasks.
Another example is Parallel Multi Instance. Consider the following examples:
1) Single parallel multi instance task:
2) Multi instance subprocess:
The additonal executions in the execution trees on the left hand side are needed for internal process engine reasons. In cockpit we want to visualize the activity instance tree which is easier to understand and more aligned with BPMN 2.0.
Execution Tree Compacting and Optimization
A second reason for introducing the activity instance tree is that the process engine compacts and optimizes the execution tree at runtime. Consider the example of a parallel gateway with two usertasks. Initially both T1 and T2 are active. In the execution tree we will see the inactive concurrent root execution waiting at the parallel gateway and two child executions, one for each task. The activity instance tree has the same structure but the root node corresponds to the process instance itself and is not waiting in the parallel gateway. After the task T2 is completed, the process engine will compact the execution tree, prune the execution for T1 and replace it with the root execution. If the execution T1 references variables or tasks they are moved to the root execution. The activity instance tree look different: it still contains an activity instance for T1 and for the process definition itself.
|Execution Tee compacting|
A ramification of this behavior is that in the execution tree there is no concept of "activity instance identity". There is no unique identifier representing an instance of an activity. In general, it is not guaranteed that the same execution that enters an activity instance will be the same execution that completes it. In the example above T1 is started by an execution with Id=2 and ended by an execution with Id=1.
(This is why in general you should never use execution IDs for message correlation, BTW!!!)
In cockpit we want to allow users to select activity instances and explore their details (variables, tasks ...). If we used the execution IDs the following behavior could occur: the user selects T1 and we write the execution id to the url. While he is looking at the details, T2 is completed and the execution tree is compacted by the process engine. Now the user types F5 (refresh) in the browser. He gets an error since the execution he previously selected does not exist anymore even though the Task is still active. Fail!
(Even worse: he could send the effectively-non-deep-link to a colleague.)
Unifying the model for History and Execution
The third reason for adding the activity instance tree to the process engine is that it is better aligned with the way the process engine history works. In the process engine history we always had the concept of historic activity instances. Aligning history and runtime tree structures allows us to use the same client-side code in cockpit for visualizing both running an historic activity instances and activities. In addition: cockpit also works if history is turned off.
How is the Activity Instance Tree implemented?
The activity instance tree is squeezed into the execution tree. We added a new column to the execution table named ACT_INST_ID_:
|New ACT_INST_ID_ column added in Execution table.|
It was not necessary changing anything else as far as the database is concerned. The values of all other columns remain untouched and we have the same number of executions as before.
The activity instance IDs are generated by the Atomic Operations that start/ end activities:
|New Atomic Operation Type Hierarchy|
There are testcases verifying that activity instance Id generation and activity start / end listener invocation works as expected (ie. an activity instance ID is generated BEFORE the activity START listeners are invoked etc.).
How can I retrieve an Activity Instance Tree?
Currently you can retrieve the activity instance tree only for a whole process instance.
Using the Java API:
ActivityInstance rootActivityInstance = runtimeService.getActivityInstance(pi.getProcessInstanceId());
You can also use the REST API:
Step by step we will roll out the support for activity instances in public API, add an Activity Instance Query etc.
Will executions be deprecated?
I am not sure yet but my gut feeling is yes. The concept of "Execution" is a proprietary internal process engine concern and will gradually be replaced by the activity instance model. We might go down a road where you can still query for executions but you will in fact get activity instances as result, in order not to break the API. But not before 7.1 or probably 8.0 :)