Professor Marlon Dumas stands at the forefront of the process mining research at University of Tartu, where he is leading a group of 15 researchers focused on business process management and process mining. He is also the co-founder of Apromore - a collaborative business process analytics platform supporting the full spectrum of process mining functionality. On top of that, he is also the co-author of “Fundamentals of Business Process Management”
This is the Mining Your Business podcast, a show all about process mining data science and advanced business analytics. I am Patrick and with me, as always, my colleague Jakub.
Marlon Dumas a professor at the University of Tartu and co-founder of Apromore, an open source process mining tool, is joining us on the podcast today. And we are going to try and answer the question, what is process mining 2.0? And if you're wondering, wait, what even is process mining 1.0, then no worries. We have got you covered.
Let's get into it.
Dear process mining community, brace yourselves. We have been preparing the grounds for this kind of episode for quite some time. We discussed what process mining is, what it can do, and bring to your organization. What are some of the trends and how vital it is to ensure a proper user adoption. Today we take process mining to the next level.
We have invited a person who stands at the forefront of the process mining research at the University of Tartu, where he is leading a group of 15 researchers focused on business process management and process mining. There they are developing methods for automated discovery of the business process improvement opportunities from event logs. Please welcome Marlon Duma. Marlon, welcome to our show.
Thank you very much. For the invitation. I'm always very excited to share my passion for business process management in general because it is my strong belief that process mining is part of it. And for process mining in particular, you know that it's really the place to be at the moment in terms of helping organizations improve their processes.
It is. We are very excited to have you and we are also very fortunate to have a brief discussion with you on in Eindhoven a couple of months ago in the ICPM conference, which where you also had a little lecture. Today we have a very specific topic in mind where we are really going from process mining into this advanced augmented approach to process mining. But before we go there, I wanted to ask you a little question. At the moment you are a professor at the university, but at the same time you actually co-founded a software called Apromore that you essentially called that with this software you are trying to democratize process mining. Can you maybe introduce us to this software a bit, what is it about?
Yes, it is a bigger adventure, Apromore. I have been a researcher in the field of business process management and process mining for 20 years in BPM already and for ten years in process mining. For a very long time we were working on process modeling and process analysis, simulation, execution in my research group in cooperation with other groups. I was at the time in Australia, in Brisbane. Very exciting place in terms of BPM and we, we came up with the concept of these advanced process model repository where you could have models and data seen together and you could use them to look at your process from multiple perspectives, not just traditional process modeling, but also looking at the data and computing performance metrics and enhancing your process models, etc. And in 2010 we launched an open source project called Apromore. It stood for Advanced Process Model Repository, together with two other researchers, with the CEO of Apromore now, who is sitting in Melbourne and with one of our colleagues called Wil van der Aalst, who at the time was working in Eindhoven. And actually Apromore is a name that Wil van der Aalst coined.You know, ask if it was actually one of the directions he was working on and we were working with Wil and with Marcelo for quite some time evolving the Apromore open source software and bringing it from pure process. From an emphasis, let's say moral models and model enhancement to a fully fletched process mining tool it was what base it from the beginning we should make made us stand up with respect to all other open source initiatives at the time in the field of process mining naturally people took notice of it and in 2016 2017 we had quite a nice user base of the open source edition and, and then we had a lot of people knocking at our doors asking us like, can you help me to, you know, launch my process mining initiative, using your open source software, can you help us set it up, etc. It became too much at some point in time to do as a side work from our jobs as professors and we decided to launch Apromore and that Marcello La Rosa and I together with some other folks in our groups. And and we launched it. We got some seed funding from that universal Melbourne, very nice to, to redevelop all these open source code and more professionally. And we launched Apromore in 2019 as a spinoff of the University of Melbourne. And we are managing the company and growing it up and as you know and we still keep our part time professorships and our research groups because we strongly believe in this osmosis between research and the product and you know, a lot of the ideas I'll be talking about today are research ideas, but we hope in the next five years that we will be in a position to move them in the product as we as we move along.
That's very interesting. And I also have to say that both people, that you've mentioned both of your colleagues, Wil van der Aalst and Marcel LaRosa are on our shortlist for our future guests. So for dear listeners, I think in the future you will hear from both of them. So stay tuned. Anyway, thank you for this little introduction into, into the tool that you are helping develop. And I'm happy that there are also some open sources that people can just go and work with to get into process mining. But the main topic for our discussion today is really to see and to navigate through where process mining, let's say has been in the last couple of years where it is now and where it is heading. And I got to say that we are also seeing it within our own projects, we are, let's say, more business oriented. We are really, you know, implementing solutions in the real life. And I know that research is always a couple of steps ahead. And for that, I know that you've created this, this spirit pyramid of different steps in process mining and where it stands, and I'll just say it out loud. So there are five steps. The first one was the descriptive process mining going into diagnostic, predictive, prescriptive and all the way to augmented and our goal for today's episode is really to dissect step by step what do you mean by that and where are we going? And hence the first question, what is descriptive process mining?
Yes. Before I say, what is descriptive process mining, I would like to pick up on your metaphor of the pyramid and it's truly a pyramid is the right metaphor because it's like what I'm going to talk about is not something that should be approaching a big bang. It should be approached incrementally. It's very important if you want to fail do it big bang, but if you want to succeed, you have to start from the bottom. So and the bottom of the pyramid that we are trying to build in my research group is descriptive process mining and descriptive process mining, which, you know, you can relate to descriptive analytics means you try to describe the A-Z situation and there are three main operations in descriptive process mining, automated process discovery. You take the data, you discovered the process, the rework loops, etc., the structure of the process, where are the exit points, the entry points and so on. And in there I will also put something called task mining. Although the data is different, you know, where you try to go down to a particular task and you try to understand how the tasks perform arguably using different data. So it's different, but at the end of the day, it's about descriptive, describing the process. Then there is a second capability of descriptive process mining is conformance checking. Conformance checking is about you take your business rules, your policies, your reference processes, and you try to compare reality against the processes and you find some deviations and you'll find all of them and so on. And the third component is what I will call performance mining process performance mining, which is looking into the performance of your process from the perspective of a for example, time or cost or revenue or a customer metrics, quality metrics, etc. and that's something we do on top. You can do on top of process models that you can do using different types of dashboards, a bi tools, etc. So that combination of these three capabilities is what I call descriptive process mining in everything I will say from this point on should only be attended by, you know, companies that have reached a point where they're are able to understand their processes. Otherwise you will be building castles offsite in sand. So, so that, that is very important. Yeah. On top of that, there's a whole pyramid that gets built, right?
So I mean when, when you were saying about all these things that you can do in the first descriptive, descriptive process mining like conformance checking, automated process discovery, I feel like a lot of companies can spend upwards of five years just on this step alone. Right. So is that something that is necessary? How much time do you think a company really needs to invest in this step to really get down to be able to go to the upper next level?
It is an ongoing process. You should never think about descriptive process mining as something you do in in one year or two years or three years. It's something you should do as long as you want to remain competitive which for many companies means always. So indeed, it will never be finished. Yeah, but you need to reach a certain level of maturity. And there are two tests I will do for that maturity. One is the data is there, the right data is there. So that is very important foundation and when you talk about spending years, one has to say that very often is from the data that we spend a lot of time Again, that should not be done in a big bang manner. You know, you shouldn't spend a humongous amount of time trying to just find the data before you derive that value from it. It should be done in spirals. So you should start with an important core process where it is feasible to get the data, for example, out of your ERP system or out of your CRM system or a combination of different systems. Within a few months. I will never recommend to spend more than a couple of months your first initiative before you can deliver value. And and immediately you can start doing an analysis of that data using descriptive process mining techniques and deliver value. And maybe after you have done that with two or three processes, or, or with a couple of end to end processes like your Order to Cash process, your procure to pay process then it is time if that is the right time or ready to, on the one hand, continue expanding your descriptive process mining efforts and on the other hand moving into the next step in the AUGMENT BPM pyramid.
Yeah, I feel like we should invite you to our scoping workshops when we are talking to our customers and telling them that it takes time and they really need to invest their time and effort into building this base for the pyramid which is descriptive process mining. However, we already discussed this in other episodes, so I would say since we have you here as a, as an expert in knowing and also looking and co-creating the future, I will say let's move to the next step right away, which is a diagnostic process mining, which is essentially something that comes up after you successfully implement and let's say interpret your process, correct?
Correct. And it comes very naturally you know, you understood your order to cash process and you understood roughly where the pain points, for example, where the bottlenecks to say something or where the rework loops or the higher effort activities and reality is never homogeneous or uniform when you look at a bottleneck or when you look at some other friction point, it's never that that happens across the board. It doesn't happen for all your customers. It doesn't happen for all your regions or for all your countries. When you are multinational companies, it happens in some places and not in others. Every process has a certain amount of what we called deviance, positive deviance. You know, some areas are performing better than others. And negative devience, which is just the counterpart, some areas are performing less well than the others, or some areas are complying with your process and some other areas of your organization are not. So then it is very natural ways you have done your first descriptive process mining effort, even though your first process to ask yourself, okay, what is the difference between those cases that are violating the process and those that are not violating or those that are complying or not comply? And that's where diagnostic process mining comes. So I it's like a doctor, you know, not only finds that you have a bit of fever or you are sneezing or, you know, you have a bit of a back pain now wants to diagnose why you're having it. So, so that operation, people call it variant analysis, some other call it comparative process mining and where you have been in the ground, you have done it. Maybe you just didn't have the word for it. Yeah. So it's like you take the event log of a process and you split it into to two or three or four, you know, for example, different product, different customers, different regions, and you start comparing and your goal is to find the reason why certain areas of your business are not performing according to expectations. And some so vendors call it root cause analysis. But I'm going to warn you, no tool at the moment as we speak, really use this causal inference techniques to detect root courses. So they use correlation techniques to do it. So they find that, well, this deviation happens mostly in Belgium, right? Or in the Netherlands. Yeah. And that does not mean that it's because you are Dutch that you deviate. Yeah. So or you are Flemish and you deviate. No, it doesn't mean that, you know, it's not the course read, but there are no there is a little bit of a it's a light at the end of the tunnel in the field of e-commerce and recommender systems. Companies like Facebook, Uber, booking.com, etc. are using a very cool set of techniques called causal inference or causal machine learning to really separate correlation from causation in streams of events. For example, customer visits to a website and to tell you what is the cause behind the fact that a customer bought or didn't buy, what are the causal factors that affect and and we will see in the next year or two, I think 2022 will be a year vendors bringing up this causal machine learning technology into process mining and and really realizing the vision of diagnostic process mining where you really are able to dig down on the causes of poor performance.
Now I had a question regarding the diagnostic step because I, I sometimes see that we have clients with millions of cases and I don't know how many hundreds of millions of activities. So the space in which this process can vary is huge. Right. So in this almost infinite possibility of where things can go wrong, how do you even go about asking the right question about your process? Right. Because you gave the metaphor of the doctor, you know, you're giving the symptoms and then the doctor ask you some questions to, you know, kind of narrow down on what the causes, right? So how do we ask the right questions to get the best thing out of a process?
The questions should be driven by your KPIs, by your performance indicators, by your KPIs, which should themselves be driven by the company's strategy and its current situation within the market. So it really all has to be strategically aligned. So I think when we start a process mining effort I always say like, do not start it if you do know what KPIs you are trying to improve, that is very essential and when you do this creative process mining, you can get lost a little bit and forget your KPIs and your performance objectives. But as you move on and particularly when you're moving into diagnostic process mining, you have to have your KPIs very clear, your process performance indicators, the whole hierarchy from organizational, from strategy level KPIs to tactical KPIs to your process performance indicators and all the measures that are required. And your question should be driven by that. So I want to achieve a lower defect rate, for example. So your question will be calm. Why is it that in some areas of the business or for some types of customers, I am observing a high level of customer complaints, for example, and that will determine what your question is. Now, the reversal in the middle, the factors you're right in an event log, say with 10 million events and, you know, 100 columns, it's very difficult to find to navigate through it and find it. And that is why we need more automation in place. Yeah. And and that is where I think that the vendor have a little bit being a little bit lazy and have just used correlation techniques like we call it logistic regression to say, well, these are the five factors out of hundreds that are affecting your performance. And that is where I think that more effort is needed and we need to start moving into a whole set of machine learning techniques to automatically find, let's say the possible half a dozen to a dozen courses of your customer complaints, for example, if that is what you care about.
Speaking of causation on correlation is this something you know, speaking of the data that we work with, if we are talking about procurement to pay or any other process that comes from one system and then you see that process and you already see, okay, Germany is performing better than Italy or vice versa. You what you only see is some some trace some digital footprint from one system. And it doesn't probably give you the complete picture because I can assume that things like different design of the of the tool that the people are using or maybe different processes in terms of like a location can influence these differences between different instances. Do you then think that involving more data and you already mentioned at the beginning like a task mining and different sources for these for these processes could actually improve this root cause analysis.
There are 2 elements into the quality of root cause analysis. One is that the input and the other one is the method you need to have good input and you need to have a good method to find your root causes. So I was talking, I was a little bit complaining about the fact that the vendors, we have not yet invested enough into the method, but of course in parallel we should be also investing or organizations investing into the input. You know, a lot of the courses, why for example, some customers are complaining cannot be seen at the level if you only reason at the level of activities at the granularity that they are recorded in the enterprise system, you have to go one step deeper and you have to see, for example, that well might come from some data entry errors and those data entry errors You can see them if you dig into the screens that your workers are visiting when they are performing a task. And you can pinpoint that when they open a couple of Excel spreadsheets and they are performing a task of resolving a customer complaint, then the complaint will be resolved negative. Yeah, your net promoter score will go down and that is something you cannot see only with the enterprise system Data, you need to go into the task mining data.
So is there some initiative that should be part of all the diagnostic process mining steps to kind of put a focus on the data quality itself?
Absolutely, yes. All the input has to be good. So we need that data to have that data completeness. And we also need to have, of course, data quality attributes like the accuracy of the data, the homogeneity of the data. And so, I think in every step of the pyramid we need to be caring about data at the descriptive level, the diagonistic level and at the upper level as we discussed that, that that is a common theme. On the other hand, one should remember that if we do not deliver value, we are not going to climb this pyramid right we are going to be shut down before we have even managed to put the first step into descriptive, the descriptive part. So so it all has to be done in spirals. So you have to start by looking into one process, getting that Data throughout the enterprise system. Unless there is really no data in an enterprise system. I will not recommend the company to do task mining before process mining. You have to find the problem at the level of the rocks and the stones before you go on and look at the pebbles and sand. That's very important, keep in mind. But of course once you have seen issues at the level of the stones, then you can start triggering additional efforts to look at the details. And by that time you could have generated you should have generated enough value business value to justify the budgets that you will need to go to the deeper level of this mine.
Yeah, this is already fascinating, because, you know, a lot of customers and a lot of companies that are investing in the process mining actually get to this point where they are able to, you know, locate the bottlenecks, the waste and are able to act upon that and actually get the value out of it. However, then we are getting into what you call a process by process mining 2.0, and that's actually starting with predictive process mining. And this is, this is something very, very interesting. And I would like you to explain to us what has actually mean a predictive process mining.
Sure. Predictive process mining is the next layer of the pyramid. Yeah. Once you have gone through what I will call looking into your ACIS process from a descriptive perspective, spotting the issues, the performance problems with respect to your KPIs and analyzing the process. The next natural step is to start asking yourself what will happen if I do something in the process or what will happen to this case in the future? And that type of questions like about the future, that is the realm of predictive process mining. In any predictive process mining, there are two parts. One is a tactical management question, which many people have in in their organizations, which is what if I automate two or three activities in my process? What if I bring in two or three additional resources into a task? Or what if 10% of my workforce is sick of COVID omicron violent next week, which is something we are getting very worried about. Right. What will be the impact on my delivery times during the Christmas period, which is so important to me for example, if I am an e-retailer. So that kind of questions can be answered by means of digital process twins which usually are based on simulation. Traditionally, simulation is done using models that are manually built and anybody who has done a simulation using traditional methods knows that it takes months to develop a good simulation model, manually. Yeah, and here is where process mining is making a breakthrough. So by combining process mining with simulation specifically using process mining to discover accurate and reliable data driven simulation models, we are able to at the same time decrease the time it takes to get a good simulation model and also increase the accuracy and reliability of those simulation models. And, you know, we have invested a lot in Apromore on that and we can see the benefits of companies, you know, using whatif process mining, we call it, to go beyond finding problems to exploring opportunities and estimating the impact of those improvement opportunities. That is the first layer of predictive process mining. This what if process mining. The second layer is more at the operational level. You know, your processes running as we speak and there are numbers you care about, such as SLA violations. You want to avoid the silly violations and you want to know, for example, if you are a claims manager, you want to know which claims are likely to violate their easily and that's where you can use event logs, exactly the same type of events logs you use for descriptive and diagnostic process mining. You can use those logs in order to identify which cases are going to violate their SLA and that is done using a machine learning techniques, specifically supervised machine learning techniques like a decision tree ensemble models, deep learning techniques. And this is a technique that is already ripe for adoption and at the moment it's being done in a very manual manner. So we are constantly getting requests like I want to have a predictive dashboard for my claims handling process, or I want to have a predictive dashboard for my order to cash process. Yeah, and we are building these dashboard a little bit manually. The next step in this will be to get more automation. What are called automated predictive process monitoring where these machine learning models are built and maintained automatically showed that we can spread the use of predictive process monitoring more widely in your organization. So it's that it doesn't get confined to one or two processes, but it can really be used across the organization.
Now, I have a lot of questions after hearing all that. So my first question is how how are we supposed to envision this one, as you said, digital twin or simulation models? If you were to answer the question, what if I automate this step to 100%? Does it just go through like the case, just wander through the process until it gets to that activity? And instead of asking, well, I'm 50 times 50% likely to be automated or 50 plus percent be manual. And if we just set the automated to 100%, it's based on the historical data that I have, I'm most likely going to do this. Is that kind of right or am I completely off the mark?
Yes, I guess. Yes, it is correct. So predictive process monitoring techniques will typically be based on continuous stream base data ingestion where the data is going from your enterprise system into the predictive monitor, more or less in real time or possibly with a certain lag or in small batches, let's say like a refreshed data refresh every half an hour. Yeah. And then given the current cases, we take the current cases, we feed them into what we call a predictive monitor, which then uses a machine learning model to assign a probability of a negative outcome to every case or a probability of a debate. And then we look into those probabilities in different ways, could be by setting thresholds, which is what you were mentioning. If the probability of being late is more than 75% skip the task or perform this task automatically, etc.. Yeah.
Okay. So based on that, what are the, I would say the spaces in between activities, right? So like a coworker going across the desk and asking someone, right, that's not really tracked in process mining because there's just no activity for it. Right? So how good is this analysis and really kind of predicting these external analogies because there's really no data for it. Right? And as we've seen in the last year, it could be all of a sudden there's a worldwide pandemic or a ship get stuck in the Suez Canal or anything like that. These externalities that you just cannot simulate for. I mean, those are pretty big examples. But how good is the simulation at kind of dealing with all the externalities that could possibly affect the business?
Yeah. So so that is one of the biggest problems or limitations of predictive process monitoring. It is very important to keep it in mind no matter what people are telling you. Out there, you know, about A.I. is going to transform the world. A.I. is doing these, A.I. is doing that. Any technique that is based on machine learning is only as good as the data you have. And one of the qualities of that data is that is it reflective of reality in particular when reality changes, which is what happened with COVID, then your predictive models will be basically discontinuous thrown to the garbage. I had a lot of requests the middle of last year for, oh, we are really, you know, getting into trouble, you know, handling this COVID pandemic. You know, can you help us to build some predictive dashboards or you know, upgrade the ones we already have to do this or that? And in at least three cases, I rejected the project that I said no, and you should not do it, you know, no matter what. I know that my competitor always beating in your RFP is going to tell you that they can and it is false. You know, we have a reputation to maintain. So you cannot build up machine learning, model a supervised machine learning model on pre covid data and pretend that is going to tell you something about the behavior of the process during the COVID pandemic. So that is something you just have to acknowledge. It's a limitation in predictive monitoring. There is no black magic. What these techniques are doing is finding patterns in the past and trying to extrapolate that these patterns can be used to predict what is going ahead in the future, you know, so you have to take it with a grain of salt and you have to be very aware of the applicability of this technique.
So if you want to beat a robot, just act as unexpectedly as possible and you might stand a chance. However, I have a question regarding the adoption of these things. I mean, process mining as complex, it may seem on the outside is still explainable. You are still doing some visualization technique, all of whatever is happening in your system and you then show it and describe it via this process path with all its variants and with some practice and some you know, education, everybody can to a certain degree understand how do you go about explaining to business users that essential let all they know are KPIs and maybe some interpretation of the past whatever happened and tell them look here we have a very accurate predictive model for your future performance and then you are trying to persuade them that if they do some step that some let's say very senior person in the company will say doesn't make any sense then you're trying to persuade them if you change this you will generate 20% more revenue.
Yeah, there are two aspects in the answer. The first part is predictive process monitoring should not be used to prescribe what to do. It is the wrong technology to use. So never use a pure predictive model to say we should do this. Predictive process monitoring stops are telling you: "this is what is likely to happen", but it's not a technology that is suitable for telling you what you should do. Now, the second part of the answer is the following. The predictive process monitoring tells you that this case is going to be late, you know, and indeed this is this can be useful, it's great, and if my prediction is accurate, it should raise some bells and my process participants managers should do something about it. Everybody agrees on it. But indeed, for the process participant, for the worker to be able to do something with your prediction, the prediction has to be explainable. You should be able to say not only that this case will be laid with a probability of 90% but you should be able to give explanation as to why it is the case. For example, this is done in some of the examples in the predictive process monitoring engine of Apromore by showing you similar cases from the past because all of these techniques are based on identifying similarities between the present and the past. So you can show that this is a similar situations at another customer was in the past and that customer ended up in a negative outcome. And this kind of example driven explanations. It's a very pragmatic approach to get across the message and to instill some explainability into the techniques. There are some other more sophisticated techniques for explainability we are getting a little bit of traction like I think shop values, for example, where everybody says Explainability, yes, I know about it. Shop values or name or these, etc. Unfortunately, those other techniques are more suitable for people with a background in data science, in statistics. Your process worker or your manager out there will not necessarily be able to digest that kind of explanations. So yes, predictions should be accompanied by explanations and no, never use predictive process monitoring to prescribe what to do. That is the role of the next layer in the pyramid.
Absolutely. Absolutely. And I'm glad you mentioned that because as soon as you mentioned 90% probability or 10% probability, my mind went to people thinking about the weather like when they say, Oh, it's 10% chance that it's going to rain and then say, I won't need an umbrella, it's only 10%, and then it rains and then they're super unhappy. Like, the weather up told me it wasn't going to rain. So you know, basing your, your decisions on these probabilities, I can see how that can be a little bit dicey when it comes to these strategic goals. So now that you've already mentioned prescriptive process mining, can you tell us what that is?
Yes. So the next layer in the pyramid is prescriptive process mining the layer above predictive process mining. And I remind you, it is a pyramind and a pyramid also applies that not every process that you analyze using the lower layer can be analyzed or managed using the upper layer, because each layer in the pyramid add some requirements into the data that you have, the type of process that you have, the interventions you can do in your process, the type of questions you have. So never think this is a square building that everybody for every process should do this descriptive, diagnostic, predictive, and prescriptive process mining. No, only some processes will be suitable for diagnostic process mining only a subset will be suitable for predictive process mining and only some of the processes that are for which you will do predictive dashboards will be suitable for prescriptive process monitoring. Now what is prescriptive process monitoring? It's a technology that recommends. It is like a recommender system, that recommends you to do certain actions in order to for example, avoid SLA violations in a given case in the process or in a set of cases in your process, or to avoid a customer complaint. In prescriptive process monitoring there is an element of predictive process monitoring. Of course, these actions that you will do, like for example, giving a phone call to the customer, giving them a discount, assigning a specific case manager to their case. These actions are driven partially by predictions, you know, because we think this case will end up in the wrong side of the SLA. We are going to allocate a case manager, be it a special case manager to get this through the line on time. But now prescriptive process monitoring differs from predictive monitoring in that it does not necessarily will do something just because the probability is 90%, it does something because that thing will add value. What does that mean? You have a certain KPI say your KPI is 95% of my case is sure that I fulfilled that SLA. And then you put a prescriptive process monitor that tries to trigger actions in such a way that you can achieve that SLE but at the same time it's such a way that you optimize a certain cost function. Yeah, sure. I can achieve my SLA by just throwing in, you know, hundreds of claims handler, I just don't have them. You know, my resources are finite, every resource has a cost, so I need to trade off between different KPIs. Some are quality KPIs, others are cost KPIs. A prescriptive process Monitor is a name gene that is triggering interventions to maximize to optimize your KPIs, taking into account the predictions, but also taking into account the cost and the effects of different actions. And the the of technology you deploy is actually a little bit different from predictive process monitoring. The type of technology you use for prescriptive process monitoring is usually something called uplift modeling. Uplift modeling is an interesting machine learning technique or statistical technique that tells you what is the benefit of doing an action in a given state of a given process. Like, for example, by giving a phone call to the customer, you will reduce by 10% the probability that the case will be late. Yeah. So using that score is an uplift score. Then you can decide which customer should you call or not call so that your SLA is full field and your costs are optimized at the same time.
Yeah, I had a question about if you could actually bring us a use case and you, you just have so I guess we can skip this one. But building on the prescriptive process mining, we are actually getting to the last step of the pyramid and this is the peak where in your current understanding of this pyramid and this process mining we are striving for, we are trying to reach. So after the prescriptive we get into the augmented and I guess this is the moment where it's just the system doing things for us. Is that correct?
Yes. So augmented, I don't call it augmented process mining anymore, because it's broader than process mining in a certain way. I call it augmented process management or augmented process execution. Augmented process execution is where you go beyond prescribing, which means recommending beyond predicting, beyond describing or diagnosing and you actually link your insights to actions. Yeah, and there are two aspects in augmented process management. There is the action acting and then there is the interaction interacting. Yeah. So augmented BPM technology has these two aspects acting and interacting and acting means that you connect the insights you get, the signals you get from process mining, from the different layers of process mining, from your predictions, from your prescriptions, from your root cause analysis, etc. You connect them to an external system that allows you to trigger automatically certain actions like, you know, sending some emails to the customer or interacting with them in the middle of a conversation, etc. So that's a part triggering actions automatically in order to maximize a certain KPI, not only recommending but doing and learning through feedback loops from what you do. So learning that for certain types of customers is worth sending them an email, but for others it's better to send them a phone call, in a way becoming situation aware. that is the action part of Augmented BPM. And the second part is the interaction part. An augmented BPM system is also a system that is able to interact with the workers and with the end users automatically. And that is where there is a connection between process mining BPM and a conversational agents or conversational technology in particular chatbots. So a connection that so far has not really happened, but that I see will naturally happen somewhere around 2024 or 2025, whilst the lower layers of the pyramid reach a certain level of maturity. An augmented BPM system is a system that can use insights from predictions, prescriptions, diagnosis in order to understand what is the situation of the current of a given case and to have interactions with the workers and with the customers of the process in order to figure out how to take this case to a successful outcome while at the same time optimizing certain KPIs, for example, cost KPIs. So to summarize, an augmented BPM system is a system that uses insights from predictions, from prescriptions, from diagnosis, the lower layers of that process mining stack in order to make trigger certain actions in different systems and to interact with process workers, managers and end users in order to maximize certain KPIs which could be combinations of quality KPIs, cost KPIs, etc. This is something at the moment that is at the level of a vision of course, the marketing teams of the vendors, including possibly my own marketing team, will tell you that we do it and etc. But of course, you know, people who are in the ground understand that, well, we don't do it, but we want to do it and we need to walk in that direction. And I'm pretty confident that the maturity of the building blocks like chatbots, conversational agents, and in a situation of aware artificial intelligence are becoming ripe to start piloting this kind of technology in in real life settings.
Now, I had a question based on the maturity of this AI, because when I hear this, you know, my hair starts to stand up a little bit. You know, I get a little bit, you know, a little bit unsure if all these AI models are as good as we think. I mean, if you I mean, the setup for this cost function on what activity it should be executing seems like it is ripe for not thinking about all the potential outcomes the system could do. I mean, my first thought when you mentioned this was what if I say I want to increase my automation rate and the augmented process just says, well, if we want to do that, I am just going to do no activity because then I have a less chance of it being done manually. So I'm just not going to do anything anymore. Right. So just misinterpreting the tasks that we give it because machines, you know A.I , doesn't think, but it acts differently than we expect it to. So how much fear is there for you that something like this could happen?
Yeah. So that actually that kind of fear already starts at the prescriptive level and one has to be very careful. And also imagine I have a prescriptive process monitor on my claims handling process and these prescriptive processes monitor is recommending me to, for example, give us a phone call to a customer or to send them some message proactively etc. and imagine that I am following those recommendations. Yeah, or is recommending me to assign resources to a claim. That's a very good example because if I start assigning resources to claims that are running late, you know what will happen? Other claims that we're not going to run late will start running late. So that's what we called second order effects. So already prescriptive process monitoring if you follow the prescriptions can have unforeseen second order effects. So it's very important that every prescriptive monitoring system is embedded in a strict A B testing management system where everything that your every intervention that you're going to start doing first go through AB testing, starting with a small population of cases. So you will say, I will only follow the recommendations for 10% of cases and then I will compare what I observed there with 10% random sample of the remaining cases and I will figure out if we are doing the right thing, if the prescription is having the benefits that it should have, and if it's not having some other side effects somewhere. Yeah, so that is already true in layer four in prescriptive process monitoring, let alone in level five, which is where you are raising the issue where the system has some level of autonomy, it triggers actions. So naturally this should be done in a very controlled way, starting with a small amount of AB testing, you know, to show that indeed the system can be given some autonomy in a given setting, for example, autonomy in terms of triggering certain types of emails during the process. And then you can expand that to the entire set of cases and everything you do in layer four and layer five with a pyramid should go through AB testing in one way or another.
And so I'm assuming the results of this AB testing can be fed back into the model to kind of make it smarter and act differently the next time.
Absolutely. So in fact, it's interesting you mention that because this technology I was mentioned called uplift modeling, which companies like Uber like booking, like Airbnb or Facebook, are using to manage their recommendations. This same technique can take is used to analyze AB testing data. So you can feed into this technique, observational data, we call it, which is like historical data event logs, but you can also feed sources of AB testing into it and you can combine these two data sources to determine how good an intervention really is with respect to some KPIs. Of course, that is a quantitative part of it. In addition to that, there is a qualitative part to it. In AB testing is not just about looking numbers is also about interpreting them. As you say, I could be improving my costs, which means bring in the cost zero by doing nothing, but then I might be having some side effects somewhere else. So we also need to be looking at doing qualitative assessments and applying just commonsense in everything we do.
Marlon, this is all very fascinating, and I think we could talk about this for next 3 hours, but the truth is that the podcast episode is usually one hour, and we are slowly reaching that point. Where could go people and eventually learn about this more in-depth? I know you, you are a lecturer, but there must be also some other materials if there are, let's say, some, some listeners who really want to go deeper into this academic research on these topics.
Yes. I maintain a series of LinkedIn post in my LinkedIn account, where I am writing about these topics, for example, causal process mining, which is the diagnosis layer or predictive process mining etc. And in there I always like write about the vision and also some pointers to literature on the topic where you could go deeper. So that is one particular entry point. Other than that, I cannot put my finger on like a book about it or a reference that will really tell you the whole story because this is moving, these are moving pieces as we speak. These are visions that are just now being built in one year time. There will certainly be more authority references on, you know, if you want to know about predictive process monitoring go here, if you want to know about prescriptive process monitoring, go there or augment the BPM, go there. Pretty sure that's a question I can answer to you in one year time. So be sure you invite me to another podcast episode.
I was just about to say that, that we already need to schedule another meeting in a year's time to discuss this again in depth. Anyway, Marlon this has been a really lovely I am very happy to deep dive into this topic and with authorities like yourself. So thank you very much for accepting our invitation and for coming to the show.
Thanks to you, Jakub and thanks to you, Patrick.
But for the rest for your for your listeners we are as usual very happy to have you listening to our show and going deeper and deeper in our journey through process mining. I hope that you are enjoying it at least as much as we do because this is really fascinating. If you have any questions, you can reach to us on firstname.lastname@example.org You can tell about our podcast to your colleagues, your managers, or to people you know, who could be interested in process mining and help us grow our show. Thank you very much. Thank you, Patrick. Thank you, Marlon. And talk to you later. Bye bye.
Establishing Functional Centers of Excellence with Diederick Badon Ghijben, Celonis
October 26, 2022
Value Realization by Utilizing Process Mining with Jean-Marc Erieau, MANN+HUMMEL
October 12, 2022
Women in Process Mining with Gabriela Galic, Deloitte, and Christine Hunter, Celonis
August 17, 2022
GET IN TOUCH
Begin Your Process Mining Implementation
With over 350 process implementations, we know exactly what the crucial parts of a successful Process Mining initiative are
Data Science Team Lead