Trace2Vec-CDD: A Framework for Concept Drift Detection in Business Process Logs using Trace Embedding

Document Type : Semantic Technology-Kahani


Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.



Business processes are subject to changes during their execution over time due to new legislation, seasonal effects, and so on. Detection of process changes is alternatively called business process drift detection. Currently, existing methods unfavorably subject the accuracy of drift detection to the effect of window size. Furthermore, most methods have to struggle with the problem of how to select appropriate features specifying the relations between traces or events. This paper draws on the notion of trace embedding to propose a new framework (Trace2Vec CDD) for automatic detection of suddenly occurring process drifts. The main contributions of the proposed approach are: (i) It is independent of windows. (ii) Trace embedding, which is used for drift detection, makes it possible to automatically extract all features from relations between traces. (iii) As attested by synthetic event logs, our approach is superior to current methods in respect of accuracy and drift detection delay.


Main Subjects

[1]   J. C. Schlimmer and R. H. Granger, “Beyond incremental processing: Tracking concept drift.,” in AAAI, pp. 502–507, 1986.
[2]   J. Martjushev, R. J. C. Bose, and W. M. Van Der Aalst, “Change point detection and dealing with gradual and multi-order dynamics in process mining,” in In- ternational Conference on Business Informatics Research, pp. 161–178, Springer, 2015.
[3]   R. J. C. Bose, W. M. Van Der Aalst, I. Zliobaite, and M. Pechenizkiy, “Dealing with concept drifts in process mining,” IEEE transactions on neural networks and learning systems, vol. 25, no. 1, pp. 154–171, 2014.
[4]   R. A. T. Stocker, “Discovering workflow changes with time-based trace clustering,”
[5]   Lecture Notes in Business Information Processing, pp. 154–168, 2011.
[6]   B. Hompes, J. C. Buijs, W. M. Van Der Aalst, P. Dixit, and H. Buurman, “Detect- ing change in processes using comparative trace clustering.,” in SIMPDA, pp. 95– 108, 2015.
[7]   J. Carmona and R. Gavalda, “Online techniques for dealing with concept drift in process mining,” in Proceedings of the 11th International Conference on Advances in Intelligent Data Analysis, IDA’12, pp. 90–102, Springer-Verlag, 2012.
[8]   P. Weber, B. Bordbar, and P. Tino, “Real-time detection of process change using process mining.,” in ICCSW, pp. 108–114, 2011.
[9]   A. Maaradji, M. Dumas, M. La Rosa, and A. Ostovar, “Fast and accurate business process drift detection,” in International Conference on Business Process Manage- ment, pp. 406–422, Springer, 2015.
[10] A. Ostovar, A. Maaradji, M. La Rosa, A. H. ter Hofstede, and B. F. van Don- gen, “Detecting drift from event streams of unpredictable business processes,” in Conceptual Modeling: ER 2016, pp. 330–346, Springer, 2016.
[11] A. Seeliger, T. Nolle,  and  M.  Mu¨hlh¨auser,  “Detecting  concept  drift  in  processes using graph metrics on process graphs,” in Proceedings of the 9th Conference on Subject-oriented Business Process Management, p. 6, ACM, 2017.
[12] A. Maaradji, M. Dumas, M. La Rosa, and A. Ostovar, “Detecting sudden and gradual drifts in business processes from execution traces,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 10, pp. 2140–2154, 2017.
[13] P. De Koninck, S. vanden Broucke, and J. De Weerdt, “act2vec, trace2vec, log2vec, and model2vec: Representation learning for business processes,” in Business Pro- cess Management, pp. 305–321, Springer International Publishing, 2018.
[14] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
[15] M. Hassani, “Concept drift detection of event streams using an adaptive window.,” pp. 230–239, 2019.
[16] G. M. Tavares, P. Ceravolo, V. G. T. Da Costa, E. Damiani, and S. B. Junior, “Overlapping analytic stages in online process mining,” pp. 167–175, 2019.
[17] T. Li, T. He, Z. Wang, Y. Zhang, and D. Chu, “Unraveling process evolution by handling concept drifts in process mining,” in SCC, pp. 442–449, 2017.
[18] F. Stertz and S. Rinderle-Ma, “Process histories - detecting and representing con- cept drifts based on event streams,” in On the Move to Meaningful Internet Sys- tems. OTM 2018 Conferences (H. Panetto, C. Debruyne, H. A. Proper, C. A. Ardagna, D. Roman, and R. Meersman, eds.), pp. 318–335, Springer International Publishing, 2018.
[19] T. Mikolov, W.-t. Yih, and G. Zweig, “Linguistic regularities in continuous space word representations.,” in hlt-Naacl, vol. 13, pp. 746–751, 2013.
[20] R.  J.  C.  Bose,  W.  M.  Van  der  Aalst,  I.  Zˇliobaite˙,  and  M.  Pechenizkiy,  “Handling concept drift in process mining,” in International Conference on Advanced Infor- mation Systems Engineering, pp. 391–405, Springer, 2011.
[21] C. Zheng, L. Wen, and J. Wang, “Detecting process concept drifts from event logs,” pp. 524–542, 2017.
[22] Y. Spenrath and M. Hassani, “Ensemble-based prediction of business processes bottlenecks with recurrent concept drifts.,” 2019.
[23] Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in ICML, pp. 1188–1196, 2014.
[24] M. Rahman, Applications of Fourier transforms to generalized functions. WIT Press, 2011.
[25] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed repre- sentations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111–3119, 2013.
[26] V. D. Aalst, Process Mining - Discovery, Conformance and Enhancement of Busi- ness Processes. Springer, 2011.
[27] W. Van Der Aalst, A. Adriansyah, A. K. A. De Medeiros, F. Arcieri, T. Baier, T. Blickle, J. C. Bose, P. van den Brand, R. Brandtjen, J. Buijs, et al., “Process mining manifesto,” in International Conference on Business Process Management, pp. 169–194, Springer, 2011.
[28] V. D. Aalst, M. L. Rosa, and F. M. Santoro, “Business process management - don’t forget to improve the process!,” Business & Information Systems Engineering, vol. 58, no. 1, pp. 1–6, 2016.
[29] M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict! a systematic com- parison of context-counting vs. context-predicting semantic vectors.,” in ACL (1), pp. 238–247, 2014.
[30] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” JOURNAL OF THE AMERICAN SOCI- ETY FOR INFORMATION SCIENCE, vol. 41, no. 6, pp. 391–407, 1990.
[31] A. Mandelbaum and A. Shalev, “Word embeddings and their use in sentence clas- sification tasks,” arXiv preprint arXiv:1610.08229, 2016.
[32] X. Rong, “word2vec parameter learning explained,” arXiv preprint arXiv:1411.2738, 2014.
[33] P. Ristoski and H. Paulheim, “Rdf2vec: Rdf graph embeddings for data mining,” in International Semantic Web Conference, pp. 498–514, Springer, 2016.
[34] R. J. C. Bose and W. M. Van der Aalst, “Context aware trace clustering: Towards improving process mining results,” in Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 401–412, SIAM, 2009.
[35] R. J. C. Bose and W. M. Van der Aalst, “Trace clustering based on conserved pat- terns: Towards achieving better process models.,” in Business Process Management Workshops, vol. 43, pp. 170–181, Springer, 2009.
[36] G. Greco, A. Guzzo, L. Pontieri, and D. Sacca, “Discovering expressive process models by clustering log traces,” IEEE Transactions on Knowledge and Data En- gineering, vol. 18, no. 8, pp. 1010–1027, 2006.
[37] J.  Demˇsar  and  Z.  Bosni´c,  “Detecting  concept  drift  in  data  streams  using  model explanation,” Expert Systems with Applications, vol. 92, pp. 546–559, 2018.
[38] J. Evermann, J.-R. Rehse, and P. Fettke, “Predicting process behaviour using deep learning,” Decision Support Systems, 2017.
[39] T. S. Sethi and M. Kantardzic, “On the reliable detection of concept drift from streaming unlabeled data,” Expert Systems with Applications, vol. 82, pp. 77–99, 2017.
[40] T. Escovedo, A. Koshiyama, A. A. da Cruz, and M. Vellasco, “Detecta: abrupt concept drift detection in non-stationary environments,” Applied Soft Computing, vol. 62, pp. 119–133, 2018.
[41] A. Alves de Medeiros, B. Van Dongen, W. Van Der Aalst, and A. Weijters, “Process mining: Extending the alpha-algorithm to mine short loops,” tech. rep., BETA Working Paper Series, 2004.
[42] K. Fatemeh, “Concept drift detection in business process logs using deep learning,” Master’s thesis, Ferdowsi University of Mashhad, Iran, 2016.