Until last post Validate Data Model by Studio Analysis, I completed creating ETL programs in Data Integration. But this is still test product because this programs are executed manually by operator. In this post, I would like to share the final piece of Data Integration, how to run ETL programs automatically after production release.
There are several ways to achieve automatic ETL, for example in my production system I am using scheduler in Machine Learning Workbench to trigger Jupyter Notebook, then operate Data Integration via Pycelonis (Python API for Celonis EMS).
[Read More]
Validate Data Model by Studio Analysis
At last post Construct My First Data Model, I created Data Model and load data to it. Normally initial load is not perfect, so I should check data in Data Model. Today I would like to share how to validate my Data Model using Analysis. By the way, Celonis EMS main function is now Studio (and App for viewer) and Analysis is also part of Studio, so today I will create Studio instead of Process Analytics to create Analysis.
[Read More]
Construct My First Data Model
Today I would like to create draft version of data model then complete series of ETL (Extraction, Transformation, Load) started from last November.
As I showed in Adjust Time Zone of Event Time in Global Transformation, I created global VIEW for activity table. Before creating Data Model, I would like to create one more global VIEW against case table. As below SQL, it is quite simple to copy all columns of issues in Planio connection, plus one more column _CASE_KEY that is converting from interger id to character.
[Read More]
Determine Process Mining Tables based on Project Goal
I talked about how to consider Case ID in advance in last post. That is same as goal setting of process mining project, it means what would like to be measured in process mining and why. Case is the unit of measuring performance and grouping activities.
Considering my case of Planio, for example I would like to measure 1) Throughput time from raising issue to closing it, 2) How many users are involved until closing issue.
[Read More]
Create Key Column of Activity Table
Celonis Data Model always require unique key in case tabel (case key) to group activities belong to each case. How about activity table ? Activity table do not have explicit key column, instead combination of case key, activity name, timestamp, and sorting number are similar to activity key (those four columns are configured in Data Model).
This week I was asked to create activity table key column due to duplication check purpose.
[Read More]
Categorize and Name Activity
In the previous post Transform Source System Tables to Minimize Data Model Tables, I recommended to convert some kind of source system tables to Activity. Today I focus on Activity and would like to share my way how to categorize and name Activity.
First point is to split Activity name to two parts, more general part and detail part. For example, in SAP ECC or S4HANA Order to Cash process, general Activity name is Create Sales Order when data committed in VA01 transaction.
[Read More]
Transform Source System Tables to Minimize Data Model Tables
In the last part of previous post Utilize N-M relationship between Activity and Dimension Tables, I said just mimicing source system’s table structure to data model is not useful for Celonis data model. Also I feel loading time to data model is exponentially long when one more table is added. So minimizing data model table is good practice especially for real time analytics.
In case of SAP Order to Cash (O2C) scenario, case table is sales order item (VBAP).
[Read More]
Utilize N-M relationship between Activity and Dimension Tables
When you determine data model structure by yourself, basically you should follow snowflake schema writtern in Understand how Tables are joined in Data Model. From case table perspective, n side is case table, and 1 side is another dimension table. On the other hand, activity table is the first case that 1 side is case table. If you determine second dimension table that 1 side is case table, please note that you can not use both activity and that dimension table at once because those tables are N-M relationship via case table.
[Read More]
Understand how Tables are joined in Data Model
In the Understand Difference between Dimension and KPI post, I mentioned that Columns used as dimensions and KPIs are inplicitly joined based on data model, but I did not mention how to join tables in data model. Today I would like to say about it.
Celonis Data Model support snowflake schema that consists of multiple tables, each table pair join as 1:N relationship. Normally activity table is top of N side and dimension tables including case table are bottom of 1 side as pyramid hierarchy.
[Read More]