At described in Start Deep Dive to Machine Learning and Action Flow it is possible to manipulate Celonis EMS function by Machine Learning Workbench. Function relevant to EMS data is already discussed at last post Store Time Series of Metrics to Table. Machine learning Workbench also enables to manipulate programs in EMS for developers. Today I would like to demonstrate some of functions to manipulate Transformation (SQL) script developed in Data Integration.
[Read More]
Execute Periodic ETL Automatically
Until last post Validate Data Model by Studio Analysis, I completed creating ETL programs in Data Integration. But this is still test product because this programs are executed manually by operator. In this post, I would like to share the final piece of Data Integration, how to run ETL programs automatically after production release.
There are several ways to achieve automatic ETL, for example in my production system I am using scheduler in Machine Learning Workbench to trigger Jupyter Notebook, then operate Data Integration via Pycelonis (Python API for Celonis EMS).
[Read More]
Validate Data Model by Studio Analysis
At last post Construct My First Data Model, I created Data Model and load data to it. Normally initial load is not perfect, so I should check data in Data Model. Today I would like to share how to validate my Data Model using Analysis. By the way, Celonis EMS main function is now Studio (and App for viewer) and Analysis is also part of Studio, so today I will create Studio instead of Process Analytics to create Analysis.
[Read More]
Construct My First Data Model
Today I would like to create draft version of data model then complete series of ETL (Extraction, Transformation, Load) started from last November.
As I showed in Adjust Time Zone of Event Time in Global Transformation, I created global VIEW for activity table. Before creating Data Model, I would like to create one more global VIEW against case table. As below SQL, it is quite simple to copy all columns of issues in Planio connection, plus one more column _CASE_KEY that is converting from interger id to character.
[Read More]
Inspect Table Data by SELECT statement
Today I would like to share small tips in transformation tasks. Of course main purpose of transforamtion is to create event log table etc. but we are not always sure about source system behavior, so before completing transformation SQL I must investigate data of source system. As I already showed, I can use simple SELECT statement to inspect tables.
Previously when I was familiar with SAP ECC, the only way to easily inspect SAP table is to download table data by SE16 transaction and open Microsoft Excel and simply filter data or use pivot table etc.
[Read More]
Adjust Time Zone of Event Time in Global Transformation
Until now I do not care about value of event time, that is stored at source system like Planio. If I analyse only one system like my current project it is no problem, but if I would like to analyze data from multiple source systems, I should take care time zone of each system.
In my case, I previously analyzed multiple SAP ECC systems, those are located in various area of world (US, EU, JP etc.
[Read More]
Handle Day based Activity as Milestone
I already created four activities until last post Unite SQL statements by CASE Expression, those are fillfilled requirement of event log. Going back to Consider Case ID before Starting Transformation, case ID is the biggest requirement. Also is is not so big as case ID, but event time is important too. In process mining, event time should be year, month, day plus hour, minute, second (YYYY-MM-DD HH:MI:SS in Vertica format). I guess event time in process mining referred to that is recorded automatically by system responding to something action.
[Read More]
Unite SQL statements by CASE Expression
Until Split Long SQL Using Views post I created four activities and each SQLs, and I found three of four have same table join pattern. So I created VIEW to shorten JOIN predicate for each SQL. It is nice to shorten total statement volume but almost all statement except for JOIN predicate is duplicated among SQLs. When something change is required, maintenance of each SQLs is annoying work. Today I would like to integrate SQLs using CASE expression.
[Read More]
Split Long SQL Using Views
At last post I wrote Compose Activity from Joining Multiple Tables to create second activity Close Issue. Final version of SQL statement was long even I just used three tables. In real process mining project I handled hundred of tables and wrote quite long SQLs. At that time I faced same patterns of SQL in multiple activities. So I introduces VIEW in my SQLs for grouping same pattern of SELECT SQL, similar to create function (method) in programming.
[Read More]
Compose Activity from Joining Multiple Tables
In the last post of Insert Simple Record into Activity Table, I created SQL to insert Raise Issue activity. That SQL was simple because only one table issues are used as data source. Today I will create SQL of Close Issue activity that requires multiple tables.
Same as previous post, first I should analyze columns to compose Close Issue activity. This is happend when I change Status column in each issue, so I need change history.
[Read More]