Insert Simple Record into Activity Table


In the last post of Determine Process Mining Tables based on Project Goal, I talked strategy of transforming Planio data to event log. Then I wrote SQL to create activity table.

From now I would like to insert activity record to that table. Based on the discussion of last post, at least I need two activities Raise Issue and Close Issue to measure throughput time between them. That is the next step to achieve my goal.

First, Raise Issue activity is happened when Planio issue record is created. As I already looked at Configure Endpoint for Suitable Extraction, when new issue record is created, it is transferred to issues table in Celonis EMS. I can see created_on column as EVENTTIME and author$id column as changed_by in issue table, and of course issues table has its primary key column id, it is enough to use only issues table to create Raise Issue activity. Simply I can write draft SQL as below.

-- draft --
INSERT INTO _cel_pl_activities (
     _CASE_KEY
    ,ACTIVITY_EN
    ,EVENTTIME
    ,changed_by
)
SELECT DISTINCT
     id AS _CASE_KEY
    ,'Raise Issue' AS ACTIVITY_EN
    ,created_on AS EVENTTIME
    ,author$id AS changed_by
FROM issues
;

Before executing all INSERT statement, I recommend to execute SELECT part of this SQL beforehand (as below), to validate table or column name and expected result. If you execute INSERT statement many times, duplicated record are inserted to issues table and it is difficult to validate SELECT result.

image

For example, in this case I was not sure whether author$id is appropriate value for operating user ID, so I tried "author$id"||"author$name" instead at SELECT statement and confirmed this is actually user ID (by the way, I decide not to include author$name because it is not required for further analysis. If author$id is not there and it is required to hide personal information, then I use pseudonymization. see Use Pseudonymized Column as Grouping Key).

To complete first SQL, I need two more columns _SORTING and _CELONIS_CHANGE_DATE. _SORTING is the integer value to prioritize activity order if two or more activities are same EVENTTIME. It is sometimes happened if I changed many fields and save data at once e.g. I assume there is no activity that happened same timing as Raise Issue, so any value is fine. This time I set 0. Regarding _CELONIS_CHANGE_DATE, I already saw this column at Verify Cloning Table Contents via Delta Load. This column is automatically inserted timing of extraction. It is effective for future debug, so I decide same strategy to my activity table too. I can use NOW() function to insert timinig of INSERT execution.

Final version of SQL to insert Raise Issue is below.

-- final --
INSERT INTO _cel_pl_activities (
     _CASE_KEY
    ,ACTIVITY_EN
    ,EVENTTIME
    ,_SORTING 
    ,changed_by
    ,_CELONIS_CHANGE_DATE
)
SELECT DISTINCT
     id AS _CASE_KEY
    ,'Raise Issue' AS ACTIVITY_EN
    ,created_on AS EVENTTIME
    ,0 AS _SORTING 
    ,author$id AS changed_by
    ,NOW() AS _CELONIS_CHANGE_DATE
FROM issues
;

Today is the simplest transformation case in process mining. Next time I would like to show another case of joining tables.

Kaz


See also