Use Pseudonymized Column as Grouping Key

One of the biggest headache for data engineer like me is how to assure data security when extracting data. Especially personal information should be dealt sensitively, otherwise I may be punished by each region’s law (e.g. GDPR). When I operate Celonis EMS, I try not to extract sensitive information from the beginning, for example I do not extract table of customer address (ADRC table in SAP etc.). But this information is sometimes effective for grouping key of counting case etc. [Read More]

Understand Delta Load Configuration Difference in Adding Column Scenario

Last time I showed behavior when I added new record then extracted that record by Delta Load (Verify Cloning Table Contents via Delta Load). Delta Load is effective way to minimize extraction effort, but it is not always applied. Today, it is continued from previous post, I would like to add column to cloned table and observe behavior of extraction task. After starting system operation including database, normally system is changing its requirement and extend function and database etc. [Read More]

Verify Cloning Table Contents via Delta Load

Following last week’s Minimize Extraction Time by Delta Load Option, today I would like to insert new record to Postgres table then try Delta Load again to extract it. To do this, I will start from operating pgAdmin, that is already ready for my loal machine after docker-compose. First step is to enter localhost:5050 to my browser, then at the login screen enter as email and pgadmin as password then click login button. [Read More]

Minimize Extraction Time by Delta Load Option

Last week I extracted Postgres table and looked at the log to understand mechanism of data transfer. At that time I used Full Load option to extract data, that is to replace all table contents and schema to latest version. That is easiest way to synchronize tables between source system (Postgres) and Celonis, but it takes a lot of time to complete this task. So that I should also use second option Delta Load to minimize extraction time. [Read More]

Look at Data Transfer Process by Data Job Log

Last week I posted Connect to Celonis and Bring Back Instruction to look at how Extractor works to connect between Celonis and Postgres. This week I would like to extract data from Postgres and look at data transfer process by data job log. In the Data Integration, I create new Data Job with Data Connection I created last week, then create new extraction task. In the next screen I add new table public. [Read More]

Connect to Celonis and Bring Back Instruction

From last week I started Data Integration series and posted Run Extractor on Your Local Machine to prepare for my Extractor and Postgres database. Today I will start using Extractor and show you the mechanism to extract data safely. From this week I will start Extractor and Postgres by next two steps (same as last week). Open VS code and open terminal at celonis-postgres folder. Enter docker-compose up in VS code terminal. [Read More]

Run Extractor on Your Local Machine

From this week I would like to explain my experience regarding Data Integration functions (Extraction, Transformation, Load etc.). To do this, I try to create sample source systems and build code in Celonis training environment. As first topic, I would like to explain on premise Extractor, that is in the middle between your source systems (SAP, Oracle etc.) and Celonis EMS and support transferring data. By the way, because I do not want to pay licence of source systems for this blog, I would like to use open source Postgres database. [Read More]