News

Building the foundations for Federated Healthcare Research

Publiced on:

June 1, 2026

On 7 May 2026, the INDICATE training session about the Extraction, Transformation and Load (ETL) process within the INDICATE project took place! The session was given by Celia Alvarez-Romero, María Parra Rodríguez-Armijo and María González-Lopez.

The programme is designed to support data providers in the implementation of ETL processes and in following the Common Data Model (CDM) within the INDICATE infrastructure effectively, securely, and in a fully standardised way. It helps participants, such as clinicians and data engineers, build both the conceptual understanding and practical skills needed to work with interoperable health data.

This training session helped participants better understand the INDICATE data architecture, including the dual Common Data Model (CDM) approach. Participants also learned more about the technical requirements, tools and skills needed to successfully implement ETL processes in their organisations.

During the session, the importance of good data preparation was explained. Before starting the ETL process, healthcare data must be organised, checked and prepared correctly. The Data Provider Handbook supports organisations with practical guidance and explains the minimum technical and procedural requirements needed to transform local intensive care data into the OMOP Common Data Model (OMOP CDM).

The session also explained how the federated approach in INDICATE works. Data stays safely stored at each hospital or organisation and is not transferred to a central database. This helps protect patient privacy and supports secure collaboration between partners across Europe.

Participants also followed the INDICATE data workflow from local ICU source systems to data ready for federated use. The session explained how information from clinical environments can be identified, extracted, transformed through ETL logic, loaded into a local OMOP CDM instance and checked before analysis. HL7 FHIR was presented only as an optional support layer for structured access or interoperability when available or useful, while the main transformation pathway focuses on preparing local ICU data in OMOP CDM. The workflow also highlighted the importance of the local environment, where execution, semantic alignment, validation and governance controls come together before data can support distributed analyses.

The ETL tooling section showed how this workflow can be translated into practical implementation steps for Data Providers. These steps include confirming local readiness, profiling source data, defining mappings, implementing transformation logic, loading the local OMOP CDM, running quality checks and refining the process when issues are detected. The tools were presented as part of an iterative workflow rather than isolated components, supporting profiling, mapping, vocabulary alignment, implementation and post-load validation. This helped clarify how INDICATE moves from architecture to execution, turning complex ICU data into reliable, comparable and analysis-ready resources.

The INDICATE Training programme on Data Model & Data Enablement consists of five sessions. The next and last session will take place on June 17, 14:00–16:00 CEST.