Skip to main content

Author: Irene Gebuis

The OMOP Common Data Model explained: speaking the language of health data

The second training of the INDICATE Training Programme on Interoperability, OMOP and Vocabularies took place on April 9, 2026. The programme is designed to support data providers in using the INDICATE infrastructure effectively, securely, and in a fully standardised way. It helps participants, such as clinicians and data engineers, build both the conceptual understanding and practical skills needed to work with interoperable health data.

During this second training, led by Maxim Moinat (Data Engineer and OHDSI Collaborator, Erasmus MC) and moderated by Boris Delange (MD, Medical Informatics, Université de Rennes), participants learned that data from different hospitals and institutions must be combined and compared to enable research at a European level. However, this is only possible when data is structured in a way that makes comparison meaningful and reliable.

This is where standardisation becomes essential. Without a shared structure, data remains fragmented across systems, making large-scale analysis difficult or even impossible. By harmonising data into a common format, researchers can generate evidence that is consistent, reproducible, and scalable across countries.

The OMOP Common Data Model provides exactly this; a shared way of organising patient data and a shared vocabulary for describing clinical events, so that hospitals across Europe can describe the same reality in the same terms. Maxim walked participants through the main building blocks of the model and showed how they apply to ICU data, with concrete examples detailed during the session. He also presented the wider OHDSI community and European networks such as EHDEN and DARWIN EU, which already federate data on hundreds of millions of patients.

Maxim then walked participants through the full journey from raw hospital data to interoperable, OMOP-formatted data, step by step, from the initial exploration of the source system to the final validation of the mapped database. At each stage, he introduced the corresponding tools from the OHDSI ecosystem, a suite of open-source resources designed to support data providers throughout the process. He also showed how the INDICATE Data Dictionary, presented in Session 1, fits into this journey by guiding data providers on which clinical concepts to prioritise for mapping.

The session concluded with key take-home messages on the importance of clear mapping specifications, vocabulary alignment, and the value of a shared data model for enabling collaborative research across institutions and countries.

Overall, the training provided participants with both a conceptual and practical understanding of how the OMOP CDM and the surrounding OHDSI ecosystem support interoperable and scalable health data research within INDICATE.

The next training will focus on the ETL Workflow, data preparation requirements, and data quality expectations and is planned on May 7 2026. 

Read more about the first training.

Marcel Giemsa

Position: Work Package 6

Who am I & what do I do?

My name is Marcel Giemsa, and I work as a Research Associate at the Department of Cardiology, Pulmonology and Angiology at the University Hospital Düsseldorf, in the group of Prof. Dr. Dr. med. Christian Jung. I hold a Bachelor’s degree in Biology from Ruhr University Bochum, and this summer I will complete a second Bachelor’s degree in Computer Science at Heinrich Heine University Düsseldorf. I originally joined Christian’s group as a student assistant, and after some time he asked me whether I would like to take on a larger role within the team — which is how I ended up working on INDICATE.

What am I up to during INDICATE?

Within INDICATE, I am responsible for the coordination of Work Package 6 and take on technical coordination tasks across the project. This includes a fair amount of hands-on project management, as well as supporting data analysis activities. A core part of my role is acting as a bridge between the clinical and the technical domains: translating clinical requirements into technical specifications, and making sure that the technical work stays aligned with the real-world needs of the ICUs and clinicians we serve. Because WP6 sits at the intersection of so many topics, a lot of my day-to-day work is about keeping the different threads connected and making sure information flows between the people who need it.

What motivates me to be part of INDICATE?

Combining medicine and IT has been something I wanted to do for as long as I can remember. It is the reason I studied both Biology and Computer Science in the first place. Working with Prof. Jung has been a great experience from day one: we quickly realized that our backgrounds complement each other well, and there is a lot of mutual trust in what each of us brings to the table. On top of that, AI in medicine is a field I find genuinely exciting, and at the same time one of the hardest when it comes to getting access to high-quality data. INDICATE tackles exactly that bottleneck, and being able to contribute to a project that works on this problem at a European scale is a rare opportunity that I didn’t want to miss.

What do I expect to accomplish within INDICATE?

On a personal level, my main goal is to deeply understand how a large EU project like INDICATE is structured — from governance and reporting to the technical coordination between dozens of partners — and to learn which pitfalls tend to come up along the way. EU projects are complex, and a lot of the knowledge about how to run them well is experience-based. I want to build exactly that kind of experience, so that in future EU projects I can help avoid problems before they occur and contribute from an even stronger starting position. Beyond that, I hope to see WP6 deliver results that genuinely support the wider project and the clinicians and researchers who will eventually work with the INDICATE infrastructure.

How does my background or expertise contribute to the goals of INDICATE?

My dual background is what I try to bring into the project every day. From Biology, I bring scientific working practices and an understanding of biological and medical processes, which helps me engage meaningfully with clinical partners and the use cases we are building around. From Computer Science, I bring a solid foundation in machine learning and in thinking about data and systems. In a project like INDICATE, where clinicians, data scientists, engineers, and ML researchers all need to work together, the biggest challenge is often not the individual disciplines but the communication between them. Because I know both “camps” from the inside, I can translate between them, ask the right questions on either side, and help make sure that technical decisions respect clinical reality — and vice versa. That is the contribution I try to make within INDICATE.

INDICATE Training Programme – Legal Framework

In order to support all consortium members in using the INDICATE infrastructure effectively, correctly, and securely, we are organizing a three-session series of the INDICATE Training Programme on Legal Framework, running in parallel with and complementing the ongoing Data Models sessions. 

The programme will give participants a comprehensive understanding of the INDICATE legal framework, covering GDPR and EHDS principles, data protection and privacy-enhancing technologies, governance and rulebook structures, and practical skills to navigate data access processes, compliance requirements, and organizational implementation challenges within INDICATE.

Session dates

All sessions will be held from 14.00 – 16.00 (CEST) via Zoom.

  • May 4 – Session 1 | Understanding GDPR
  • June 24 – Session 2 | Understanding and using Data Access
  • September 10 – Session 3 |  Understanding the Rulebook and legal onboarding steps

Vacancy: Statistician / Applied Mathematician (INDICATE Project)

Position Overview

AP-HP Assistance publique – Hôpitaux de Paris, a valued partner for the INDICATE project, is seeking a highly motivated Statistician / Applied Mathematician / Data Scientist to contribute to the development and validation of predictive models of organ failure in critically ill patients. The position is part of the European INDICATE project and focuses on translational research at the interface between medicine, statistics, and artificial intelligence.

Scientific Scope

INDICATE focuses on predicting major organ failures in ICU patients using multimodal data (clinical, biological, and high-frequency physiological signals). The goal is to identify early predictive signatures of organ dysfunction (renal, respiratory and cardiovascular) and support personalized decision-making in critical care.

Methodological Framework

The candidate will implement and validate advanced statistical and machine learning models, including supervised learning, time-series modeling, and trajectory analysis. Key aspects include feature engineering from high-frequency data, handling missing data, model calibration and discrimination assessment, and external validation when available.

Required skills

  • Strong background in statistics, applied mathematics, or data science
  • Experience in predictive modeling and machine learning
  • Programming skills: Python (mandatory), SQL; Java/C++ is a plus
  • Interest in biomedical applications and clinical data

Contract and Conditions

  • Fixed-term contract (18 months)
  • Full-time (100%)
  • Location: INSERM U942, Paris (AP-HP / Université Paris Cité)
  • English required; French not mandatory

Application process

To apply for this position, please send your CV and motivational letter to contact Dr. Benjamin Deniau via benjamin.deniau@aphp.fr and Ms. Fatima Zunara via fatima.zunara@aphp.fr.

Dr. Benjamin Deniau
benjamin.deniau@aphp.fr

Fatima Zunara
fatima.zunara@aphp.fr

INDICATE Training session on Onboarding & Data Model: Unlocking ICU data across Europe without moving patient data

This week marked the first session of the INDICATE Training Programme, designed to support data providers in using the INDICATE infrastructure effectively, securely, and in a fully standardized way.

The session, guided by moderator Maxim Moinat (Data Engineer, Erasmus MC) and co-moderated by Maarten Ligtenberg (Co-founder Cradeq), provided a solid introduction to key building blocks on the INDICATE onboarding framework, interoperability in complex healthcare data environments and federated data infrastructure and secure data sharing principles.

Jan van den Brand (technical lead INDICATE) highlighted key challenges in ICU clinical decision-making and innovation, driven by fragmented data, a lack of standardized data-sharing agreements, and limited secure infrastructure. He illustrated this using a metaphor: hospitals today resemble a house with different types of power sockets, where every device requires its own adapter to function.

In this analogy, medical and AI software represent the appliances, while hospital systems such as electronic health records and laboratory databases represent the power sources. Without a shared standard, hospitals are often forced to build and maintain these “adapters” themselves, increasing complexity, cost, and operational risk. This underlines the need for shared standards and interoperable data models.

A central theme, introduced by our presenter Boris Delange (Doctor in Medical Informatics, Université de Rennes), was the reality of hospital data: each institution often uses its own “language” to describe the same clinical concepts. This creates significant challenges for interoperability and data integration, while also highlighting the importance of standardization for enabling meaningful reuse of healthcare data in research and innovation. 

Boris also addressed the broader context of Hospital Information Systems and Clinical Data Warehouses, focusing on challenges related to data quality, semantic alignment, and making heterogeneous data usable beyond clinical care. Despite its value, a large proportion of hospital data (97%!) remains underutilized for research purposes.

INDICATE addresses this challenge by developing a federated data infrastructure, where data remains securely stored within its original institution (the data never leaves the hospital) while becoming interoperable and accessible for analysis across organisations  through shared standards.

The training programme consists of five sessions. The next session will take place on April 9, 14:00–16:00 CEST.

The training sessions are organised by Maarten Ligtenberg, Melania Istrate, Elisa Vera, Jan van den Brand, Aliza Bos, Maaike van Zuilen, and Irene Gebuis, a collaboration between Work Packages 1 and 5 and the INDICATE Training and Education Workgroup.

Moving from Vision to Implementation in federated ICU Research

We’re gathered in Brussels for the INDICATE Design Workshop – a two-day event bringing together hospitals, technical experts, clinicians and communication advisors from across Europe. INDICATE is building a federated data infrastructure for intensive care data. That means: collaborating on better care and research, without patient data ever leaving a single hospital.

Day 1 ‘Understanding the data provider journey’

  • Jan van den Brand walked us through the INDICATE mission – why federated ICU data matters for European healthcare and also through the Onboarding Blueprint – the journey from commitment to production
  • Bert Cappelle gave a detailed and inspiring live demo on how to conduct a study with federated data within the INDICATE platform

The afternoon was hands-on: data providers and guests worked through a stakeholder mapping exercise, identifying who is needed to implement INDICATE within a hospital and whether those stakeholders can actually be named today. This was followed by a data provider gap analysis focused on for example identity and access management.

This year we are making the shift from concept to implementation in real hospital environments. That takes collaboration, honesty about challenges, and a willingness to learn from each other. And that’s what we did today!

Day 2 ‘What does it actually take to onboard a hospital as a data provider?’

During day 2 of the INDICATE Design Workshop we spent the day working through what it actually takes to onboard a hospital as a data provider: not in theory, but in practice. 

What does the onboarding journey look like? What do data providers need to implement INDICATE in their organisation, who to get involved? During the day we challenged ourselves to rethink the onboarding process and follow up steps from a data provider perspective – At the end of the day, all attending data providers had developed an actionable implementation roadmap for the next three months. 

One principle that kept coming back: patient data never leaves the hospital. That’s not just a technical design choice, it’s the foundation of trust that makes the federated data network possible.

Next meeting? On Monday March 30 the INDICATE Training Programme on Data Enablement & Data Model will start! We look forward to welcoming you to the session!

Thanks to all who actively participated in person and online!

Celia Alvarez-Romero (Servicio Andaluz de Salud), Marcel Giemsa (Universitätsklinikum Düsseldorf), Bert Cappelle (UZ Gent), Christian Jung (Universitätsklinikum Düsseldorf), Kirsten Colpaert (UZ Gent), Maurizio Cecconi (ESICM), Anouk Kruiswijk (KPMG), Maaike van Zuilen (Erasmus MC – philogirl), Maarten Ligtenberg (Cradeq), Daniel Laxar (Medical University of Vienna), Maria Theodorakopoulou (Hellenic Society of Intensive Care Medicine (HSICM), Maurice Walny (Charité – Universitätsmedizin Berlin), Joost Schotsman (UMC Utrecht), Rachit Gupta (KPMG), Maxim Moinat (Erasmus MC), Irene Gebuis (philogirl), Alexander Lang, Nils Woge, Kai Marten Vogl, Daniel Wetzler, Lorenz Kapral, Natalja Zilinski.