Skip to main content

  Work Package 4



Technical infrastructure

Objectives

Work Package 4 focuses on the technical implementation of the federated infrastructure for ICU data with the aims:

  • of deploying an interoperable and secure federated infrastructure for trusted ICU datasets in the EU with interoperable links to other federated data infrastructures;
  • providing secure and interoperable platforms for aggregation of ICU data sets for secondary analysis and training of AI algorithms, and;
  • offering an easy-to-access front end in four EU languages at the clinical workforce level. It encompasses a central node that provides the core functionalities of the infrastructure required for the effective implementation of the governance, such as distributed data access, an integrated toolbox, and components for trust and data privacy.

Standard and secure

Our approach emphasises leveraging existing technologies and frameworks, such as Gaia-X and Azure components, to create a federated infrastructure for ICU data that excels in security, interoperability, and user experience.

By aligning with established standards, collaborating with DSSC, and utilising advanced Azure services, the INDICATE project ensures a robust and sustainable technical implementation, fostering secure cross-border access to ICU data within the European Digital Infrastructure Consortium (EDIC) ecosystem.

Tasks

Task 4.1 | Develop and deploy central components to the infrastructure

This task will focus on the development and deployment, utilising existing open-source components that are part of the Gaia-X ecosystem wherever possible. This includes the following subtasks:

  • T4.1.1 Implement Authentication and Authorisation services. To ensure secure interactions, the infrastructure will implement robust authentication and authorisation mechanisms compliant with eIDAS standards. Azure AD (Microsoft Entra ID) will be adapted to meet eIDAS requirements, enabling secure federated identity management. With security in mind, authentication and authorisation will be kept at the safest level possible, leveraging mechanisms such as multi factor authentication (MFA), least privilege administrative access with the help of role-based access control (RBAC) Entra roles and conditional access policies. Tenant restrictions will ensure only capable and authorised entities gain access to the tenant and ensure the data that is to remain on the tenant does. The central identity management system will interoperate with local solutions through standardised protocols such as SAML (Security Assertion Markup Language) or OpenID Connect. These protocols allow seamless communication between the central system and local identity providers, ensuring secure authentication and authorisation processes. Local identity management system nodes must adhere to the following requirements:
    • Digital identity clearly identifies the associated unique person
    • Completely implemented identity lifecycle process fully supporting the ability of users to join the infrastructure, move between organisations, or leave the infrastructure. In case of off-boarding, accesses are revoked, and accounts are disabled
    • Controlled and auditable management (grant/revoke) of privileges and business roles
    • SAML or OpenID protocol supported
    • Compliance with GDPR policy
  • T4.1.2 Establish a Verifiable Data Registry and Trust Anchor. Following the process described by Gaia-X’s trust framework, the central node will offer a verifiable data registry and act as a trust anchor. Data providers, data users, and service providers can make use of this certification process to make a digital handshake and initiate federated data access processes.
  • T4.1.3 Set up layered logging, monitoring, and notification services. High availability and incident response capabilities will be maintained through logging, monitoring, and notification services. Utilising Azure Monitor and Azure Security Center, the system will provide real-time insights into service health, resource usage, and security threats for the most crucial elements of the central node and infrastructure connector services. Automated notifications will enable swift incident response and forensic analysis to attempt to identify any threats within the solution before they are able to cause any damage. Some of the KPIs that will be tracked to monitor the health of the infrastructure are system uptime, response time, data processing throughput, error rates, and resource utilisation (CPU, memory, storage). Additionally, local nodes within the provided templates will by design have similar capabilities allowing the possibility to facilitate complete visibility of all health, diagnostics, and security information in near to real-time. Centralised identities will also be managed through built-in logging and monitoring services such as Entra Sign- in logs and Entra audit logs to ensure authentication is legitimate and any malicious attempts of identity attacks will be discovered and mitigated immediately.
  • T4.1.4 Set up Repository services. Open code principles will be adhered to by leveraging open GitHub repositories. This approach allows data users and -providers the flexibility to use their preferred tools and coding languages, promoting an open and collaborative ecosystem. Each entity can utilise their tool of choice, aligning with the project’s open code principles. Security best practices and a secure Continuous Integration/Continuous Deployment (CI/CD) pipeline will be implemented to ensure the integrity and safety of the stored code. In this way version-controlled code, containerised software, and machine learning models will be securely stored, facilitating collaborative development and streamlined deployment.
  • T4.1.5 Implement ticketing systems. Efficient handling of technical support requests is vital for the success of the federated infrastructure for ICU data. The INDICATE project team will provide 2nd and 3rd line support through federated helpdesks and security incident response teams. However, recognising the importance of seamless support, we will work with the DSSC national contact points to establish 1st line of support. To ensure integrated and cohesive support operations, the ticketing system employed for incident reporting and issue resolution will align closely with the DSSC tooling. This approach enables efficient communication between the project team, DSSC national contact points, and other stakeholders, ensuring a streamlined process for incident response, security support, and technical assistance. The integrated ticketing system will facilitate prompt and effective collaboration, enhancing the overall support experience for end- users and contributors.
  • T4.1.5 Deploy a Portal as a Graphical User Interface in four EU languages. A user-friendly front-end portal, accessible in four European languages, will serve as the entry point for individual end-users. Developed using Azure App Service and Azure API Management, the portal will provide a seamless experience, enabling users to access services, navigate datasets, and initiate requests within the federated infrastructure.
Task  4.2 | Implement Data Federation Network integration

This task builds upon the Gaia-X framework to integrate a diverse network of data providers, data users, service providers, and infrastructure providers. This will enable clinicians, researchers, and innovators to operate on the infrastructure through pre-built, configurable Landing Zones deployed at local nodes. These landing zones will include secure processing environments where data that has gone through the ETL process described in WP2 and can be promoted for federate access. The counterpart for data users is an equally secure processing environment that acts as an orchestrator for the federated data access. The structured approach within these zones guarantees secure data processing and seamless integration, allowing participants to focus on their core tasks without concerns about infrastructure intricacies. Development of the templates for Landing Zones includes the following subtasks:

  • T4.2.1 Provide Secure Processing Environment specifications, empowering participants to choose the most suitable secure processing environments based on their unique requirements. By ensuring that all selected environments meet consistent standards, participants can confidently leverage secure processing environments of their choice, aligning with the overarching security protocols of the federated infrastructure for ICU data. In this context, INDICATE also references the DARWIN-EU Safe Processing Environment, showcasing the project’s commitment to integrating proven and trusted solutions into its framework.
  • T4.2.2 Develop templates for configuring cloud resources (VMs, containers, or container swarms [e.g. kubernetes]) using Infrastructure as Code (IaC) within the secure processing environments.
  • T4.2.3 Create networking configurations within the Landing Zones and interconnections between the Landing Zones that adhere to stringent security best practices and the principle of least privilege. This approach guarantees secure communications within the infrastructure, preventing any unwanted public access and ensuring data integrity and confidentiality.
T ask 4.3 | Deploy a centralised Metadata Catalogue

In this task will implement a centralised metadata catalogue, but rather than imposing a one-size- fits-all solution, we recognise the diverse range of metadata catalogues employed by our participants.

Therefore, our strategy revolves around creating a centralised metadata catalogue that seamlessly integrates and communicates with these local and varied catalogues. The central metadata catalogue does not operate in isolation; instead, it remains in synchronisation with the local metadata catalogues maintained by various participants. This is achieved through a robust publisher-subscriber message system, enhancing interoperability, and ensuring that metadata remains up to date and accurate across the entire infrastructure. To implement the centralised metadata catalogue we will perform the following subtasks:

  • T4.3.1 Set-up a pub-sub messaging systems to integrate local metadata catalogues. In this model, the central hub acts as the central subscriber, while local metadata-catalogue nodes function as publishers. Local nodes emit updates as messages to the central message queue, facilitating real-time communication and data exchange. To organise these messages effectively, topics or event channels are established within the message queue system. Each local metadata-catalogue node emits events to specific topics that align with the type or category of metadata being updated. To implement this seamless communication, leverage advanced message queue technologies such as Apache Kafka, RabbitMQ, NATS, or cloud-based services like Azure Event Hubs. These technologies provide a robust foundation for real-time data exchange, ensuring that metadata updates are efficiently propagated throughout the infrastructure.
  • T4.3.2 Furthermore, our metadata catalogue integrates with the BBMRI-ERIC broker service, enhancing the accessibility and discoverability of metadata. By linking with the BBMRI-ERIC broker service, INDICATE participants can negotiate access, ensuring that data providers and consumers can seamlessly collaborate and exchange data within a trusted environment. By adopting this approach, INDICATE not only respects the existing metadata infrastructures of participants but also enhances the overall accessibility and usability of metadata within the federated ecosystem. This integration fosters a cohesive environment where metadata remains current, reliable, and readily available to clinicians, researchers, and innovators, supporting their diverse needs within the project’s objectives.
T ask 4.4 | Deploy Digital Marketplace

The centralised marketplace will facilitate service discovery, negotiation, and access management.Utilising Azure Cognitive Search and Azure Logic Apps, a powerful search engine and recommender system will ensure relevant data services are findable. Clear terms-of-use, minimal service requirements, and Service Level Agreements (SLAs) will be outlined.

T ask 4.5 | Setup Helpdesk and CIRT

The INDICATE project team will provide 2nd and 3rd line support through federated helpdesks and security incident response teams. However, recognising the importance of seamless support, we will work with the DSSC national contact points to establish 1st line of support. This task includes:

  • T4.5.1 Setting up central (3rd line) support and national level (2nd line) technical support teams in collaboration with the DSSC national contact points.
  • T4.5.2 Setting up a Cybersecurity and Incidence Response team to monitor and mitigate cyber- and proactively implement cybersecurity measures

Lead

Mathias Syx

Task lead representatives

Amina Shah
Sylvain Robert

Contributors