Follow us on Twitter
Follow us on Linkedin
Latest News
Automate Curation and Publishing of
Personal Health Data Through Artificial Intelligence


Curate health data once with the help of patients and use many times.

  • Capture metadata on data sources
  • Orchestrate the use of multiples curation and integration tools
  • Engage and train the patients to solve semantic issues the machine cannot solve

Semantic interoperability is yet an unsolved problem: data collected for primary intent will remain heterogeneous, and standardisation will remain a burden for health data providers, hospitals, GP, authorities and many more.

AIDAVA intends to take an innovative approach to data interoperability. By capturing appropriate metadata on data sources and by orchestrating multiple curation tools (including AI based ones), the project wants to demonstrate the possibility to work with data sources in any format and to semi-automatically integrate and transform them in any format needed for secondary data use. As such AIDAVA will take charge of the data standardisation of health data from their native format and solve much of the data interoperability issues.

To achieve this vision, it is important that citizens are engaged. By 2030, EU citizens will be able to control their data. Some will want to make sure their health data - or the health data of their loved ones - are fully integrated and of high quality to support preventive medicine and best quality of care. AIDAVA will help citizens - after a minimal training in health and data literacy - to clean (curate) their data and to subsequently share them in any relevant standardised format.

Key Facts

  • The AIDAVA project is a 4 year research project funded by the European Union, running from September 2022 to August 2026, including 14 partners from 9 countries.(Download our fact sheet)
  • All data from an individual is integrated and curated within a Personal Health Knowledge Graph (PHKG).
  • Each PHKG is an instance of a common reference knowledge graph - based on ontologies derived from SNOMED, HL7 FHIR resource profiles, LOINC, and other domain specific terminologies.
  • Data sources are expected to be provided with multiple metadata - FAIR and additional content related metadata - to support automation, within a formalised Data Transfer Specification.
  • Automation of curation is based on the orchestration of multiple tools (optical character recognition (OCR), syntactic transformation, semantic transformation, entity deduplication, natural language processing (NLP), feature extraction from imaging, etc.) that are triggered based on metadata on the data sources.
  • AIDAVA will maximise the use of existing tools and develop novel tools around NLP in three languages and machine learning (ML) based data quality enhancement.
  • Empowerment of the citizens requires an AI based conversational assistant, with explainability capabilities that will be developed within the project
  • Health Data Intermediaries (regulated by the European Data Governance Act) will empower patients to be in control of their data, and facilitate curation and sharing. The Data Governance Act provides a framework to enhance trust in voluntary data sharing for the benefit of businesses and citizens. More information on the Data Governance Act can be found on the website of the European Commission.