Adrian

Dr. Adrien Coulet

Abstract

Predicting clinical outcomes from patient care pathways represented with temporal knowledge graphs

Background: With the increasing availability of healthcare data, predictive modeling finds many applications in the biomedical domain, such as the evaluation of the level of risk for various conditions, which in turn can guide clinical decision making. However, it is unclear how knowledge graph data representations and their embedding, which are competitive in some settings, could be of interest in biomedical predictive modeling. 

Method: We simulated synthetic but realistic data of patients with intracranial aneurysm and experimented on the task of predicting their clinical outcome. We reduced this task to the classification of a subtype of nodes, and as a baseline, we evaluated its performance on tabular data. Next, we generated various graph-based representations of the same dataset, including a representation following the schema proposed by the SPHN (Swiss Personalized Healthcare Network), and investigated how the adopted schema for representing first individual data and second temporal data impacts predictive performances. 

Results: Our study illustrates that in our case, the SPHN graph representation, along with Graph Convolutional Network (GCN) embeddings reach the best performance for a predictive task from observational data. We emphasize the importance of the adopted schema and of the consideration of literal values in the representation of individual data. Our study also moderates the relative impact of various time encoding on GCN performance.

Availability: This work has been accepted for publication and presentation to the Research track of the Extended Semantic Web Conference 2025. The preprint of the article is available at https://arxiv.org/abs/2502.21138.

Researcher and co-leader of the HeKA team

L’institut national de recherche en sciences et technologies du numérique (Inria), France