Integrating large-scale neuroimaging research datasets: harmonisation of white matter hyperintensity measurements across Whitehall and UK Biobank datasets

Valentina Bordina, Ilaria Bertani, Irene Mattioli, Vaanathi Sundaresana, Paul McCarthy, Sana Suri, Enikő Zsoldos, Nicola Filippini, Abda Mahmood, Luca Melazzini, Maria Marcella Laganàg, Giovanna Zamboni, Archana Singh-Manoux, Mika Kivimäkii, Klaus P. Ebmeier, Giuseppe Baselli, Mark Jenkinson, Clare E. Mackaya, Ludovica Griffanti

Neuroimage May 2021


Large scale neuroimaging datasets present the possibility of providing normative distributions for a wide variety of neuroimaging markers, which would vastly improve the clinical utility of these measures. However, a major challenge is our current poor ability to integrate measures across different large-scale datasets, due to inconsistencies in imaging and non-imaging measures across the different protocols and populations. Here we explore the harmonisation of white matter hyperintensity (WMH) measures across two major studies of healthy elderly populations, the Whitehall II imaging sub-study and the UK Biobank. We identify pre-processing strategies that maximise the consistency across datasets and utilise multivariate regression to characterise study sample differences contributing to differences in WMH variations across studies. We also present a parser to harmonise WMH-relevant non-imaging variables across the two datasets. We show that we can provide highly calibrated WMH measures from these datasets with: (1) the inclusion of a number of specific standardised processing steps; and (2) appropriate modelling of sample differences through the alignment of demographic, cognitive and physiological variables. These results open up a wide range of applications for the study of WMHs and other neuroimaging markers across extensive databases of clinical data.

Read the article here.

Published May 21, 2021 9:13 AM - Last modified May 21, 2021 9:13 AM