Many clinical studies are based on registry analyses, but exact approaches of data extraction and pre-processing are rarely included, while this is critical for reliability and reproducibility of results. We aimed to develop an open-source data extraction pipeline which generates a ready-to-analyze dataset focused on relevant determinants of outcomes after hematopoietic stem cell transplantation (HSCT). This pipeline was developed using EBMT registry data, including 54,457 allogeneic and 63,651 autologous HSCT procedures. The pipeline determines HLA matching from molecular data, assesses cytogenetic risk for acute myeloid leukemia and myelodysplastic syndrome, processes molecular markers, assigns the hematopoietic cell transplantation comorbidity index (HCT-CI) based on comorbidities, and maps disease states to simplified categories. We prospectively assessed the recently developed disease risk stratification system (DRSS), showing that the pipeline produces consistent results with previous studies. The hazard ratio correlation between our cohort and the original DRSS derivation cohort was 0.92 with a 2-year AUC of 0.616, indicating similar effects and predictive performance. We aim to establish a new standard by promoting transparent, standardized and uniform extraction of registry data, enhancing reproducibility in registry studies.

An extraction pipeline for analysis of hematopoietic stem cell transplantation data / Von Asmuth, E.G.J., Halkes, C.J.M., Versluis, J., Eikema, D.-J.A., Angelucci, E., Bazarbachi, A., Ciceri, F., Greco, R., Hazenberg, M., Kalwak, K., Mclornan, D.P., Neven, B., Risitano, A.M., Steinbuch, M., Sureda, A., Snowden, J., Lankester, A.C., Putter, H., De Wreede, L.C.. - In: BONE MARROW TRANSPLANTATION. - ISSN 0268-3369. - 61:4(2026), pp. 400-407. [10.1038/s41409-026-02818-z]

An extraction pipeline for analysis of hematopoietic stem cell transplantation data

Ciceri F.;
2026-01-01

Abstract

Many clinical studies are based on registry analyses, but exact approaches of data extraction and pre-processing are rarely included, while this is critical for reliability and reproducibility of results. We aimed to develop an open-source data extraction pipeline which generates a ready-to-analyze dataset focused on relevant determinants of outcomes after hematopoietic stem cell transplantation (HSCT). This pipeline was developed using EBMT registry data, including 54,457 allogeneic and 63,651 autologous HSCT procedures. The pipeline determines HLA matching from molecular data, assesses cytogenetic risk for acute myeloid leukemia and myelodysplastic syndrome, processes molecular markers, assigns the hematopoietic cell transplantation comorbidity index (HCT-CI) based on comorbidities, and maps disease states to simplified categories. We prospectively assessed the recently developed disease risk stratification system (DRSS), showing that the pipeline produces consistent results with previous studies. The hazard ratio correlation between our cohort and the original DRSS derivation cohort was 0.92 with a 2-year AUC of 0.616, indicating similar effects and predictive performance. We aim to establish a new standard by promoting transparent, standardized and uniform extraction of registry data, enhancing reproducibility in registry studies.
2026
Inglese
Springer Nature
61
4
400
407
8
Pubblicato
Esperti anonimi
Internazionale
Goal 3: Good health and well-being
An extraction pipeline for analysis of hematopoietic stem cell transplantation data / Von Asmuth, E.G.J., Halkes, C.J.M., Versluis, J., Eikema, D.-J.A., Angelucci, E., Bazarbachi, A., Ciceri, F., Greco, R., Hazenberg, M., Kalwak, K., Mclornan, D.P., Neven, B., Risitano, A.M., Steinbuch, M., Sureda, A., Snowden, J., Lankester, A.C., Putter, H., De Wreede, L.C.. - In: BONE MARROW TRANSPLANTATION. - ISSN 0268-3369. - 61:4(2026), pp. 400-407. [10.1038/s41409-026-02818-z]
none
19
info:eu-repo/semantics/article
262
Von Asmuth, E. G. J.; Halkes, C. J. M.; Versluis, J.; Eikema, D. -J. A.; Angelucci, E.; Bazarbachi, A.; Ciceri, F.; Greco, R.; Hazenberg, M.; Kalwak, ...espandi
1 Contributo su Rivista::1.1.1 Articolo in rivista - Review
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11768/203860
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact