Scientific publications

Prevalence and clinical characteristics of patients with rheumatoid arthritis with interstitial lung disease using unstructured healthcare data and machine learning. Scientific Publication

Jan 30, 2024 | Magazine: RMD Open

Jose A Román Ivorra  1 , Ernesto Trallero-Araguas  2 , Maria Lopez Lasanta  2 , Laura Cebrián  3 , Leticia Lojo  3 , Belén López-Muñíz  4 , Julia Fernández-Melon  5 , Belén Núñez  6 , Lucia Silva-Fernández  5 , Raúl Veiga Cabello  7 , Pilar Ahijado  8 , Isabel De la Morena Barrio  9 , Nerea Costas Torrijo  10 , Belén Safont  11 , Enrique Ornilla  12 , Juliana Restrepo  13 , Arantxa Campo  14 , Jose L Andreu  15 , Elvira Díez  16 , Alejandra López Robles  17 , Elena Bollo  18 , Diego Benavent  19 , David Vilanova  20 , Sara Luján Valdés  21 , Raul Castellanos-Moreira  22


Objectives: Real-world data regarding rheumatoid arthritis (RA) and its association with interstitial lung disease (ILD) is still scarce. This study aimed to estimate the prevalence of RA and ILD in patients with RA (RAILD) in Spain, and to compare clinical characteristics of patients with RA with and without ILD using natural language processing (NLP) on electronic health records (EHR).

Methods: Observational case-control, retrospective and multicentre study based on the secondary use of unstructured clinical data from patients with adult RA and RAILD from nine hospitals between 2014 and 2019. NLP was used to extract unstructured clinical information from EHR and standardise it into a SNOMED-CT terminology. Prevalence of RA and RAILD were calculated, and a descriptive analysis was performed. Characteristics between patients with RAILD and RA patients without ILD (RAnonILD) were compared.

Results: From a source population of 3 176 165 patients and 64 241 683 EHRs, 13 958 patients with RA were identified. Of those, 5.1% patients additionally had ILD (RAILD). The overall age-adjusted prevalence of RA and RAILD were 0.53% and 0.02%, respectively. The most common ILD subtype was usual interstitial pneumonia (29.3%). When comparing RAILD versus RAnonILD patients, RAILD patients were older and had more comorbidities, notably concerning infections (33.6% vs 16.5%, p<0.001), malignancies (15.9% vs 8.5%, p<0.001) and cardiovascular disease (25.8% vs 13.9%, p<0.001) than RAnonILD. RAILD patients also had higher inflammatory burden reflected in more pharmacological prescriptions and higher inflammatory parameters and presented a higher in-hospital mortality with a higher risk of death (HR 2.32; 95% CI 1.59 to 2.81, p<0.001).

Conclusions: We found an estimated age-adjusted prevalence of RA and RAILD by analysing real-world data through NLP. RAILD patients were more vulnerable at the time of inclusion with higher comorbidity and inflammatory burden than RAnonILD, which correlated with higher mortality.

CITATION  RMD Open. 2024 Jan 30;10(1):e003353.  doi: 10.1136/rmdopen-2023-003353