6 months ago
Systemic Anticancer Therapy Timelines Extraction From Electronic Medical Records Text: Algorithm Development and Validation
Background: The systemic treatment of cancer typically requires the use of multiple anticancer agents in combination and/or sequentially. Clinical narrative texts often contain extensive descriptions of the temporal sequencing of systemic anticancer therapy (SACT), setting up an important task that may be amenable to automated extraction of SACT timelines. Objective: We aimed to explore automatic methods for extracting patient-level SACT timelines from clinical narratives in the electronic medical records (EMRs). Methods: We used two datasets from two institutions: (1) Colorectal Cancer (CRC) dataset including the entire EMR of the 199 patients in the THYME dataset, and (2) 2024 ChemoTimelines shared task dataset including 149 patients with ovarian cancer, breast cancer and melanoma. We explored finetuning smaller language models trained to attend to events and time expressions, and few-shot prompting of Large Language Models (LLMs). Evaluation used the 2024 ChemoTimelines shared task configuration – Subtask1 involving the construction of SACT timelines from manually annotated SACT event and time expression mentions provided as input in addition to the patient’s notes, and Subtask2 requiring extraction of SACT timelines directly from the patient’s notes. Results: Our task-specific finetuned EntityBERT model achieved 93% F1 score, outperforming the best results in Subtask1 of the 2024 ChemoTimelines shared task (90%). It ranked second in Subtask2. LLM (LLaMA2, LLaMA3.1, Mixtral) performance lagged the task-specific finetuned model performance for both the THYME and shared task datasets. On the shared task datasets, the best LLM performance was 77% macro F1, 16 percentage points lower than the task-specific finetuned system (Subtask1). Conclusions: In this paper, we explored approaches for patient-level timeline extraction through the SACT timeline extraction task. Our results and analysis add to the knowledge of extracting treatment timelines from EMR clinical narratives using language modeling methods.
Systemic Anticancer Therapy Timelines Extraction From Electronic Medical Records Text: Algorithm Development and Validation #CancerResearch #MedicalRecords #AnticancerTherapy #MachineLearning #HealthcareInnovation
1
0
0
0