Iterative Prompt Refinement For Radiation Oncology Symptom Extraction Using Teacher-student Large Language Models

Khanmohammadi Reza, Ghanem Ahmed I, Verdecchia Kyle, Hall Ryan, Elshaikh Mohamed, Movsas Benjamin, Bagher-ebadian Hassan, Chetty Indrin, Ghassemi Mohammad M., Thind Kundan. Arxiv 2024

[Paper]
GPT Model Architecture Prompting

This study introduces a novel teacher-student architecture utilizing Large Language Models (LLMs) to improve prostate cancer radiotherapy symptom extraction from clinical notes. Mixtral, the student model, initially extracts symptoms, followed by GPT-4, the teacher model, which refines prompts based on Mixtral’s performance. This iterative process involved 294 single symptom clinical notes across 12 symptoms, with up to 16 rounds of refinement per epoch. Results showed significant improvements in extracting symptoms from both single and multi-symptom notes. For 59 single symptom notes, accuracy increased from 0.51 to 0.71, precision from 0.52 to 0.82, recall from 0.52 to 0.72, and F1 score from 0.49 to 0.73. In 375 multi-symptom notes, accuracy rose from 0.24 to 0.43, precision from 0.6 to 0.76, recall from 0.24 to 0.43, and F1 score from 0.20 to 0.44. These results demonstrate the effectiveness of advanced prompt engineering in LLMs for radiation oncology use.

The Large Language Model Bible

Iterative Prompt Refinement For Radiation Oncology Symptom Extraction Using Teacher-student Large Language Models

Khanmohammadi Reza, Ghanem Ahmed I, Verdecchia Kyle, Hall Ryan, Elshaikh Mohamed, Movsas Benjamin, Bagher-ebadian Hassan, Chetty Indrin, Ghassemi Mohammad M., Thind Kundan. Arxiv 2024

Similar Work