Free-Text Responses in a Nationally Representative Experimental Survey about End-of-Life Care Choices: ChatGPT-4o-Assisted Qualitative Analytical Study
Background: Little is known about how surrogates make end-of-life care choices for patients who lack the ability to make decisions for themselves. Objective: (1) To identify key themes that emerged from participants’ free text responses to a large nationally representative vignette survey about surrogate decision-making in end-of-life care and (2) to determine if an advanced artificial intelligence chatbot could assist us in accurately and efficiently performing qualitative analyses. Methods: Our dataset included 3,931 free-text responses from a nationally representative survey of 6,109 individuals. In this qualitative study, we first familiarized ourselves with the free-text responses and hand coded the first 200 responses until we reached saturation. We then created a codebook, initial themes, subthemes, and illustrative quotes. Subsequently, we prompted ChatGPT-4o to identify frequent keywords, generate themes, and quotable quotes. We validated responses by comparing the artificial intelligence’s keyword counts to qualitative software (NVivo) counts, and cross-validating artificial intelligence-generated quotes with the original transcripts. Results: We identified several key themes: surrogates more often chose comfort care for care recipients with dementia, particularly at advanced stages. They also strongly weighed the patients’ perceived quality of life and functional status. Many reported making surrogate decisions based on their own lived experiences or values, rather than making decisions aligned with the patients’ previously stated wishes. There was no significant difference between the artificial intelligence and qualitative software’s keyword counts. The most frequent keywords included “life” (n=2,051), “quality” (n=903), and dementia (n=507). Overall, artificial intelligence-generated themes closely aligned with aforementioned human-generated themes. Manual coding of the first 200 free-text responses required 4 hours, including codebook development. In contrast, ChatGPT-4o generated themes in