TY - JOUR
T1 - Large Language Model–Assisted Surgical Consent Forms in Non-English Language
T2 - Content Analysis and Readability Evaluation
AU - Oh, Namkee
AU - Kim, Jongman
AU - Park, Sunghae
AU - An, Sunghyo
AU - Lee, Eunjin
AU - Do, Hayeon
AU - Baik, Jiyoung
AU - Gwon, Suk Min
AU - Rhu, Jinsoo
AU - Choi, Gyu Seong
AU - Park, Seonmin
AU - Cho, Jai Young
AU - Lee, Hae Won
AU - Lee, Boram
AU - Jeong, Eun Sung
AU - Lee, Jeong Moo
AU - Choi, Young Rok
AU - Kwon, Jieun
AU - Kim, Kyeong Deok
AU - Kim, Seok Hwan
AU - Chun, Gwang Sik
N1 - Publisher Copyright:
© Namkee Oh, Jongman Kim, Sunghae Park, Sunghyo An, Eunjin Lee, Hayeon Do, Jiyoung Baik, Suk Min Gwon, Jinsoo Rhu, Gyu-Seong Choi, Seonmin Park, Jai Young Cho, Hae Won Lee, Boram Lee, Eun Sung Jeong, Jeong-Moo Lee, YoungRok Choi, Jieun Kwon, Kyeong Deok Kim, Seok-Hwan Kim, Gwang-Sik Chun.
PY - 2025
Y1 - 2025
N2 - Background: Surgical consent forms convey critical information; yet, their complex language can limit patient comprehension. Large language models (LLMs) can simplify complex information and improve readability, but evidence of the impact of LLM-generated modifications on content preservation in non-English consent forms is lacking. Objective: This study evaluates the impact of LLM-assisted editing on the readability and content quality of surgical consent forms in Korean—particularly consent documents for standardized liver resection—across multiple institutions. Methods: Standardized liver resection consent forms were collected from 7 South Korean medical institutions, and these forms were simplified using ChatGPT-4o. Thereafter, readability was assessed using KReaD and Natmal indices, while text structure was evaluated based on character count, word count, sentence count, words per sentence, and difficult word ratio. Content quality was analyzed across 4 domains—risk, benefit, alternative, and overall impression—using evaluations from 7 liver resection specialists. Statistical comparisons were conducted using paired 2-sided t tests, and a linear mixed-effects model was applied to account for institutional and evaluator variability. Results: Artificial intelligence–assisted editing significantly improved readability, reducing the KReaD score from 1777 (SD 28.47) to 1335.6 (SD 59.95) (P<.001) and the Natmal score from 1452.3 (SD 88.67) to 1245.3 (SD 96.96) (P=.007). Sentence length and difficult word ratio decreased significantly, contributing to increased accessibility (P<.05). However, content quality analysis showed a decline in the risk description scores (before: 2.29, SD 0.47 vs after: 1.92, SD 0.32; P=.06) and overall impression scores (before: 2.21, SD 0.49 vs after: 1.71, SD 0.64; P=.13). The linear mixed-effects model confirmed significant reductions in risk descriptions (β1=−0.371; P=.01) and overall impression (β1=−0.500; P=.03), suggesting potential omissions in critical safety information. Despite this, qualitative analysis indicated that evaluators did not find explicit omissions but perceived the text as overly simplified and less professional. Conclusions: Although LLM-assisted surgical consent forms significantly enhance readability, they may compromise certain aspects of content completeness, particularly in risk disclosure. These findings highlight the need for a balanced approach that maintains accessibility while ensuring medical and legal accuracy. Future research should include patient-centered evaluations to assess comprehension and informed decision-making as well as broader multilingual validation to determine LLM applicability across diverse health care settings.
AB - Background: Surgical consent forms convey critical information; yet, their complex language can limit patient comprehension. Large language models (LLMs) can simplify complex information and improve readability, but evidence of the impact of LLM-generated modifications on content preservation in non-English consent forms is lacking. Objective: This study evaluates the impact of LLM-assisted editing on the readability and content quality of surgical consent forms in Korean—particularly consent documents for standardized liver resection—across multiple institutions. Methods: Standardized liver resection consent forms were collected from 7 South Korean medical institutions, and these forms were simplified using ChatGPT-4o. Thereafter, readability was assessed using KReaD and Natmal indices, while text structure was evaluated based on character count, word count, sentence count, words per sentence, and difficult word ratio. Content quality was analyzed across 4 domains—risk, benefit, alternative, and overall impression—using evaluations from 7 liver resection specialists. Statistical comparisons were conducted using paired 2-sided t tests, and a linear mixed-effects model was applied to account for institutional and evaluator variability. Results: Artificial intelligence–assisted editing significantly improved readability, reducing the KReaD score from 1777 (SD 28.47) to 1335.6 (SD 59.95) (P<.001) and the Natmal score from 1452.3 (SD 88.67) to 1245.3 (SD 96.96) (P=.007). Sentence length and difficult word ratio decreased significantly, contributing to increased accessibility (P<.05). However, content quality analysis showed a decline in the risk description scores (before: 2.29, SD 0.47 vs after: 1.92, SD 0.32; P=.06) and overall impression scores (before: 2.21, SD 0.49 vs after: 1.71, SD 0.64; P=.13). The linear mixed-effects model confirmed significant reductions in risk descriptions (β1=−0.371; P=.01) and overall impression (β1=−0.500; P=.03), suggesting potential omissions in critical safety information. Despite this, qualitative analysis indicated that evaluators did not find explicit omissions but perceived the text as overly simplified and less professional. Conclusions: Although LLM-assisted surgical consent forms significantly enhance readability, they may compromise certain aspects of content completeness, particularly in risk disclosure. These findings highlight the need for a balanced approach that maintains accessibility while ensuring medical and legal accuracy. Future research should include patient-centered evaluations to assess comprehension and informed decision-making as well as broader multilingual validation to determine LLM applicability across diverse health care settings.
KW - ChatGPT-4o
KW - informed consent
KW - large language model
KW - liver resection
KW - natural language processing
KW - operative
KW - readability
KW - surgical consent form
KW - surgical procedures
UR - https://www.scopus.com/pages/publications/105008681182
U2 - 10.2196/73222
DO - 10.2196/73222
M3 - Article
C2 - 40537063
AN - SCOPUS:105008681182
SN - 1439-4456
VL - 27
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
M1 - e73222
ER -