Probing Good-Enough Processing in Large Language Models with a Paraphrasing Task

Jonghyun Lee; Jeong-Ah Shin

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Probing Good-Enough Processing in Large Language Models with a Paraphrasing Task

Authors: Jonghyun Lee; Jeong-Ah Shin

Issue Date: Jan-2026

Publisher: 한국영어학회

Keywords: large language models; garden-path sentences; good-enough processing; syntactic processing; paraphrasing task; ChatGPT

Citation: 영어학, v.26, pp 127 - 141

Pages: 15

Indexed: SCOPUS
KCI

Journal Title: 영어학

Volume: 26

Start Page: 127

End Page: 141

URI: https://scholarworks.dongguk.edu/handle/sw.dongguk/63821

DOI: 10.15738/kjell.26..202601.127

ISSN: 1598-1398
2586-7474

Abstract: This study investigates whether large language models (LLMs) exhibit human-like ‘good-enough’ processing patterns in syntactic comprehension or demonstrate mechanical accuracy. Previous research using forced-choice question-answering paradigms revealed that LLMs display incomplete syntactic reanalysis similar to humans when processing garden-path sentences. However, concerns arose that these patterns might reflect methodological artifacts rather than genuine processing characteristics, as direct questioning could bias models toward initial misinterpretations. To address this limitation, we employed a paraphrasing task that requires comprehensive sentence reformulation rather than binary responses, following Patson et al. (2009). We tested GPT-3.5 and GPT-4 on 24 garden-path sentences containing Optionally Transitive (OT) and Reflexive Absolute Transitive (RAT) verbs. Results demonstrate that good-enough processing patterns persist across both paradigms, with LLMs continuing to exhibit partial reanalysis in garden-path conditions even when generating full paraphrases. This confirms that previously observed error patterns represent genuine syntactic processing characteristics rather than experimental artifacts. Notably, GPT-4 showed improved performance in the paraphrasing task compared to forced-choice experiments, suggesting task-dependent variation in processing depth. Both models exhibited human-like incomplete processing despite their substantial computational resources, indicating that their pattern-matching mechanisms favor processing shortcuts over complete syntactic interpretation. These findings reveal that LLMs demonstrate good-enough processing similar to humans, with performance varying systematically across task formats.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Humanities > Division of English Language & Literature > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Shin, Jeong Ah photo

Shin, Jeong Ah: College of Humanities (Division of English Language and Literature)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE