Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

How are Korean Neural Language Models ‘surprised’ Layerwisely?How are Korean Neural Language Models ‘surprised’ Layerwisely?

Other Titles
How are Korean Neural Language Models ‘surprised’ Layerwisely?
Authors
최선주박명관김유희
Issue Date
Nov-2021
Publisher
한국언어과학회
Keywords
KR-BERT; KoBERT; linguistic anomaly; surprisal gap; layerwise; 한국어 신경망 언어모델; 언어학적 변칙; ‘놀라움’ 차이; 신경망 층별 분석
Citation
언어과학, v.28, no.4, pp 301 - 317
Pages
17
Indexed
KCI
Journal Title
언어과학
Volume
28
Number
4
Start Page
301
End Page
317
URI
https://scholarworks.dongguk.edu/handle/sw.dongguk/4213
DOI
10.14384/kals.2021.28.4.301
ISSN
1225-2522
2508-4267
Abstract
Since the introduction of BERT, recent works have shown success in detecting when a word is anomalous given sentence context. Since likelihood score is not an appropriate tool in identifying the exact property of linguistic anomaly, Li et al. (2021) recently adopt Gaussian models for density estimation at intermediate layers of pretrained language models. They find that different English pretrained language models employ separate mechanisms to recognize different types of linguistic anomaly. In keeping with Li et al.‘s methodology, we probe whether Korean counterparts such as KoBERT and KR-BERT are sensitive to different levels of linguistic anomaly, just as English-based language models are. To investigate the issue concerned, we construct an experiment with a suite of test data involving morphosyntactic, semantic, and commonsense anomaly in Korean and apply the two Korean-based models to test relevant sentences. We find that KoBERT and KR-BERT show relatively higher surprisal gaps throughout layers when the anomaly is morphosyntactic than when the anomaly is semantic. By contrast, commonsense anomaly does not exhibit any surprisal gap in any layer. We thus report that, like their English counterparts, KoBERT and KR-BERT use different mechanisms to track the different types of linguistic anomaly.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Humanities > Division of English Language & Literature > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Park, Myung Kwan photo

Park, Myung Kwan
College of Humanities (Division of English Language and Literature)
Read more

Altmetrics

Total Views & Downloads

BROWSE