Improved utilization methodology of BERT specialized in text classification
- Authors
- So, H.; Rhee, J.
- Issue Date
- 5-Mar-2021
- Publisher
- Association for Computing Machinery
- Keywords
- BERT; Text classification; Triplet loss; Word embedding
- Citation
- ACM International Conference Proceeding Series, pp 114 - 118
- Pages
- 5
- Indexed
- SCOPUS
- Journal Title
- ACM International Conference Proceeding Series
- Start Page
- 114
- End Page
- 118
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/5598
- DOI
- 10.1145/3471985.3472384
- Abstract
- Recent language models are pre-trained to generate universal word representations. This study proposes a BERT-Triplet model and its utilization methodology to generate word representations specialized for the text classification task. Specifically, we use class information of the data in the pre-training stage of the proposed BERT-Triplet model to closely distribute the embedding vectors of words or sentences with a high probability of being classified into the same class in the vector space, unlike existing language models. The proposed methodology obtains improvement of the classification performance and is expected to be used in various sub-fields of text classification and in language models other than BERT. © 2021 ACM.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > Department of Industrial and Systems Engineering > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.