A self-imitation learning approach for scheduling evaporation and encapsulation stages of OLED display manufacturing systems
- Authors
- Lee, Donghun; Park, In-Beom; Kim, Kwanho
- Issue Date
- Jun-2025
- Publisher
- Elsevier Ltd
- Keywords
- Deep reinforcement learning; Eligibility return; OLED display manufacturing scheduling; Self-imitation learning; Total tardiness
- Citation
- Robotics and Computer-Integrated Manufacturing, v.93, pp 1 - 14
- Pages
- 14
- Indexed
- SCIE
SCOPUS
- Journal Title
- Robotics and Computer-Integrated Manufacturing
- Volume
- 93
- Start Page
- 1
- End Page
- 14
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/57817
- DOI
- 10.1016/j.rcim.2024.102917
- ISSN
- 0736-5845
1879-2537
- Abstract
- In modern organic light-emitting diode (OLED) manufacturing systems, scheduling is a key decision-making problem to improve productivity. In particular, the scheduling of evaporation and encapsulation stages has been confronted with complicated constraints such as job-splitting property, preventive maintenance, machine eligibility, family setups, and heterogeneous release time of jobs. To efficiently solve such complicated scheduling problems, reinforcement learning (RL) has drawn increasing attention as an alternative in recent years. Unfortunately, the performance of the RL-based scheduling methods might not be satisfactory since unexpected correlations between actions are caused by machine eligibility restrictions, making it more challenging to address the credit assignment problem. To minimize the total tardiness, this article proposes a self-imitation learningbased scheduling method in which an agent utilizes past good experiences to exploit efficient exploration. Furthermore, a novel return design is introduced to overcome the credit assignment problem by considering machine eligibility restrictions. To prove the effectiveness and efficiency of the proposed method, numerical experiments are carried out by using the datasets that simulated the real-world OLED display manufacturing systems. Experiment results demonstrate that the proposed method outperforms other baselines, including rulebased and meta-heuristics, as well as the other DRL-based method in terms of the total tardiness while reducing computation time compared to meta-heuristics.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > Department of Industrial and Systems Engineering > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.