TY - JOUR
T1 - A self-imitation learning approach for scheduling evaporation and encapsulation stages of OLED display manufacturing systems
AU - Lee, Donghun
AU - Park, In Beom
AU - Kim, Kwanho
N1 - Publisher Copyright:
© 2024
PY - 2025/6
Y1 - 2025/6
N2 - In modern organic light-emitting diode (OLED) manufacturing systems, scheduling is a key decision-making problem to improve productivity. In particular, the scheduling of evaporation and encapsulation stages has been confronted with complicated constraints such as job-splitting property, preventive maintenance, machine eligibility, family setups, and heterogeneous release time of jobs. To efficiently solve such complicated scheduling problems, reinforcement learning (RL) has drawn increasing attention as an alternative in recent years. Unfortunately, the performance of the RL-based scheduling methods might not be satisfactory since unexpected correlations between actions are caused by machine eligibility restrictions, making it more challenging to address the credit assignment problem. To minimize the total tardiness, this article proposes a self-imitation learning-based scheduling method in which an agent utilizes past good experiences to exploit efficient exploration. Furthermore, a novel return design is introduced to overcome the credit assignment problem by considering machine eligibility restrictions. To prove the effectiveness and efficiency of the proposed method, numerical experiments are carried out by using the datasets that simulated the real-world OLED display manufacturing systems. Experiment results demonstrate that the proposed method outperforms other baselines, including rule-based and meta-heuristics, as well as the other DRL-based method in terms of the total tardiness while reducing computation time compared to meta-heuristics.
AB - In modern organic light-emitting diode (OLED) manufacturing systems, scheduling is a key decision-making problem to improve productivity. In particular, the scheduling of evaporation and encapsulation stages has been confronted with complicated constraints such as job-splitting property, preventive maintenance, machine eligibility, family setups, and heterogeneous release time of jobs. To efficiently solve such complicated scheduling problems, reinforcement learning (RL) has drawn increasing attention as an alternative in recent years. Unfortunately, the performance of the RL-based scheduling methods might not be satisfactory since unexpected correlations between actions are caused by machine eligibility restrictions, making it more challenging to address the credit assignment problem. To minimize the total tardiness, this article proposes a self-imitation learning-based scheduling method in which an agent utilizes past good experiences to exploit efficient exploration. Furthermore, a novel return design is introduced to overcome the credit assignment problem by considering machine eligibility restrictions. To prove the effectiveness and efficiency of the proposed method, numerical experiments are carried out by using the datasets that simulated the real-world OLED display manufacturing systems. Experiment results demonstrate that the proposed method outperforms other baselines, including rule-based and meta-heuristics, as well as the other DRL-based method in terms of the total tardiness while reducing computation time compared to meta-heuristics.
KW - Deep reinforcement learning
KW - Eligibility return
KW - OLED display manufacturing scheduling
KW - Self-imitation learning
KW - Total tardiness
UR - http://www.scopus.com/inward/record.url?scp=85210372182&partnerID=8YFLogxK
U2 - 10.1016/j.rcim.2024.102917
DO - 10.1016/j.rcim.2024.102917
M3 - Article
AN - SCOPUS:85210372182
SN - 0736-5845
VL - 93
JO - Robotics and Computer-Integrated Manufacturing
JF - Robotics and Computer-Integrated Manufacturing
M1 - 102917
ER -