dc.identifier.uri |
http://dx.doi.org/10.15488/11231 |
|
dc.identifier.uri |
https://www.repo.uni-hannover.de/handle/123456789/11318 |
|
dc.contributor.author |
van Ekeris, Tilo
|
|
dc.contributor.author |
Meyes, Richard
|
|
dc.contributor.author |
Meisen, Tobias
|
|
dc.contributor.editor |
Herberger, David
|
|
dc.contributor.editor |
Hübner, Marco
|
|
dc.date.accessioned |
2021-08-19T08:32:14Z |
|
dc.date.issued |
2021 |
|
dc.identifier.citation |
van Ekeris, T.; Meyes, R.; Meisen, T.: Discovering Heuristics And Metaheuristics For Job Shop Scheduling From Scratch Via Deep Reinforcement Learning. In: Herberger, D.; Hübner, M. (Eds.): Proceedings of the Conference on Production Systems and Logistics : CPSL 2021. Hannover : publish-Ing., 2021, S. 709-718. DOI:https://doi.org/10.15488/11231 |
|
dc.description.abstract |
Scheduling is the mathematical problem of allocating tasks to resources considering certain constraints. The goal is to achieve the best possible scheduling quality given a quality metric like makespan. Typical scheduling problems, including the classic Job Shop Scheduling Problem (JSP or JSSP), are NP-hard; meaning it is infeasible to use optimal solvers for big problem sizes. Instead, heuristics are frequently used to find suboptimal solutions in polynomial time, especially in real-world applications. Recently, Deep Reinforcement Learning (DRL) has also been applied to find solutions for planning problems like the JSP. In DRL, agents learn solution strategies for specific problem classes through the principle of trial and error. In this paper, we explore the connection between known heuristics and DRL: Heuristics always rely on features that can be extracted from the considered problem with low computational effort. We show that DRL agents, for which we limit the available observation to the underlying features of well-known heuristics, learn the behaviour of the more qualitative heuristics from scratch, while they do not learn the behaviour of less qualitative heuristics that would also be possible learning outcomes given the same feature as observation. Additionally, we motivate the use of DRL as a metaheuristic generator by training with the features of multiple basic heuristics. We show promising results that indicate that this learned metaheuristic finds better schedules in terms of makespan than any single simple heuristic – while only requiring simple computations in the time-critical solution phase and thus being faster than optimal solvers. |
eng |
dc.language.iso |
eng |
|
dc.publisher |
Hannover : publish-Ing. |
|
dc.relation.ispartof |
https://doi.org/10.15488/11229 |
|
dc.relation.ispartof |
Proceedings of the Conference on Production Systems and Logistics : CPSL 2021 |
|
dc.rights |
CC BY 3.0 DE |
|
dc.rights.uri |
https://creativecommons.org/licenses/by/3.0/de/ |
|
dc.subject |
Deep Reinforcement Learning (DRL) |
eng |
dc.subject |
Production planning |
eng |
dc.subject |
Scheduling |
eng |
dc.subject |
Job Shop Scheduling (JSP, JSSP) |
eng |
dc.subject |
Proximal Policy Optimization (PPO) |
eng |
dc.subject |
Heuristics |
eng |
dc.subject |
Metaheuristics |
eng |
dc.subject.classification |
Konferenzschrift |
ger |
dc.subject.ddc |
620 | Ingenieurwissenschaften und Maschinenbau
|
|
dc.title |
Discovering Heuristics And Metaheuristics For Job Shop Scheduling From Scratch Via Deep Reinforcement Learning |
eng |
dc.type |
BookPart |
|
dc.type |
Text |
|
dc.relation.essn |
2701-6277 |
|
dc.description.version |
publishedVersion |
|
tib.accessRights |
frei zug�nglich |
|