Increasingly fast development cycles and individualized products pose major challenges for today's smart
production systems in times of industry 4.0. The systems must be flexible and continuously adapt to changing
conditions while still guaranteeing high throughputs and robustness against external disruptions. Deep reinforcement learning (RL) algorithms, which already reached impressive success with Google DeepMind's
AlphaGo, are increasingly transferred to production systems to meet related requirements. Unlike supervised
and unsupervised machine learning techniques, deep RL algorithms learn based on recently collected sensorand process-data in direct interaction with the environment and are able to perform decisions in real-time.
As such, deep RL algorithms seem promising given their potential to provide decision support in complex
environments, as production systems, and simultaneously adapt to changing circumstances.
While different use-cases for deep RL emerged, a structured overview and integration of findings on their
application are missing. To address this gap, this contribution provides a systematic literature review of
existing deep RL applications in the field of production planning and control as well as production logistics.
From a performance perspective, it became evident that deep RL can beat heuristics significantly in their
overall performance and provides superior solutions to various industrial use-cases. Nevertheless, safety and
reliability concerns must be overcome before the widespread use of deep RL is possible which presumes
more intensive testing of deep RL in real world applications besides the already ongoing intensive simulations.
|