This paper presents a novel hybrid approach combining Tsetlin Machines (TMs) and Q-learning (QTM) for solving the Job Shop Scheduling Problem (JSSP). The proposed model integrates the pattern recognition capabilities of TMs with the decision-making strengths of Q-learning to optimize scheduling decisions. We implement a job shop scheduling(JSS) environment that handles complex scenarios with multiple jobs, operations, and machines while maintaining comprehensive state tracking. The QTM framework employs a novel reward function that balances makespan minimization with machine utilization, and utilizes an innovative action selection mechanism combining TM predictions with scheduling factors. We evaluate our approach using multiple benchmark datasets, including Taillard and Lawrence instances, as well as a real-world battery disassembly case study. Experimental results demonstrate that QTM consistently outperforms traditional dispatching rules like FIFO, MWKR, and SPT, achieving lower makespan and optimality gaps across instances. In the battery disassembly case study, QTM achieved a makespan of 581 with an optimality gap of 3.38%, significantly better than traditional heuristic-based methods. For Lawrence instances, QTM maintained an average optimality gap of 22.98%, while for Taillard instances, it achieved a 30.30% gap, showing particular strength in handling larger, more complex scheduling scenarios. While not matching state-of-the-art performance offered by far more complicated deep learning approaches, QTM offers a resource-efficient and transparent alternative that outperforms standard Q-learning, making it suitable for practical industrial applications with computational resource constraint. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.