In this paper, we apply reinforcement learning to the power management control of building multi-type air-conditioners. In general, reinforcement learning requires several tens of thousands of training episodes before the control performance reaches a practical level. Therefore, applying it directly to air-conditioning control in 10-minute intervals would require unrealistic training days as several years. We attempted to shorten the learning period by learning in advance on a virtual building that emulates the dynamic characteristics of an actual building. Since it is difficult to create exactly the same air-conditioning environment of the actual building, we propose a method to select the closest one from several virtual buildings based on the differences of immediate reward. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.