An Evolutionary Multitasking-Based Unsupervised Learning Framework for Learning to Optimize


연구 분야: Artificial Intelligence



학회: 2025 IEEE Congress on Evolutionary Computation (CEC)


초록

Learning to optimize (L2O) is a paradigm designed to train a learnable optimizer that can quickly infer optimal solutions with less computation. While supervised learning (SL) and reinforcement learning (RL) are prevalent, SL needs to obtain optimal solutions of training instances beforehand and RL heavily relies on the meticulous design of rewards. More importantly, both paradigms struggle with good generalization across new problem instances. Therefore, this paper pioneers a novel learning framework, named evolutionary multitasking-based unsupervised learning (EMTUL), eliminating dependency on optimal labels and complicated rewards, and maintaining diversified parameter vectors to improve generalization. Taking the traveling salesman problem (TSP) as a case study, a lightweight learnable optimizer named tour generator (TrGen) is devised. During training, each instance is treated as a task, and an EMT algorithm is employed to train the TrGen on multiple tasks simultaneously. Upon termination of the algorithm, an epoch ends, and a set of candidate parameter vectors is preserved for the subsequent epoch based on distribution diversity and generalization performance. This approach gradually directs the training process towards diversified search regions beneficial for generalization. Following training, the set of candidate parameter vectors serves as a knowledge reserve. When encountering new instances, the learnable optimizer can either directly infer optimal solutions via the knowledge reserve or fine-tune its parameters to accommodate the specifics of each new instance. The main benefits of the EMTUL are: 1) it directly uses the objective function as the loss function, dispensing the necessity of optimal solutions and differentiable loss or reward function; 2) it maintains a set of elite parameter vectors to efficiently handle new instances. Finally, experimental results demonstrate that TrGen, with few model parameters and trained under EMTUL, effectively identif... Show More


Author Profile
Wei Wang

School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan China

Andorra
Author Profile
Yindong Shen

School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan China

Andorra

📄 논문 정보

발행 연도 2025년
인용수 67
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (357건)