연구 분야: Networking
학회: International Conference on the Applications of Evolutionary Computation (Part of EvoStar)
Many embedded applications have strict energy, memory, and time constraints, making neural network (NN) inference particularly challenging. Recently, a novel NN architecture called Fast Feedforward Networks (FFFs) has been proposed to achieve inference with extremely lightweight computational demands and minimal latency. Yet, the memory footprint of such NNs remains a challenge. In this paper, we attempt to overcome this challenge by using a weight-sharing technique, called weight virtualization, proposing different virtualization methods that take advantage of the peculiarities of the FFFs’ tree-based architecture. We further optimize the model’s size (resulting from the virtualization configuration) and performance via multi-objective evolutionary optimization based on NSGA-II. Our experiments (https://github.com/DIOL-UniTN/MOE-VFFF) show that, in different benchmarks, leaf virtualization can reduce the memory footprint by up to 13x with negligible accuracy loss.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Andorra |
| 사이트 | Springer |
| 좋아요 수 | 0 |