연구 분야: Artificial Intelligence
학회: Pattern Recognition and Image Analysis
Low-bit quantization of neural networks is of great practical importance since it helps to significantly reduce the memory footprint and power consumption as well as increase the computational speed, which is especially important for mobile devices. Acting as a compromise between accurate 8-bit quantization and computationally efficient 4-bit quantization, 4.6-bit quantization is one of the promising methods. In this work, theoretical and practical aspects of training 4.6-bit convolutional neural networks with a HardTanh-type activation function are studied. Applying such activations, one can combine quantization and activation operations, simplifying the training procedure and reducing the computational cost of execution. Combinations of three different initialization strategies and four quantization algorithms are theoretically and experimentally studied for the class of neural networks involved. Special attention is paid to quantizing blocks with residual connections. According to the results, the best accuracy can be achieved by using initialization that balances the variance of gradients in the neural network with a layer-by-layer quantization algorithm that calibrates the weights of the neural network layers similar to the AdaQuant algorithm. For quantization of a block with a residual connection, an extended HardTanh-type activation function should be used, which is subsequently combined with the quantization operation.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Andorra |
| 사이트 | Springer |
| 좋아요 수 | 0 |