ITRT(IT Research Trends)

Training 4.6-Bit Convolutional Neural Networks with a HardTanh Activation Function

연구 분야: Artificial Intelligence

논문 키워드: #algorithm #algorithms #efficient #simplifying #helps

학회: Pattern Recognition and Image Analysis

초록

Low-bit quantization of neural networks is of great practical importance since it helps to significantly reduce the memory footprint and power consumption as well as increase the computational speed, which is especially important for mobile devices. Acting as a compromise between accurate 8-bit quantization and computationally efficient 4-bit quantization, 4.6-bit quantization is one of the promising methods. In this work, theoretical and practical aspects of training 4.6-bit convolutional neural networks with a HardTanh-type activation function are studied. Applying such activations, one can combine quantization and activation operations, simplifying the training procedure and reducing the computational cost of execution. Combinations of three different initialization strategies and four quantization algorithms are theoretically and experimentally studied for the class of neural networks involved. Special attention is paid to quantizing blocks with residual connections. According to the results, the best accuracy can be achieved by using initialization that balances the variance of gradients in the neural network with a layer-by-layer quantization algorithm that calibrates the weights of the neural network layers similar to the AdaQuant algorithm. For quantization of a block with a residual connection, an extended HardTanh-type activation function should be used, which is subsequently combined with the quantization operation.

A. V. Trusov

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences 119333 Moscow Russian Federation

Andorra

📄 논문 정보

발행 연도	2025년
인용수	0
출판 국가	Andorra
사이트	Springer
좋아요 수	0

Training 4.6-Bit Convolutional Neural Networks with a HardTanh Activation Function

Training 4.6-Bit Convolutional Neural Networks with a HardTanh Activation Function

📄 논문 정보

연관 논문 목록 (278건) 내 서재 담기

연관 논문 목록 (278건)