TCX: A RISC Style Tensor Computing Extension and a Programmable Tensor Processor


연구 분야: Cryptography



학회: ACM Transactions on Embedded Computing Systems, Volume 22, Issue 3


초록

Neural network processors and accelerators are domain-specific architectures deployed to solve the high computational requirements of deep learning algorithms. This article proposes a new instruction set extension for tensor computing, TCX, using Reduced Instruction Set Computer (RISC) instructions enhanced with variable length tensor extensions. It features a multi-dimensional register file, dimension registers, and fully generic tensor instructions. It can be seamlessly integrated into existing RISC Instruction Set Architectures and provides software compatibility for scalable hardware implementations. We present a tensor accelerator implementation of the tensor extensions using an out-of-order RISC microarchitecture. The tensor accelerator is scalable in computation units from several hundred to tens of thousands. An optimized register renaming mechanism is described that allows for many physical tensor registers without requiring architectural support for large tensor register names. We describe new tensor load and store instructions that reduce bandwidth requirements using tensor dimension registers. Implementations may balance data bandwidth and computation utilization for different types of tensor computations such as element-wise, depthwise, and matrix-multiplication. We characterize the computation precision of tensor operations to balance area, generality, and accuracy loss for several well-known neural networks. The TCX processor runs at 1 GHz and sustains 8.2 Tera operations per second using a 4,096 multiply-accumulate compute unit. It consumes 12.8 mm2 while dissipating 0.46W/TOPs in TSMC 28-nm technology.


Author Profile
Tailin Liang

University of Science and Technology Beijing China and Hua Xia General Processor Technologies Haidian Qu Beijing China

Andorra
Author Profile
Lei Wang

University of Science and Technology Beijing Beijing China

Andorra
Author Profile
Shaobo Shi

University of Science and Technology Beijing China and Hua Xia General Processor Technologies Haidian Qu Beijing China

Andorra

📄 논문 정보

발행 연도 2023년
인용수 1
출판 국가 Andorra
사이트 ACM
좋아요 수 0

연관 논문 목록 (276건)