MY201868A

MY201868A - Neural network processor using compression and decompression of activation data to reduce memory bandwidth utilization

Info

Publication number: MY201868A
Application number: MYPI2019006051A
Authority: MY
Inventors: Leon Corkery Joseph; Eliot Lundell Benjamin; Marvin Wall Larry; Balling Mcbride Chad; Ashok Ambardekar Amol; Petre George; D Cedola Kent; Bobrov Boris
Original assignee: Microsoft Technology Licensing Llc
Priority date: 2017-04-17
Filing date: 2018-04-16
Publication date: 2024-03-21
Also published as: EP3612946B1; US20180300634A1; CN110546628A; CN110520857A; CN118153639A; CN110520909A; EP3613026B1; BR112019021541A2; US20200233820A1; KR20230152828A; PH12019550191A1; US11030131B2; SG11201909175XA; WO2018194846A1; EP3612991B1; CL2019002864A1; WO2018194994A2; CN110582785A; MX2023008178A; CN110520909B

Abstract

A deep neural network ("DNN") module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit (200) can receive an uncompressed chunk of data (202) generated by a neuron in the DNN module. The compression unit generates a mask portion (208) and a data portion (210) of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit (500) can receive a compressed chunk of data (204) from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion (208) and the data portion (210). This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption. (Figure 4)