WO2018121670A1 - 压缩/解压缩的装置和系统、芯片、电子装置 - Google Patents

压缩/解压缩的装置和系统、芯片、电子装置 Download PDF

Info

Publication number
WO2018121670A1
WO2018121670A1 PCT/CN2017/119364 CN2017119364W WO2018121670A1 WO 2018121670 A1 WO2018121670 A1 WO 2018121670A1 CN 2017119364 W CN2017119364 W CN 2017119364W WO 2018121670 A1 WO2018121670 A1 WO 2018121670A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
neural network
module
encoding
video
Prior art date
Application number
PCT/CN2017/119364
Other languages
English (en)
French (fr)
Inventor
陈天石
罗宇哲
郭崎
刘少礼
陈云霁
Original Assignee
上海寒武纪信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海寒武纪信息科技有限公司 filed Critical 上海寒武纪信息科技有限公司
Priority to EP17889129.7A priority Critical patent/EP3564864A4/en
Publication of WO2018121670A1 publication Critical patent/WO2018121670A1/zh
Priority to US16/457,397 priority patent/US10462476B1/en
Priority to US16/561,012 priority patent/US10834415B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the present disclosure relates to the field of artificial neural network technologies, and in particular, to an apparatus and system, a chip, and an electronic device for compressing/decompressing neural network data.
  • ANNs Artificial neural networks
  • NNNs neural networks
  • neural networks have made great progress in many fields such as intelligent control and machine learning.
  • neural networks have once again become a hot issue in the field of artificial intelligence.
  • the size of neural networks has become larger and larger.
  • Google Inc. have proposed the concept of “large-scale deep learning” and hope to build intelligent computer systems through Google as a platform to integrate global information.
  • the present disclosure provides an apparatus and system, chip, and electronic device for compression/decompression of neural network data to reduce the pressure of storage space and memory access bandwidth.
  • a compression apparatus for neural network data includes: a model conversion module 120 for converting neural network numerical data into video-like data; and a data encoding module 131 coupled to the model conversion module 120 for using the video encoding method for the class
  • the video data is encoded to obtain a compressed result.
  • the video-like data refers to that after the conversion by the model conversion module, each of the original neural network numerical data is converted into a series of integer values within a preset range. Corresponding to the representation of one pixel, the data of the corresponding video formed by these integers together.
  • the model conversion module 120 converts neural network numerical data into video-like data in one of two ways:
  • the first way determine the data range of the neural network numerical data in [-b, a], a is a positive integer greater than or equal to the maximum value of the entire neural network numerical data, and -b is less than or equal to the entire neural network model data.
  • the negative integer of the minimum is a positive integer greater than or equal to the maximum value of the entire neural network numerical data, and -b is less than or equal to the entire neural network model data. The negative integer of the minimum.
  • the model conversion module 120 performs the conversion according to the following formula:
  • I is an integer in the interval [0, (2 t -1)], that is, a representation of one pixel;
  • w is the true data value of the neural network numerical data in the range [-b, a], a and b Both are positive integers and t is a positive integer;
  • neural network numerical data has spatial local similarity, that is, spatially similar neurons and weight connections may be similar, similar to the video pixels, the similarity between frames, so it is possible to compress the neural network by video compression.
  • the model conversion module 120 converts the weights and offsets of each convolution kernel in the convolutional neural network numerical data, and converts the weights and offsets.
  • the obtained integers are integrated to obtain data corresponding to the video frame, and the video data is obtained by combining the weights of the plurality of convolution kernels and the data of similar video frames obtained by the offset.
  • the convolutional neural network numerical data refers to: neural network numerical data of a convolutional neural network.
  • the “integration” refers to: when each data similar to a video frame is converted into convolutional kernel data, the information of the convolutional neural network convolution kernel is obtained, and a linked list or other data may be used. The structure is stored.
  • the data encoding module 131 includes: an encoding submodule for encoding the video data in a manner of video encoding to obtain a data encoding result; and an integration submodule for The data encoding result and the encoding process information are integrated to obtain a compressed result.
  • the encoding sub-module includes: a prediction unit 130a, configured to perform predictive coding using correlation between video-like data neighboring data; and a transform unit 130b, configured to process the predicted unit
  • the video-like data is orthogonally transform-encoded to compress the data
  • the quantization unit 130c is configured to perform quantization coding on the video-like data processed by the transform unit, and reduce the coding length of the data without degrading the data quality
  • the encoding unit 130d is configured to perform rate compression encoding on the video-like data processed by the quantization unit by using statistical characteristics of the data to reduce data redundancy.
  • the prediction unit 130a, the transform unit 130b, the quantization unit 130c, and the entropy encoding unit 130d share the same data buffer unit or respectively correspond to a data buffer unit.
  • the encoding sub-module includes: a depth automatic codec unit 130e for further encoding the video-like data output by the model conversion module, and using the hidden layer output as a coding result;
  • the deep automatic codec unit 130e trains the video-like data as a training input and an ideal output by minimizing the reconstruction error so that the output becomes substantially the same data as the input-type video data.
  • the output is the same as the input, so the output can be regarded as a reconstruction of the input.
  • the output is different from the input, and the difference is the reconstruction error.
  • Minimizing the reconstruction error through training is to minimize the reconstruction error as described above.
  • the basics are the same here, and there is no strict benchmark. It can only be said to be similar.
  • the compression apparatus further includes: a structure information encoding module for encoding the neural network structure information to obtain neural network structure data.
  • the neural network value data includes weight data and bias data of the neural network.
  • the neural network structure information includes: a connection mode between neurons, a number of neurons in a layer, and a type of activation function.
  • the structural information encoding module encodes neural network structure information by recording the number of intra-layer neurons in each layer of the neural network; encoding the activation function type; and using the adjacency matrix to represent the connection of neurons between adjacent layers. The relationship is obtained by using the layer number as the index number, the neuron activation function type number, and the adjacency matrix as the index structure of the index result, that is, the neural network structure data.
  • the compression apparatus of the present disclosure further includes: a data cache module 140 for buffering neural network value data; a controller module 110, and the data cache module 140, the model conversion module 120, and the data encoding module
  • the 131 phase connection is used to send a control command to perform the following operations:
  • the data cache instruction is sent to the data cache module 140 to obtain the compression result from the data encoding module 131, and the compression result is cached.
  • a decompression device for neural network data.
  • the decompression device includes: a data decoding module 132, configured to obtain a compression result, and decode the compression result by using a video decoding manner corresponding to the compression result; and a model conversion module 120 connected to the data decoding module 132, It is used to restore the decoded video-like data into neural network numerical data.
  • the data decoding module 132 includes: a de-integration sub-module for de-integrating the compression result to obtain a data encoding result and encoding process information; and a decoder And a module, configured to extract coding mode information from the coding process information, and decode the data coding result by using a decoding manner corresponding to the coding mode information to obtain video-like data.
  • the model conversion module restores the decoded video-like data into neural network numerical data in one of two ways:
  • the first way determine the data range of the neural network numerical data is [-b, a], a is a positive integer greater than or equal to the maximum value of the entire neural network numerical data, and -b is less than or equal to the entire neural network model data.
  • the negative integer of the minimum value is [-b, a]
  • the model conversion module 120 operates in accordance with the following formula to restore neural network numerical data:
  • w is the restored data value of the neural network numerical data in the range [-b, a] before the conversion of the compression device model conversion module
  • I is the video-like data, which is in the interval [0, (2 t -1)]
  • An integer inside, t is a positive integer.
  • the neural network numerical data herein refers to the previously compressed neural network numerical data, because the range before compression is [-b, a], so the decompressed neural network numerical data is also in this interval.
  • the model conversion module 120 converts the data of the corresponding video frame in the video-like data, and converts each frame into a weight and bias of a convolution kernel of the convolutional neural network.
  • the data converted by each frame is integrated to obtain the overall information of the weight and offset of each convolution kernel of the convolutional neural network, thereby restoring the neural network numerical data.
  • the convolutional neural network numerical data refers to: neural network numerical data of a convolutional neural network.
  • integration means that when each data similar to a video frame is converted into convolutional kernel data, information of the convolutional kernel of the entire convolutional neural network is obtained, and a linked list or other data structure may be specifically used. .
  • the decompression device further includes: a neural network restoration module, configured to decode the neural network structure data, obtain neural network structure information, and transform the neural network structure information and the restored neural network numerical data. Restore the neural network together.
  • a neural network restoration module configured to decode the neural network structure data, obtain neural network structure information, and transform the neural network structure information and the restored neural network numerical data. Restore the neural network together.
  • the neural network numerical data is weight data and bias data of a neural network.
  • the neural network structure information includes: a connection manner between neurons, a number of intra-layer neurons, and an activation function type, and the neural network structure data encodes neural network structure information.
  • the data includes: a connection manner between neurons, a number of intra-layer neurons, and an activation function type, and the neural network structure data encodes neural network structure information. The data.
  • the neural network value data is weight data and offset data of the neural network; or the decompression device further includes: a non-numeric data decoding module, configured to decode the neural network numerical data to obtain a corresponding neural network non- Numerical data, wherein the neural network non-numeric data is one or more of the following data: data of connection manner between neurons and layer data.
  • the present disclosure decompression apparatus further includes: a data cache module 140 for buffering a compression result; a controller module 110, and the model conversion module 120, the data decoding module 132, and the data cache module 140 Connection, which is used to issue control instructions to the three to perform the following operations:
  • a data conversion instruction is sent to the model conversion module 120 to convert the video-like data into neural network numerical data.
  • a system for compression/decompression of neural network data includes: a compression device, which is the compression device described above; and a decompression device, which is the decompression device described above.
  • the compression device and the decompression device share a data cache module 140, a controller module 110, and a model conversion module 120.
  • a chip comprising: a compression device as described above; and/or a decompression device as described above; and/or a compression/decompression system as described above.
  • the chip includes a storage component disposed outside the storage component for a neural network to the incoming storage component The data is compressed; or the chip includes an input port, the compression device is disposed outside the input port for compressing the input neural network data; or the chip includes a data transmitting end, and the compression device is disposed at the data transmitting end. Used to compress the neural network data to be sent.
  • the chip includes a storage component, the decompression device being disposed outside the storage component for reading from the storage component The compressed neural network data is decompressed; or the chip includes an output port, the decompression device is disposed outside the output port for decompressing the input compressed neural network data; or the chip includes data At the receiving end, the decompression device is disposed at the data receiving end for decompressing the received compressed neural network data.
  • an electronic device comprising: the chip as described above.
  • the electronic device includes a data processing device, a robot, a computer, a printer, a scanner, a tablet, a smart terminal, a mobile phone, a driving recorder, a navigator, a sensor, a camera, and a cloud server.
  • the vehicle includes an airplane, a ship, and/or a vehicle
  • the household appliance includes a television, an air conditioner, a microwave oven, a refrigerator, a rice cooker, a humidifier, a washing machine, an electric lamp, a gas stove, a range hood
  • the medical device includes a nuclear magnetic resonance instrument, a B-ultrasound, and/or an electrocardiograph.
  • the apparatus and system, chip, and electronic device for compressing/decompressing neural network data of the present disclosure have at least one of the following beneficial effects:
  • the present disclosure can achieve high-efficiency compression and decompression of a large-scale neural network model, thereby greatly reducing the storage space and transmission pressure of the neural network model, thereby adapting to the trend of expanding the size of the neural network in the era of big data.
  • FIG. 1 is a schematic structural diagram of a compression apparatus for compressing neural network data according to a first embodiment of the present disclosure.
  • FIG. 2 is a schematic structural view of a data encoding module in the compression device shown in FIG. 1.
  • FIG. 3 is a flow chart of a controller module in FIG. 1 transmitting a control command to perform an operation.
  • FIG. 4 is a schematic structural diagram of a decompression apparatus for decompressing a neural network data compression result according to a second embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a data decoding module in the decompression device shown in FIG. 4.
  • FIG. 6 is a flow chart of a controller module transmitting control commands in the decompression device shown in FIG. 4 to perform an operation.
  • FIG. 7 is a schematic structural diagram of a compression/decompression system for neural network data compression results according to a third embodiment of the present disclosure.
  • Figure 8 is a schematic illustration of a compression process and a decompression process in the compression/decompression system of Figure 7.
  • FIG. 9 is a schematic structural view of a second embodiment of a compression device according to the present disclosure.
  • Figure 10 is a block diagram showing the second embodiment of the decompression device according to the present disclosure.
  • 131-data encoding module 131a-integration sub-module
  • 132-data decoding module 132a-de-integration sub-module
  • 130a-prediction unit 130b-transform unit; 130c-quantization unit;
  • 130d-entropy coding unit 130d'-entropy decoding unit
  • Video coding and decoding technology is a very mature technology.
  • Traditional video coding and decoding technology uses techniques such as prediction, transform and entropy coding. After deep learning, the use of deep neural networks for video encoding and decoding has become a new research hotspot.
  • neural network data refers to a collection of neural network information, including neural network numerical data and neural network structural information.
  • the neural network numerical data includes: weight data of the neurons and offset data, which are actually numerical data. After careful research and comparison, the applicant found that the neural network numerical data has the same local correlation as the pixels of the video image. Therefore, using the video codec method to encode and decode the neural network model, and then compressing the neural network model, A viable technical route.
  • the neural network structure information includes: the connection mode between neurons, the number of neurons, and the type of activation function.
  • the connection mode between neurons the number of neurons
  • the type of activation function the type of activation function
  • a compression apparatus for compressing neural network data In an actual system environment, the compression device can be installed around the storage component to compress the neural network data of the incoming storage component; it can also be placed around the input port to compress the input neural network data; The compression device is disposed at the transmitting end of the data for compressing the transmitted data.
  • the neural network data includes: neural network numerical data, specifically, weight data and offset data of the neuron.
  • the compression apparatus for compressing neural network numerical data in this embodiment includes: a controller module 110, a model conversion module 120, a data encoding module 131, and a data cache module 140.
  • the data cache module 140 is configured to cache the neural network value data obtained by the external storage module 200.
  • the model conversion module 120 is coupled to the data cache module 140 for converting neural network numerical data into video-like data.
  • the data encoding module 131 is coupled to the model conversion module 120 for encoding video-like data in a video encoding manner.
  • the controller module 110 is connected to the model conversion module 120, the data encoding module 131, and the data cache module 140, and is used to issue control commands to the three to coordinate the work.
  • the controller module 110 sends a control instruction to perform the following operations:
  • Step S302 sending a data read instruction to the data cache module 140, requesting the neural network value data to the external storage module 200, and buffering the neural network value data;
  • neural network numerical data has spatial local similarity, that is, spatially similar neurons and weight connections may be similar, similar to the video pixels, the similarity between frames, so it is possible to compress the neural network by video compression.
  • Step S304 sending a data read instruction to the model conversion module 120 to read the neural network value data from the data cache module 140;
  • Step S306 sending a data conversion instruction to the model conversion module 120, so that it converts the read neural network numerical data into video-like data;
  • the video-like data herein refers to the original each neural network numerical data converted into a series of integer values within a preset range after the conversion by the model conversion module, such as an integer value in the interval [0, 255] Corresponding to the representation of one pixel, these integers together constitute a video-like data.
  • the following two examples of specific neural network numerical data are used as an example:
  • the model conversion module can operate according to the following formula:
  • I is an integer in the interval [0, 255], that is, a representation of one pixel; w is the true data value of the neural network numerical data in the range of [-b, a].
  • 8bpp indicates that the pixel depth is 8, that is, each pixel is represented by 8-bit data. In this case, one pixel can have 2 to the 8th power, that is, 256 colors.
  • the neural network numerical data includes convolutional neural network numerical data, for convolutional neural network numerical data:
  • the model conversion module 120 converts the weights and offsets of each convolution kernel in the convolutional neural network numerical data, and integrates the weights and the integers obtained by the offset conversion to obtain data corresponding to the video frames.
  • the weight of the various convolution kernels is combined with the data of the similar video frames obtained by the offset to obtain the video-like data.
  • the convolutional neural network numerical data refers to: neural network numerical data of a convolutional neural network.
  • the “integration” refers to: when each data similar to a video frame is converted into convolutional kernel data, the information of the convolutional neural network convolution kernel is obtained, and a linked list or other data may be used. The structure is stored.
  • Step S308 sending a data read instruction to the data cache module 140, requesting the class conversion data to be requested by the model conversion module 120, and performing buffering;
  • Step S310 sending a data read instruction to the data encoding module 131 to read the video-like data from the data cache module 140;
  • Step S312 sending a data encoding instruction to the data encoding module 131, where the encoding instruction includes information of the encoding mode, so that the unit-type video data corresponding to the encoding mode is encoded to obtain a data encoding result;
  • the data encoding module 131 includes: an encoding submodule for encoding the video data in a video encoding manner to obtain a data encoding result; and an integration submodule 131a for The data encoding result and the encoding process information are integrated to obtain a compressed result.
  • the encoding sub-module further includes: a prediction unit 130a, a transform unit 130b, a quantization unit 130c, an entropy encoding unit 130d, and a depth automatic codec unit 130e.
  • the prediction unit 130a performs predictive coding using correlation between adjacent data using video-like data (converted neural network numerical data). "Adjacent” here means spatially similar.
  • the predictive coding refers to predictive coding according to intra-frame and inter-frame similarity of video-like data. For example, if there are three consecutive video frames, and the middle frame is removed, the intermediate frame can still be predicted based on the similarity between the two frames and the intermediate frame, or the last frame can be predicted according to the first two frames, so that Store two frames of information instead of three frames.
  • the similarity of the weights of the neural network units corresponding to different convolution kernels is predicted, and the difference between the predicted values and the actual values is encoded to achieve the compression purpose.
  • the predictive encoded video data and the original video data have the same representation.
  • the transform unit 130b performs orthogonal transform coding on the video-like data processed by the prediction unit 130a, thereby achieving the purpose of compression.
  • the quantization unit 130c quantizes the video-like data processed by the transform unit, which can reduce the coding length of the data without degrading the data quality.
  • the long parameter is set by the user according to experience and scene, taking into account the compression ratio and data recovery degree
  • FQ(u,v) is the quantization value of F(u,v)
  • round() is the rounding function (the output is input and input) The nearest integer of the real number).
  • the entropy coding unit 130d performs rate compression coding on the video-like data processed by the quantization unit by using the statistical characteristics of the data, such as Huffman coding and arithmetic coding.
  • a binary code of a short word length is assigned to a symbol having a high probability of occurrence
  • a binary code of a long word length is assigned to a symbol having a small probability of occurrence, thereby obtaining a code having the shortest average code length.
  • the coding mode includes: prediction, transform, quantization, and entropy coding.
  • the video-like data sequentially passes through the prediction unit 130a, the transform unit 130b, the quantization unit 130c, and the entropy coding unit 130d.
  • the output of one module is the input of the latter module.
  • a set of video data after being subjected to the prediction unit 130a, becomes a coding result of the difference between the predicted value and the actual value, enters the transform unit 130b, is further compressed by the two-dimensional DCT transform, and then enters the quantization unit 130c so that the code length thereof is shortened.
  • the coding redundancy is reduced by the Huffman coding of the entropy coding unit 130d, thereby achieving a better compression effect.
  • the prediction unit 130a, the transform unit 130b, the quantization unit 130c, and the entropy coding unit 130d share the same data buffer unit or respectively correspond to a data buffer unit.
  • the video-like data sequentially passes through the prediction unit 130a, the transform unit 130b, the quantization unit 130c, and the entropy coding unit 130d, but the disclosure is not limited thereto, and in other coding methods, It can be a unit and module that goes through other needs. It should be clear to those skilled in the art how to set in a specific coding mode, and details are not described herein again.
  • the depth automatic codec unit 130e encodes the data by using the working principle of the depth auto-encoder.
  • the working principle of the deep automatic codec unit is that the encoding end output is the encoding result, and the encoder training adopts the method of minimizing the reconstruction error, as described below.
  • the deep automatic codec unit trains by using the video data as the training input and the ideal output by minimizing the reconstruction error, so that the output becomes substantially the same data as the input type video data, thereby implementing the deep automatic codec unit.
  • the hidden layer output is used as the encoding result, and the final output is used as the decoding result. Since the number of neurons in the hidden layer is less than the number of input neurons, the input data can be compressed. It should be noted that the deep automatic codec unit encodes the information of the decoder end of the deep autoencoder and encodes the encoded result for decoding.
  • the output is expected to be the same as the input, so the output can be seen as a reconstruction of the input. In fact, the output is different from the input, and the difference is the reconstruction error. Minimizing the reconstruction error through training is to minimize the reconstruction error as described above.
  • the coding command may be one of the above coding modes, or may be a combination of the above two coding modes (the order of the coding modes is not limited when combining), and other video coding methods may be used.
  • the same data buffer unit may be shared by the prediction unit (130a), the transform unit (130b), the quantization unit (130c), and the entropy coding unit (130d); There may be a separate data cache unit that does not affect the implementation of the present disclosure.
  • the sequence of instructions in the controller module can be determined by a program written by the user, and the neural network numerical data can be compressed by using a compression method that the user desires to use. Users can combine different coding methods by writing related programs.
  • the controller module compiles the relevant program into an instruction, and decodes the instruction into a related control instruction, thereby realizing control of each module and the encoding process.
  • the process of compressing data is essentially a process of encoding data, and the encoding process in the above process can be equivalently regarded as part of a compression process or a compression process.
  • Step S314 sending an integration instruction to the data encoding module 131, so that the data encoding result and the encoding process information are integrated to obtain a compression result;
  • the compression result includes two parts: the first part is the data encoding result of the neural network numerical data, and the second part is the encoding process information.
  • the coding process information may include: information of the coding mode and information of the decoding end of the depth auto-encoder (when the deep automatic codec unit is used).
  • the information of the encoding method refers to the manner in which the encoding is performed, which is agreed in advance. For example, if a field in the instruction is "1", video encoding is used. If “2" is used, the depth automatic encoder is used. If “3” is used, the video encoding method is used first, and then the depth automatic encoder is used.
  • Step S316 sending a data cache instruction to the data cache module 140, obtaining a compression result from the data encoding module 131, and buffering the compression result;
  • Step S318, sending a data storage instruction to the data cache module 140 to save the compression result to the external storage module 200.
  • the compression result is output to the external storage module, but in other embodiments of the present disclosure, the compression result may be directly transmitted, or the compression result may be cached in the data encoding module.
  • the 131 or the data cache module 140 is an optional implementation of the present disclosure.
  • a decompression apparatus for decompressing neural network data compression results is provided. It should be noted that, for the purpose of brief description, any of the technical features of the above-described compression device embodiments that can be used for the same application are described herein, and the same description is not required.
  • the decompression device of this embodiment may be installed around the storage component for decompressing the compressed neural network data of the outgoing storage component; or it may be placed around the output port, The compressed neural network data is decompressed and outputted; the decompressing device can also be placed at the receiving end of the data for decompressing the received compressed neural network data.
  • the neural network data is neural network numerical data.
  • the decompression device for decompressing the compression result of the neural network numerical data is similar to the compression device of the first embodiment, and includes: a controller module 110', a model conversion module 120', and data decoding. Module 132 and data cache module 140'.
  • the connection relationship of each module in the decompression device in this embodiment is similar to the connection relationship of the compression device in the first embodiment, and will not be described in detail herein.
  • controller module 110' The structure and function of the controller module 110', the model conversion module 120', and the data cache module 140' are similar to those of the corresponding modules in the compression device, and are not described herein again.
  • the data cache module 140' is configured to cache the compression result.
  • the data decoding module 132 is coupled to the model conversion module 120' for decoding the compression result using a video decoding manner corresponding to the compression result.
  • the model conversion module 120' is connected to the data decoding module 132, and the decoded video data is restored to neural network numerical data.
  • the controller module 110' is coupled to the model conversion module 120', the data decoding module 132, and the data cache module 140' for issuing control commands to the three to coordinate their work.
  • the operations performed by the respective modules in the decompression device of the present embodiment are inverse to the operations performed by the corresponding modules of the compression device of the first embodiment.
  • the controller module 110' sends a control instruction to perform the following operations:
  • Step S602 sending a data read instruction to the data cache module 140', requesting the external storage module 200 to request the compression result, and buffering the compression result;
  • the compression result here includes two parts: the first part is the data encoding result of the neural network numerical data, and the second part is the encoding process information.
  • Step S604 sending a data read instruction to the data decoding module 132 to read the compression result from the data cache module 140';
  • Step S606 sending a de-integration instruction to the data decoding module 132, so that it decodes the encoding process information and the data compression result from the compression result;
  • Step S608 sending a data read instruction to the data decoding module 132, and reading the encoding process information from the data decoding module 132;
  • Step S610 selecting a decoding instruction according to the encoding process information
  • the encoding process information may include: information of the encoding mode, and information of the depth autoencoder decoding end (when the deep automatic codec unit is used). Therefore, it is possible to obtain from the encoding process information which encoding mode or combination of encoding modes is used to encode the neural network numerical data, and accordingly generate corresponding decoding instructions.
  • the decoding instruction includes which decoding method is used to decode the data encoding result in the compression result.
  • Step S612 sending a decoding instruction to the data encoding and decoding module 132, so that it decompresses the data compression result in the compression result to obtain video-like data;
  • the data decoding module 132 includes: a de-integration sub-module 132a for de-integrating the compression result to obtain a data encoding result and encoding process information; and a decoding sub-module for extracting encoding mode information from the encoding process information. And decoding the data encoding result by using a decoding method corresponding to the encoding mode information to obtain video-like data.
  • the decoding submodule further includes a prediction unit 130a, a transform unit 130b, a quantization unit 130c, an entropy decoding unit 130d', and a depth auto codec unit 130e. The operations performed by each unit are inverse to the related operations in the encoding module.
  • the quantization unit 130c performs inverse quantization processing on the compression result processed by the entropy decoding unit 130d'.
  • the following inverse quantization process is used:
  • the transform unit 130b performs inverse orthogonal transform on the data compression result processed by the quantization unit to perform decoding.
  • Equation 2-1 the inverse two-dimensional discrete cosine transform for an N ⁇ N matrix is expressed as:
  • the prediction unit 130a decodes the compression result processed by the transformation unit using the correlation between adjacent data in the original neural network numerical data.
  • the prediction unit 130a may add the predicted value to the correlation difference to restore the original value.
  • the depth automatic codec unit 130e decodes the neural network numerical data encoded by the deep autoencoder (as indicated by a broken line in FIG. 5).
  • the depth automatic codec unit 130e first decodes the decoding end information of the depth auto-encoder used in the encoding from the input data, constructs a decoder using the decoding end information, and uses the decoder.
  • the neural network numerical data encoded by the depth autoencoder is decoded.
  • the encoding instruction may be an encoding method or a combination of two or more encoding methods.
  • the data decoding module 132 sequentially decodes the data by using a corresponding decoding manner.
  • the encoded data sequentially passes through the entropy encoding module 130d', the quantization module 130c, the transform module 130b, and the prediction module 130a.
  • the output of one module is the input of the latter module.
  • the compressed neural network value data of a set of input data encoding and decoding modules is decoded by the entropy decoding module 130d' for the decoding process corresponding to the Huffman encoding, and the decoding result is entered into the quantization unit 130c for inverse quantization, and then enters the transform unit 130b.
  • the inverse transform is performed, and finally, the prediction unit 130a is caused to add the predicted value and the correlation difference, thereby outputting the decoding result.
  • Step S614 sending a data read instruction to the data cache module 140', so that the data cache module 140' reads the video-like data from the data decoding module 132, and caches;
  • Step S616 sending a data read instruction to the model conversion module 120, and causing the model conversion module 120 to read the video-like data from the data cache module 140';
  • Step S618, sending a data conversion instruction to the model conversion module 120, so that the model conversion module 120 converts the video-like data into neural network numerical data;
  • the model conversion module (120) operates according to the following formula to restore neural network numerical data:
  • w is the true data value of the neural network numerical data in the range [-b, a]
  • I is the video-like data, which is an integer in the interval [0, 255].
  • the above formula corresponds to the case where the pixel depth is 8, and corresponding to the case where the pixel depth is t, "255" in the above formula should be replaced by "(2 t -1)", where t is a positive integer.
  • the model conversion module 120 converts the data of the corresponding video frame in the video-like data, and converts each frame into a convolution kernel of the convolutional neural network. Value and offset, the data converted by each frame is integrated to obtain the overall information of the weight and offset of each convolution kernel of the convolutional neural network, thereby restoring the neural network numerical data.
  • the convolutional neural network numerical data refers to: neural network numerical data of a convolutional neural network.
  • Step S620 sending a data read instruction to the data cache module 140', so that the data cache module 140' requests the neural network value data to the model conversion module 120, and caches;
  • Step S622 send a data write command to the data cache module 140', so that the data cache module 140' writes the neural network value data to the external storage module 200;
  • the decoding result is output to the external storage module, but in other embodiments of the present disclosure, the decoding result may be directly transmitted, or the decoding result may be cached in the model conversion module. Or the data cache module is an optional implementation of the present disclosure.
  • the decompression process is essentially a decoding process, so the decoding process in the above process can be equivalently regarded as part of the decompression process or the decompression process.
  • a compression/decompression system is also provided.
  • the compression/decompression system of the present embodiment integrates the compression device of the first embodiment and the decompression device of the second embodiment. And, the compression device and the decompression device share the controller module (110, 110'), the model conversion module (120, 120'), and the data cache module (140, 140'). And, the data encoding module 131 in the compression device and the data decoding module 132 in the decompression device are integrated into the data encoding/decoding module 130.
  • the data encoding module 131 and the data decoding module 132 share a prediction unit 130a, a transform unit 130b, a quantization unit 130c, and a depth auto codec unit 130e.
  • the entropy encoding module 130d and the entropy decoding module 130d' exist as a module in the system, that is, to implement encoding of data and also to decode data.
  • the neural network data is stored in the external storage module 200; then, the controller module 110 sends a control command to the relevant module to control the compression process; the data cache module 140 reads the neural network data from the external storage module and caches; and then, the model
  • the conversion module 120 reads the neural network data from the data cache module 140 and converts it into class video data, and then stores the video data to the data cache module 140; then, the data encoding module 131 reads the class from the data cache module 140.
  • the video data which in turn passes through the processing of the prediction unit 130a, the transform unit 130b, the quantization unit 130c, and the entropy decoding unit 130d completes the compression process; subsequently, the data buffer module 140 reads the compressed data from the data encoding and decoding module 30. Finally, the data cache module 140 writes the compression result to the external storage module, and the data amount of the compression result is greatly reduced, which facilitates operations such as storage and transmission, as shown in the compression process in FIG.
  • the data to be decompressed is stored in the external storage module 200, and the data is compressed by the prediction, transformation, quantization, and entropy coding processes of the neural network data; in the following process, the controller module 110 sends the control instruction.
  • the data cache module 140 reads the data to be decompressed from the external storage module 200.
  • the data decoding module 132 reads the data to be decompressed from the data buffer module 140, and the data is once subjected to processing by the entropy decoding unit 130d', the quantization unit 130c, the transform unit 130b, and the prediction unit 130a, and decompressed into video-like data.
  • the data cache module 140 then reads the video-like data from the data encoding and decoding module 30. Subsequently, the data cache module 140 stores the video-like data to the model conversion module 120, which converts it into neural network data. Finally, the data cache module 140 reads the neural network data from the model conversion module 120 and writes it to the external storage module 200, thereby realizing the restoration of the neural network data, as shown in the decompression process in FIG.
  • the neural network data includes: neural network numerical data and neural network structural information.
  • the neural network structure information includes: a connection mode between neurons, a number of neurons in a layer, a type of activation function, and the like.
  • neural network structure information it may not be compressed in the manner of the first embodiment of the compression device.
  • the compression apparatus of this embodiment further includes: a structure information encoding module 133 for encoding the neural network structure information to obtain neural network structure data.
  • the structure information encoding module it is used to encode in the following manner:
  • the adjacency matrix is used to indicate the connection relationship of neurons between adjacent layers.
  • the element of the i-th row and the j-th column of the adjacency matrix is 1 to represent the i-th neuron of the upper layer and the j-th layer of the next layer.
  • the neurons are connected, otherwise they are not connected.
  • an index structure with the layer number as the index number, the neuron activation function type number and the adjacency matrix as the index result can be obtained, which is the neural network structure data.
  • the obtained neural network structure data it can be compressed or stored together with the compressed neural network value data.
  • the model conversion module and the data encoding module for compressing the neural network numerical data in this embodiment are the same as the corresponding modules in the first embodiment of the compression device, and are not described herein again.
  • Figure 10 is a block diagram showing the second embodiment of the decompression device according to the present disclosure.
  • the decompression device further includes: a neural network recovery module 134, configured to decode the neural network structure data, and obtain neural network structure information, The neural network structure information is restored together with the restored neural network numerical data.
  • the neural network structure information includes: a connection mode between neurons, a number of neurons in a layer, a type of activation function, and the like.
  • the neural network structure data is data after the neural network structure information is encoded in the manner of the second embodiment of the compression device.
  • the neural network structure data is restored to the neural network structure information, and the neural network recovery is realized together with the restored neural network numerical data.
  • a compression/decompression system is also provided.
  • the compression/decompression system of the present embodiment integrates the second embodiment of the compression device and the second embodiment of the decompression device.
  • a chip comprising: a first embodiment of a compression device or a compression device according to a second embodiment of the compression device, a first embodiment of the decompression device, and a decompression device The decompression device of the second embodiment, or the compression/decompression system as described in the third embodiment.
  • its position on the chip can be:
  • the chip includes a storage component, the compression device being disposed outside the storage component for compressing neural network data transmitted to the storage component; or
  • the chip includes an input port, the compression device being disposed outside the input port for compressing the input neural network data; or
  • the chip includes a data sending end, and the compressing device is disposed at the data sending end, and is configured to compress the neural network data to be sent;
  • its setting position on the chip can be:
  • the chip includes a storage component, the decompression device being disposed outside the storage component for decompressing the compressed neural network data read from the storage component; or
  • the chip includes an output port, the decompression device being disposed outside the output port for decompressing the input compressed neural network data;
  • the chip includes a data receiving end, and the decompressing device is disposed at the data receiving end for decompressing the received compressed neural network data.
  • the compression/decompression device can be implemented inside the chip, and the interaction speed between the device and the chip is faster, and the off-chip may reduce the interaction speed, but if the user No neural network chip is needed, only a compression/decompression device is needed, and it can be used independently.
  • the present disclosure discloses a board that includes the chip package structure described above.
  • the present disclosure discloses an electronic device that includes the above described chip.
  • the electronic device includes a data processing device, a robot, a computer, a printer, a scanner, a tablet, a smart terminal, a mobile phone, a driving recorder, a navigator, a sensor, a camera, a cloud server, a camera, a camera, a projector, a watch, an earphone, Mobile storage, wearable device vehicles, household appliances, and/or medical devices.
  • the vehicle includes an airplane, a ship, and/or a vehicle;
  • the household appliance includes a television, an air conditioner, a microwave oven, a refrigerator, a rice cooker, a humidifier, a washing machine, an electric lamp, a gas stove, a range hood;
  • the medical device includes a nuclear magnetic resonance instrument, B-ultrasound and / or electrocardiograph.
  • the external storage module and the data cache module are separate modules, and in other embodiments of the present disclosure, the external storage module and the data cache module may also exist in a whole form, that is, two Modules are combined into one module with storage function, which can also implement the present disclosure;
  • One or more or all of the above modules (such as prediction submodules, transformation submodules, etc.) have a corresponding data cache module, or none of the corresponding data cache modules, but an external storage Modules, which may be embodied in the present disclosure;
  • the present disclosure can be implemented by using a hard disk, a memory body, or the like as an external storage module;
  • the external storage module may be replaced with an input/output module for inputting and outputting data, for example, for a compression device, compressing the input neural network data or The compressed neural network data is output; for the decompression device, the input neural network data is decompressed or the decompressed neural network data is output, and the present disclosure can also be implemented.
  • the present disclosure can implement high-efficiency compression and decompression of a large-scale neural network model, thereby greatly reducing the storage space and transmission pressure of the neural network model, thereby adapting to the trend of expanding the size of the neural network in the era of big data, and can be applied to the nerve.
  • Various fields of network data have strong promotion and application value.
  • each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software program module.
  • the integrated unit if implemented in the form of a software program module and sold or used as a standalone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present disclosure, or all or part of the technical solution, may be embodied in the form of a software product stored in a memory. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the foregoing memory includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like, which can store program codes.
  • Each functional unit/module may be hardware, such as the hardware may be a circuit, including digital circuits, analog circuits, and the like.
  • Physical implementations of hardware structures include, but are not limited to, physical devices including, but not limited to, transistors, memristors, and the like.
  • the computing modules in the computing device can be any suitable hardware processor, such as a CPU, GPU, FPGA, DSP, ASIC, and the like.
  • the storage unit may be any suitable magnetic storage medium or magneto-optical storage medium such as RRAM, DRAM, SRAM, EDRAM, HBM, HMC, and the like.

Abstract

本公开提供了一种用于神经网络数据的压缩/解压缩的装置和系统、芯片、芯片封装结构、板卡、电子装置。其中,该压缩装置包括:模型转换模块;以及数据编码模块,与模型转换模块相连接。该解压缩装置包括:数据解码模块;以及模型转换模块,与数据解码模块相连接。该系统包括:上述的压缩装置和解压缩装置。本公开对神经网络数据进行压缩/解压缩,达到较高的压缩率,大幅减少了神经网络模型的存储空间和传输压力。

Description

压缩/解压缩的装置和系统、芯片、电子装置 技术领域
本公开涉及人工神经网络技术领域,尤其涉及一种用于神经网络数据的压缩/解压缩的装置和系统、芯片、电子装置。
背景技术
人工神经网络(ANNs),简称神经网络(NNs),是一种模仿动物神经网络行为特征,进行分布式并行信息处理的算法数学模型。这种网络依靠系统的复杂程度,通过调整内部大量节点之间相互连接的关系,达到处理信息的目的。
目前,神经网络在智能控制、机器学习等很多领域均获得长足发展。随着深度学习的兴起,神经网络再次成为人工智能领域研究的热点问题。随着大数据与深度学习的广泛结合,神经网络的规模变得越来越大。谷歌公司(Google)的研究者提出了“大规模深度学习”的理念,希望通过Google作为平台整合全球的信息构建智能计算机系统。
随着深度学习技术的不断发展,当前神经网络的模型规模越来越大,对存储性能以及访存带宽需求越来越高。如果不进行压缩的话,不仅需要大量的存储空间,并且对访存带宽的要求也非常高。压缩神经网络,作为一种新的技术理念,在神经网络的规模日益增加的背景下就表现出充分的必要性。
公开内容
(一)要解决的技术问题
鉴于上述技术问题,本公开提供了一种用于神经网络数据的压缩/解压缩的装置和系统、芯片、电子装置,以减少存储空间和访存带宽的压力。
(二)技术方案
根据本公开的一个方面,提供了一种用于神经网络数据的压缩装置。该压缩装置包括:模型转换模块120,用于将神经网络数值数据转化为类视频数据;以及数据编码模块131,与所述模型转换模块120相连接,用于采用视频编码的方式对所述类视频数据进行编码,得到压缩结果。
在本公开的一些实施例中,本公开压缩装置中,所述类视频数据是指 经过模型转换模块的转换后,原来的每个神经网络数值数据被转换为一系列预设范围内的整数值,对应于一个个像素的表示,这些整数共同所构成的对应视频的数据。
在本公开的一些实施例中,本公开压缩装置中,所述模型转换模块120采用以下两种方式其中之一将神经网络数值数据转化为类视频数据:
第一种方式:确定神经网络数值数据的数据范围在[-b,a],a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数。
模型转换模块120按以下公式操作进行转换:
Figure PCTCN2017119364-appb-000001
其中,I是在[0,(2 t-1)]区间内的整数,即一个像素的表示;w是在[-b,a]范围内的神经网络数值数据的真实数据值,a和b均为正整数,t为正整数;
因为神经网络数值数据具有空间局部相似性,即空间上相近的神经元和权连接可能相似,与视频的像素,帧间相似性类似,因而可能采取视频压缩方式压缩神经网络。
第二种方式:对于卷积神经网络数值数据,模型转换模块120将卷积神经网络数值数据中的每一种卷积核的权值和偏置进行转换,并将权值和偏置转换后得到的整数整合起来,得到对应视频帧的数据,从多种卷积核的权值和偏置得到的类似视频帧的数据结合起来就得到类视频数据。
其中,所述卷积神经网络数值数据是指:卷积神经网络的神经网络数值数据。
其中,所述的“整合”是指:当把每一个类似于视频帧的数据转变为卷积核数据后,就得到了整个卷积神经网络卷积核的信息,具体可以采用链表或其他数据结构进行存储。
在本公开的一些实施例中,所述数据编码模块131包括:编码子模块,用于采用视频编码的方式对所述类视频数据进行编码,得到数据编码结果;以及整合子模块,用于将数据编码结果和编码过程信息进行整合,得到压 缩结果。
在本公开的一些实施例中,所述编码子模块包括:预测单元130a,用于利用类视频数据相邻数据之间的相关性进行预测编码;变换单元130b,用于对经过预测单元处理后的类视频数据进行正交变换编码,以压缩数据;量化单元130c,用于对经过变换单元处理后的类视频数据进行量化编码,在不降低数据质量的前提下减少数据的编码长度;以及熵编码单元130d,用于利用数据的统计特性对经过量化单元处理后的类视频数据进行码率压缩编码,以减少数据冗余。
在本公开的一些实施例中,所述预测单元130a、变换单元130b、量化单元130c、熵编码单元130d共用同一数据缓存单元或分别对应一数据缓存单元。
在本公开的一些实施例中,所述编码子模块包括:深度自动编解码器单元130e,用于对模型转换模块输出的类视频数据进一步编码,将隐层输出作为编码结果;其中,所述深度自动编解码器单元130e通过将类视频数据作为训练输入和理想输出利用最小化重构误差的方法进行训练,使输出成为与输入类视频数据基本相同的数据。
其中,在深度自动编解码器单元中,希望输出和输入一样,因此输出可以看作对输入的重构。而实际上输出和输入不同,则不同的地方就是重构误差,通过训练使得重构误差最小化就是如上所述的最小化重构误差。此处的基本相同,并没有严格的判别基准,只能说是类似。
在本公开的一些实施例中,所述压缩装置还包括:结构信息编码模块,用于将神经网络结构信息进行编码,得到神经网络结构数据。
在本公开的一些实施例中,所述神经网络数值数据包括:神经网络的权值数据和偏置数据。
在本公开的一些实施例中,所述神经网络结构信息包括:神经元之间的连接方式、层内神经元数目、激活函数种类。所述结构信息编码模块采用如下方式对神经网络结构信息进行编码:记录神经网络各层的层内神经元数目;对激活函数种类进行编码;用邻接矩阵表示各相邻层之间神经元的连接关系,从而得到以层号为索引号,神经元激活函数种类编号和邻接矩阵为索引结果的索引结构,即为神经网络结构数据。
在本公开的一些实施例中,本公开压缩装置还包括:数据缓存模块140,用于缓存神经网络数值数据;控制器模块110,与所述数据缓存模块140、模型转换模块120和数据编码模块131相连接,用于发送控制指令,以执行如下操作:
向数据缓存模块140发送数据读取指令,令其向外界请求神经网络数值数据,并将该神经网络数值数据进行缓存;
向模型转换模块120发送数据读取指令,令其从数据缓存模块140中读取神经网络数值数据;
向模型转换模块120发送数据转换指令,令其将读取的神经网络数值数据转换为类视频数据;
向数据缓存模块140发送数据读取指令,令其向模型转换模块120请求类视频数据,并进行缓存;
向数据编码模块131发送数据读取指令,令其从数据缓存模块140读取类视频数据;
向数据编码模块131发送数据编码指令,该编码指令中包含编码方式的信息,令其对采用该编码方式对应的单元对类视频数据进行编码,得到数据编码结果;
向数据编码模块131发送整合指令,令其将数据编码结果和编码过程信息进行整合,得到压缩结果;
向数据缓存模块140发送数据缓存指令,令其从数据编码模块131中获得压缩结果,并将压缩结果进行缓存。
根据本公开的另一个方面,还提供了一种用于神经网络数据的解压缩装置。该解压缩装置包括:数据解码模块132,用于得到压缩结果,采用与压缩结果对应的视频解码方式对所述压缩结果进行解码;以及模型转换模块120,与所述数据解码模块132相连接,用于将解码后的类视频数据复原为神经网络数值数据。
在本公开的一些实施例中,本公开解压缩装置中,所述数据解码模块132包括:解整合子模块,用于将压缩结果进行解整合,得到数据编码结果和编码过程信息;以及解码子模块,用于从所述编码过程信息中提取编码方式信息,利用该编码方式信息对应的解码方式对所述数据编码结果进 行解码,得到类视频数据。
在本公开的一些实施例中,所述模型转换模块采用以下两种方式其中之一将解码后的类视频数据复原为神经网络数值数据:
第一种方式:确定神经网络数值数据的数据范围为[-b,a],,a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数。
模型转换模块120按以下公式操作,,从而复原神经网络数值数据:
Figure PCTCN2017119364-appb-000002
其中,w是压缩装置模型转换模块转换前在[-b,a]范围内的神经网络数值数据的复原数据值,I为类视频数据,其是在[0,(2 t-1)]区间内的整数,t为正整数。
可以理解的是,此处的神经网络数值数据其实是指之前压缩后的神经网络数值数据,因为压缩前的范围是[-b,a],所以解压后的神经网络数值数据也是在这个区间。
第二种方式:对于卷积神经网络数值数据,模型转换模块120将类视频数据中对应视频帧的数据进行转换,每一帧转换为卷积神经网络的一种卷积核的权值和偏置,将各帧转换的数据整合起来,得到卷积神经网络各卷积核的权值和偏置的整体信息,从而复原神经网络数值数据。
其中,所述卷积神经网络数值数据是指:卷积神经网络的神经网络数值数据。
其中,上述的“整合”是指:当把每一个类似于视频帧的数据转变为卷积核数据后,就得到了整个卷积神经网络卷积核的信息,具体可以采用链表或其他数据结构。
在本公开的一些实施例中,所述解压缩装置还包括:神经网络复原模块,用于神经网络结构数据进行解码,得到神经网络结构信息,将神经网络结构信息与复原后的神经网络数值数据一起复原神经网络。
在本公开的一些实施例中,所述神经网络数值数据为神经网络的权值数据和偏置数据。
在本公开的一些实施例中,所述神经网络结构信息包括:神经元之间 的连接方式、层内神经元数目、激活函数种类,所述神经网络结构数据是对神经网络结构信息进行编码后的数据。
所述神经网络数值数据为神经网络的权值数据和偏置数据;或者所述解压缩装置还包括:非数值数据解码模块,用于将神经网络数值数据进行解码,得到相应的神经网络的非数值数据,其中,所述神经网络非数值数据为以下数据中的一种或多种:神经元之间的连接方式的数据和层数据。
在本公开的一些实施例中,本公开解压缩装置还包括:数据缓存模块140,用于缓存压缩结果;控制器模块110,与所述模型转换模块120、数据解码模块132和数据缓存模块140连接,用于向三者下达控制指令,以执行以下操作:
向数据缓存模块140发送数据读取指令,令其向外部请求压缩结果,并将该压缩结果缓存;
向数据解码模块132发送数据读取指令,令其从数据缓存模块140中读取压缩结果;
向数据解码模块132发送解整合指令,令其从所述压缩结果中解码出编码过程信息和数据压缩结果;
向数据解码模块132发送数据读取指令,从数据解码模块132读取编码过程信息;
根据编码过程信息选择解码指令;
向数据编码解码模块132发送解码指令,令其将压缩结果中的数据压缩结果进行解压缩,得到类视频数据;
向数据缓存模块140发送数据读取指令,令其从数据解码模块132读取类视频数据,并缓存;
向模型转换模块120发送数据读取指令,令其从数据缓存模块140中读取类视频数据;
向模型转换模块120发送数据转换指令,令其将类视频数据转换为神经网络数值数据。
根据本公开的再一个方面,还提供了一种用于神经网络数据的压缩/解压缩的系统。该系统包括:压缩装置,为以上所述的压缩装置;以及解压缩装置,为以上所述的解压缩装置。
在本公开的一些实施例中,所述压缩装置和解压缩装置共用数据缓存模块140、控制器模块110和模型转换模块120。
根据本公开的再一个方面,还提供了一种芯片,包括:如上所述的压缩装置;和/或如上所述的解压缩装置;和/或如上所述的压缩/解压缩的系统。
在本公开的一些实施例中,对于所述压缩装置或所述系统中的压缩装置:所述芯片包括存储部件,所述压缩装置设置于存储部件外侧,用于对传入存储部件的神经网络数据进行压缩;或者所述芯片包括输入端口,所述压缩装置设置于输入端口外侧,用于压缩输入的神经网络数据;或者所述芯片包括数据发送端,所述压缩装置设置于数据发送端,用于对欲发送的神经网络数据进行压缩。
在本公开的一些实施例中,对于所述解压缩装置或所述系统中的解压缩装置:所述芯片包括存储部件,所述解压缩装置设置于存储部件外侧,用于对从存储部件读出的压缩后的神经网络数据进行解压缩;或者所述芯片包括输出端口,所述解压缩装置设置于输出端口外侧,用于解压缩输入的压缩后的神经网络数据;或者所述芯片包括数据接收端,所述解压缩装置设置于数据接收端,用于对接收的压缩后的神经网络数据进行解压缩。
根据本公开的再一个方面,还提供了一种电子装置,包括:如上所述的芯片。
在本公开的一些实施例中,所述的电子装置,包括数据处理装置、机器人、电脑、打印机、扫描仪、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备交通工具、家用电器、和/或医疗设备。
在本公开的一些实施例中,所述交通工具包括飞机、轮船和/或车辆;所述家用电器包括电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机;所述医疗设备包括核磁共振仪、B超仪和/或心电图仪。
(三)有益效果
从上述技术方案可以看出,本公开用于神经网络数据的压缩/解压缩的装置和系统、芯片、电子装置至少具有以下有益效果其中之一:
(1)借用视频编解码方法对神经网络数据进行压缩/解压缩,达到较高的压缩率,大幅减少了神经网络模型的存储空间和传输压力;
(2)在数据压缩/解压缩模块中,集成了多种压缩或解压缩的算法,能够大幅度加速压缩/解压缩过程;
(3)具有专用的数据缓存模块和控制器模块来服务于多种视频编解码专用模块,支持多种视频编解码技术的组合,极大地增加了装置的灵活性和实用性,同时支持新兴的运用深度神经网络进行压缩解压的技术。
综上,本公开可以实现大规模神经网络模型的高效压缩与解压,从而大幅减少神经网络模型的存储空间和传输压力,从而适应大数据时代神经网络规模不断扩大的趋势。
附图说明
图1为根据本公开第一实施例用于压缩神经网络数据的压缩装置的结构示意图。
图2为图1所示压缩装置中数据编码模块的结构示意图。
图3为图1所示压缩装置中控制器模块发送控制指令,以执行操作的流程图。
图4为本公开第二实施例用于解压缩神经网络数据压缩结果的解压缩装置的结构示意图。
图5为图4所示解压缩装置中数据解码模块的结构示意图。
图6为图4所示解压缩装置中控制器模块发送控制指令,以执行操作的流程图。
图7为本公开第三实施例用于神经网络数据压缩结果的压缩/解压缩系统的结构示意图。
图8为图7所示压缩/解压缩系统中压缩过程和解压缩过程的示意图。
图9为根据本公开压缩装置第二实施例的结构示意图。
图10为根据本公开解压缩装置第二实施例的结构示意图。
【本公开主要元件符号说明】
110、110′-控制器模块;
120、120′-模型转换模块;
140、140′-数据缓存模块;
130-数据编/解码模块;
131-数据编码模块; 131a-整合子模块;
132-数据解码模块; 132a-解整合子模块;
130a-预测单元; 130b-变换单元; 130c-量化单元;
130d-熵编码单元; 130d′-熵解码单元;
130e-深度自动编解码器单元;
133-结构信息编码模块;
134-神经网络复原模块;
200-外部存储模块。
具体实施方式
视频编码解码技术是一项十分成熟的技术,传统的视频编码解码技术采用预测、变换和熵编码等技术,深度学习兴起后,利用深度神经网络进行视频编解码也成为新的研究热点。
从广义上来讲,神经网络数据是指神经网络信息的集合体,包括神经网络数值数据和神经网络结构信息。
神经网络数值数据包括:神经元的权值数据和偏置数据,其实际上为数值数据。申请人经过认真地研究和比较后发现:神经网络数值数据与视频图像的像素一样具有局部相关性,因此运用视频编解码的方法来进行神经网络模型的编解码,进而压缩神经网络模型,将是一种可行的技术路线。
此外,神经网络结构信息包括:神经元之间的连接方式、神经元的数目、激活函数种类。对于这些神经网络结构信息,也可通过编码用数字表不。
为使本公开的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本公开进一步详细说明。
一、压缩装置第一实施例
在本公开的第一个示例性实施例中,提供了一种用于压缩神经网络数据的压缩装置。在实际的系统环境中,该压缩装置可以安装在存储部件周围,为传入存储部件的神经网络数据进行压缩处理;也可以将其置于输入端口周围,压缩输入的神经网络数据;还可以将压缩装置设置于数据的发送端,用于压缩发送数据。
本实施例中,神经网络数据包括:神经网络数值数据,具体而言为神经元的权值数据和偏置数据。请参照图1,本实施例用于压缩神经网络数值数据的压缩装置包括:控制器模块110、模型转换模块120、数据编码模块131和数据缓存模块140。
本实施例中,数据缓存模块140用于缓存由外部存储模块200获得的神经网络数值数据。模型转换模块120与数据缓存模块140相连接,用于将神经网络数值数据转化为类视频数据。数据编码模块131与模型转换模块120相连接,用于采用视频编码的方式对类视频数据进行编码。控制器模块110与模型转换模块120、数据编码模块131和数据缓存模块140连接,用于向三者下达控制指令,令其协调工作。
请参照图3,本实施例中,控制器模块110发送控制指令,以执行如下操作:
步骤S302,向数据缓存模块140发送数据读取指令,令其向外部存储模块200请求神经网络数值数据,并将该神经网络数值数据进行缓存;
因为神经网络数值数据具有空间局部相似性,即空间上相近的神经元和权连接可能相似,与视频的像素,帧间相似性类似,因而可能采取视频压缩方式压缩神经网络。
步骤S304,向模型转换模块120发送数据读取指令,令其从数据缓存模块140中读取神经网络数值数据;
步骤S306,向模型转换模块120发送数据转换指令,令其将读取的神经网络数值数据转换为类视频数据;
其中,此处的类视频数据是指经过模型转换模块的转换后,原来的每个神经网络数值数据被转换为一系列预设范围内的整数值,如[0,255]区间内的整数值,对应于一个个像素的表示,这些整数共同所构成的类似视频的数据。以下以两种特定的神经网络数值数据为例进行说明:
(1)确定神经网络模型数据的数据范围[-b,a],其中,a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数。
将神经网络模型数据转换为0~255的整数(此时对应于8bpp),模型转换模块可以按以下公式操作:
Figure PCTCN2017119364-appb-000003
其中,I是在[0,255]区间内的整数,即一个像素的表示;w是在[-b,a]范围内的神经网络数值数据的真实数据值。其中,8bpp表明像素深度是8,即每个像素点用8位数据来表示。在这种情况下,一个像素可以有2的8次方,即256种颜色。
本领域技术人员应当理解,上述公式中的“255”对应于像素深度为8的情况,来源于“(2 8-1)”,对于像素深度为t的情况,上述公式中的“255”应当用(2 t-1)来代替,其中t为正整数。
(2)举例说明,神经网络数值数据包括卷积神经网络数值数据,对于卷积神经网络数值数据:
模型转换模块120将卷积神经网络数值数据中的每一种卷积核的权值和偏置进行转换,并将权值和偏置转换后得到的整数整合起来,得到对应视频帧的数据,多种卷积核的权值和偏置得到的类似视频帧的数据结合起来就得到类视频数据。
其中,所述卷积神经网络数值数据是指:卷积神经网络的神经网络数值数据。
其中,所述的“整合”是指:当把每一个类似于视频帧的数据转变为卷积核数据后,就得到了整个卷积神经网络卷积核的信息,具体可以采用链表或其他数据结构进行存储。
步骤S308,向数据缓存模块140发送数据读取指令,令其向模型转换模块120请求类视频数据,并进行缓存;
步骤S310,向数据编码模块131发送数据读取指令,令其从数据缓存模块140读取类视频数据;
步骤S312,向数据编码模块131发送数据编码指令,该编码指令中包含编码方式的信息,令其对采用该编码方式对应的单元对类视频数据进行编码,得到数据编码结果;
请参照图2,本实施例中,数据编码模块131包括:编码子模块,用于采用视频编码的方式对所述类视频数据进行编码,得到数据编码结果;以及整合子模块131a,用于将数据编码结果和编码过程信息进行整合,得 到压缩结果。其中,编码子模块进一步包括:预测单元130a,变换单元130b,量化单元130c,熵编码单元130d和深度自动编解码器单元130e。
其中,在第一种编码方式中:
(1)预测单元130a利用类视频数据(转换后的神经网络数值数据)相邻数据之间的相关性进行预测编码。此处的“相邻”是指空间上相近。
其中,所述的预测编码是指根据类视频数据帧内和帧间相似性进行预测编码。举个例子,如果有三个连续的视频帧,抽掉中间一帧,仍然可以根据前后两帧与中间帧的相似性把中间帧预测出来,或者根据前两帧把最后一帧预测出来,这样只要存储两帧的信息而不是三帧。
具体在本实施例中,不同卷积核对应的神经网络单元的权值的相似性,对神经网络权值进行预测,将预测值与实际值之差进行编码,以达到压缩目的。其中,预测编码后的类视频数据和原类视频数据具有相同的表现形式。
(2)变换单元130b对经过预测单元130a处理后的类视频数据进行正交变换编码,从而达到压缩的目的。
例如,对类视频数据进行二维离散余弦变换(DCT)时,设f(m,n)为N×N的离散类视频数据,则二维DCT变换表示为:
Figure PCTCN2017119364-appb-000004
其中,u,v=0,1,……,N-1,当u=0(v=0)时,
Figure PCTCN2017119364-appb-000005
当u=1,2,……,N-1(v=1,2,……,N-1)时,c(u)=1(c(v)=1);f(m,n)为编码前在矩阵中位置为(m,n)的值,F(u,v)为编码后在矩阵中位置为(u,v)的值。
(3)量化单元130c对经过变换单元处理后的类视频数据进行量化编码,其可以在不降低数据质量的前提下减少数据的编码长度。
例如,采用标量量化技术时,对数据采用如下处理:
Figure PCTCN2017119364-appb-000006
其中,F(u,v)为经过变换单元处理的类视频数据中的任意位置的数据(u,v=0,1,……,N-1),Q step为量化步长,该量化步长参数由用户依据经验和场景设定,同时兼顾压缩比和数据复原程度,FQ(u,v)为F(u,v)的量化值,round()为取整函数(其输出为与输入实数最接近的整数)。
(4)熵编码单元130d利用数据的统计特性对经过量化单元处理后的类视频数据进行码率压缩编码,如采用哈夫曼编码和算数编码等。
例如在进行哈夫曼编码时,对出现概率大的符号分配短字长的二进制码,对出现概率小的符号分配长字长的二进制码,从而得到平均码长最短的码。
总的来说,本种编码方式包含:预测、变换、量化和熵编码,在数据编码模块131中,类视频数据依次经过预测单元130a、变换单元130b、量化单元130c和熵编码单元130d,前一个模块的输出为后一个模块的输入。例如一组类视频数据,经过预测单元130a后变为预测值与实际值的差的编码结果,进入变换单元130b后经过二维DCT变换被进一步压缩,再进入量化单元130c使得其编码长度被缩短,最后通过熵编码单元130d的哈夫曼编码减少编码冗余,从而达到较好的压缩效果。
此外,本领域技术人员能够理解,在第一种编码方式中,预测单元130a、变换单元130b、量化单元130c、熵编码单元130d共用同一数据缓存单元或分别对应一数据缓存单元。
需要说明的是,虽然本实施例中,类视频数据依次经过预测单元130a、变换单元130b、量化单元130c和熵编码单元130d,但本公开并不以此为限,在其他编码方式中,也可以是经过其他需要的单元和模块。本领域技术人员应当清楚在具体的编码方式中如何设定,在此不再赘述。其中,在第二种编码方式中,深度自动编解码器单元130e利用深度自动编码器的工作原理对数据进行编码。
其中,深度自动编解码器单元的工作原理即是编码端输出为编码结果,编码器训练采用最小化重构误差的方法,如下所述。
其中,深度自动编解码器单元通过将类视频数据作为训练输入和理想输出利用最小化重构误差的方法进行训练,使输出成为与输入类视频数据 基本相同的数据,从而深度自动编解码器单元将隐层输出作为编码结果,将最终输出作为解码结果,由于隐层的神经元数目少于输入神经元数目,因此可以对输入的数据进行压缩。需要注意的是,深度自动编解码器单元会将深度自动编码器的解码器端的信息进行编码并编入编码结果,供解码使用。
需要解释的是,在深度自动编解码器单元中,希望输出和输入一样,因此输出可以看作对输入的重构。而实际上输出和输入不同,则不同的地方就是重构误差,通过训练使得重构误差最小化就是如上所述的最小化重构误差。需要说明的是,所述编码指令中可以是上述的一种编码方式,也可以是上述两种编码方式组合(在组合时不限定编码方式的顺序),也可以采用其他的视频编码方式。
同样的,关于该深度自动编解码器单元中,其可以是同上述的预测单元(130a)、变换单元(130b)、量化单元(130c)、熵编码单元(130d)共用同一数据缓存单元;也可以是具有一独立的数据缓存单元,均不影响本公开的实现。
具体来讲,控制器模块中的指令序列可以由用户编写的程序决定,可以采用用户希望采用的压缩方式对神经网络数值数据进行压缩。用户通过编写相关程序,将不同的编码方式进行组合。控制器模块将相关程序编译为指令,并将指令译码为相关控制指令,实现对各模块及编码过程的控制。
关于编码的具体过程,可以参照视频编码的相关说明,此处不再进一步详细说明。
需要进一步说明的是,压缩数据的过程实质上是对数据进行编码的过程,以上过程中的编码过程可等同视为压缩过程或压缩过程的一部分。
步骤S314,向数据编码模块131发送整合指令,令其将数据编码结果和编码过程信息进行整合,得到压缩结果;
在此步骤之后,压缩结果中包括两部分内容:第一部分为对神经网络数值数据的数据编码结果,第二部分是编码过程信息。其中,该编码过程信息中可以包含:编码方式的信息、深度自动编码器解码端信息(采用深度自动编解码器单元时)。
此处,编码方式的信息是指采用何种方式进行编码,是事先约定好的。 例如:指令中某个域是“1”就用视频编码,是“2”就用深度自动编码器,是“3”就先用视频编码方式再用深度自动编码器等等。
步骤S316,向数据缓存模块140发送数据缓存指令,令其从数据编码模块131中获得压缩结果,并将压缩结果进行缓存;
步骤S318,向数据缓存模块140发送数据存储指令,令其将压缩结果存至外部存储模块200。
需要说明的是,虽然本实施例中是将压缩结果输出到外部存储模块,但在本公开其他实施例中,还可以是将该压缩结果直接传输出去,或者是将压缩结果缓存于数据编码模块131或者数据缓存模块140中,均是本公开可选的实现方式。
至此,本实施例用于压缩神经网络数值数据的压缩装置介绍完毕。
二、解压缩装置第一实施例
在本公开的第二个示例性实施例中,提供了一种用于解压缩神经网络数据压缩结果的解压缩装置。需要说明的是,为了达到简要说明的目的,上述压缩装置实施例中任何可作相同应用的技术特征叙述皆并于此,无需再重复相同叙述。
在实际的系统环境中,本实施例解压缩装置可以安装在存储部件周围,用于为传出存储部件的压缩后的神经网络数据进行解压缩处理;也可以将其置于输出端口周围,用于解压缩输出的压缩后的神经网络数据;还可以将解压缩装置置于数据的接收端,用于解压缩接收的压缩后的神经网络数据。
本实施例中,神经网络数据为神经网络数值数据。请参照图4,本实施例用于解压缩神经网络数值数据压缩结果的解压缩装置与第一实施例的压缩装置的结构类似,包括:控制器模块110′、模型转换模块120′、数据解码模块132和数据缓存模块140′。本实施例解压缩装置中各模块的连接关系,与第一实施例中压缩装置的连接关系类似,此处不再详细说明。
其中,控制器模块110′、模型转换模块120′和数据缓存模块140′的结构和功能与压缩装置中相应模块的结构和功能类似,此处不再赘述。
本实施例中,数据缓存模块140′用于缓存压缩结果。数据解码模块132与所述模型转换模块120′相连接,用于采用与压缩结果对应的视频 解码方式对所述压缩结果进行解码。模型转换模块120′与所述数据解码模块132相连接,用于解码后的类视频数据复原为神经网络数值数据。控制器模块110′与模型转换模块120′、数据解码模块132和数据缓存模块140′连接,用于向三者下达控制指令,令其协调工作。
与第一实施例压缩装置不同的是,本实施例解压缩装置中各个模块执行的操作与第一实施例压缩装置相应模块执行的操作相逆。具体而言,请参照图5,本实施例中,控制器模块110′发送控制指令,以执行如下操作:
步骤S602,向数据缓存模块140′发送数据读取指令,令其向外部存储模块200请求压缩结果,并将该压缩结果缓存;
如上所述,此处压缩结果中包括两部分内容:第一部分为对神经网络数值数据的数据编码结果,第二部分是编码过程信息。
步骤S604,向数据解码模块132发送数据读取指令,令其从数据缓存模块140′中读取压缩结果;
步骤S606,向数据解码模块132发送解整合指令,令其从压缩结果中解码出编码过程信息和数据压缩结果;
步骤S608,向数据解码模块132发送数据读取指令,从数据解码模块132读取编码过程信息;
步骤S610,根据编码过程信息选择解码指令;
如上所述,编码过程信息中可以包含:编码方式的信息、深度自动编码器解码端信息(采用深度自动编解码器单元时)。因此,可以从编码过程信息中得到采用的何种编码方式或编码方式的组合对神经网络数值数据进行的编码,并据此产生相应的解码指令。该解码指令中包含采用何种解码方式对压缩结果中的数据编码结果进行解码。
步骤S612,向数据编码解码模块132发送解码指令,令其将压缩结果中的数据压缩结果进行解压缩,得到类视频数据;
其中,数据解码模块132包括:解整合子模块132a,用于将压缩结果进行解整合,得到数据编码结果和编码过程信息;以及解码子模块,用于从所述编码过程信息中提取编码方式信息,利用该编码方式信息对应的解码方式对所述数据编码结果进行解码,得到类视频数据。而解码子模块进 一步包括:预测单元130a,变换单元130b,量化单元130c,熵解码单元130d′和深度自动编解码器单元130e。各个单元执行的操作与编码模块中的相关操作相逆。
其中,在第一种解码方式中(如图5中实线所示):
(1)熵解码单元130d′可以对压缩结果进行编码数据时所使用的熵编码方法对应的熵解码过程,如进行哈夫曼编码的解码过程。
(2)量化单元130c将经过熵解码单元130d′处理的压缩结果进行反量化处理。如对于经过标量量化技术处理的数据,采用以下反量化过程:
F(u,v)=FQ(u,v)·Q step           (3-2)
所有的参数定义与公式3-1相同,此处不再重述。
(3)变换单元130b对经过量化单元处理的数据压缩结果进行反正交变换进行解码。
例如,与公式2-1相逆,对于N×N矩阵的二维离散余弦逆变换表示为:
Figure PCTCN2017119364-appb-000007
所有的参数定义与公式2-1相同,此处不再重述。
(4)预测单元130a利用原神经网络数值数据中相邻数据之间的相关性对经过变换单元处理的压缩结果进行解码。
例如:预测单元130a可以将预测值与相关差值相加,以恢复原值。
其中,在第二种解码方式中,深度自动编解码器单元130e对经过深度自动编码器编码的神经网络数值数据进行解码(如图5中虚线所示)。
例如,在解码过程中,深度自动编解码器单元130e首先从输入数据中解码出编码时所使用的深度自动编码器的解码端信息,用这些解码端信息构造一个解码器,再利用该解码器对经过深度自动编码器编码的神经网络数值数据进行解码。
在第一实施例中,编码指令中可以是一种编码方式,也可以是两种或两种以上的编码方式组合。与第一实施例对应,如果输入至数据解码模块132中的数据采用的是两种或两种以上的编码方式,则数据解码模块132依次采用相应的解码方式对数据进行解码。
例如:当输入数据解码模块132的数据所用的编码方法为预测、变换、 量化和哈夫曼编码时,编码数据依次经过熵编码模块130d′,量化模块130c,变换模块130b和预测模块130a,前一个模块的输出为后一个模块的输入。例如一组输入数据编码解码模块的经过压缩的神经网络数值数据,经过熵解码模块130d′进行哈夫曼编码对应的解码过程进行解码,解码结果进入量化单元130c进行反量化,接着进入变换单元130b进行反变换,最后进入预测单元130a使得预测值与相关差值相加,从而输出解码结果。
关于解码的具体过程,可以参照视频解码的相关说明,此处不再进一步详细说明。
步骤S614,向数据缓存模块140′发送数据读取指令,令数据缓存模块140′从数据解码模块132读取类视频数据,并缓存;
步骤S616,向模型转换模块120发送数据读取指令,令模型转换模块120从数据缓存模块140′中读取类视频数据;
步骤S618,向模型转换模块120发送数据转换指令,令模型转换模块120将类视频数据转换为神经网络数值数据;
关于转换过程,其与第一实施例中模型转换模块执行的过程相逆。
以第一种方式为例:确定神经网络数值数据的数据范围为[-b,a],,a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数。
模型转换模块(120)按以下公式操作,从而复原神经网络数值数据:
Figure PCTCN2017119364-appb-000008
其中,w是在[-b,a]范围内的神经网络数值数据的真实数据值,I为类视频数据,其是在[0,255]区间内的整数。
同样的,上述公式对应于像素深度为8的情况,对应像素深度为t的情况,上述公式中的“255”应当用“(2 t-1)”来代替,其中t为正整数。
以第二种方式为例:对于卷积神经网络数值数据,模型转换模块120将类视频数据中对应视频帧的数据进行转换,每一帧转换为卷积神经网络的一种卷积核的权值和偏置,将各帧转换的数据整合起来,得到卷积神经网络各卷积核的权值和偏置的整体信息,从而复原神经网络数值数据。
其中,所述卷积神经网络数值数据是指:卷积神经网络的神经网络数 值数据。
其中,当把每一个类似于视频帧的数据转变为卷积核数据后,就得到了整个卷积神经网络卷积核的信息,具体可以采用链表或其他数据结构。
步骤S620,向数据缓存模块140′发送数据读取指令,令数据缓存模块140′向模型转换模块120请求神经网络数值数据,并缓存;
步骤S622,向数据缓存模块140′发送数据写入指令,令数据缓存模块140′将神经网络数值数据写入外部存储模块200;
需要说明的是,虽然本实施例中是将解码结果输出到外部存储模块,但在本公开其他实施例中,还可以是将该解码结果直接传输出去,或者是将解码结果缓存于模型转换模块或者数据缓存模块中,均是本公开可选的实现方式。
需要进一步说明的是,解压过程实质上是一个解码的过程,因此以上的过程中的解码过程可以等同视为解压缩过程或解压缩过程的一部分。
至此,本实施例用于解压缩神经网络数值数据的解压缩装置介绍完毕。
三、压缩/解压缩系统第一实施例
在本公开的第三个示例性实施例中,还提供了一种压缩/解压缩系统。如图7所示,本实施例压缩/解压缩系统集成了第一实施例压缩装置与第二实施例的解压缩装置。并且,压缩装置和解压缩装置共用控制器模块(110、110′)、模型转换模块(120、120′)和数据缓存模块(140、140′)。并且,压缩装置中的数据编码模块131和解压缩装置中的数据解码模块132集成为数据编/解码模块130。在数据编解码模块130中,数据编码模块131和数据解码模块132共用预测单元130a,变换单元130b,量化单元130c和深度自动编解码器单元130e。熵编码模块130d和熵解码模块130d′在系统中作为一个模块存在,即实现数据的编码,也实现数据的解码。
以下分别对本实施例压缩/解压缩系统的压缩过程和解压缩过程进行概要性说明:
压缩过程:
首先,将神经网络数据存入外部存储模块200;接着,控制器模块110,发送控制指令给相关模块控制压缩过程;数据缓存模块140从外部存储模块中读取神经网络数据并缓存;接着,模型转换模块120从数据缓存模块 140中读取神经网络数据并将其转换为类视频数据,随后将这些类视频数据存至数据缓存模块140;然后,数据编码模块131从数据缓存模块140读取类视频数据,这些数据依次经过预测单元130a,变换单元130b,量化单元130c和熵解码单元130d的处理完成压缩过程;随后,数据缓存模块140从数据编码解码模块30中读取压缩后的数据。最后,数据缓存模块140将压缩结果写入外部存储模块,压缩结果的数据量大大减小,方便进行存储、传输等操作,如图8中压缩过程所示。
解码过程:
首先,将待解压的数据存入外部存储模块200中,这些数据是神经网络数据经过预测、变换、量化和熵编码过程所压缩而成的;在接下来的过程中控制器模块110发送控制指令至各相关模块从而控制解压过程。数据缓存模块140从外部存储模块200读取待解压的数据。接着,数据解码模块132从数据缓存模块140读取待解压数据,这些数据一次经过熵解码单元130d′,量化单元130c,变换单元130b和预测单元130a的处理,解压为类视频数据。然后,数据缓存模块140从数据编码解码模块30中读取类视频数据。随后,数据缓存模块140将类视频数据存至模型转换模块120,模型转换模块120将其转换为神经网络数据。最后,数据缓存模块140从模型转换模块120读取神经网络数据,再将其写入外部存储模块200,从而实现神经网络数据的复原,如图8中解压缩过程所示。
至此,本实施例用于压缩/解压缩神经网络数据的压缩/解压缩系统介绍完毕。
四、压缩装置第二实施例
如上所述,神经网络数据包括:神经网络数值数据和神经网络结构信息。所述神经网络结构信息包括:神经元之间的连接方式、层内神经元数目、激活函数种类等。
对于神经网络结构信息,其不可以采用压缩装置第一实施例的方式进行压缩。
图9为根据本公开压缩装置第二实施例的结构示意图。如图9所示,本实施例压缩装置还包括:结构信息编码模块133,用于将神经网络结构信息进行编码,得到神经网络结构数据。
在该结构信息编码模块中,用于采用如下方式进行编码:
(1)记录神经网络各层的层内神经元数目;
(2)对激活函数种类进行编码,如Relu函数用1表示,Sigmoid函数用2表示;
(3)用邻接矩阵表示各相邻层之间神经元的连接关系,例如:邻接矩阵第i行第j列的元素为1表示上一层的第i个神经元和下一层的第j个神经元相连,反之则不相连。
通过上述方式,就可以得到一个以层号为索引号,神经元激活函数种类编号和邻接矩阵为索引结果的索引结构,即为神经网络结构数据。
对于得到的神经网络结构数据,其就可以和压缩后的神经网络数值数据一起被压缩或被存储。
本实施例中的神经网络数值数据进行压缩的模型转换模块、数据编码模块与压缩装置第一实施例中相应模块相同,此处不再赘述。
通过增加本实施例中的结构信息编码模块,实现了神经网络结构数据和神经网络数值数据的一并处理。
五、解压缩装置第二实施例
在本公开的第五个实施例中,提供了另外一种解压缩装置。
图10为根据本公开解压缩装置第二实施例的结构示意图。本实施例与解压缩装置第一实施例的区别在于,本实施例中,所述解压缩装置还包括:神经网络复原模块134,用于神经网络结构数据进行解码,得到神经网络结构信息,将神经网络结构信息与复原后的神经网络数值数据一起复原神经网络。
如上所述,神经网络结构信息包括:神经元之间的连接方式、层内神经元数目、激活函数种类等。而神经网络结构数据为对神经网络结构信息采用压缩装置第二实施例的方式编码之后的数据。
通过增加本实施例中的神经网络复原模块,实现了神经网络结构数据还原为神经网络结构信息,进而与复原后的神经网络数值数据一起实现神经网络的复原。
六、压缩/解压缩系统第二实施例
在本公开的第六个示例性实施例中,还提供了一种压缩/解压缩系统。 本实施例压缩/解压缩系统集成了压缩装置第二实施例与解压缩装置第二实施例。六、其他实施例
在本公开的另外一些实施例里,还提供了一种芯片,其包括了:压缩装置第一实施例或压缩装置第二实施例所述的压缩装置、解压缩装置第一实施例和解压缩装置第二实施例所述的解压缩装置,或如第三实施例所述的压缩/解压缩系统。
对于压缩装置而言,其在芯片上的设置位置可以为:
1.所述芯片包括存储部件,所述压缩装置设置于存储部件外侧,用于对传入存储部件的神经网络数据进行压缩;或者
2.所述芯片包括输入端口,所述压缩装置设置于输入端口外侧,用于压缩输入的神经网络数据;或者
3.所述芯片包括数据发送端,所述压缩装置设置于数据发送端,用于对欲发送的神经网络数据进行压缩;
对于解压缩装置而言,其在芯片上的设置位置可以为:
1.所述芯片包括存储部件,所述解压缩装置设置于存储部件外侧,用于对从存储部件读出的压缩后的神经网络数据进行解压缩;或者
2.所述芯片包括输出端口,所述解压缩装置设置于输出端口外侧,用于解压缩输入的压缩后的神经网络数据;或者
3.所述芯片包括数据接收端,所述解压缩装置设置于数据接收端,用于对接收的压缩后的神经网络数据进行解压缩。
本领域技术人员可以理解的是,如果芯片需要压缩和解压缩功能,可以在芯片内部实现压缩/解压缩装置,装置与芯片的交互速度更快,置于片外可能降低交互速度,但如果用户并不需要神经网络芯片,只想要一个压缩/解压缩装置,也完全可以独立使用。
在本公开的另外一些实施例里,还提供了一种芯片封装结构,其包括了上述芯片。
在本公开的另外一些实施例里,本公开公开了一个板卡,其包括了上述芯片封装结构。
在一个实施例里,本公开公开了一个电子装置,其包括了上述芯片。
该电子装置包括数据处理装置、机器人、电脑、打印机、扫描仪、平 板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备交通工具、家用电器、和/或医疗设备。
所述交通工具包括飞机、轮船和/或车辆;所述家用电器包括电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机;所述医疗设备包括核磁共振仪、B超仪和/或心电图仪。
至此,已经结合附图对本实施例进行了详细描述。依据以上描述,本领域技术人员应当对本公开用于神经网络数据的压缩/解压缩的装置和系统、芯片、电子装置有了清楚的认识。
需要说明的是,在附图或说明书正文中,未绘示或描述的实现方式,均为所属技术领域中普通技术人员所知的形式,并未进行详细说明。此外,上述对各元件和方法的定义并不仅限于实施例中提到的各种具体结构、形状或方式,本领域普通技术人员可对其进行简单地更改或替换,例如:
(1)在上述实施例中,外部存储模块和数据缓存模块为分离的两个模块,而在本公开的另外一些实施例中,外部存储模块和数据缓存模块还可以整体的形式存在,即将两个模块合并为一个具有存储功能的模块,其同样可以实现本公开;
(2)上述的一个或者多个或者全部的模块(比如预测子模块,变换子模块,等等)都有一个对应的数据缓存模块,或者都没有对应的数据缓存模块,而是有一个外部存储模块,均可以实施例本公开;
(3)可以用硬盘、内存体等作为外部存储模块来实现本公开;
(4)在本公开的其他一些实施例中,外部存储模块可以用输入输出模块替代,用来进行数据的输入和输出,例如:对于压缩装置而言,其对输入的神经网络数据进行压缩或将压缩后的神经网络数据进行输出;对于解压缩装置而言,其对输入的神经网络数据进行解压缩或将解压缩后的神经网络数据进行输出,同样可以实现本公开。
综上所述,本公开可以实现大规模神经网络模型的高效压缩与解压,从而大幅减少神经网络模型的存储空间和传输压力,从而适应大数据时代神经网络规模不断扩大的趋势,可以应用到神经网络数据的各个领域,具有较强的推广应用价值。
还需要说明的是,本文可提供包含特定值的参数的示范,但这些参数无需确切等于相应的值,而是可在可接受的误差容限或设计约束内近似于相应值。除非特别描述或必须依序发生的步骤,上述步骤的顺序并无限制于以上所列,且可根据所需设计而变化或重新安排。并且上述实施例可基于设计及可靠度的考虑,彼此混合搭配使用或与其他实施例混合搭配使用,即不同实施例中的技术特征可以自由组合形成更多的实施例。
前面的附图中所描绘的进程或方法可通过包括硬件(例如,电路、专用逻辑等)、固件、软件(例如,被承载在非瞬态计算机可读介质上的软件),或两者的组合的处理逻辑来执行。虽然上文按照某些顺序操作描述了进程或方法,但是,应该理解,所描述的某些操作能以不同顺序来执行。此外,可并行地而非顺序地执行一些操作。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序模块的形式实现。
所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
各功能单元/模块都可以是硬件,比如该硬件可以是电路,包括数字电路,模拟电路等等。硬件结构的物理实现包括但不局限于物理器件,物理器件包括但不局限于晶体管,忆阻器等等。所述计算装置中的计算模块可以是任何适当的硬件处理器,比如CPU、GPU、FPGA、DSP和ASIC等等。所述存储单元可以是任何适当的磁存储介质或者磁光存储介质,比如RRAM,DRAM,SRAM,EDRAM,HBM,HMC等等。以上所述的具体实 施例,对本公开的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本公开的具体实施例而已,并不用于限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (19)

  1. 一种用于神经网络数据的压缩装置,包括:
    模型转换模块(120),用于将神经网络数值数据转化为类视频数据;以及
    数据编码模块(131),与所述模型转换模块(120)相连接,用于采用视频编码的方式对所述类视频数据进行编码,得到压缩结果。
  2. 根据权利要求1所述的压缩装置,其中,所述类视频数据是指经过模型转换模块的转换后,原来的每个神经网络数值数据被转换为一系列预设范围内的整数值,对应于一个个像素的表示,这些整数共同所构成的对应视频的数据。
  3. 根据权利要求2所述的压缩装置,其中,所述模型转换模块(120)采用以下两种方式其中之一将神经网络数值数据转化为类视频数据:
    第一种方式:确定神经网络数值数据的数据范围[-b,a],a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数;
    模型转换模块(120)按以下公式操作进行转换:
    Figure PCTCN2017119364-appb-100001
    其中,I是在[0,(2 t-1)]区间内的整数,即一个像素的表示;w是在[-b,a]范围内的神经网络数值数据的真实数据值,t为正整数;
    第二种方式:对于卷积神经网络数值数据,模型转换模块(120)将卷积神经网络数值数据中的每一种卷积核的权值和偏置进行转换,并将权值和偏置转换后得到的整数整合起来,得到对应视频帧的数据,从多种卷积核的权值和偏置得到的类似视频帧的数据结合起来就得到类视频数据。
  4. 根据权利要求1所述的压缩装置,其中,所述数据编码模块(131)包括:
    编码子模块,用于采用视频编码的方式对所述类视频数据进行编码,得到数据编码结果;以及
    整合子模块,用于将数据编码结果和编码过程信息进行整合,得到压缩结果。
  5. 根据权利要求4所述的压缩装置,其中,所述编码子模块包括:
    预测单元(130a),用于利用类视频数据相邻数据之间的相关性进行预测编码;
    变换单元(130b),用于对经过预测单元处理后的类视频数据进行正交变换编码,以压缩数据;
    量化单元(130c),用于对经过变换单元处理后的类视频数据进行量化编码,在不降低数据质量的前提下减少数据的编码长度;以及
    熵编码单元(130d),用于利用数据的统计特性对经过量化单元处理后的类视频数据进行码率压缩编码,以减少数据冗余。
  6. 根据权利要求4所述的压缩装置,其中,所述预测单元(130a)、变换单元(130b)、量化单元(130c)、熵编码单元(130d)共用同一数据缓存单元或分别对应一数据缓存单元。
  7. 根据权利要求4所述的压缩装置,其中,所述编码子模块包括:
    深度自动编码器单元,用于对模型转换模块输出的类视频数据进一步编码,将隐层输出作为编码结果;
    其中,所述深度自动编码器单元通过将类视频数据作为训练输入和理想输出利用最小化重构误差的方法进行训练,使输出成为与输入类视频数据基本相同的数据。
  8. 根据权利要求1至7中任一项所述的压缩装置,还包括:
    结构信息编码模块,用于将神经网络结构信息进行编码,得到神经网 络结构数据。
  9. 根据权利要求8所述的压缩装置,其中,
    所述神经网络数值数据包括:神经网络的权值数据和偏置数据;
    所述神经网络结构信息包括:神经元之间的连接方式、层内神经元数目、激活函数种类;
    所述结构信息编码模块采用如下方式对神经网络结构信息进行编码:记录神经网络各层的层内神经元数目;对激活函数种类进行编码;用邻接矩阵表示各相邻层之间神经元的连接关系,从而得到以层号为索引号,神经元激活函数种类编号和邻接矩阵为索引结果的索引结构,即为神经网络结构数据。
  10. 根据权利要求1至7中任一项所述的压缩装置,还包括:
    数据缓存模块(140),用于缓存神经网络数值数据;
    控制器模块(110),与所述数据缓存模块(140)、模型转换模块(120)和数据编码模块(131)相连接,用于发送控制指令,以执行如下操作:
    向数据缓存模块(140)发送数据读取指令,令其向外界请求神经网络数值数据,并将该神经网络数值数据进行缓存;
    向模型转换模块(120)发送数据读取指令,令其从数据缓存模块(140)中读取神经网络数值数据;
    向模型转换模块(120)发送数据转换指令,令其将读取的神经网络数值数据转换为类视频数据;
    向数据缓存模块(140)发送数据读取指令,令其向模型转换模块(120)请求类视频数据,并进行缓存;
    向数据编码模块(131)发送数据读取指令,令其从数据缓存模块(140)读取类视频数据;
    向数据编码模块(131)发送数据编码指令,该编码指令中包含编码方式的信息,令其对采用该编码方式对应的单元对类视频数据进行编码,得到数据编码结果;
    向数据编码模块(131)发送整合指令,令其将数据编码结果和编 码过程信息进行整合,得到压缩结果;
    向数据缓存模块(140)发送数据缓存指令,令其从数据编码模块(131)中获得压缩结果,并将压缩结果进行缓存。
  11. 一种用于神经网络数据的解压缩装置,包括:
    数据解码模块(132),用于得到压缩结果,采用与压缩结果对应的视频解码方式对所述压缩结果进行解码;以及
    模型转换模块(120),与所述数据解码模块(132)相连接,用于将解码后的类视频数据复原为神经网络数值数据。
  12. 根据权利要求11所述的解压缩装置,其中,所述数据解码模块(132)包括:
    解整合子模块,用于将压缩结果进行解整合,得到数据编码结果和编码过程信息;以及
    解码子模块,用于从所述编码过程信息中提取编码方式信息,利用该编码方式信息对应的解码方式对所述数据编码结果进行解码,得到类视频数据。
  13. 根据权利要求11所述的解压缩装置,其中,所述模型转换模块采用以下两种方式其中之一将解码后的类视频数据复原为神经网络数值数据:
    第一种方式:确定神经网络数值数据的数据范围为[-b,a],,a是大于或者等于整个神经网络数值数据的最大值的正整数,-b是小于或等于整个神经网络模型数据的最小值的负整数;
    模型转换模块(120)按以下公式操作,从而复原神经网络数值数据:
    Figure PCTCN2017119364-appb-100002
    其中,w是在[-b,a]范围内的神经网络数值数据的真实数据值,I为类视频数据,其是在[0,(2 t-1)]区间内的整数,t为正整数;
    第二种方式:对于卷积神经网络数值数据,模型转换模块(120)将类视频数据中对应视频帧的数据进行转换,每一帧转换为卷积神经网络的一种卷积核的权值和偏置,将各帧转换的数据整合起来,得到卷积神经网络各卷积核的权值和偏置的整体信息,从而复原神经网络数值数据。
  14. 根据权利要求11至13中任一项所述的解压缩装置,还包括:
    神经网络复原模块,用于神经网络结构数据进行解码,得到神经网络结构信息,将神经网络结构信息与复原后的神经网络数值数据一起复原神经网络;
    其中,所述神经网络数值数据为神经网络的权值数据和偏置数据;所述神经网络结构信息包括:神经元之间的连接方式、层内神经元数目、激活函数种类,所述神经网络结构数据是对神经网络结构信息进行编码后的数据。
  15. 根据权利要求11至13中任一项所述的解压缩装置,还包括:
    数据缓存模块(140),用于缓存压缩结果;
    控制器模块(110),与所述模型转换模块(120)、数据解码模块(132)和数据缓存模块(140)连接,用于向三者下达控制指令,以执行以下操作:
    向数据缓存模块(140)发送数据读取指令,令其向外部请求压缩结果,并将该压缩结果缓存;
    向数据解码模块(132)发送数据读取指令,令其从数据缓存模块(140)中读取压缩结果;
    向数据解码模块(132)发送解整合指令,令其从所述压缩结果中解码出编码过程信息和数据压缩结果;
    向数据解码模块(132)发送数据读取指令,从数据解码模块132读取编码过程信息;
    根据编码过程信息选择解码指令;
    向数据编码解码模块(132)发送解码指令,令其将压缩结果中的数据压缩结果进行解压缩,得到类视频数据;
    向数据缓存模块(140)发送数据读取指令,令其从数据解码模块(132)读取类视频数据,并缓存;
    向模型转换模块(120)发送数据读取指令,令其从数据缓存模块(140)中读取类视频数据;
    向模型转换模块(120)发送数据转换指令,令其将类视频数据转换为神经网络数值数据。
  16. 一种用于神经网络数据的压缩/解压缩的系统,包括:
    压缩装置,为权利要求1~10中任一项所述的压缩装置;以及
    解压缩装置,为权利要求11至15中任一项所述的解压缩装置。
  17. 根据权利要求16所述的压缩/解压缩的系统,其中:
    所述压缩装置为权利要求9所述的压缩装置;以及
    所述解压缩装置为权利要求14所述的解压缩装置;
    其中,所述压缩装置和解压缩装置共用数据缓存模块(140)、控制器模块(110)和模型转换模块(120)。
  18. 一种芯片,包括:
    如权利要求1~10中任一项所述的压缩装置;和/或
    如权利要求11~15中任一项所述的解压缩装置;和/或
    如权利要求16或17所述的压缩/解压缩的系统;
    其中,对于所述压缩装置或所述系统中的压缩装置:
    所述芯片包括存储部件,所述压缩装置设置于存储部件外侧,用于对传入存储部件的神经网络数据进行压缩;或者
    所述芯片包括输入端口,所述压缩装置设置于输入端口外侧,用于压缩输入的神经网络数据;或者
    所述芯片包括数据发送端,所述压缩装置设置于数据发送端,用于对欲发送的神经网络数据进行压缩;
    和/或,对于所述解压缩装置或所述系统中的解压缩装置:
    所述芯片包括存储部件,所述解压缩装置设置于存储部件外 侧,用于对从存储部件读出的压缩后的神经网络数据进行解压缩;或者
    所述芯片包括输出端口,所述解压缩装置设置于输出端口外侧,用于解压缩输入的压缩后的神经网络数据;或者
    所述芯片包括数据接收端,所述解压缩装置设置于数据接收端,用于对接收的压缩后的神经网络数据进行解压缩。
  19. 一种电子装置,包括:如权利要求18所述的芯片。
PCT/CN2017/119364 2016-12-30 2017-12-28 压缩/解压缩的装置和系统、芯片、电子装置 WO2018121670A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP17889129.7A EP3564864A4 (en) 2016-12-30 2017-12-28 DEVICES FOR COMPRESSION / DECOMPRESSION, SYSTEM, CHIP AND ELECTRONIC DEVICE
US16/457,397 US10462476B1 (en) 2016-12-30 2019-06-28 Devices for compression/decompression, system, chip, and electronic device
US16/561,012 US10834415B2 (en) 2016-12-30 2019-09-05 Devices for compression/decompression, system, chip, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611270091.6 2016-12-30
CN201611270091 2016-12-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/457,397 Continuation-In-Part US10462476B1 (en) 2016-12-30 2019-06-28 Devices for compression/decompression, system, chip, and electronic device

Publications (1)

Publication Number Publication Date
WO2018121670A1 true WO2018121670A1 (zh) 2018-07-05

Family

ID=62707005

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119364 WO2018121670A1 (zh) 2016-12-30 2017-12-28 压缩/解压缩的装置和系统、芯片、电子装置

Country Status (4)

Country Link
US (2) US10462476B1 (zh)
EP (1) EP3564864A4 (zh)
CN (1) CN108271026B (zh)
WO (1) WO2018121670A1 (zh)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3564864A4 (en) * 2016-12-30 2020-04-15 Shanghai Cambricon Information Technology Co., Ltd DEVICES FOR COMPRESSION / DECOMPRESSION, SYSTEM, CHIP AND ELECTRONIC DEVICE
CN109409518B (zh) * 2018-10-11 2021-05-04 北京旷视科技有限公司 神经网络模型处理方法、装置及终端
CN111047020B (zh) * 2018-10-12 2022-11-29 上海寒武纪信息科技有限公司 支持压缩及解压缩的神经网络运算装置及方法
CN111045726B (zh) * 2018-10-12 2022-04-15 上海寒武纪信息科技有限公司 支持编码、解码的深度学习处理装置及方法
KR102621118B1 (ko) * 2018-11-01 2024-01-04 삼성전자주식회사 영상 적응적 양자화 테이블을 이용한 영상의 부호화 장치 및 방법
CN111381878A (zh) * 2018-12-28 2020-07-07 上海寒武纪信息科技有限公司 数据处理装置、方法、芯片及电子设备
CN110009621B (zh) * 2019-04-02 2023-11-07 广东工业大学 一种篡改视频检测方法、装置、设备及可读存储介质
EP3742349A1 (en) 2019-05-24 2020-11-25 Samsung Electronics Co., Ltd. Decompression apparatus and control method thereof
CN110248191A (zh) * 2019-07-15 2019-09-17 山东浪潮人工智能研究院有限公司 一种基于深层卷积神经网络的视频压缩方法
CN112445772A (zh) * 2019-08-31 2021-03-05 上海寒武纪信息科技有限公司 用于数据压缩和解压缩的装置和方法
US11671110B2 (en) 2019-11-22 2023-06-06 Tencent America LLC Method and apparatus for neural network model compression/decompression
KR20210136123A (ko) * 2019-11-22 2021-11-16 텐센트 아메리카 엘엘씨 신경망 모델 압축을 위한 양자화, 적응적 블록 파티셔닝 및 코드북 코딩을 위한 방법 및 장치
US11496151B1 (en) * 2020-04-24 2022-11-08 Tencent America LLC Neural network model compression with block partitioning
US11611355B2 (en) * 2020-06-22 2023-03-21 Tencent America LLC Techniques for parameter set and header design for compressed neural network representation
WO2022116207A1 (zh) * 2020-12-04 2022-06-09 深圳市大疆创新科技有限公司 编码方法、解码方法和编码装置、解码装置
CN116530079A (zh) * 2021-01-08 2023-08-01 深圳市大疆创新科技有限公司 编码方法、解码方法和编码装置、解码装置
US20220284282A1 (en) * 2021-03-05 2022-09-08 Qualcomm Incorporated Encoding techniques for neural network architectures
CN113742003B (zh) * 2021-09-15 2023-08-22 深圳市朗强科技有限公司 一种基于fpga芯片的程序代码执行方法及设备
CN114422607B (zh) * 2022-03-30 2022-06-10 三峡智控科技有限公司 一种实时数据的压缩传输方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184362A (zh) * 2015-08-21 2015-12-23 中国科学院自动化研究所 基于参数量化的深度卷积神经网络的加速与压缩方法

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3058774B2 (ja) * 1993-01-29 2000-07-04 株式会社河合楽器製作所 映像合成装置及び映像合成方法
US20060245500A1 (en) * 2004-12-15 2006-11-02 David Yonovitz Tunable wavelet target extraction preprocessor system
US7668397B2 (en) * 2005-03-25 2010-02-23 Algolith Inc. Apparatus and method for objective assessment of DCT-coded video quality with or without an original video sequence
GB2519070A (en) * 2013-10-01 2015-04-15 Sony Corp Data encoding and decoding
US9818032B2 (en) * 2015-10-28 2017-11-14 Intel Corporation Automatic video summarization
CN108427990B (zh) * 2016-01-20 2020-05-22 中科寒武纪科技股份有限公司 神经网络计算系统和方法
EP3557484B1 (en) * 2016-12-14 2021-11-17 Shanghai Cambricon Information Technology Co., Ltd Neural network convolution operation device and method
EP3561736A4 (en) * 2016-12-20 2020-09-09 Shanghai Cambricon Information Technology Co., Ltd MULTIPLICATION AND ADDITION DEVICE FOR MATRICES, COMPUTER DEVICE WITH NEURONAL NETWORK AND PROCESS
EP3564864A4 (en) * 2016-12-30 2020-04-15 Shanghai Cambricon Information Technology Co., Ltd DEVICES FOR COMPRESSION / DECOMPRESSION, SYSTEM, CHIP AND ELECTRONIC DEVICE
US11551067B2 (en) * 2017-04-06 2023-01-10 Shanghai Cambricon Information Technology Co., Ltd Neural network processor and neural network computation method
US10657439B2 (en) * 2017-10-24 2020-05-19 Shanghai Cambricon Information Technology Co., Ltd Processing method and device, operation method and device
US10540574B2 (en) * 2017-12-07 2020-01-21 Shanghai Cambricon Information Technology Co., Ltd Image compression method and related device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184362A (zh) * 2015-08-21 2015-12-23 中国科学院自动化研究所 基于参数量化的深度卷积神经网络的加速与压缩方法

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FU, YAN ET AL.: "Lossless Data Compression with Neural Network Based on Maximum Entropy Theory", JOURNAL OF UNIVERSITY ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA, vol. 36, no. 6, 31 December 2007 (2007-12-31), pages 1245 - 1248, XP009515320, ISSN: 1001-0548 *
GONG, YUNCHAO ET AL.: "Compressing Deep Convolutional Networks Using Vector Quantization", ICLR, 18 December 2014 (2014-12-18), pages 1 - 10, XP055262159 *
HAN, SONG ET AL.: "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding", ICLR, 15 February 2016 (2016-02-15), pages 1 - 5, XP055393078 *
See also references of EP3564864A4 *

Also Published As

Publication number Publication date
EP3564864A4 (en) 2020-04-15
US10462476B1 (en) 2019-10-29
US20190394477A1 (en) 2019-12-26
EP3564864A1 (en) 2019-11-06
CN108271026A (zh) 2018-07-10
US10834415B2 (en) 2020-11-10
CN108271026B (zh) 2020-03-31
US20190327479A1 (en) 2019-10-24

Similar Documents

Publication Publication Date Title
WO2018121670A1 (zh) 压缩/解压缩的装置和系统、芯片、电子装置
Hu et al. Learning end-to-end lossy image compression: A benchmark
US20200160565A1 (en) Methods And Apparatuses For Learned Image Compression
US20200145692A1 (en) Video processing method and apparatus
WO2018120019A1 (zh) 用于神经网络数据的压缩/解压缩的装置和系统
WO2018121798A1 (zh) 基于深度自动编码器的视频编解码装置及方法
CN103581665A (zh) 转码视频数据
WO2019056898A1 (zh) 一种编码、解码方法及装置
US20230362378A1 (en) Video coding method and apparatus
WO2023279961A1 (zh) 视频图像的编解码方法及装置
CN111641826A (zh) 对数据进行编码、解码的方法、装置与系统
US11483585B2 (en) Electronic apparatus and controlling method thereof
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
TWI826160B (zh) 圖像編解碼方法和裝置
WO2023193629A1 (zh) 区域增强层的编解码方法和装置
CN115866253A (zh) 一种基于自调制的通道间变换方法、装置、终端及介质
CN116095183A (zh) 一种数据压缩方法以及相关设备
WO2023050720A1 (zh) 图像处理方法、图像处理装置、模型训练方法
WO2022179509A1 (zh) 音视频或图像分层压缩方法和装置
CN114501031B (zh) 一种压缩编码、解压缩方法以及装置
CN114359100A (zh) 图像色彩增强方法、装置、存储介质与电子设备
CN115409697A (zh) 一种图像处理方法及相关装置
WO2023040745A1 (zh) 特征图编解码方法和装置
WO2022155818A1 (zh) 图像编码、解码方法及装置、编解码器
CN103179392A (zh) 图像处理设备以及图像处理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17889129

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017889129

Country of ref document: EP

Effective date: 20190730