CN107729995A - Method and system and neural network processor for accelerans network processing unit - Google Patents
Method and system and neural network processor for accelerans network processing unit Download PDFInfo
- Publication number
- CN107729995A CN107729995A CN201711054139.4A CN201711054139A CN107729995A CN 107729995 A CN107729995 A CN 107729995A CN 201711054139 A CN201711054139 A CN 201711054139A CN 107729995 A CN107729995 A CN 107729995A
- Authority
- CN
- China
- Prior art keywords
- data
- weight
- neural network
- packet
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Abstract
The invention provides the method for accelerans network processing unit and corresponding neural network processor, wherein from original data packet and the weight packet of pending neural network model, extraction nonzero element simultaneously sets the position mark of each packet, and the position mark being each grouped indicates whether the element of relevant position in the packet is zero;The computing unit that the data and weight being all not zero based on element of the position mark selection in same position and relevant position when calculating are loaded onto neural network processor participates in computing.So, the data scale handled by neural network processor can be effectively reduced, so as to reduce storage overhead on piece, arithmetic speed is accelerated and reduces energy consumption so that Processing with Neural Network systematic function is more efficient.
Description
Technical field
The present invention relates to neural network processor, more particularly to the method that accelerans network model calculates.
Background technology
Deep learning achieves important breakthrough in the last few years, and the neural network model using deep learning Algorithm for Training is being schemed
As the application fields such as identification, speech processes, intelligent robot achieve the achievement to attract people's attention.Deep neural network passes through foundation
Model simulates the neural attachment structure of human brain, when handling the signals such as image, sound and text, passes through multiple conversion ranks
Data characteristics is described for section layering.With the continuous improvement of neutral net complexity, nerual network technique is in practical application
During exist and take that resource is more, the problems such as arithmetic speed is slow, energy expenditure is big.Traditional software meter is substituted using hardware accelerator
The method of calculation turns into the effective mode for improving neural computing efficiency, such as utilizes graphics processing unit, special place
Manage the neural network processor that device chip and FPGA (FPGA) are realized.
At present neural network processor generally using the weighted data trained as input signal together with data-signal
Carry out arithmetic operation on piece.Neural network processor belongs to computation-intensive and memory access processor-intensive.Neural network computing
During substantial amounts of parameter iteration be present, computing unit needs largely to access memory.As Neural Network Data is advised
The continuous growth of mould, intensive accessing operation do not only take up a large amount of Resources on Chip of neural network processor, also reduce its computing
Speed.
The content of the invention
Therefore, it is an object of the invention to overcome above-mentioned prior art the defects of, there is provided one kind improves Processing with Neural Network
The method of device arithmetic speed and corresponding neural network processor.
The purpose of the present invention is achieved through the following technical solutions:
On the one hand, the invention provides a kind of method for accelerans network processing unit, methods described to include:
Packet and weight packet of the step 1) for neural network model to be loaded, extract nonzero element and set
The position mark being respectively grouped, the position mark being each grouped indicate whether the element of relevant position in the packet is zero;
The nonzero element and position mark that each packet and weight are grouped by step 2) are loaded onto neural network processor
Memory cell in;
Step 3) is matched based on the position mark to data and weight, will only be in same position and relevant position
The data that are all not zero of element and weight be loaded onto the computing unit of neural network processor and participate in computing.
In the above method, it may also include and extract non-zero from the output data of the computing unit from neural network processor
Element and its position mark, and it is saved into data storage cell.
In the above method, step 3) may include:
By each position in the binary form of the position mark of packet and the binary system of the position mark of weight packet
Each carry out order comparison in form;
Position is identical and be all the calculating that the data of position and weight corresponding to 1 position are loaded onto neural network processor
Unit participates in computing.
Another aspect, the invention provides a kind of neural network processor, including control unit, computing unit, weight to deposit
Storage unit, data storage cell, wherein data matching unit, control unit are used to control the scheduling of related data, computing with depositing
Storage;Nonzero element and its position mark in the neutral net weight packet that the storage of weight memory cell has trained;Data are deposited
Storage unit stores nonzero element and its position mark in neutral net original data packet and intermediate result data;Data Matching list
Member be used for based on position mark to the weight from the weight memory cell and the data from data storage cell carry out
Match somebody with somebody, only data and weight that the element of same position and relevant position is all not zero are loaded onto in computing unit.
In above-mentioned neural network processor, data compression unit is may also include, for from from the defeated of computing unit
Go out extracting data nonzero element and set location mark, and be saved into data storage cell.
In above-mentioned neural network processor, data matching unit may include one or more comparators.
In above-mentioned neural network processor, data compression unit may include input register, output register and ratio
Compared with device, input register receives the data from computing unit, judges whether the data are null value by comparator, if not being
The data and corresponding register number are loaded into output register while marker bit are designated as into 1 by zero.
Another aspect, the invention provides a kind of system for accelerans network processing unit, the system includes:
Data prediction device, it is grouped for the packet to neural network model to be loaded and weight, extraction is non-
Neutral element and the position mark that each packet is set, the position mark being each grouped indicate relevant position in the packet element whether
It is zero, and for the nonzero element and position mark of each packet and weight packet to be loaded onto into neural network processor
In memory cell;
Data matching device, data and weight are matched based on the position mark, only will be in same position and
The computing unit that the data and weight that the element of relevant position is all not zero are loaded onto neural network processor participates in computing.
Said system may also include data compression device, from the output data of the computing unit from neural network processor
Middle extraction nonzero element and its position mark, and it is saved into data storage cell.
In said system, the data matching device can be configured as:
By each position in the binary form of the position mark of packet and the binary system of the position mark of weight packet
Each carry out order comparison in form;
Position is identical and be all the calculating that the data of position and weight corresponding to 1 position are loaded onto neural network processor
Unit participates in computing.
Compared with prior art, the advantage of the invention is that:
The present invention effectively reduces the data scale handled by neural network processor, so as to reduce storage overhead on piece,
Accelerate arithmetic speed and reduce energy consumption so that Processing with Neural Network systematic function is more efficient.
Brief description of the drawings
Embodiments of the present invention is further illustrated referring to the drawings, wherein:
Fig. 1 is the schematic flow sheet according to the method for accelerans network processing unit of the embodiment of the present invention;
Fig. 2 is to compress storage format example schematic diagram according to the weight of the embodiment of the present invention;
Fig. 3 is the compression storing data format sample schematic diagram according to the embodiment of the present invention;
Fig. 4 is the weight compression process example schematic diagram according to the embodiment of the present invention;
Fig. 5 is the data compression process example schematic diagram according to the embodiment of the present invention;
Fig. 6 is the structural representation according to the neural network processor of the embodiment of the present invention;
Fig. 7 is the structural representation according to the data matching unit of the embodiment of the present invention;
Fig. 8 is the structural representation according to the data compression unit of the embodiment of the present invention;
Fig. 9 is the calculation process schematic diagram using the neural network processor of the embodiment of the present invention.
Embodiment
In order that the purpose of the present invention, technical scheme and advantage are more clearly understood, pass through below in conjunction with accompanying drawing specific real
Applying example, the present invention is described in more detail.It should be appreciated that specific embodiment described herein is only to explain the present invention, and
It is not used in the restriction present invention.
It is 0 to show that inventor, which has found to participate under study for action to exist in the data and weight of neural computing big numerical quantity,
As such data do not produce numerically with weight after the computing such as multiplication and addition to operation result in calculating process
Influence.But these numerical value be 0 data and weight in storage, be loaded into and the process such as computing can take a large amount of Resources on Chip, disappear
Consume the unnecessary working time, it is difficult to meet the performance requirement of neural network processor.
In one embodiment of the invention, there is provided a kind of method for accelerans network processing unit.Such as Fig. 1 institutes
Show, this method mainly includes the 1) original data packet for neural network model to be loaded and weight packet, extracts non-zero
Element and the position mark for setting packet, the position mark of packet indicate whether the element of relevant position in the packet is zero;2)
The nonzero element and position mark that packet and weight are grouped are loaded onto in the memory cell of neural network processor;3) base
Data and weight are matched in the position mark, only the element in same position and relevant position are all not zero
The computing unit that data and weight are loaded onto neural network processor participates in computing.
More specifically, being grouped in step 1) for the original data packet and weight of neural network model to be loaded, carry
Negated neutral element and the position mark that packet is set.In neural computing, it will usually by pending weight sum according to this
Identical model split is stored and loaded into multiple packets or sequence, and the element in every group can be according to the god of actual use
The scale of computing unit through network processing unit determines.The process of this extraction nonzero element and set location mark can also manage
Solve for pending Neural Network Data and weight are recompiled or compressed, through recompiling or compressing what is obtained afterwards
By element that not retain numerical value be zero in weight sequence and data sequence.Storage format such as Fig. 2 of weight after step 1) processing
It is shown, including two parts:<Weight>With<Mark>;The storage format of data is as shown in figure 3, also include two parts:<Data
Nonzero element>With<Mark>.Wherein mark (alternatively referred to as position mark) indicate relevant position in the packet element whether be
Zero, such as in a packet if the numerical value of the element of correspondence position is 0, the mark of the position can be set 0, if relevant position
Element be nonzero element, then the mark value of the position can be arranged to 1.
Fig. 4 gives the process signal that processing is compressed to weight.Retouched in Fig. 4 so that every group includes four elements as an example
State the process of weight compression.As shown in figure 4, being original weight above line, and below line it is the power obtained after step 1) processing
Weight.In first group of weight, nonzero element is 1.5 and 2.5 element, and the two elements are in the 1st position and the 4th of the packet
Individual position, therefore after recompiling or compressing, line this group of weight shown below remains the two nonzero elements, and the group
Position mark corresponding to weight is 1001;Three nonzero elements are included in second group of original weight, are in this group of weight
1st, the 3rd and the 4th element, therefore after recompiling or compressing, this group of weight remains the two nonzero elements, and
Position mark corresponding to this group of weight is arranged to 1011.In the 3rd group of weighted value, comprising two nonzero elements 3 and 4, it is corresponding
Position mark be arranged to 0011.
Data compression process shown in Fig. 5 is similar with the weight compression process shown in Fig. 4, contains four members with every group of packet
It is initial data above line exemplified by element, and below line is the data obtained after step 1) processing.In first group of data, number
The element being worth for 1 and 2 is respectively the 1st and the 4th element, therefore after recompiling, this group of data remain two non-zeros
Element, and position mark corresponding to this group of data is 1001;Three nonzero elements are included in second group of initial data,
It is the 1st, the 2nd and the 4th element in this group of data, position mark corresponding to this group of weight is arranged to 1101.At the 3rd group
Three nonzero elements are remained in data, after compression, its position mark is arranged to 1011.
With continued reference to Fig. 1, after above-mentioned processing, nonzero element in weight and packet and position are marked in step 2)
Note is loaded onto in the memory cell of neural network processor, such as can be respectively loaded on the weight storage list of neural network processor
Member and data storage cell.Then when step 3) is being calculated, data is read from data storage cell and are stored from weight
Unit reads weight, and data and weight are matched based on the position mark, will only be in same position and relevant position
The data that are all not zero of element and weight be loaded onto the computing unit of neural network processor and participate in computing.For example, for number
Each carry out order comparison in the position mark being grouped according to the position mark and weight of packet, if same position and mark is same
When be 1, then the weight of relevant position and data are loaded onto in computing unit.As can be seen that point for including 4 elements
Group, position mark corresponding to each packet are actually that (its number range is 2 for an integer0-24Between), the two of the numerical value
Each position of binary form indicates whether each element is 0 in the packet successively.Therefore for neural network processor, only deposit
The nonzero element and a position mark in packet and weight packet are stored up, EMS memory occupation can be greatly reduced;And only will
Non-zero and weight are loaded into computing unit, have both been improved calculating speed and have been improved computing unit utilization rate.
In yet another embodiment, this method also includes the output for the computing unit from neural network processor
Every group of data carry out it is same recompile or compress, it is identical with the above-mentioned processing mode to weight and initial data, only should
Nonzero element and its position mark in group data are saved in memory cell.Because it can be produced very in neural computing
More results of intermediate calculations, wherein nonzero element is also only preserved from these results of intermediate calculations can be further at optimization neural network
Manage the utilization rate of storage and computing resource in device.
Fig. 6 is the structural representation according to the neural network processor of one embodiment of the present of invention.At the neutral net
The structure based on storage-control-calculating is managed, wherein storage organization is used to store the data for participating in calculating and processor operation refers to
Order;Control structure includes decoding circuit, and for parsing operational order, generation control signal is with the scheduling of data in control sheet with depositing
Storage and neural computing process;Calculating structure includes ALU, for participating in the neutral net in the processor
Calculate operation.As shown in fig. 6, control unit can be single with data storage cell, weight memory cell, the location of instruction, calculating
Member communication, control unit obtain the instruction being stored in the location of instruction and parse the instruction, produce control signal control
Computing unit carries out neural computing.Weight memory cell is used to store the neutral net weight trained, and data are deposited
Storage unit is used to store the various data related to neural computing, and the data may include the primitive character of neural network model
The parameter that data and participation intermediate layer calculate and the data of output from computing unit etc..Computing unit is used for according to control
The caused control signal of unit performs corresponding neural computing.Computing unit is related to one or more memory cell
Connection, computing unit can obtain data and weight to be calculated from data storage cell and weight memory cell, and can
To write data to data storage cell.
But it is different from existing neural network processor, deposited in the weight memory cell shown in Fig. 6 and data storage cell
Storage is the data as described above by recompiling or compressing, and is only saved non-in each packet and weight packet
Neutral element and its position mark.In addition, data also are added between the input of computing unit and the output of memory cell
Matching unit, and add data compression unit between the output of computing unit and the input of memory cell.Wherein, data
With unit in weight memory cell and data storage cell use recompile or compress after form store weight with
Data are matched, for example, the position mark of packet and weight packet is read, by the binary form of the position mark
Each order is compared, and is only loaded onto data and weight that the element in same position and relevant position is all not zero
The computing unit of neural network processor participates in computing, so as to ensure that the weight of compression can be carried out with corresponding compressed data
Correctly calculate.Fig. 7 gives the structural representation of the data matching unit of example.In the data matching unit comprising one or
Multiple comparators, the effect of comparator are that the position mark of the position mark of data and weight is compared, and are only allowed identical
Position and mark are that the buffer queue for the array that 1 data and weight are loaded onto computing unit is medium to be calculated simultaneously.
What is shown in Fig. 6 is only an example of each computing unit shared data matching unit.In another embodiment
In or corresponding data matching unit is set in each computing unit.So, calculated in neural network model
Cheng Zhong, the data sharing from data storage cell is into each computing unit, and the different power from weight memory cell
Weight values are linked into each computing unit, each computing unit by the data matching unit of oneself to the position mark of weight and
The position mark of data is matched, and only performing follow-up calculate to the correspondence position data and weight that match operates, and each
Computing unit can concurrent working.
With continued reference to Fig. 6, the data compression unit between the output of computing unit and the input of memory cell is used for
The results of intermediate calculations of computing unit output is compressed on neural network processor piece, only retains nonzero element, does not deposit
Store up zero valued elements., to weight and the processing identical mode of initial data, only computing unit is exported using with described above
One group of data in nonzero element and its position mark be saved in memory cell, so as to further optimization neural network processor
Middle storage and the utilization rate of computing resource.Fig. 8 gives the structural representation of the data compression unit of example.The data compression list
Member is made up of the input, it is necessary in the data access compressed to compression unit input register, output register and comparator
In register group, then judge whether the data of access are null value by comparator, if not null value is then by data and corresponding
Register number is loaded into output register, while comparative result is recorded in marker bit, if null value, marker bit 0,
If not null value, marker bit 1.
Fig. 9 shows that use neural network processor according to embodiments of the present invention carries out the process of neural computing
Schematic flow sheet.Wherein each computing unit of the neural network processor includes respective data matching unit.Such as Fig. 9 institutes
Show, control unit addresses to memory cell, reads and parse the instruction for needing to perform, the storage address obtained according to analysis instruction
Input data is obtained from memory cell, data and weight are stored from data storage cell and weight respectively in packetized units
Unit is loaded into computing unit.In neural network model in calculating process, data storage cell will be come from according to control instruction
Packet share in each computing unit, and the packet of the weight from weight memory cell is linked into each corresponding calculate
In unit.Then, the data matching unit set in each computing unit is based on the position for receiving weight packet and packet
Mark carries out data and matched with weight, is that 1 data and weight perform neural network computing simultaneously only to same position and mark
Middle related arithmetic operation.The correlation result of each computing unit is provided to data compression unit, by data compression unit from
In extract nonzero element and set location mark, output this to data storage cell.
In yet another embodiment, a kind of system for accelerans network processing unit, including piece external pressure are additionally provided
Compression apparatus and neural network processor described above.Wherein, the piece external pressure compression apparatus is to pending neural network model
Original data packet and weight packet in extract nonzero value and set location mark, then by the data after processing and weight point
The data storage cell and weight memory cell of neural network processor are not loaded onto.
In yet another embodiment, a kind of system for accelerans network processing unit, the system bag are additionally provided
Include data prediction device and data matching device.Wherein data prediction device is used for for neural network model to be loaded
Original data packet and weight packet, extraction nonzero element simultaneously the position mark of packet is set, and is loaded into nerve net
In the memory cell of network processor.Data matching device is used to match data and weight according to position mark, only will place
Data and weight that element in same position and relevant position is all not zero are loaded onto the computing unit of neural network processor
Participate in computing.In another embodiment, the system can also include:Data compression device, for from Processing with Neural Network
Nonzero element and set location mark are extracted in the output data of the computing unit of device, is then saved into Processing with Neural Network
In the data storage cell of device.
Although the present invention be described by means of preferred embodiments, but the present invention be not limited to it is described here
Embodiment, also include made various changes and change without departing from the present invention.
Claims (10)
1. a kind of method for accelerans network processing unit, methods described includes:
Packet of the step 1) for neural network model to be loaded and weight packet, extract nonzero element and set each point
The position mark of group, the position mark being each grouped indicate whether the element of relevant position in the packet is zero;
The nonzero element and position mark that each packet and weight are grouped by step 2) are loaded onto depositing for neural network processor
In storage unit;
Step 3) is matched based on the position mark to data and weight, only by the member in same position and relevant position
The computing unit that the data and weight that element is all not zero are loaded onto neural network processor participates in computing.
2. the method according to claim 11, in addition to the output data from the computing unit from neural network processor
Middle extraction nonzero element and its position mark, and it is saved into data storage cell.
3. according to the method for claim 1, step 3) includes:
By each position in the binary form of the position mark of packet and the binary form of the position mark of weight packet
In each carry out order comparison;
By position is identical and mark be all 1 position corresponding to position data and weight be loaded onto the calculating of neural network processor
Unit participates in computing.
4. a kind of neural network processor, including control unit, computing unit, weight memory cell, data storage cell, data
Matching unit, wherein control unit are used for the scheduling, computing and storage for controlling related data;The storage of weight memory cell has been instructed
Nonzero element and its position mark in the neutral net weight packet perfected;Data storage cell stores neutral net initial data
Nonzero element and its position mark in packet and intermediate result data;Data matching unit is used for based on position mark to from institute
State the weight of weight memory cell and the data from data storage cell are matched, only by same position and relevant position
The data and weight that element is all not zero are loaded onto in computing unit.
5. neural network processor according to claim 4, in addition to data compression unit, for from from computing unit
Output data in extract nonzero element and set location mark, and be saved into data storage cell.
6. the neural network processor according to claim 4 or 5, wherein data matching unit include one or more compare
Device.
7. the neural network processor according to any one of claim 4 or 5, wherein data compression unit are posted including input
Storage, output register and comparator, input register receive the data from computing unit, judge the data by comparator
Whether it is null value, the data and corresponding register number are loaded into output register if being not zero simultaneously will mark
Position is designated as 1.
8. a kind of system for accelerans network processing unit, the system includes:
Data prediction device, it is grouped for the packet to neural network model to be loaded and weight, extracts non-zero entry
Element simultaneously sets the position mark of each packet, the position mark being each grouped indicate relevant position in the packet element whether be
Zero, and for the nonzero element and position mark of each packet and weight packet to be loaded onto into depositing for neural network processor
In storage unit;
Data matching device, data and weight are matched based on the position mark, will only be in same position and corresponding
The computing unit that the data and weight that the element of position is all not zero are loaded onto neural network processor participates in computing.
9. system according to claim 8, in addition to:
Data compression device, nonzero element and its position are extracted from the output data of the computing unit from neural network processor
Tagging, and it is saved into data storage cell.
10. system according to claim 8, the data matching device is configured as:
By each position in the binary form of the position mark of packet and the binary form of the position mark of weight packet
In each carry out order comparison;
By position is identical and mark be all 1 position corresponding to position data and weight be loaded onto the calculating of neural network processor
Unit participates in computing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711054139.4A CN107729995A (en) | 2017-10-31 | 2017-10-31 | Method and system and neural network processor for accelerans network processing unit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711054139.4A CN107729995A (en) | 2017-10-31 | 2017-10-31 | Method and system and neural network processor for accelerans network processing unit |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107729995A true CN107729995A (en) | 2018-02-23 |
Family
ID=61203664
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711054139.4A Pending CN107729995A (en) | 2017-10-31 | 2017-10-31 | Method and system and neural network processor for accelerans network processing unit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107729995A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108809522A (en) * | 2018-07-09 | 2018-11-13 | 上海大学 | The implementation method of the deep learning decoder of multi-code |
WO2019091020A1 (en) * | 2017-11-10 | 2019-05-16 | 中国科学院计算技术研究所 | Weight data storage method, and neural network processor based on method |
CN109886394A (en) * | 2019-03-05 | 2019-06-14 | 北京时代拓灵科技有限公司 | Three-valued neural networks weight processing method and processing device in embedded device |
CN111105018A (en) * | 2019-10-21 | 2020-05-05 | 深圳云天励飞技术有限公司 | Data processing method and device |
CN111160516A (en) * | 2018-11-07 | 2020-05-15 | 杭州海康威视数字技术股份有限公司 | Convolutional layer sparsization method and device of deep neural network |
CN111260044A (en) * | 2018-11-30 | 2020-06-09 | 上海寒武纪信息科技有限公司 | Data comparator, data processing method, chip and electronic equipment |
WO2020134550A1 (en) * | 2018-12-29 | 2020-07-02 | 深圳云天励飞技术有限公司 | Data compression method and related device |
CN111882028A (en) * | 2020-06-08 | 2020-11-03 | 北京大学深圳研究生院 | Convolution operation device for convolution neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1719435A (en) * | 2004-07-07 | 2006-01-11 | 联发科技股份有限公司 | Method and apparatus for implementing DCT/IDCT |
CN105260776A (en) * | 2015-09-10 | 2016-01-20 | 华为技术有限公司 | Neural network processor and convolutional neural network processor |
CN106447034A (en) * | 2016-10-27 | 2017-02-22 | 中国科学院计算技术研究所 | Neutral network processor based on data compression, design method and chip |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
CN106650924A (en) * | 2016-10-27 | 2017-05-10 | 中国科学院计算技术研究所 | Processor based on time dimension and space dimension data flow compression and design method |
-
2017
- 2017-10-31 CN CN201711054139.4A patent/CN107729995A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1719435A (en) * | 2004-07-07 | 2006-01-11 | 联发科技股份有限公司 | Method and apparatus for implementing DCT/IDCT |
CN105260776A (en) * | 2015-09-10 | 2016-01-20 | 华为技术有限公司 | Neural network processor and convolutional neural network processor |
CN106447034A (en) * | 2016-10-27 | 2017-02-22 | 中国科学院计算技术研究所 | Neutral network processor based on data compression, design method and chip |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
CN106650924A (en) * | 2016-10-27 | 2017-05-10 | 中国科学院计算技术研究所 | Processor based on time dimension and space dimension data flow compression and design method |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019091020A1 (en) * | 2017-11-10 | 2019-05-16 | 中国科学院计算技术研究所 | Weight data storage method, and neural network processor based on method |
US11531889B2 (en) | 2017-11-10 | 2022-12-20 | Institute Of Computing Technology, Chinese Academy Of Sciences | Weight data storage method and neural network processor based on the method |
CN108809522B (en) * | 2018-07-09 | 2021-09-14 | 上海大学 | Method for realizing multi-code deep learning decoder |
CN108809522A (en) * | 2018-07-09 | 2018-11-13 | 上海大学 | The implementation method of the deep learning decoder of multi-code |
CN111160516B (en) * | 2018-11-07 | 2023-09-05 | 杭州海康威视数字技术股份有限公司 | Convolutional layer sparsification method and device for deep neural network |
CN111160516A (en) * | 2018-11-07 | 2020-05-15 | 杭州海康威视数字技术股份有限公司 | Convolutional layer sparsization method and device of deep neural network |
CN111260044A (en) * | 2018-11-30 | 2020-06-09 | 上海寒武纪信息科技有限公司 | Data comparator, data processing method, chip and electronic equipment |
WO2020134550A1 (en) * | 2018-12-29 | 2020-07-02 | 深圳云天励飞技术有限公司 | Data compression method and related device |
CN109886394B (en) * | 2019-03-05 | 2021-06-18 | 北京时代拓灵科技有限公司 | Method and device for processing weight of ternary neural network in embedded equipment |
CN109886394A (en) * | 2019-03-05 | 2019-06-14 | 北京时代拓灵科技有限公司 | Three-valued neural networks weight processing method and processing device in embedded device |
CN111105018A (en) * | 2019-10-21 | 2020-05-05 | 深圳云天励飞技术有限公司 | Data processing method and device |
CN111105018B (en) * | 2019-10-21 | 2023-10-13 | 深圳云天励飞技术有限公司 | Data processing method and device |
CN111882028A (en) * | 2020-06-08 | 2020-11-03 | 北京大学深圳研究生院 | Convolution operation device for convolution neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107729995A (en) | Method and system and neural network processor for accelerans network processing unit | |
CN107491811A (en) | Method and system and neural network processor for accelerans network processing unit | |
CN106529670B (en) | It is a kind of based on weight compression neural network processor, design method, chip | |
CN107844829A (en) | Method and system and neural network processor for accelerans network processing unit | |
CN106447034B (en) | A kind of neural network processor based on data compression, design method, chip | |
CN107103113B (en) | The Automation Design method, apparatus and optimization method towards neural network processor | |
CN109740747B (en) | Operation method, device and Related product | |
CN107341544A (en) | A kind of reconfigurable accelerator and its implementation based on divisible array | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN110110851A (en) | A kind of the FPGA accelerator and its accelerated method of LSTM neural network | |
CN109376843A (en) | EEG signals rapid classification method, implementation method and device based on FPGA | |
CN110379416A (en) | A kind of neural network language model training method, device, equipment and storage medium | |
CN105512723A (en) | Artificial neural network calculating device and method for sparse connection | |
CN106951395A (en) | Towards the parallel convolution operations method and device of compression convolutional neural networks | |
CN109190756A (en) | Arithmetic unit based on Winograd convolution and the neural network processor comprising the device | |
CN108231086A (en) | A kind of deep learning voice enhancer and method based on FPGA | |
Zhang et al. | Spiking neural P systems with a generalized use of rules | |
CN107992940A (en) | Implementation method and device of a kind of convolutional neural networks on FPGA | |
CN104424507B (en) | Prediction method and prediction device of echo state network | |
CN110321997A (en) | High degree of parallelism computing platform, system and calculating implementation method | |
CN106875005A (en) | Adaptive threshold neuronal messages processing method and system | |
CN110109543A (en) | C-VEP recognition methods based on subject migration | |
CN114333074A (en) | Human body posture estimation method based on dynamic lightweight high-resolution network | |
CN114332545A (en) | Image data classification method and device based on low-bit pulse neural network | |
CN109657794A (en) | A kind of distributed deep neural network performance modelling method of queue based on instruction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180223 |