CN109978143A - It is a kind of based on the stacking-type self-encoding encoder of SIMD framework and coding method - Google Patents
It is a kind of based on the stacking-type self-encoding encoder of SIMD framework and coding method Download PDFInfo
- Publication number
- CN109978143A CN109978143A CN201910251530.6A CN201910251530A CN109978143A CN 109978143 A CN109978143 A CN 109978143A CN 201910251530 A CN201910251530 A CN 201910251530A CN 109978143 A CN109978143 A CN 109978143A
- Authority
- CN
- China
- Prior art keywords
- layer
- neural network
- weight
- sram
- biasing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000013528 artificial neural network Methods 0.000 claims abstract description 76
- 238000012549 training Methods 0.000 claims abstract description 22
- 230000001537 neural effect Effects 0.000 claims abstract description 4
- 210000002569 neuron Anatomy 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 15
- 230000000644 propagated effect Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000017105 transposition Effects 0.000 claims description 4
- 230000003585 interneuronal effect Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 238000013500 data storage Methods 0.000 claims description 2
- 210000004218 nerve net Anatomy 0.000 claims description 2
- 239000010410 layer Substances 0.000 description 69
- 238000013473 artificial intelligence Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
It is of the invention based on the stacking-type self-encoding encoder of SIMD framework and coding method, self-encoding encoder includes DMA interface module, ANN Reasoning module and neural metwork training module;DMA interface module mainly passes through the dma mode data that DDR is read in outside piece and is stored on piece SRAM by partitioned mode, and last operation result is write back DDR by dma mode;The reasoning computing module of neural network carries out categorical reasoning operation to new sample using trained weight and biasing;The training module of neural network is mainly responsible for the weight and biasing for successively updating neural network forward from neural network the last layer.The utility model has the advantages that the neural network number of plies supported of self-encoding encoder of the invention there is no limit, therefore the reasoning and training of Large Scale Neural Networks are supported, and it realizes that part calculates the cover of time and memory access time by ping-pong operation, there is good Practical significance and broad application prospect.
Description
Technical field
The present invention relates to the hardware realization field of intelligent algorithm more particularly to a kind of stacking-types based on SIMD framework
Self-encoding encoder and coding method.
Background technique
The development of electronic computer since with nineteen forty-one, technology can create machine intelligence, " artificial intelligence "
(Artificial Intelligence) word was proposed in DARTMOUTH association in 1956, since then, was ground
The persons of studying carefully have been developed numerous theoretical and principle, the concepts of artificial intelligence and have also been extended therewith.Before 2007, it is limited to algorithm at that time
With the factors such as data, to chip, especially strong demand, general cpu chip can provide enough meters to artificial intelligence not yet
Calculation ability.Later due to the fast development of HD video and game industry, graphics processor (GPU) chip is achieved rapidly
Development.Because GPU has more logical unit for handling data, belong to high parallel organization, in processing graph data and
In terms of complicated algorithm advantageously than CPU, and because the model parameter of AI deep learning is more, data scale is big, computationally intensive, this
GPU becomes the mainstream of AI chip at that time instead of CPU in a period of time afterwards.Under the huge tide of artificial intelligence, also have very much
Manufacturer's handling machine learning algorithm uses field programmable gate array (FPGA), and FPGA is high by its flexibility, in industry
Internet and industrial robot apparatus field have huge developing market.In addition to two kinds of intelligent algorithms of GPU and FPGA add
Fast chip, Google are proposed a application specific processor TPU for the design of specific intelligent algorithm, and chip area is with respect to FPGA
Smaller with GPU, power consumption is also lower.
Communication network is the basis of artificial intelligence outburst, and with the arriving of 5G communication era, all things on earth interconnection will generate magnanimity
Data, large-scale neural network needs powerful calculation power.As a kind of important neural network algorithm, stacking-type encodes certainly
Algorithm has a wide range of applications in plurality of application scenes such as recognition of face, geography information mappings.The present invention is based on a restructural
Intelligence accelerates core, proposes a kind of stacking-type of SIMD framework from the hardware realization of encryption algorithm, some hard with GPU, FPGA etc.
Part accelerated mode is compared, and the implementation resource utilization is high, and hardware realization speed is fast.As the typical case in intelligent algorithm
Algorithm, the implementation method have good reference and broad application prospect.
Summary of the invention
Present invention aims to overcome that above-mentioned the deficiencies in the prior art, are effectively reduced the training time of neural network, sufficiently
Using storage resource, accelerate trained and reasoning calculating speed, provides a kind of stacking-type based on SIMD framework from encoding
Device and coding method, are specifically realized by the following technical scheme:
The stacking-type self-encoding encoder based on SIMD framework includes: based on neural network
DMA interface module is stored on piece SRAM by partitioned mode by the dma mode data that DDR is read in outside piece, and will
Last operation result writes back DDR by dma mode;
ANN Reasoning module, trained weight and biasing carry out categorical reasoning fortune to new sample for use
It calculates;Neural metwork training module, will be after training sample propagated forward according to gradient descent algorithm;From the last layer of neural network
Backpropagation updates the weight and biasing of neural network.
The stacking-type self-encoding encoder based on SIMD framework it is further design be, the storage of every layer of neural network
SRAM, which contains, 4N source data storage bank, then the SRAM is divided into four parts, and there is N number of bank in each part,
It is respectively as follows:
The first part of SRAM, storage input xj;
The second part and Part III of SRAM stores weight Wij;
The Part IV of SRAM stores the calculated result of every layer of neural network.
Constant memory, storage biasing bi。
According to the above-mentioned stacking-type self-encoding encoder based on SIMD framework, it is self-editing to provide a kind of stacking-type based on SIMD framework
Code method, this method includes algorithm reasoning process and algorithm training process, and algorithm reasoning process includes:
Step 1-1) initialization all neurons of first layer input xj, biasing bi, first neuron of first layer and mind
All interneuronal weight W through the network second layerij;
Step 1-2) output of first neuron of second layer neural network, the meter multiplied accumulating are calculated according to formula (1)
Calculation process is completed by the structure of the parallel multiply-add tree in 32 tunnelsIt calculates, after the completion of calculating, by second neuron
Weight WijMove in the Part III of SRAM;
H in formula (1)iIndicate the calculated result of every layer of neural network, aiWhat is indicated is that weight multiplies accumulating and h with what is inputtedis
What () indicated is sigmoid activation primitive;
Step 1-3) progress ping-pong operation moves in weight, the output calculating of the completion neural network second layer, and ties calculating
The Part IV of fruit deposit SRAM;
Step 1-4) by the input of the neural network second layer exported as third layer, calculate the defeated of neural network third layer
Out, the first part of covering deposit SRAM.
Step 1-5) according to this access and calculation, obtain neural network the last layer as a result, and by result from
It is read in SRAM and writes back DDR according to dma mode;
Algorithm training process includes propagated forward and backpropagation, and the propagated forward includes the following steps:
Step 2-1-1) initialization first layer input xjAnd biasing bi, the weight W of first neuron of first layerij;
Step 2-1-2) basisAnd hi=s (ai) calculate first neuron of the second layer
Output, which is completed by the structure of the parallel multiply-add tree in 32 tunnelsIt calculates, has been calculated
Cheng Hou, by the weight W of second neuronijIt moves in the Part III of SRAM, calculates the output result of second neuron;
Step 2-1-3) use ping-pong operation to move in weight, the output of 512 neurons of the neural network second layer is calculated
It completes, is stored in the Part IV of SRAM, and data are write back into DDR according to dma mode;
Step 2-1-4) by the input of the neural network second layer exported as third layer, calculate neural network third layer
Output, the first part of covering deposit SRAM;
Step 2-1-5) complete above-mentioned steps, obtain neural network the last layer as a result, and result is read from SRAM
It takes and writes back DDR according to dma mode;
In the backpropagation, label data is defined as Std, delta is defined as delta, specifically includes following step
It is rapid:
Step 2-2-1) from DDR according to dma mode neural network label data Std is read in, and calculate resulting nerve net
Network the last layer data subtract each other to obtain the error delta of neural network the last layer;
Step 2-2-2) the transposition weight of neural network layer second from the bottom is rattled according to dma mode and reads in each neuron
Weight Wji, by weight WjiIt is stored in the second part and Part III of SRAM, biasing and weight are updated according to formula (2), until most
The weight of later layer and biasing are completed to update;
Covering is stored in the part of the SRAM where former weight and biasing after the completion of updating, and will update the biasing and power finished
DDR is written according to dma mode in weight;
Step 2-2-3) the delta delta that calculates preceding layer in the same way, it calculates and updates weight and biasing,
The biasing finished and weight will be updated, DDR is written according to dma mode;
Step 2-2-4) successively to previous Es-region propagations, all layers of neural network of weight and biasing are updated, and write back
DDR completes the primary training of neural network.
The stacking-type based on SIMD framework from coding method it is further design be, the step 1-5) if mind
It is odd-level through the total number of plies of network, then reads the result of the last layer from the first part of SRAM;If the total layer of neural network
Number is even level, then the result of the last layer is read from the Part IV of SRAM.Advantages of the present invention is as follows:
There is no limit for the neural network number of plies that stacking-type self-encoding encoder based on SIMD framework of the invention is supported, therefore props up
It holds the reasoning and training of Large Scale Neural Networks, and realizes that part calculates covering for time and memory access time by ping-pong operation
Lid, there is good Practical significance and broad application prospect.
Detailed description of the invention
Fig. 1 is stacking-type single self-encoding encoder schematic diagram from encryption algorithm.
Fig. 2 is that multiple single self-encoding encoders stack the schematic diagram for becoming self-encoding encoder entirety.
Fig. 3 is flow chart of the stacking-type based on SIMD framework from coding method.
Fig. 4 is that stacking-type calculates from encryption algorithm reasoning part and training part propagated forward part and realizes schematic diagram.
Fig. 5 is stacking-type from encryption algorithm storage mode schematic diagram.
Specific embodiment
Below in conjunction with attached drawing, technical solution of the present invention is described in detail.
As shown in Figure 1, being divided into input layer, hidden layer, output layer, multiple single encodes the self-encoding encoder of the present embodiment certainly
To form stacking-type self-encoding encoder as shown in Figure 2 after device storehouse, stacking-type self-encoding encoder by one layer of input, multilayer hidden layer and
Whether one layer of output layer composition, finally need Softmax classifier to define according to actual needs.
The self-encoding encoder is mainly made of DMA interface module, ANN Reasoning module and neural metwork training module.
The present invention is by Pingpang Memory to every layer of operation result of neural network and to every layer of neural network each neuron weight
Pingpang Memory maximally utilizes resource, while carrying out data carrying, conformity calculation knot according to the subregion of SRAM
Fruit improves algorithm arithmetic speed.
It is described in detail, and built a based on SystemC language with one embodiment of the present of invention realization below
Cycle accurate system integration project model verified.Neural network shares 7 layers, every layer from front to back of neural network in embodiment
Neuron number be respectively as follows: 1024,512,256,128,256,512,1024, input, weight, biasing of neural network etc.
Data are 32 floating numbers of IEEE754 standard, if with 4PE (Processing Element, wherein containing 4 complex multiplication
Musical instruments used in a Buddhist or Taoist mass, 4 complex adders, 1 real add musical instruments used in a Buddhist or Taoist mass, 1 real multipliers, 1 one surmount function) it is (right for computing array
32 bank are answered, each bank depth hypothesis is set as 4K, and bank bit wide is 64), then a bank a address stores 2
Source data.Technical solution of the present invention will be further introduced with this embodiment and in conjunction with attached drawing below.
Hardware algorithm implementation flow chart needs the weight first by all layers and interlayer as shown in figure 3, before the algorithm starts
It is stored in DDR, is used in order to which training updates weight, training is as follows with reasoning process detailed step after transposition:
The reasoning link process of stacking-type from encryption algorithm is as follows:
S1: the input x of 1024 neurons of first layer is initializedj, biasing bi, first neuron of first layer and nerve
The interneuronal weight W of 512 of the network second layerij, as shown in figure 5, x will be inputtedjIt is stored in the 0-7 bank, weight
It is stored in 8-15 bank, biases biIt is stored in constant storage.
S2: according toAnd hi=s (ai) output of first neuron of the second layer is calculated,
The calculating process entirety hardware structure multiplied accumulating by the structure of the parallel multiply-add tree in 32 tunnels as shown in figure 4, completedIt calculates.After the completion of calculating, by the weight W of second neuronijMove in Part III bank_3.
S3: table tennis moves in weight, and the output of the neural network second layer is calculated and is completed.It is stored in the Part IV of SRAM
bank_4。
S4: input of the output of the neural network second layer as third layer calculates the output of neural network third layer, covering
It is stored in the first part bank_1 of SRAM.
S5: according to this access and calculation, obtain neural network the last layer as a result, and by result from SRAM
It reads and writes back DDR (if the total number of plies of neural network is odd-level, from the first part bank_1 of SRAM according to dma mode
It reads;If the total number of plies of neural network is even level, read from the Part IV bank_4 of SRAM).
The training link process of stacking-type from encryption algorithm is as follows:
Algorithm training link is divided into propagated forward and backpropagation, and propagated forward is not uniquely both with algorithm reasoning link
The calculated result by every layer is needed to write back DDR by dma mode, to use for backpropagation, backpropagation uses ladder
Spend descent algorithm.
Propagated forward:
S1: the input x of first layer is initializedjAnd biasing bi, the weight W of first neuron of first layerij。
S2: according toAnd hi=s (ai) output of first neuron of the second layer is calculated, it should
The calculating process entirety hardware structure multiplied accumulating by the structure of the parallel multiply-add tree in 32 tunnels as shown in figure 4, completedIt calculates.After the completion of calculating, by the weight W of second neuronijIt moves in Part III bank_3, calculates
The output result of second neuron.
S3: table tennis moves in weight, and the output of 512 neurons of the neural network second layer is calculated and is completed.It is stored in SRAM's
In Part IV bank_4 i.e. the 24-31 bank, and data are write back into DDR according to dma mode.
S4: input of the output of the neural network second layer as third layer calculates the output of neural network third layer, covering
It is stored in first part bank_1 i.e. the 0-7 bank of SRAM.
S5: according to this access and calculation, obtain the 7th layer of neural network the last layer i.e. neural network as a result,
And result is read from SRAM and writes back DDR according to dma mode, the total number of plies of this example neural network is 7 layers, be odd-level, then from
It is read in bank_1, that is, 0-7 bank of first part of SRAM.
Backpropagation (gradient decline):
Label data is defined as Std, delta is defined as delta.
S6: reading in neural network label data Std from DDR according to dma mode, with calculate resulting neural network last
The 7th layer data of layer subtracts each other to obtain the error delta of neural network the last layer.
S7: the transposition weight of neural network layer second from the bottom is read in the weight of each neuron according to dma mode table tennis
Wji, it is deposited into the second part bank_2 and Part III bank_3 of SRAM, according to the update method of biasing and weight, is updated
The weight and biasing of the last layer.
Covering is stored in the part of the SRAM where former weight and biasing after the completion of updating, and will update the biasing and power finished
DDR is written according to dma mode in weight.
S8: calculating the delta delta of preceding layer in the same way, and calculates and update weight and biasing, equally
Mode DDR is written.
S9: successively to previous Es-region propagations, all layers of neural network of weight and biasing are updated, and writes back DDR, completes mind
Primary training through network.
The present invention by stacking-type from encryption algorithm input and weight be stored in SRAM division different zones in, can
The memory access of Lothrus apterus calculates required variable, and by ping-pong operation and the time-sharing multiplex of computing resource, realizes the algorithm
Calculating process fast implements, to substantially increase resource utilization and hardware realization speed, therefore the implementation application
Prospect is extensive.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of protection of the claims
Subject to.
Claims (4)
1. a kind of stacking-type self-encoding encoder based on SIMD framework, be based on neural network it is characterised by comprising:
DMA interface module is stored on piece SRAM by partitioned mode by the dma mode data that DDR is read in outside piece, and will be last
Operation result DDR is write back by dma mode;
ANN Reasoning module, trained weight and biasing carry out categorical reasoning operation to new sample for use;Mind
It, will be after training sample propagated forward according to gradient descent algorithm through network training module;It is reversed from the last layer of neural network
It propagates, updates the weight and biasing of neural network.
2. the stacking-type self-encoding encoder according to claim 1 based on SIMD framework, it is characterised in that every layer of neural network
Storage SRAM contain and have 4N source data storage bank, then the SRAM is divided into four parts, each part has N number of
Bank is respectively as follows:
The first part of SRAM, storage input xj;
The second part and Part III of SRAM stores weight Wij;
The Part IV of SRAM stores the calculated result of every layer of neural network.
Constant memory, storage biasing bi。
3. if the described in any item stacking-types based on SIMD framework of claim 1-2 are from coding method, it is characterised in that including
Algorithm reasoning process and algorithm training process, algorithm reasoning process include:
Step 1-1) initialization all neurons of first layer input xj, biasing bi, first neuron of first layer and nerve net
All interneuronal weight W of the network second layerij;
Step 1-2) output of first neuron of second layer neural network, the calculating multiplied accumulating are calculated according to formula (1)
The structure of the parallel multiply-add tree in 32 tunnel Cheng You is completedIt calculates, after the completion of calculating, by the power of second neuron
Weight WijMove in the Part III of SRAM;
H in formula (1)iIndicate the calculated result of every layer of neural network, aiWhat is indicated is that weight multiplies accumulating and h with what is inputtediS () table
What is shown is sigmoid activation primitive;
Step 1-3) progress ping-pong operation moves in weight, the output calculating of the completion neural network second layer, and calculated result is deposited
Enter the Part IV of SRAM;
Step 1-4) by the input of the neural network second layer exported as third layer, the output of neural network third layer is calculated,
The first part of covering deposit SRAM;
Step 1-5) according to this access and calculation, obtain neural network the last layer as a result, and by result from SRAM
Middle reading writes back DDR according to dma mode;
Algorithm training process includes propagated forward and backpropagation, and the propagated forward includes the following steps:
Step 2-1-1) initialization first layer input xjAnd biasing bi, the weight W of first neuron of first layerij;
Step 2-1-2) basisAnd hi=s (ai) calculate the defeated of first neuron of the second layer
Out, which is completed by the structure of the parallel multiply-add tree in 32 tunnelsIt calculates, calculates and complete
Afterwards, by the weight W of second neuronijIt moves in the Part III of SRAM, calculates the output result of second neuron;
Step 2-1-3) use ping-pong operation to move in weight, the output of 512 neurons of the neural network second layer is calculated and is completed,
It is stored in the Part IV of SRAM, and data are write back into DDR according to dma mode;
Step 2-1-4) by the input of the neural network second layer exported as third layer, calculate the defeated of neural network third layer
Out, the first part of covering deposit SRAM;
Step 2-1-5) complete above-mentioned steps, obtain neural network the last layer as a result, and result is read simultaneously from SRAM
DDR is write back according to dma mode;
In the backpropagation, label data is defined as Std, delta is defined as delta, specifically comprises the following steps:
Step 2-2-1) from DDR according to dma mode neural network label data Std is read in, and calculate resulting neural network most
Latter layer data subtracts each other to obtain the error delta of neural network the last layer;
Step 2-2-2) the transposition weight of neural network layer second from the bottom is rattled according to dma mode and reads in the power of each neuron
Weight Wji, by weight WjiIt is stored in the second part and Part III of SRAM, updates biasing and weight according to formula (2), until last
The weight of layer and biasing are completed to update;
Covering is stored in the part of the SRAM where former weight and biasing after the completion of updating, will update the biasing finished and weight by
DDR is written according to dma mode;
Step 2-2-3) the delta delta that calculates preceding layer in the same way, it calculates and updates weight and biasing, it will
It updates the biasing finished and weight and DDR is written according to dma mode;
Step 2-2-4) successively to previous Es-region propagations, all layers of neural network of weight and biasing are updated, and write back DDR, it is complete
At the primary training of neural network.
4. the stacking-type according to claim 3 based on SIMD framework is from coding method, it is characterised in that: the step 1-
5) if the total number of plies of neural network is odd-level, the result of the last layer is read from the first part of SRAM;If neural
The total number of plies of network is even level, then the result of the last layer is read from the Part IV of SRAM.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910251530.6A CN109978143B (en) | 2019-03-29 | 2019-03-29 | Stack type self-encoder based on SIMD architecture and encoding method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910251530.6A CN109978143B (en) | 2019-03-29 | 2019-03-29 | Stack type self-encoder based on SIMD architecture and encoding method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109978143A true CN109978143A (en) | 2019-07-05 |
CN109978143B CN109978143B (en) | 2023-07-18 |
Family
ID=67081767
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910251530.6A Active CN109978143B (en) | 2019-03-29 | 2019-03-29 | Stack type self-encoder based on SIMD architecture and encoding method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109978143B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022013648A1 (en) * | 2020-07-13 | 2022-01-20 | International Business Machines Corporation | Methods for detecting and monitoring bias in software application using artificial intelligence and devices thereof |
CN114202067A (en) * | 2021-11-30 | 2022-03-18 | 山东产研鲲云人工智能研究院有限公司 | Bandwidth optimization method for convolutional neural network accelerator and related equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110138259A1 (en) * | 2009-12-03 | 2011-06-09 | Microsoft Corporation | High Performance Digital Signal Processing In Software Radios |
CN106991477A (en) * | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | A kind of artificial neural network compression-encoding device and method |
CN108446766A (en) * | 2018-03-21 | 2018-08-24 | 北京理工大学 | A kind of method of quick trained storehouse own coding deep neural network |
-
2019
- 2019-03-29 CN CN201910251530.6A patent/CN109978143B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110138259A1 (en) * | 2009-12-03 | 2011-06-09 | Microsoft Corporation | High Performance Digital Signal Processing In Software Radios |
CN106991477A (en) * | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | A kind of artificial neural network compression-encoding device and method |
CN108446766A (en) * | 2018-03-21 | 2018-08-24 | 北京理工大学 | A kind of method of quick trained storehouse own coding deep neural network |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022013648A1 (en) * | 2020-07-13 | 2022-01-20 | International Business Machines Corporation | Methods for detecting and monitoring bias in software application using artificial intelligence and devices thereof |
GB2611981A (en) * | 2020-07-13 | 2023-04-19 | Ibm | Methods for detecting and monitoring bias in software application using artificial intelligence and devices thereof |
US11861513B2 (en) | 2020-07-13 | 2024-01-02 | International Business Machines Corporation | Methods for detecting and monitoring bias in a software application using artificial intelligence and devices thereof |
CN114202067A (en) * | 2021-11-30 | 2022-03-18 | 山东产研鲲云人工智能研究院有限公司 | Bandwidth optimization method for convolutional neural network accelerator and related equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109978143B (en) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107437096B (en) | Image classification method based on parameter efficient depth residual error network model | |
CN107239824A (en) | Apparatus and method for realizing sparse convolution neutral net accelerator | |
CN107153873A (en) | A kind of two-value convolutional neural networks processor and its application method | |
CN107832843A (en) | A kind of information processing method and Related product | |
CN110073359A (en) | Valid data for convolutional neural networks are laid out | |
CN107203808B (en) | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor | |
CN106951395A (en) | Towards the parallel convolution operations method and device of compression convolutional neural networks | |
CN106951962A (en) | Compound operation unit, method and electronic equipment for neutral net | |
CN107169563A (en) | Processing system and method applied to two-value weight convolutional network | |
CN108416422A (en) | A kind of convolutional neural networks implementation method and device based on FPGA | |
CN107578099A (en) | Computing device and method | |
CN108256628A (en) | Convolutional neural networks hardware accelerator and its working method based on multicast network-on-chip | |
CN107209871A (en) | Convolution matrix with readjustment is multiplied to the depth tile for depth convolutional neural networks | |
CN110543939B (en) | Hardware acceleration realization device for convolutional neural network backward training based on FPGA | |
CN108416327A (en) | A kind of object detection method, device, computer equipment and readable storage medium storing program for executing | |
CN107239823A (en) | A kind of apparatus and method for realizing sparse neural network | |
CN107578095A (en) | Neural computing device and the processor comprising the computing device | |
CN107423816A (en) | A kind of more computational accuracy Processing with Neural Network method and systems | |
CN107967516A (en) | A kind of acceleration of neutral net based on trace norm constraint and compression method | |
CN107256424A (en) | Three value weight convolutional network processing systems and method | |
CN108053848A (en) | Circuit structure and neural network chip | |
CN109978143A (en) | It is a kind of based on the stacking-type self-encoding encoder of SIMD framework and coding method | |
CN110163350A (en) | A kind of computing device and method | |
CN112508190A (en) | Method, device and equipment for processing structured sparse parameters and storage medium | |
KR20190089685A (en) | Method and apparatus for processing data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |