CN107368454A  A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating  Google Patents
A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating Download PDFInfo
 Publication number
 CN107368454A CN107368454A CN201710478883.0A CN201710478883A CN107368454A CN 107368454 A CN107368454 A CN 107368454A CN 201710478883 A CN201710478883 A CN 201710478883A CN 107368454 A CN107368454 A CN 107368454A
 Authority
 CN
 China
 Prior art keywords
 gpu
 matrix
 threads
 push
 large amount
 Prior art date
 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
 Pending
Links
 239000011159 matrix material Substances 0.000 claims abstract description 50
 238000004364 calculation method Methods 0.000 claims abstract description 25
 230000001131 transforming Effects 0.000 claims abstract description 13
 238000005457 optimization Methods 0.000 claims abstract description 9
 238000000354 decomposition reaction Methods 0.000 claims abstract description 5
 HPTJABJPZMULFHUHFFFAOYSAN 12(cyclohexylcarbamoylamino)dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFHUHFFFAOYSAN 0.000 claims description 3
 238000000034 method Methods 0.000 claims description 3
 230000017105 transposition Effects 0.000 claims description 3
 238000004458 analytical method Methods 0.000 abstract description 4
 230000001133 acceleration Effects 0.000 description 3
 238000005094 computer simulation Methods 0.000 description 1
 238000010586 diagram Methods 0.000 description 1
 230000000694 effects Effects 0.000 description 1
 230000005611 electricity Effects 0.000 description 1
 230000005520 electrodynamics Effects 0.000 description 1
 238000005516 engineering process Methods 0.000 description 1
 230000003993 interaction Effects 0.000 description 1
 238000009114 investigational therapy Methods 0.000 description 1
 230000001788 irregular Effects 0.000 description 1
Classifications

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06F—ELECTRIC DIGITAL DATA PROCESSING
 G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
 G06F17/10—Complex mathematical operations
 G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
 G06F17/12—Simultaneous equations, e.g. systems of linear equations

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06F—ELECTRIC DIGITAL DATA PROCESSING
 G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
 G06F17/10—Complex mathematical operations
 G06F17/16—Matrix or vector computation, e.g. matrixmatrix or matrixvector multiplication, matrix factorization
Abstract
Method is pushed away before accelerating the invention discloses a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms, methods described comprises the following steps：(1) in CPU triangular transformation matrix L is descended according to a series of LU symbol decomposition results of sparsity structure identical n rank system of linear equations coefficient matrixes_{1}Sparsity structure, to matrix L_{1}Each row carries out parallelization layering, and L_{1}~L_{N}With identical sparsity structure and parallelization layering result；(2) data needed for push calculation before LU are transferred to GPU by CPU；(3) task distribution and device memory optimization：Will be to matrix L_{1}~L_{N}Before push away processor active task and be assigned in a large amount of threads on GPU and perform, and used according to access principles memory optimization is merged；(4) the kernel function Batch_LUForward calculated in GPU by push before the incremental sequence starting layering LU of level.The present invention can improve Load flow calculation speed, and basis is provided for online analysis.
Description
Technical field
The invention belongs to High performance computing in power system application field, more particularly to a kind of a large amount of sparse lower triangle sides of isomorphism
The GPU of journey group pushes away method before accelerating.
Background technology
Load flow calculation is most widely used, most basic and most important a kind of electric computing in power system.In power train
In the research of the method for operation of uniting and programme, it is required for carrying out Load flow calculation to compare the method for operation or plan power supply plan
Feasibility, reliability and economy, it is necessary to be calculated using online power flow in the realtime monitoring of operation states of electric power system.Pass
In the NewtonLaphson method Load flow calculation of system, the update equation group solution time accounts for the 70% of the Load flow calculation time, update equation group
The calculating speed of solution influences the overall performance of program.
As a large amount of new energy access growing, the uncertain increase of power network of power network and electricity market, probability
Load flow calculation turns into analysis means indispensable in power system daytoday operation.In Probabilistic Load Flow, core is also most simultaneously
The calculating of timeconsuming is large batch of Load flow calculation, for the highly similar spy of highvolume trend in Probabilistic Load Flow and topological structure
Point, the present invention propose the batch processing parallel schema based on GPU parallel tables.
GPU is a kind of manycore parallel processor, will be considerably beyond CPU in the quantity of processing unit.Traditional GPU is only born
Duty figure renders, and CPU has all been given in most processing.Present GPU has developed into a kind of multinuclear, multithreading, had
Powerful calculating ability and high bandwidth of memory, programmable processor.Under universal computer model, GPU as CPU association at
Device work is managed, is decomposed by task reasonable distribution and completes highperformance calculation.
It is a pith in probabilistic load flow that sparse lower trigonometric equation group, which solves, in batches, wherein lower trigonometric equation group
Solution is most common operation in Solving Linear, is the subsequent step that LU factorization solves system of linear equations, generally also
Push is calculated before being referred to as.LU symbol decomposition is carried out to the identical sparsity structure J matrixes of batch system of linear equations coefficient matrix matrix
Afterwards, the sparsity structure of lower triangular transformation matrix L is obtained.According to the nonzero meta structure of L battle arrays, parallelization point is carried out to L matrix rows
Layer, wherein the calculating of the row in every layer is separate, without dependence, it can naturally be handled by parallel calculating, be adapted to GPU
Accelerate.The solution of lower trigonometric equation group in sparse vectors can be completed by CPU and GPU effective cooperation, at present state
Inside and outside researcher's research emphasis is the threaded design of amount of calculation distribution, and lacks to thread calculation and data directory side
The further investigation of formula, GPU advantage are not not fully exerted.
It would therefore be highly desirable to solve the above problems.
The content of the invention
Goal of the invention：Trigonometric equation group is descended in batches suitable for probabilistic load flow it is an object of the invention to provide a kind of
Before push away method, Load flow calculation speed can be improved, for online analysis provide basis the sparse lower trigonometric equation group of a large amount of isomorphisms GPU
Method is pushed away before acceleration.
Load flow calculation：Electrodynamic noun, refer in given power system network topology, component parameters and generating, load parameter
Under the conditions of, calculate the distribution of active power, reactive power and voltage in power network.
GPU：Graphics processor (English：GraphicsProcessingUnit, abbreviation：GPU).
Technical scheme：To realize object above, the invention discloses a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms
Method is pushed away before acceleration, methods described comprises the following steps：
(1) decomposed and tied according to a series of LU symbols of sparsity structure identical n rank system of linear equations coefficient matrixes in CPU
Fruit, that is, descend triangular transformation matrix L_{1}Sparsity structure, to matrix L_{1}Each row carries out parallelization layering, and L_{1}~L_{N}With identical
Sparsity structure and parallelization layering result；
(2) data needed for push calculation before LU are transferred to GPU by CPU；
(3) task distribution and device memory optimization：Will be to matrix L_{1}~L_{N}Before push away processor active task be assigned to it is big on GPU
Performed in amount thread, and used according to access principles memory optimization is merged；
(4) kernel function Batch_LUForward is calculated by push before the incremental sequence starting layering LU of level in GPU.
Wherein, in the step (1), parallelization is layered lower triangular transformation matrix L_{1}N rows be assigned in M layers, belong to
Between row in same layer independently of each other, push is calculated before can carrying out parallel；The quantity of every layer of row included is L (k), and k is represented
Level number；All line numbers are stored in kth layer to mapping table Map_{k}。
Preferably, in the step (2), data needed for push calculation include before LU：Lower triangular transformation matrix L_{1}~L_{N}, matrix
Dimension n, matrix L_{1}Parallelization layering result, system of linear equations righthandside vector b_{1}~b_{N}。
Furthermore in the step (3), by N number of isomorphism sparse matrix L_{1}~L_{N}Same a line LU before push away operation distribution
Different threads to same thread block are handled；To ensure to merge internal storage access, by matrix L_{1}~L_{N}The continuous storage group in internal memory
It is the big matrix of N rows in logic into one, then carries out transposition operation.
Further, in the step (4), push calculation kernel function is defined as Batch_LUForward before the LU in GPU<
N_{blocks}, N_{threads}>, wherein thread block size N_{threads}It is fixed as 128；When calculating k layers, thread number of blocks N_{blocks}
=L (k), total number of threads are：N_{blocks}×N_{threads}, start kernel function Batch_LUForward<L (k), N_{threads}>To count
Calculate all rows for belonging to kth layer；Batch_LUForward<L (k), N_{threads}>Specific calculation process be：
(4.1) the thread index that CUDA is distributed in thread block index blockID and thread block for each thread automatically
threadID；
(4.2) blockID and threadID are assigned to variable bid and t, joint variable bid and t indexes bid lines
T threads in journey block, 128 threads in bid thread blocks are responsible for matrix L_{1}~L_{N}Jth=Map_{k}(bid) pushed away before row
Computing, wherein：T threads are responsible for calculating matrix L_{t}Jth row before push calculate, t=threadID+m × 128, (m=0,
1 ..., N/128)；
In t threads in (4.3) bid thread blocks, judge whether t is less than N, less than continuing executing with, the otherwise line
Journey is out of service；
(4.4) variable i is incremented to j1 from 1, and if only if L_{t}During (j, i) ≠ 0, using formula y_{t}(j)=b_{t}(j)y_{t}(i)
×L_{t}(j, i) pushes away operation result y jth of element y before calculating_{t}(j)；
(4.5) formula y is used_{t}(j)=y_{t}(j)/L_{t}(j, j) updates y_{t}(j)。
Beneficial effect：Compared with prior art, the present invention has following remarkable advantage：First the present invention according to CPU to big
The LU symbol decomposition results of isomorphism Jacobian matrix are measured, that is, descend the sparse format of triangular transformation matrix L 1, it is possible to reduce unnecessary
Floatingpoint Computation；Secondly, matrix L 1 is subjected to parallelization layering in CPU, and result is transmitted to GPU, reduced GPU and logic is grasped
The computing of work；Furthermore operation will be pushed away before batch matrix it is assigned in substantial amounts of thread and performs, and according to GPU memory access
Modelbased optimization device memory uses, and realizes GPU and merges memory access, improves internal memory operation speed；It is incremental by level in last GPU
Kernel function Batch_LUForward is calculated in push before sequence starting layering LU, the pattern for taking CPU and GPU to combine, is controlled by CPU
Overall flow processed simultaneously handles basic data, and GPU is responsible for push calculation before the lower triangular transformation matrix layering of sparse vectors, carries
Operation efficiency is pushed away before the high LU of direction of energy system of linear equations, it is timeconsuming big to solve Load flow calculation in Operation of Electric Systems analysis
The problem of.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the present invention；
Fig. 2 is example used in the present invention；
Fig. 3 is kernel function task of the present invention distribution and internal memory optimization schematic diagram.
Embodiment
Technical scheme is described further below in conjunction with the accompanying drawings.
As shown in figure 1, the present invention pushes away method before disclosing a kind of GPU acceleration of the sparse lower trigonometric equation group of a large amount of isomorphisms, should
Method is divided into following steps to implement：
Step 1：Sparse matrix L parallelizations are layered in CPU
According to a series of LU symbol decomposition results of the system of linear equations coefficient matrix of a large amount of isomorphisms in CPU, that is, descend triangle
Transformation matrix L_{1}Sparsity structure, to lower triangular transformation matrix L_{1}Each row carries out parallelization layering, and parallelization is layered lower three angular moment
Battle array L_{1}N rows be assigned in M layers, belong between the row in same layer independently of each other, before can carrying out parallel push calculate；Every layer of bag
The quantity of the row contained is L (k), and k represents level number；All line numbers are stored in kth layer to mapping table Map_{k}。
Wherein, the parallelization principle of stratification is referring to " Direct Methods for Sparse Linear Systems "
Timothy A.Davis, SIAM, Philadelphia, 2006, " for design of Parallel Algorithms and the system knot of irregular problem
Structure optimizes ", Chen Xiaoming.
Step 2：Data needed for push calculation before LU are transferred to GPU by CPU
CPU reads electrical network basic data, and by matrix L_{1}Layering result and electrical network basic data start in kernel function
GPU is disposably transferred to before performing, reduces the data interaction between CPU and GPU.Required data include：Lower triangular transformation square
Battle array L_{1}~L_{N}, matrix dimensionality n, matrix L_{1}Parallelization layering result, system of linear equations righthandside vector b_{1}~b_{N}。
Step 3：Task is distributed and device memory optimization
Illustrate specific task allocation model exemplified by push is calculated before the lower triangular matrix that dimension as shown in Figure 2 is 8.Will
N number of isomorphism sparse matrix L_{1}~L_{N}Same a line before push away operation distribute to same thread block different threads processing.Tool
Body allocation model is as shown in Figure 3：7th thread block is responsible for calculating sparse matrix L_{1}~L_{N}The 7th row；Visited to ensure to merge internal memory
Ask, by matrix L_{1}~L_{N}Continuous storage composition one is the big matrix of N rows in logic in internal memory, then carries out transposition operation, such as
Shown in Fig. 3, the data that 32 threads in a thread beam in the 7th thread block are read continuously are deposited in internal memory, are improved
Internal memory memory access speed.
Step 4：In GPU kernel function Batch_LUForward is calculated by push before the incremental sequence starting layering LU of level.
Push calculates kernel function and is defined as Batch_LUForward before LU in GPU<N_{blocks}, N_{threads}>, its thread
Block size N_{threads}It is fixed as 128；When calculating k layers, thread number of blocks N_{blocks}=L (k), total number of threads are：
N_{blocks}×N_{threads}, start kernel function Batch_LUForward<L (k), N_{threads}>To calculate all rows for belonging to kth layer.
Batch_LUForward<L (k), N_{threads}>Specific calculation process be：
(4.1) the thread index that CUDA is distributed in thread block index blockID and thread block for each thread automatically
threadID；
(4.2) blockID and threadID are assigned to variable bid and t, joint variable bid and t indexes bid lines
T threads in journey block, 128 threads in bid thread blocks are responsible for matrix L_{1}~L_{N}Jth=Map_{k}(bid) pushed away before row
Computing, wherein：T threads are responsible for calculating matrix L_{t}Jth row before push calculate, t=threadID+m × 128, (m=0,
1 ..., N/128)；
In t threads in (4.3) bid thread blocks, judge whether t is less than N, less than continuing executing with, the otherwise line
Journey is out of service；
(4.4) variable i is incremented to j1 from 1, and if only if L_{t}During (j, i) ≠ 0, using formula y_{t}(j)=b_{t}(j)y_{t}(i)
×L_{t}(j, i) pushes away operation result y before calculating_{t}Jth of element y_{t}(j)；
(4.5) formula y is used_{t}(j)=y_{t}(j)/L_{t}(j, j) updates y_{t}(j)。
Claims (5)
1. a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating, it is characterised in that methods described is included such as
Lower step：
(1) in CPU according to a series of LU symbol decomposition results of sparsity structure identical n rank system of linear equations coefficient matrixes,
Descend triangular transformation matrix L_{1}Sparsity structure, to matrix L_{1}Each row carries out parallelization layering, and L_{1}~L_{N}It is sparse with identical
Structure and parallelization layering result；
(2) data needed for push calculation before LU are transferred to GPU by CPU；
(3) task distribution and device memory optimization：Will be to matrix L_{1}~L_{N}Before push away a large amount of lines that processor active task is assigned on GPU
Performed in journey, and used according to access principles memory optimization is merged；
(4) the kernel function Batch_LUForward calculated in GPU by push before the incremental sequence starting layering LU of level.
2. a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms according to claim 1 pushes away method before accelerating, its feature
It is：In the step (1), parallelization is layered lower triangular transformation matrix L_{1}N rows be assigned in M layers, belong in same layer
Row between independently of each other, can carry out parallel before push calculate；The quantity of every layer of row included is L (k), and k represents level number；Storage the
All line numbers are to mapping table Map in k layers_{k}。
3. a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms according to claim 1 pushes away method before accelerating, its feature
It is：In the step (2), data needed for push calculation include lower triangular transformation matrix L before LU_{1}~L_{N}, matrix dimensionality n, matrix L_{1}
Parallelization layering result, system of linear equations righthandside vector b_{1}~b_{N}。
4. a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms according to claim 1 pushes away method before accelerating, its feature
It is：In the step (3), by N number of isomorphism sparse matrix L_{1}~L_{N}Same a line LU before push away operation distribute to it is same
The different threads processing of thread block；To ensure to merge internal storage access, by matrix L_{1}~L_{N}Continuous storage composition one is patrolled in internal memory
The upper big matrix for N rows is collected, then carries out transposition operation.
5. a kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms according to claim 1 pushes away method before accelerating, its feature
It is：In the step (4), push calculation kernel function is defined as Batch_LUForward before the LU in GPU<N_{blocks},
N_{threads}>, wherein thread block size N_{threads}It is fixed as 128；When calculating k layers, thread number of blocks N_{blocks}=L (k),
Always number of threads is：N_{blocks}×N_{threads}, start kernel function Batch_LUForward<L (k), N_{threads}>Belong to calculate
All rows of kth layer；Batch_LUForward<L (k), N_{threads}>Specific calculation process be：
(4.1) thread that CUDA is distributed in thread block index blockID and thread block for each thread automatically indexes threadID；
(4.2) blockID and threadID are assigned to variable bid and t, joint variable bid and t indexes bid thread blocks
In t threads, 128 threads in bid thread blocks are responsible for matrix L_{1}~L_{N}Jth=Map_{k}(bid) push before row
Calculate, wherein：T threads are responsible for calculating matrix L_{t}Jth row before push calculate, t=threadID+m × 128, (m=0,1 ...,
N/128)；
In t threads in (4.3) bid thread blocks, judge whether t is less than N, less than continuing executing with, otherwise the thread moves back
Go out operation；
(4.4) variable i is incremented to j1 from 1, and if only if L_{t}During (j, i) ≠ 0, using formula y_{t}(j)=b_{t}(j)y_{t}(i)×L_{t}
(j, i) pushes away operation result y jth of element y before updating_{t}(j)；
(4.5) formula y is used_{t}(j)=y_{t}(j)/L_{t}(j, j) updates y_{t}(j)。
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN201710478883.0A CN107368454A (en)  20170622  20170622  A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN201710478883.0A CN107368454A (en)  20170622  20170622  A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating 
Publications (1)
Publication Number  Publication Date 

CN107368454A true CN107368454A (en)  20171121 
Family
ID=60306389
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN201710478883.0A Pending CN107368454A (en)  20170622  20170622  A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating 
Country Status (1)
Country  Link 

CN (1)  CN107368454A (en) 
Cited By (3)
Publication number  Priority date  Publication date  Assignee  Title 

CN111416441A (en) *  20200409  20200714  东南大学  Power grid topology analysis method based on GPU hierarchical acceleration 
CN113297537A (en) *  20210604  20210824  中国科学院软件研究所  GPU platformoriented sparse structured trigonometric equation set solution highperformance implementation method and device 
CN115396065A (en) *  20221026  20221125  南京邮电大学  Lowdelay decoding method for sparse random linear network coding 
Citations (3)
Publication number  Priority date  Publication date  Assignee  Title 

CN106026107A (en) *  20160726  20161012  东南大学  QR decomposition method of power flow Jacobian matrix for GPU acceleration 
CN106157176A (en) *  20160726  20161123  东南大学  The LU decomposition method of the direction of energy Jacobian matrix that a kind of GPU accelerates 
CN106354479A (en) *  20160812  20170125  东南大学  GPU acceleration QR decomposition method for a large number of isomorphic sparse matrixes 

2017
 20170622 CN CN201710478883.0A patent/CN107368454A/en active Pending
Patent Citations (3)
Publication number  Priority date  Publication date  Assignee  Title 

CN106026107A (en) *  20160726  20161012  东南大学  QR decomposition method of power flow Jacobian matrix for GPU acceleration 
CN106157176A (en) *  20160726  20161123  东南大学  The LU decomposition method of the direction of energy Jacobian matrix that a kind of GPU accelerates 
CN106354479A (en) *  20160812  20170125  东南大学  GPU acceleration QR decomposition method for a large number of isomorphic sparse matrixes 
Cited By (4)
Publication number  Priority date  Publication date  Assignee  Title 

CN111416441A (en) *  20200409  20200714  东南大学  Power grid topology analysis method based on GPU hierarchical acceleration 
CN113297537A (en) *  20210604  20210824  中国科学院软件研究所  GPU platformoriented sparse structured trigonometric equation set solution highperformance implementation method and device 
CN113297537B (en) *  20210604  20221025  中国科学院软件研究所  Highperformance implementation method and device for solving sparse structured trigonometric equation set 
CN115396065A (en) *  20221026  20221125  南京邮电大学  Lowdelay decoding method for sparse random linear network coding 
Similar Documents
Publication  Publication Date  Title 

CN106157176B (en)  A kind of LU decomposition method for the direction of energy Jacobian matrix that GPU accelerates  
CN107368454A (en)  A kind of GPU of the sparse lower trigonometric equation group of a large amount of isomorphisms pushes away method before accelerating  
CN108932548A (en)  A kind of degree of rarefication neural network acceleration system based on FPGA  
CN107145939A (en)  A kind of Neural network optimization and device  
CN104484234B (en)  A kind of more wavefront tidal current computing methods and system based on GPU  
CN109325591A (en)  Neural network processor towards Winograd convolution  
CN106407158A (en)  GPU accelerated method for performing batch processing of isomorphic sparse matrixes multiplied by full vectors  
CN105718364A (en)  Dynamic assessment method for ability of computation resource in cloud computing platform  
CN105870935B (en)  Radial distribution networks idle work optimization method based on clustering algorithm  
CN105391057A (en)  GPU thread design method of power flow Jacobian matrix calculation  
CN106026107B (en)  A kind of QR decomposition method for the direction of energy Jacobian matrix that GPU accelerates  
CN104035868B (en)  Diagonal angle edged model decomposition coordinates the data center's method for solving calculated  
CN109359730A (en)  Neural network processor towards fixed output normal form Winograd convolution  
CN106505575A (en)  A kind of Line Flow economic load dispatching method based on Granule Computing  
Teixidor et al.  Modeling fractal electrodes for Liion batteries  
CN103324983B (en)  A kind of isomorphism identification method of mechanism kinematics chain based on immune genetic hybrid algorithm  
CN106354479B (en)  A kind of GPU acceleration QR decomposition method of a large amount of isomorphism sparse matrixes  
CN108108190A (en)  A kind of computational methods and Related product  
CN105955712B (en)  A kind of DC Line Fault screening technique accelerated based on GPU  
CN107368368A (en)  A kind of GPU of the sparse upper trigonometric equation group of a large amount of isomorphisms accelerates back substitution method  
CN107392429A (en)  Under the direction of energy that a kind of GPU accelerates method is pushed away before trigonometric equation group  
CN106934485A (en)  A kind of new onedimensional based on genetic algorithm rehearses baiting method  
CN107066702A (en)  A kind of electromagnetic scattering method of rapid solving conductor structure localized variation  
CN106021943A (en)  Direct current fault screening method designed in combination with GPU hardware and software architecture characteristics  
CN109767002A (en)  A kind of neural network accelerated method based on mutipiece FPGA collaboration processing 
Legal Events
Date  Code  Title  Description 

PB01  Publication  
PB01  Publication  
SE01  Entry into force of request for substantive examination  
SE01  Entry into force of request for substantive examination  
RJ01  Rejection of invention patent application after publication  
RJ01  Rejection of invention patent application after publication 
Application publication date: 20171121 