CN109725937A - Parallel processing architecture and IC chip for machine learning decorrelation operation - Google Patents
Parallel processing architecture and IC chip for machine learning decorrelation operation Download PDFInfo
- Publication number
- CN109725937A CN109725937A CN201811435477.7A CN201811435477A CN109725937A CN 109725937 A CN109725937 A CN 109725937A CN 201811435477 A CN201811435477 A CN 201811435477A CN 109725937 A CN109725937 A CN 109725937A
- Authority
- CN
- China
- Prior art keywords
- decorrelation
- parallel processing
- processing architecture
- output
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 46
- 238000010801 machine learning Methods 0.000 title claims abstract description 26
- 239000013598 vector Substances 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 abstract description 12
- 238000000354 decomposition reaction Methods 0.000 description 13
- 238000004422 calculation algorithm Methods 0.000 description 11
- 230000006872 improvement Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000002490 cerebral effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Landscapes
- Complex Calculations (AREA)
Abstract
The invention discloses a kind of parallel processing architectures and IC chip for machine learning decorrelation operation, including decorrelation unit, including two input channels and two output channels, it is used to after carrying out the data vector that input channel inputs decorrelation operation export through output channel;Parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation unit and shifts to install, interconnection;The input terminal of first floor decorrelation unit is equipped with N number of input data vector, and each input data vector inputs adjacent or two decorrelation units of head and the tail;The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.The present invention maximizes the quantity for the output channel that can be created;In pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;For cross-check final output data accuracy and conformity provide practical, system method.
Description
Technical field
The present invention relates to a kind of parallel processing architectures and IC chip for machine learning decorrelation operation.
Background technique
Artificial intelligence changes the looks of all trades and professions in a manner of unprecedented.It is well known that artificial intelligence is related
The range that is related to of application it is very wide: belong to wherein from intelligent stock exchange software to the system of control automatic driving.People
Work intelligence and machine learning are two very popular terms, are often used interchangeably.However, based on the solution to machine learning
It releases, i.e. " we simply can allow machine to touch data, themselves is allowed to learn " this viewpoint, machine learning should be considered as
A kind of implementation method of artificial intelligence.Closely related mathematical concept and algorithm have much with machine learning, Linear Algebra
In " singular value decomposition " (Singular Value Decomposition, abbreviation SVD) may be most popular and most important
It is a kind of.
With the arrival of big data era, the ability that people collected and obtained data is more and more stronger.What these data had
Feature is: high-dimensional, extensive and complicated.The high-dimensional efficiency that can seriously affect data mining algorithm of data.Therefore " drop
Dimension " becomes the top priority of big data excavation and machine learning, and " singular value decomposition " then becomes the key tool of " dimensionality reduction ".More than
It is the method from simple, the non-mathematics of one kind of information retrieval and data mining angle introduction " singular value decomposition "." singular value point
There are also different titles for solution ", such as principal component analysis (Principal Components Analysis, abbreviation PCA), orthogonalization
Functional Analysis etc..Generally speaking, other than the highly useful and important mathematical tool of one kind as machine learning, " singular value
The purposes of decomposition " has extended also to many different subsciences, including psychology and sociology, weather and atmospheric science, with
And astronomy.
" singular value decomposition " and the key concept " feature decomposition " (i.e. the calculating of characteristic value and feature vector) in mathematics are close
Cut phase is closed.(note: singular value relevant to " singular value decomposition " is actually the square root of characteristic value.) have at present it is several very well
Calculation method, " singular value decomposition " can be calculated.One of more known method is " iteration QR algorithm "
(IterativeQR Algorithm).(note: " iteration QR algorithm " refers to John G.F.Francis and Vera
N.Kublanovskaya in the 1950s end according to QR decompose global concept the mathematics mistake of independent invention using iteration
Journey.In addition, this is related to the iteration of some macroscopic aspects by the iteration QR algorithm of Francis and Kublanovskaya exploitation
Thought should not be obscured with following QR decomposition algorithms referred to.) there are four types of QR to decompose (QR Decomposition) calculation in history
Method, i.e. " classical Gram-Schmidt algorithm " (Classical Gram-Schmidt), " Givens rotation " (Givens
Rotation), " Householder transformation " (Householder Transformation) and " improvement Gram Schmidt calculation
Method " (Modified Gram Schmidt).In in the past few decades, linear algebra field to these four QR decomposition algorithms and
Its advantage and disadvantage conducts in-depth research." improvement Gram Schmidt algorithm " be generally considered to input signal channel (
Exactly given matrix column vector) execute " mutually orthogonal " process (being equivalent to the decorrelation operation for executing complete set) most
Structuring, the most intuitive and most stable of method of numerical value.And on the other hand, " improvement Gram Schmidt algorithm " still has certain lack
It falls into, such as causes to influence computational efficiency without symmetrical treatment framework, need using interminable tie line, it can not be by whole
In a processing framework using most short communication line connection adjacent processing units acquisition advantage etc..
Summary of the invention
The present invention proposes a kind of parallel processing architecture and IC chip for machine learning decorrelation operation, solves
Cause to influence computational efficiency without symmetrical treatment framework in the prior art, need using interminable tie line, it can not
The problem of obtaining advantage and connecting adjacent processing units using most short communication line in entire processing framework.
The technical scheme of the present invention is realized as follows:
A kind of parallel processing architecture for machine learning decorrelation operation, including
Decorrelation unit, including two input channels and two output channels are used for the data inputted to input channel
Vector exports after carrying out decorrelation operation through output channel;
The parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces layer and goes phase
It closes unit to shift to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, the input of each input data vector it is adjacent or
Two decorrelation units from beginning to end;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
Preferably, the left side input data vector of the decorrelation unit isRight side input data vector isLeft side output data vector isRight side output data vector is
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
Preferably, the input data vector inputs the left side input channel of a decorrelation unit, inputs another decorrelation
The right side input channel of unit.
Preferably, the parallel processing architecture is planar structure.
Preferably, the parallel processing architecture is three dimensional cylinder structure.
A kind of IC chip, including described in any item parallel processing framves for machine learning decorrelation operation
Structure.
The beneficial effects of the present invention are:
1) quantity for the output channel that can be created is maximized;
2) in pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;
3) for cross-check final output data accuracy and conformity provide practical, system method.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the structural schematic diagram of decorrelation unit one embodiment;
Fig. 2 is a kind of structural schematic diagram of the parallel processing architecture for machine learning decorrelation operation of the present invention;
Fig. 3 is the functional block diagram of pretreatment stage, parallel processing architecture and post-processing stages;
Fig. 4 is a kind of three dimensional structure diagram of the parallel processing architecture for machine learning decorrelation operation of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
As shown in Figure 1, the invention proposes a kind of parallel processing architectures for machine learning decorrelation operation, including
Decorrelation unit, including two input channels and two output channels are used for the data inputted to input channel
Vector exports after carrying out decorrelation operation through output channel;In the present embodiment, decorrelation unit Decorrelation Cell, letter
Claim DC.Decorrelation unit is theoretical basis of the invention, and two input data vectors respectively have K data point, two output datas
Vector respectively has K data point, and in the present embodiment, K is the integer equal to N.
Parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation list
Member shifts to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, the input of each input data vector it is adjacent or
Two decorrelation units from beginning to end;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
The common calculation formula of decorrelation operation is listed below, but because of the difference of data normalization and other factors, no
It can be slightly different with the decorrelation operation between application.
If the left side input data vector of decorrelation unit isRight side input data vector is
Left side output data vector isRight side output data vector isDecorrelation unit passes through processing
Left side input data channel generates right side output data channel, to remove left side input data channel and right side input data
Associated component between channel.When processing terminate, it is between final right side output data channel and right side input data channel
Decorrelation, left side output data channel is generated by processing right side input data channel, in the process, right side input
Associated component between data channel and left side input data channel is removed.When processing terminate, final left side exports number
According to being also decorrelation between channel and left side input data channel.The decorrelation operation of decorrelation unit is equivalent to orthogonalization
Operation.
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
In the embodiment of the present invention, in the decorrelation unit of first layer, input data vector inputs a decorrelation unit
Left side input channel, input the right side input channel of another decorrelation unit.
As shown in Fig. 2, by using X3 4(k;3,2,1) this output channel illustrates processing sequence of the invention:
Subscript " 3 " represents X3 4(k;3,2,1) this output channel once crosses the decorrelation operation of 3 levels.Note: in Fig. 2
Subscript " 0 " in input channel symbol represents in the input channel initial stage, is also not carried out any decorrelation operation.
Coefficient " k " in bracket, represents the specific data sample in each output channel data sequence." k " is arranged from 1 to K
Sequence, K are the sum of the data sample in given output channel data sequence.
Subscript " 4 " and X3 4(k;3,2,1) other remaining symbols in, reflect this specific output channel, are logical
Cross complete the 4th input channel and other input channels it is all may with generated after necessary decorrelation operation.
Since the present invention relates to the exploitation of " N input -2N output " processing framework, each input channels corresponding two
A output channel.In this example for elaboration:
X3 4(k;And X 3,2,1)3 4(k;1,2,3) all with same, i.e. the 4th article of input channel is corresponding, in other words all from
This channel develops.
Theoretically, this two output channels will generate same output data sequence.But due to system noise and other originals
Cause, they may not necessarily generate identical output data sequence.And in this example, X3 4(k;And X 3,2,1)3 4(k;1,2,3)
The main distinction, be because they execute decorrelation operations when different order caused by.That is: X3 4(k;3,2,1) output channel
Generation, start from the decorrelation operation of the 4th article with the 3rd article of input channel, and X3 4(k;1,2,3) generation of output channel, then
The decorrelation operation of the 4th article with the 1st article of input channel is started from, and so on.In other words, the number order in bracket represents
Decorrelation operation occur sequence.Therefore the design of this numbering system, can allow all decorrelations in entire processing framework
Calculation step is more clear clear.The sequence for understanding and utilizing decorrelation operation, to signal specific treatment conditions and machine learning
Using extremely important.In addition, above-mentioned number and coefficient system design, can also support to use or be related to this processing architecture invention
The engineering development of application.
As shown in figure 3, the input terminal of parallel processing architecture of the invention further includes pretreatment stage, after output end is equipped with
The reason stage, the reasonable arrangement that pretreatment stage is related to the preparation of input data and carries out to it, and post-process and be then related to exporting
The conversion of data in a particular application.For example, singular value decomposition, QR decompose the substantial connection between the present invention.Between them
Correlation is related in the conversion task of the output data that post-processing stages execute in a particular application.
As shown in Fig. 2, parallel processing architecture of the present invention is planar structure.
As shown in figure 4, parallel processing architecture of the present invention is three dimensional cylinder structure.
Parallel processing architecture of the invention can be converted mutually between two and three dimensions, convenient for flexibly setting.
The invention also provides a kind of IC chips, including any one for machine learning decorrelation operation and
Row processing framework.Flexible visualization between this 2D to 3D and 3D to 2D of the present invention, for chip area minimum and other cores
Piece designs relevant optimization task and provides unlimited possibility.Because this 3D cylindrical structure can be driven plain in different directions,
It rolls, scale.
The present invention describes a kind of parallel, modularization and the operation framework that can flexibly stretch in detail, for handling various people
Work intelligence and data and signal in machine learning application.It is logical to be usually directed to multiple inputs with traditional data or signal processing applications
Road is different with single output channel, and the present invention relates to multiple input channels and multiple output channels.It is many advanced, diversified
Using, such as adaptive pulse Doppler processing relevant to real time radar signal processing, improvement cerebral nerve magnetic signal quality,
It requires using this " Multiple input-output " processing framework.This novel concurrent operation framework is expected to as next-generation artificial intelligence
Energy chip design, which provides, clearly to be instructed, and accelerates the theory and application innovation in machine learning field.
The beneficial effects of the present invention are:
1) quantity for the output channel that can be created is maximized;
2) in pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;
3) for cross-check final output data accuracy and conformity provide practical, system method.
Above-mentioned technical proposal discloses improvement of the invention, the technology contents not being disclosed in detail, can be by art technology
Personnel are achieved by the prior art.
The above is merely preferred embodiments of the present invention, be not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (6)
1. a kind of parallel processing architecture for machine learning decorrelation operation, it is characterised in that: including
Decorrelation unit, including two input channels and two output channels are used for the data vector inputted to input channel
It is exported after carrying out decorrelation operation through output channel;
The parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation list
Member shifts to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, and each input data vector inputs adjacent or head and the tail
Two decorrelation units;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
2. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described
The left side input data vector of decorrelation unit isRight side input data vector isLeft side exports number
It is according to vectorRight side output data vector is
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
3. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described
Input data vector inputs the left side input channel of a decorrelation unit, inputs the right side input channel of another decorrelation unit.
4. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described
Parallel processing architecture is planar structure.
5. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described
Parallel processing architecture is three dimensional cylinder structure.
6. a kind of IC chip, it is characterised in that: go phase for machine learning including claim 1-5 is described in any item
Close the parallel processing architecture of operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811435477.7A CN109725937A (en) | 2018-11-28 | 2018-11-28 | Parallel processing architecture and IC chip for machine learning decorrelation operation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811435477.7A CN109725937A (en) | 2018-11-28 | 2018-11-28 | Parallel processing architecture and IC chip for machine learning decorrelation operation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109725937A true CN109725937A (en) | 2019-05-07 |
Family
ID=66295877
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811435477.7A Pending CN109725937A (en) | 2018-11-28 | 2018-11-28 | Parallel processing architecture and IC chip for machine learning decorrelation operation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109725937A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110361701A (en) * | 2019-07-16 | 2019-10-22 | 广州市高峰科技有限公司 | Radar electric anticluter based on side-lobe blanking and sidelobe cancellation |
CN110361702A (en) * | 2019-07-16 | 2019-10-22 | 广州市高峰科技有限公司 | The processing method of Radar jam signal |
CN110471883A (en) * | 2019-07-16 | 2019-11-19 | 广州市高峰科技有限公司 | Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework |
-
2018
- 2018-11-28 CN CN201811435477.7A patent/CN109725937A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110361701A (en) * | 2019-07-16 | 2019-10-22 | 广州市高峰科技有限公司 | Radar electric anticluter based on side-lobe blanking and sidelobe cancellation |
CN110361702A (en) * | 2019-07-16 | 2019-10-22 | 广州市高峰科技有限公司 | The processing method of Radar jam signal |
CN110471883A (en) * | 2019-07-16 | 2019-11-19 | 广州市高峰科技有限公司 | Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Raissi | Forward–backward stochastic neural networks: deep learning of high-dimensional partial differential equations | |
CN109725937A (en) | Parallel processing architecture and IC chip for machine learning decorrelation operation | |
Dym | Linear algebra in action | |
CN101776934A (en) | Carry generation and transfer function generator and reversible and optimal addition line design method | |
CN105913118A (en) | Artificial neural network hardware implementation device based on probability calculation | |
Velinov et al. | Practical application of simplex method for solving linear programming problems | |
Singh et al. | Performance Measure of Similis and FPGrowth Algo rithm | |
Balasubramaniam et al. | Delay-range dependent stability criteria for neural networks with Markovian jumping parameters | |
Xu et al. | Sticker DNA computer model—part I: theory | |
Bhagavathi et al. | A time-optimal multiple search algorithm on enhanced meshes, with applications | |
Karavay et al. | Qubit fault detection in SoC logic | |
CN103473599A (en) | Genetic algorithm and Kalman filtering based RBFN (Radial Basis Function Networks) combined training method | |
CN110471883A (en) | Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework | |
Gan et al. | Subsequence-level entity attention lstm for relation extraction | |
CN102262669A (en) | Fast outputting method from Chinese Pinyin to Chinese character internal code | |
Balogun | A modified linear search algorithm | |
Cagnoni et al. | Real-World Applications of Evolutionary Computing: EvoWorkshops 2000: EvoIASP, EvoSCONDI, EvoTel, EvoSTIM, EvoRob, and EvoFlight, Edinburgh, Scotland, UK, April 17, 2000 Proceedings | |
Aksoy et al. | Performance evaluation of RULES-3 induction system for data mining | |
He et al. | Fuzzy clustering method based on perturbation | |
CN101739565A (en) | Large-capacity pattern recognition method | |
Dan et al. | An algorithm for synthesis of quantum reversible logic circuits based on decomposition | |
Żelazny et al. | Solving multi-objective permutation flowshop scheduling problem using CUDA | |
Chen et al. | Quantum FPGA architecture design | |
Onwubolu et al. | Manufacturing cell grouping using similarity coefficient-distance measure | |
Karthauser et al. | Dynamics of coset dimensional reduction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190507 |
|
WD01 | Invention patent application deemed withdrawn after publication |