CN109725937A - Parallel processing architecture and IC chip for machine learning decorrelation operation - Google Patents

Parallel processing architecture and IC chip for machine learning decorrelation operation Download PDF

Info

Publication number
CN109725937A
CN109725937A CN201811435477.7A CN201811435477A CN109725937A CN 109725937 A CN109725937 A CN 109725937A CN 201811435477 A CN201811435477 A CN 201811435477A CN 109725937 A CN109725937 A CN 109725937A
Authority
CN
China
Prior art keywords
decorrelation
parallel processing
processing architecture
output
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811435477.7A
Other languages
Chinese (zh)
Inventor
袁闻峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Gaofeng Technology Co Ltd
Original Assignee
Guangzhou Gaofeng Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Gaofeng Technology Co Ltd filed Critical Guangzhou Gaofeng Technology Co Ltd
Priority to CN201811435477.7A priority Critical patent/CN109725937A/en
Publication of CN109725937A publication Critical patent/CN109725937A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of parallel processing architectures and IC chip for machine learning decorrelation operation, including decorrelation unit, including two input channels and two output channels, it is used to after carrying out the data vector that input channel inputs decorrelation operation export through output channel;Parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation unit and shifts to install, interconnection;The input terminal of first floor decorrelation unit is equipped with N number of input data vector, and each input data vector inputs adjacent or two decorrelation units of head and the tail;The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.The present invention maximizes the quantity for the output channel that can be created;In pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;For cross-check final output data accuracy and conformity provide practical, system method.

Description

Parallel processing architecture and IC chip for machine learning decorrelation operation
Technical field
The present invention relates to a kind of parallel processing architectures and IC chip for machine learning decorrelation operation.
Background technique
Artificial intelligence changes the looks of all trades and professions in a manner of unprecedented.It is well known that artificial intelligence is related The range that is related to of application it is very wide: belong to wherein from intelligent stock exchange software to the system of control automatic driving.People Work intelligence and machine learning are two very popular terms, are often used interchangeably.However, based on the solution to machine learning It releases, i.e. " we simply can allow machine to touch data, themselves is allowed to learn " this viewpoint, machine learning should be considered as A kind of implementation method of artificial intelligence.Closely related mathematical concept and algorithm have much with machine learning, Linear Algebra In " singular value decomposition " (Singular Value Decomposition, abbreviation SVD) may be most popular and most important It is a kind of.
With the arrival of big data era, the ability that people collected and obtained data is more and more stronger.What these data had Feature is: high-dimensional, extensive and complicated.The high-dimensional efficiency that can seriously affect data mining algorithm of data.Therefore " drop Dimension " becomes the top priority of big data excavation and machine learning, and " singular value decomposition " then becomes the key tool of " dimensionality reduction ".More than It is the method from simple, the non-mathematics of one kind of information retrieval and data mining angle introduction " singular value decomposition "." singular value point There are also different titles for solution ", such as principal component analysis (Principal Components Analysis, abbreviation PCA), orthogonalization Functional Analysis etc..Generally speaking, other than the highly useful and important mathematical tool of one kind as machine learning, " singular value The purposes of decomposition " has extended also to many different subsciences, including psychology and sociology, weather and atmospheric science, with And astronomy.
" singular value decomposition " and the key concept " feature decomposition " (i.e. the calculating of characteristic value and feature vector) in mathematics are close Cut phase is closed.(note: singular value relevant to " singular value decomposition " is actually the square root of characteristic value.) have at present it is several very well Calculation method, " singular value decomposition " can be calculated.One of more known method is " iteration QR algorithm " (IterativeQR Algorithm).(note: " iteration QR algorithm " refers to John G.F.Francis and Vera N.Kublanovskaya in the 1950s end according to QR decompose global concept the mathematics mistake of independent invention using iteration Journey.In addition, this is related to the iteration of some macroscopic aspects by the iteration QR algorithm of Francis and Kublanovskaya exploitation Thought should not be obscured with following QR decomposition algorithms referred to.) there are four types of QR to decompose (QR Decomposition) calculation in history Method, i.e. " classical Gram-Schmidt algorithm " (Classical Gram-Schmidt), " Givens rotation " (Givens Rotation), " Householder transformation " (Householder Transformation) and " improvement Gram Schmidt calculation Method " (Modified Gram Schmidt).In in the past few decades, linear algebra field to these four QR decomposition algorithms and Its advantage and disadvantage conducts in-depth research." improvement Gram Schmidt algorithm " be generally considered to input signal channel ( Exactly given matrix column vector) execute " mutually orthogonal " process (being equivalent to the decorrelation operation for executing complete set) most Structuring, the most intuitive and most stable of method of numerical value.And on the other hand, " improvement Gram Schmidt algorithm " still has certain lack It falls into, such as causes to influence computational efficiency without symmetrical treatment framework, need using interminable tie line, it can not be by whole In a processing framework using most short communication line connection adjacent processing units acquisition advantage etc..
Summary of the invention
The present invention proposes a kind of parallel processing architecture and IC chip for machine learning decorrelation operation, solves Cause to influence computational efficiency without symmetrical treatment framework in the prior art, need using interminable tie line, it can not The problem of obtaining advantage and connecting adjacent processing units using most short communication line in entire processing framework.
The technical scheme of the present invention is realized as follows:
A kind of parallel processing architecture for machine learning decorrelation operation, including
Decorrelation unit, including two input channels and two output channels are used for the data inputted to input channel Vector exports after carrying out decorrelation operation through output channel;
The parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces layer and goes phase It closes unit to shift to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, the input of each input data vector it is adjacent or Two decorrelation units from beginning to end;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
Preferably, the left side input data vector of the decorrelation unit isRight side input data vector isLeft side output data vector isRight side output data vector is
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
Preferably, the input data vector inputs the left side input channel of a decorrelation unit, inputs another decorrelation The right side input channel of unit.
Preferably, the parallel processing architecture is planar structure.
Preferably, the parallel processing architecture is three dimensional cylinder structure.
A kind of IC chip, including described in any item parallel processing framves for machine learning decorrelation operation Structure.
The beneficial effects of the present invention are:
1) quantity for the output channel that can be created is maximized;
2) in pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;
3) for cross-check final output data accuracy and conformity provide practical, system method.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the structural schematic diagram of decorrelation unit one embodiment;
Fig. 2 is a kind of structural schematic diagram of the parallel processing architecture for machine learning decorrelation operation of the present invention;
Fig. 3 is the functional block diagram of pretreatment stage, parallel processing architecture and post-processing stages;
Fig. 4 is a kind of three dimensional structure diagram of the parallel processing architecture for machine learning decorrelation operation of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
As shown in Figure 1, the invention proposes a kind of parallel processing architectures for machine learning decorrelation operation, including
Decorrelation unit, including two input channels and two output channels are used for the data inputted to input channel Vector exports after carrying out decorrelation operation through output channel;In the present embodiment, decorrelation unit Decorrelation Cell, letter Claim DC.Decorrelation unit is theoretical basis of the invention, and two input data vectors respectively have K data point, two output datas Vector respectively has K data point, and in the present embodiment, K is the integer equal to N.
Parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation list Member shifts to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, the input of each input data vector it is adjacent or Two decorrelation units from beginning to end;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
The common calculation formula of decorrelation operation is listed below, but because of the difference of data normalization and other factors, no It can be slightly different with the decorrelation operation between application.
If the left side input data vector of decorrelation unit isRight side input data vector is Left side output data vector isRight side output data vector isDecorrelation unit passes through processing Left side input data channel generates right side output data channel, to remove left side input data channel and right side input data Associated component between channel.When processing terminate, it is between final right side output data channel and right side input data channel Decorrelation, left side output data channel is generated by processing right side input data channel, in the process, right side input Associated component between data channel and left side input data channel is removed.When processing terminate, final left side exports number According to being also decorrelation between channel and left side input data channel.The decorrelation operation of decorrelation unit is equivalent to orthogonalization Operation.
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
In the embodiment of the present invention, in the decorrelation unit of first layer, input data vector inputs a decorrelation unit Left side input channel, input the right side input channel of another decorrelation unit.
As shown in Fig. 2, by using X3 4(k;3,2,1) this output channel illustrates processing sequence of the invention:
Subscript " 3 " represents X3 4(k;3,2,1) this output channel once crosses the decorrelation operation of 3 levels.Note: in Fig. 2 Subscript " 0 " in input channel symbol represents in the input channel initial stage, is also not carried out any decorrelation operation.
Coefficient " k " in bracket, represents the specific data sample in each output channel data sequence." k " is arranged from 1 to K Sequence, K are the sum of the data sample in given output channel data sequence.
Subscript " 4 " and X3 4(k;3,2,1) other remaining symbols in, reflect this specific output channel, are logical Cross complete the 4th input channel and other input channels it is all may with generated after necessary decorrelation operation.
Since the present invention relates to the exploitation of " N input -2N output " processing framework, each input channels corresponding two A output channel.In this example for elaboration:
X3 4(k;And X 3,2,1)3 4(k;1,2,3) all with same, i.e. the 4th article of input channel is corresponding, in other words all from This channel develops.
Theoretically, this two output channels will generate same output data sequence.But due to system noise and other originals Cause, they may not necessarily generate identical output data sequence.And in this example, X3 4(k;And X 3,2,1)3 4(k;1,2,3) The main distinction, be because they execute decorrelation operations when different order caused by.That is: X3 4(k;3,2,1) output channel Generation, start from the decorrelation operation of the 4th article with the 3rd article of input channel, and X3 4(k;1,2,3) generation of output channel, then The decorrelation operation of the 4th article with the 1st article of input channel is started from, and so on.In other words, the number order in bracket represents Decorrelation operation occur sequence.Therefore the design of this numbering system, can allow all decorrelations in entire processing framework Calculation step is more clear clear.The sequence for understanding and utilizing decorrelation operation, to signal specific treatment conditions and machine learning Using extremely important.In addition, above-mentioned number and coefficient system design, can also support to use or be related to this processing architecture invention The engineering development of application.
As shown in figure 3, the input terminal of parallel processing architecture of the invention further includes pretreatment stage, after output end is equipped with The reason stage, the reasonable arrangement that pretreatment stage is related to the preparation of input data and carries out to it, and post-process and be then related to exporting The conversion of data in a particular application.For example, singular value decomposition, QR decompose the substantial connection between the present invention.Between them Correlation is related in the conversion task of the output data that post-processing stages execute in a particular application.
As shown in Fig. 2, parallel processing architecture of the present invention is planar structure.
As shown in figure 4, parallel processing architecture of the present invention is three dimensional cylinder structure.
Parallel processing architecture of the invention can be converted mutually between two and three dimensions, convenient for flexibly setting.
The invention also provides a kind of IC chips, including any one for machine learning decorrelation operation and Row processing framework.Flexible visualization between this 2D to 3D and 3D to 2D of the present invention, for chip area minimum and other cores Piece designs relevant optimization task and provides unlimited possibility.Because this 3D cylindrical structure can be driven plain in different directions, It rolls, scale.
The present invention describes a kind of parallel, modularization and the operation framework that can flexibly stretch in detail, for handling various people Work intelligence and data and signal in machine learning application.It is logical to be usually directed to multiple inputs with traditional data or signal processing applications Road is different with single output channel, and the present invention relates to multiple input channels and multiple output channels.It is many advanced, diversified Using, such as adaptive pulse Doppler processing relevant to real time radar signal processing, improvement cerebral nerve magnetic signal quality, It requires using this " Multiple input-output " processing framework.This novel concurrent operation framework is expected to as next-generation artificial intelligence Energy chip design, which provides, clearly to be instructed, and accelerates the theory and application innovation in machine learning field.
The beneficial effects of the present invention are:
1) quantity for the output channel that can be created is maximized;
2) in pipelined fashion, using and realize the symmetrical of maximum quantity, to execute all necessary calculating;
3) for cross-check final output data accuracy and conformity provide practical, system method.
Above-mentioned technical proposal discloses improvement of the invention, the technology contents not being disclosed in detail, can be by art technology Personnel are achieved by the prior art.
The above is merely preferred embodiments of the present invention, be not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (6)

1. a kind of parallel processing architecture for machine learning decorrelation operation, it is characterised in that: including
Decorrelation unit, including two input channels and two output channels are used for the data vector inputted to input channel It is exported after carrying out decorrelation operation through output channel;
The parallel processing architecture includes N-1 layers of decorrelation unit, and each layer includes N number of decorrelation unit, faces a layer decorrelation list Member shifts to install, interconnection;
The input terminal of first floor decorrelation unit is equipped with N number of input data vector, and each input data vector inputs adjacent or head and the tail Two decorrelation units;
The output end of last layer decorrelation unit is equipped with 2N output data vector, and N is the integer greater than 2.
2. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described The left side input data vector of decorrelation unit isRight side input data vector isLeft side exports number It is according to vectorRight side output data vector is
Wherein, m is decorrelation level;
I, j are channel index;
K is the specific data sample in each output channel data sequence;
It * is complex conjugate;
K is the sum of the specific data sample in each output channel data sequence.
3. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described Input data vector inputs the left side input channel of a decorrelation unit, inputs the right side input channel of another decorrelation unit.
4. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described Parallel processing architecture is planar structure.
5. the parallel processing architecture according to claim 1 for machine learning decorrelation operation, it is characterised in that: described Parallel processing architecture is three dimensional cylinder structure.
6. a kind of IC chip, it is characterised in that: go phase for machine learning including claim 1-5 is described in any item Close the parallel processing architecture of operation.
CN201811435477.7A 2018-11-28 2018-11-28 Parallel processing architecture and IC chip for machine learning decorrelation operation Pending CN109725937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811435477.7A CN109725937A (en) 2018-11-28 2018-11-28 Parallel processing architecture and IC chip for machine learning decorrelation operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811435477.7A CN109725937A (en) 2018-11-28 2018-11-28 Parallel processing architecture and IC chip for machine learning decorrelation operation

Publications (1)

Publication Number Publication Date
CN109725937A true CN109725937A (en) 2019-05-07

Family

ID=66295877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811435477.7A Pending CN109725937A (en) 2018-11-28 2018-11-28 Parallel processing architecture and IC chip for machine learning decorrelation operation

Country Status (1)

Country Link
CN (1) CN109725937A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110361701A (en) * 2019-07-16 2019-10-22 广州市高峰科技有限公司 Radar electric anticluter based on side-lobe blanking and sidelobe cancellation
CN110361702A (en) * 2019-07-16 2019-10-22 广州市高峰科技有限公司 The processing method of Radar jam signal
CN110471883A (en) * 2019-07-16 2019-11-19 广州市高峰科技有限公司 Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110361701A (en) * 2019-07-16 2019-10-22 广州市高峰科技有限公司 Radar electric anticluter based on side-lobe blanking and sidelobe cancellation
CN110361702A (en) * 2019-07-16 2019-10-22 广州市高峰科技有限公司 The processing method of Radar jam signal
CN110471883A (en) * 2019-07-16 2019-11-19 广州市高峰科技有限公司 Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework

Similar Documents

Publication Publication Date Title
Raissi Forward–backward stochastic neural networks: deep learning of high-dimensional partial differential equations
CN109725937A (en) Parallel processing architecture and IC chip for machine learning decorrelation operation
Dym Linear algebra in action
CN101776934A (en) Carry generation and transfer function generator and reversible and optimal addition line design method
CN105913118A (en) Artificial neural network hardware implementation device based on probability calculation
Velinov et al. Practical application of simplex method for solving linear programming problems
Singh et al. Performance Measure of Similis and FPGrowth Algo rithm
Balasubramaniam et al. Delay-range dependent stability criteria for neural networks with Markovian jumping parameters
Xu et al. Sticker DNA computer model—part I: theory
Bhagavathi et al. A time-optimal multiple search algorithm on enhanced meshes, with applications
Karavay et al. Qubit fault detection in SoC logic
CN103473599A (en) Genetic algorithm and Kalman filtering based RBFN (Radial Basis Function Networks) combined training method
CN110471883A (en) Artificial intelligence accelerator with Cyclic Symmetry decorrelation framework
Gan et al. Subsequence-level entity attention lstm for relation extraction
CN102262669A (en) Fast outputting method from Chinese Pinyin to Chinese character internal code
Balogun A modified linear search algorithm
Cagnoni et al. Real-World Applications of Evolutionary Computing: EvoWorkshops 2000: EvoIASP, EvoSCONDI, EvoTel, EvoSTIM, EvoRob, and EvoFlight, Edinburgh, Scotland, UK, April 17, 2000 Proceedings
Aksoy et al. Performance evaluation of RULES-3 induction system for data mining
He et al. Fuzzy clustering method based on perturbation
CN101739565A (en) Large-capacity pattern recognition method
Dan et al. An algorithm for synthesis of quantum reversible logic circuits based on decomposition
Żelazny et al. Solving multi-objective permutation flowshop scheduling problem using CUDA
Chen et al. Quantum FPGA architecture design
Onwubolu et al. Manufacturing cell grouping using similarity coefficient-distance measure
Karthauser et al. Dynamics of coset dimensional reduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190507

WD01 Invention patent application deemed withdrawn after publication