CN106855918A - Process the principal component analytical method of extensive matrix data - Google Patents

Process the principal component analytical method of extensive matrix data Download PDF

Info

Publication number
CN106855918A
CN106855918A CN201611153472.6A CN201611153472A CN106855918A CN 106855918 A CN106855918 A CN 106855918A CN 201611153472 A CN201611153472 A CN 201611153472A CN 106855918 A CN106855918 A CN 106855918A
Authority
CN
China
Prior art keywords
matrix
data
principal component
calculate
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611153472.6A
Other languages
Chinese (zh)
Inventor
喻文健
谷昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201611153472.6A priority Critical patent/CN106855918A/en
Publication of CN106855918A publication Critical patent/CN106855918A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass

Landscapes

  • Complex Calculations (AREA)

Abstract

The present invention proposes a kind of principal component analytical method for processing extensive matrix data, including:Generation random number matrix Ω;According to raw data matrix A calculating matrix G and H;Initializing variable j=1, m × l matrixes Q and l × n matrix B are null matrix;Setting G[j,j+b]And Ω[j,j+b]The jth of respectively G and Ω is arranged to j+b, works as j>When 1, G is calculated[j,j+b]‑QBΩ[j,j+b]Result is covered into G[j,j+b];To G[j,j+b]It is simplified QR to decompose, obtains orthogonal matrix Q[j,j+b]With upper triangle square formation R;If j>1, calculate Q[j,j+b]‑Q(QTQ[j,j+b]) simplified QR decompose, the orthogonal matrix that will obtain covering Q[j,j+b], obtainCalculateResult of calculation is covered into R;If H[j,j+b]For the jth of H is arranged to j+b, if j=1, calculateOtherwise calculateObtain matrix Btemp;The value of variable j+b+1 is assigned to variable j;If j≤l, return to step four otherwise performs next step;Singular value decomposition is done to B, preceding k principal component vector and corresponding singular value is obtained.The present invention is suitable for various big data analysis scenes, with computational efficiency and practicality higher.

Description

Process the principal component analytical method of extensive matrix data
Technical field
The present invention relates to big data analysis technical field, more particularly to a kind of principal component for processing extensive matrix data point Analysis method.
Background technology
Principal component analysis, i.e. PCA (Principal Component Analysis), are a kind of conventional data analysis sides Method.PCA extracts one group main base vector (i.e. principal character component) of the initial data in linear space by matrix computations, Then initial data is projected on this group of base, realizes the dimensionality reduction of high dimensional data.To by the data after dimensionality reduction, can further do The computings such as cluster, classification, realize the artificial intelligence applications such as feature extraction, automatic classification, identification.Currently, principal component analysis conduct A kind of important unsupervised learning method, is widely used in the relevant various application problems of data mining, machine learning.
In practical problem, data can often be expressed as a matrix.Without loss of generality, each data is regarded as matrix A line of A, then matrix column number is exactly the dimension of each data.The target that principal component analysis is calculated is some of initial data Individual principal character component, can be obtained by the Eigenvalues Decomposition of matrix or singular value decomposition.Side based on Eigenvalue Decomposition Method is first calculating matrix ATA, then to ATA carries out Eigenvalues Decomposition, obtains the corresponding characteristic vector of some characteristic values of maximum just It is requirement " principal component ".Method based on Singular Value Decomposition Using directly does singular value decomposition to matrix A:A=U Σ VT, its Middle U and V are orthogonal matrix, and Σ is the diagonal matrix that diagonal element is arranged from big to small, and the preceding several columns of the V matrixes for obtaining are exactly to require " principal component ".If data dimension is less high, i.e., the columns of A is much smaller than line number, the method computational efficiency that feature based value is decomposed Compare high, because the A of its treatmentTA matrixes are a less matrixes of exponent number.
On the other hand, as mobile device, internet, sensor network, genetic engineering are developed rapidly, data are produced Source becomes variation, while data volume also shows exponential growth trend.That is, being currently in so-called " big data " epoch.How to store, analyze and manage growing data integration under the time and space limitation that can be born is The problem that traditional data processing meanses face.Research shows that current 85% data can be directly or by conversion After be expressed as numeric type data, i.e., common integer, real-coded GA, and " table " of numeric type data construction is stored in database Structure is generally regarded as matrix and is processed.Therefore, how for these big datas produce, storage, using etc. aspect spy Point, working out effective " big matrix " data analysing method becomes abnormal important.Specifically, because data scale is too big, it Be probably distributed storage (i.e. on the network on different computer nodes) or storage on the computer's hard and cannot Intactly it is loaded into internal memory (due to memory size limitation).In some other application scenarios, these data be also likely to be by The mode of " data flow " gradually produces, get, and is not suitable for by the way of traditional first storing is calculated again to them Processed.Method in view of traditional calculating principal component analysis need to carry out Eigenvalues Decomposition or singular value point to whole matrix Solution, needs reading repeatedly in algorithm in it, (k principal component, at least wants completely the element of ergodic data matrix before to calculate Read matrix element k times on ground), it is clear that they are not suitable for being analyzed to reading the huge big data of expense in above-mentioned scene.
In view of above-mentioned background, the calculation of the matrix computational approach based on randomization, including Eigenvalues Decomposition, singular value decomposition Method, enjoys people to pay close attention in recent years.In document:N.Halko,P.-G.Martinsson and J.A.Tropp,Finding structure with randomness:Probabilistic algorithms for constructing Approximate matrix decompositions, SIAM Review, 53 (2011), no.2, pp.217-288 (letter below Writing SIAM2011) in, it is proposed that a kind of random singular value decomposition algorithm less to matrix data traversal number of times.The method is led to Cross and original matrix A is multiplied by one only containing the random matrix of k row, obtain the k dimensional features subspace of original matrix column space, then Obtain the orthogonal base vectors matrix Q of the subspace, and A approximate factorization:A ≈ QB, wherein B be one only k rows matrix. Finally to B, this less matrix carries out traditional singular value decomposition calculating, can approximately obtain the preceding k singular value of original matrix A With corresponding left and right singular vector.In document SIAM2011, the degree of accuracy also to above-mentioned approximate data has carried out theoretical point Analysis, as a result shows that it can make error fall in the limit of very little on very big probability, while it is also proposed several raising results The skill of the degree of accuracy.
Although it should be pointed out that document SIAM2011 institute's extracting methods greatly reduced compared to traditional singular value decomposition algorithm it is right The traversal number of times of matrix element, but it at least needs twice of Ergodic Matrices element, still there is room for promotion from computational efficiency, and The processing requirement of data stream type big data cannot be adapted to.
The content of the invention
It is contemplated that at least solving one of above-mentioned technical problem.
Therefore, it is an object of the invention to propose a kind of principal component analytical method for processing extensive matrix data, the party Method is suitable for various big data analysis scenes, with computational efficiency and practicality higher.
To achieve these goals, embodiment of the invention discloses that a kind of principal component for processing extensive matrix data point Analysis method, comprises the following steps:S1:A n row, the random number matrix Ω of l row are generated in internal memory;S2:Choose initial data Matrix A, and according to raw data matrix A the calculating matrix G and H, and matrix G and H are stored in internal memory, wherein, G=A Ω, H=ATG, the raw data matrix A are m * n matrix;S3:Initializing variable j=1, and initialize m × l matrixes Q and l × n matrix B is null matrix;S4:Setting G[j,j+b]And Ω[j,j+b]Respectively the jth of matrix G and matrix Ω is arranged to j+b, and works as j >When 1, G is calculated[j,j+b]-QBΩ[j,j+b], and result of calculation is covered into G[j,j+b], wherein, b is the nonnegative integer no more than l-j; S5:To matrix G[j,j+b]It is simplified QR to decompose, obtains the row orthogonal matrix Q of m × (b+1)[j,j+b]With upper triangle square formation R, wherein, Q[j,j+b]For the jth stored in matrix Q is arranged to j+b;S6:If j>1, then calculating matrix Q[j,j+b]-Q(QTQ[j,j+b]) simplification QR is decomposed, the m that will be obtained × (b+1) row orthogonal matrix covering Q[j,j+b], it is to obtain upper triangular matrixAnd calculating matrix multiplicationAnd result of calculation is covered into R;S7:If H[j,j+b]The jth of representing matrix H is arranged to j+b, if j=1, is calculatedOtherwise calculateObtaining result is (b+1) matrix B of × ntemp, and by BtempThe jth in matrix B is stored to j+b rows;S8:The value of variable j+b+1 is assigned to become Amount j;S9:If j≤l, return and perform the S4, otherwise perform the S10;S10:Singular value decomposition is done to matrix B:B=U ΣVT, wherein, the preceding k of matrix V is classified as the preceding k principal component vector, and the preceding k diagonal element of Σ is described corresponding unusual Value.
In addition, the principal component analytical method of the extensive matrix data for the treatment of according to the above embodiment of the present invention can also have There is following additional technical characteristic:
In some instances, in the S1, the parameter l is at least bigger than k 5 integer.
In some instances, the S1, further includes:S11:According to one n × l of random number generator Software Create with Machine matrix number Ω;S12:Initializing variable i=0, variable P are the nonnegative integer less than 10;S13:If i=P, terminate to hold OK, the S14 is otherwise gone to continue executing with;S14:Calculating matrix multiplication A Ω, and result of calculation is carried out to simplify QR decomposition, will M × l row the orthogonal matrix for obtaining is assigned to matrix G;S15:Calculating matrix multiplication ATG, and result of calculation is carried out to simplify QR decomposition, N × l row the orthogonal matrix that will be obtained is assigned to matrix Ω;S16:The value of i is added 1, and goes to the S13 and continued executing with.
In some instances, in the S2, different producing methods or source according to the raw data matrix A are led to The unit crossed in a time raw data matrix A of traversal usually calculates matrix G=A Ω and H=ATG。
In some instances, the S2, further includes:S21:Open up two-dimensional array space storage n × l's in internal memory Matrix H, and be 0 by the data initialization of the matrix H;S22:Obtain the data of the default row of raw data matrix A and be stored in In internal memory, and set the matrix A that the default row forms s × ni, calculating matrix multiplication Gi=AiΩ, wherein, the GiIt is square The corresponding rows of battle array G;S23:CalculateAnd result of calculation is assigned to matrix H;S24:Judge whether to obtain initial data All rows of matrix A, if it is, stopping performing, otherwise return and perform the S22.
The principal component analytical method of the extensive matrix data for the treatment of according to embodiments of the present invention, based on current random strange Different value decomposition algorithm, but by improve by algorithm main part to the traversal number of times of data matrix by being reduced to twice once, And keep the degree of accuracy of former algorithm constant, such that it is able to be suitable for various big data analysis scenes, imitated with calculating higher Rate and practicality.
Additional aspect of the invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by practice of the invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from description of the accompanying drawings below to embodiment is combined Substantially and be readily appreciated that, wherein:
Fig. 1 is the flow chart of the principal component analytical method of the extensive matrix data for the treatment of according to embodiments of the present invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings.
Below in conjunction with the principal component analytical method of the Description of Drawings extensive matrix data for the treatment of according to embodiments of the present invention.
Fig. 1 is the flow of the principal component analytical method of the extensive matrix data for the treatment of according to an embodiment of the invention Figure.As shown in figure 1, the method is comprised the following steps:
Step S1:A n row, the random number matrix Ω of l row are generated in internal memory.Wherein, parameter l is at least bigger than k by 5 Integer, k is a columns for random matrix.
In one embodiment of the invention, the element in random number matrix Ω is uniform random number or standard normal Distribution random numbers, or obtained by more complicated mode, no longer enumerate repeating one by one herein.
In one embodiment of the invention, step S1 is further included:
S11:According to one n × l random number matrix Ω of random number generator Software Create.
S12:Initializing variable i=0, variable P are the nonnegative integer less than 10.
S13:If i=P, terminate to perform, otherwise go to step S14 and continue executing with.
S14:Calculating matrix multiplication A Ω, and result of calculation is carried out to simplify QR decomposition, the m × l row orthogonal matrix that will be obtained is assigned It is worth and gives matrix G.
S15:Calculating matrix multiplication ATG, and result of calculation is carried out to simplify QR decomposition, the n × l row orthogonal matrix that will be obtained is assigned It is worth and gives matrix Ω.
S16:The value of i is added 1, and goes to S13 and continued executing with, so as to improve the degree of accuracy of result.
Step S2:Raw data matrix A is chosen, and according to raw data matrix A calculating matrix G and H, and by matrix G and H It is stored in internal memory, wherein, G=A Ω, H=ATG, raw data matrix A are m * n matrix, and it is often gone and represents a data, always Common m rows.
In one embodiment of the invention, in step s 2, different producing methods according to raw data matrix A or come Source, matrix G=A Ω and H=A are usually calculated by traveling through the unit in a raw data matrix ATG。
Based on this, step S2 is further included:
S21:The matrix H that two-dimensional array space stores n × l is opened up in internal memory, and is 0 by the data initialization of matrix H.
S22:Obtain the data of the default row of raw data matrix A and be stored in internal memory, and set default row and form s × n Matrix Ai, calculating matrix multiplication Gi=AiΩ, wherein, GiIt is the corresponding rows of matrix G.
S23:CalculateAnd result of calculation is assigned to (covering) matrix H.
S24:Judge whether all rows of acquisition raw data matrix A, if it is, stopping performing, otherwise return and perform Step S22.
Step S3:Initializing variable j=1, and initialize m × l matrixes Q and l × n matrix B and be null matrix.
Step S4:Setting G[j,j+b]And Ω[j,j+b]Respectively the jth of matrix G and matrix Ω is arranged to j+b, and works as j>When 1, meter Calculate G[j,j+b]-QBΩ[j,j+b], and result of calculation is covered into G[j,j+b], wherein, set b as the nonnegative integer no more than l-j.
Step S5:To matrix G[j,j+b]It is simplified QR to decompose, obtains the row orthogonal matrix Q of m × (b+1)[j,j+b]With upper triangle Square formation R, wherein, Q[j,j+b]The jth stored in matrix Q is arranged to j+b.
Step S6:If j>1, then calculating matrix Q[j,j+b]-Q(QTQ[j,j+b]) simplified QR decompose, the m that will be obtained × (b+ 1) row orthogonal matrix covering Q[j,j+b], it is to obtain upper triangular matrixAnd calculating matrix multiplicationAnd result of calculation is covered into R.
Step S7:If H[j,j+b]The jth of representing matrix H is arranged to j+b, if j=1, is calculatedIt is no Then calculateObtain the matrix B that result is (b+1) × ntemp, and By BtempThe jth in matrix B is stored to j+b rows.
Step S8:The value of variable j+b+1 is assigned to variable j.
Step S9:If j≤l, return and perform S4, otherwise perform S10.
Step S10:Singular value decomposition is done to matrix B, to obtain preceding k principal component vector and corresponding singular value.Specifically , the formula that singular value decomposition is done to matrix B is:
B=U Σ VT,
Wherein, the preceding k of matrix V is classified as preceding k principal component vector, and the preceding k diagonal element of Σ is corresponding singular value.
Needs are said, in the above embodiment of the present invention, it is assumed that raw data matrix A is m * n matrix, and it is often gone A data are represented, altogether m rows, based on this, the purpose of the present invention is the preceding k principal component vector and correspondence for calculating the data Singular value.On the other hand, if each column of A represents a data, can be to its transposition ATCarry out the above-mentioned implementation of the invention described above Calculating process described by example.
For the ease of more fully understanding the principal component analytical method of the extensive matrix data for the treatment of of the embodiment of the present invention, with Under the present invention is explained in further detail in conjunction with specific embodiments.
In the present embodiment, the method for the embodiment of the present invention for example can be with any programming language realization, with CPU Performed with the computing device of internal memory.The random number generator that is used in the present embodiment, plus/minus method, multiplication are performed to matrix, is turned Put, matrix inversion (or Solving Linear), and QR decompose and singular value decomposition, prior art is, by calling phase Answering the numerical computations function library of programming language can realize.
In the present embodiment, it is considered to which one stores the large-scale data matrix A on hard disk, and it is certain time series number According to, each data characteristics number is relatively more, such as and 100,000, and data a total of 500,000, each numerical value is using double essences Degree floating number storage is, it is necessary to 8 bytes.So, whole data need the amount of storage of about 400GB.Assuming that needing to extract data 1000 principal characters, that is, to calculate preceding 1000 principal component vectors.Following step is can perform, an ergodic data is for one time Complete to calculate (parameter value k=1000, m=500,000, n=100,000), comprise the following steps that:
Step 1:The value of l is determined according to parameter k, l=k+5 is made.
Step 2:A n row, the standardized normal distribution random number matrix Ω of l row are generated in internal memory.
Step 3:Matrix G=A Ω and H=A are calculated by traveling through the matrix A one time on hard diskTG.Specific steps are such as Under:
Step 3.1:The matrix H and m × l matrix G of two-dimensional array space storage n × l are opened up in internal memory, by the beginning of its data Beginning turns to 0, opens the fixed disk file of storage matrix A, and read pointer is placed in into file header.
Step 3.2:Since the 1000 row data that matrix A is read file pointer position are stored in internal memory, if they are formed Matrix Ai, calculate Gi=AiΩ, as a result GiIt is stored on the corresponding rows of matrix G.
Step 3.3:CalculateResult is assigned to (covering) H.
Step 3.4:If not taking the corresponding files of A, return to step 3.2 is performed, and otherwise performs step 4.
Step 4:Initializing variable j=1, initialization m × l matrixes Q and l × n matrix B is null matrix.
Step 5:The value for setting b is 19.
Step 6:Note G[j,j+b]And Ω[j,j+b]Respectively the jth of matrix G and Ω is arranged to j+b;If j>1, calculate G[j,j+b]- QBΩ[j,j+b], result is covered into G[j,j+b]
Step 7:To matrix G[j,j+b]It is simplified QR to decompose, obtains the row orthogonal matrix Q of m × (b+1)[j,j+b]With upper triangle Square formation R, Q[j,j+b]The jth stored in matrix Q is arranged to j+b.
Step 8:If j>1, calculating matrix Q[j,j+b]-Q(QTQ[j,j+b]) simplified QR decompose, the m that will be obtained × (b+1) Row orthogonal matrix covers Q[j,j+b], and obtain upper triangular matrix and beCalculating matrix multiplicationResult covers R.
Step 9:If H[j,j+b]The jth of representing matrix H is arranged to j+b;If j=1, calculateIt is no Then calculateObtain the matrix B that result is (b+1) × ntemp, will It stores the jth in matrix B to j+b rows.
Step 10:The value of variable j+b+1 is assigned to variable j.
Step 11:If j≤l, return to step 5 is performed, and otherwise performs step 12.
Step 12:Singular value decomposition, i.e. B=U Σ V are done to matrix BT, then the preceding k row of matrix V are exactly desired preceding k master Component vector, the preceding k diagonal element of Σ is exactly corresponding singular value.
It is above-mentioned only to have read hard disk number because the time for reading data from hard disk is much larger than the time calculated in internal memory According to the algorithm of a time, it performs the twice that speed is respective algorithms in document SIAM2011, greatlys save whole big data analysis Time.
The matrix V of preceding 1000 principal component vectors composition that said process is obtained, notices that it is one 100,000 × 1000 Row orthogonal matrix.Then AV is calculated, the dimensionality reduction data matrix of 500,000 × 1000 is obtained, it still represents 500,000 number According to, but data dimension reduces significantly.Can further be clustered using the data after dimensionality reduction, be classified, etc., do data mining with Analysis.
To sum up, the principal component analytical method of the extensive matrix data for the treatment of according to embodiments of the present invention, based on current Random singular value decomposition algorithm, but by improve by algorithm main part to the traversal number of times of data matrix by being reduced to twice Once, and keep the degree of accuracy of former algorithm constant, such that it is able to be suitable for various big datas analysis scenes, with meter higher Calculate efficiency and practicality.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are described Point is contained at least one embodiment of the invention or example.In this manual, to the schematic representation of above-mentioned term not Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not Can these embodiments be carried out with various changes, modification, replacement and modification in the case of departing from principle of the invention and objective, this The scope of invention is by claim and its equivalent limits.

Claims (5)

1. a kind of principal component analytical method for processing extensive matrix data, it is characterised in that comprise the following steps:
S1:A n row, the random number matrix Ω of l row are generated in internal memory;
S2:Raw data matrix A is chosen, and according to raw data matrix A the calculating matrix G and H, and matrix G and H are stored In internal memory, wherein, G=A Ω, H=ATG, the raw data matrix A are m * n matrix;
S3:Initializing variable j=1, and initialize m × l matrixes Q and l × n matrix B and be null matrix;
S4:Setting G[j,j+b]And Ω[j,j+b]Respectively the jth of matrix G and matrix Ω is arranged to j+b, and works as j>When 1, calculate G[j,j+b]-QBΩ[j,j+b], and result of calculation is covered into G[j,j+b], wherein, b is the nonnegative integer no more than l-j;
S5:To matrix G[j,j+b]It is simplified QR to decompose, obtains m × (b+1) row orthogonal matrix Q[j,j+b]With upper triangle square formation R, its In, Q[j,j+b]The jth stored in matrix Q is arranged to j+b;
S6:If j>1, then calculating matrix Q[j,j+b]-Q(QTQ[j,j+b]) simplified QR decompose, the m that will be obtained × (b+1) arrange orthogonal Matrix cover Q[j,j+b], it is to obtain upper triangular matrixAnd calculating matrix multiplicationAnd result of calculation is covered into R;
S7:If H[j,j+b]The jth of representing matrix H is arranged to j+b, if j=1, is calculatedOtherwise calculateObtain the matrix B that result is (b+1) × ntemp, and by Btemp The jth in matrix B is stored to j+b rows;
S8:The value of variable j+b+1 is assigned to variable j;
S9:If j≤l, return and perform the S4, otherwise perform the S10;
S10:Singular value decomposition is done to matrix B:B=U Σ VT, wherein, the preceding k of matrix V is classified as the preceding k principal component vector, Σ Preceding k diagonal element be the corresponding singular value.
2. the principal component analytical method of the extensive matrix data for the treatment of according to claim 1, it is characterised in that described In S1, the parameter l is at least bigger than k 5 integer.
3. the principal component analytical method of the extensive matrix data for the treatment of according to claim 1, it is characterised in that described S1, further includes:
S11:According to one n × l random number matrix Ω of random number generator Software Create;
S12:Initializing variable i=0, variable P are the nonnegative integer less than 10;
S13:If i=P, terminate to perform, otherwise go to the S14 and continue executing with;
S14:Calculating matrix multiplication A Ω, and result of calculation is carried out to simplify QR decomposition, the m × l row orthogonal matrix that will be obtained is assigned to Matrix G;
S15:Calculating matrix multiplication ATG, and result of calculation is carried out to simplify QR decomposition, the n × l row orthogonal matrix that will be obtained is assigned to Matrix Ω;
S16:The value of i is added 1, and goes to the S13 and continued executing with.
4. the principal component analytical method of the extensive matrix data for the treatment of according to claim 1, it is characterised in that described In S2, different producing methods or source according to the raw data matrix A, by traveling through a raw data matrix A In unit usually calculate matrix G=A Ω and H=ATG。
5. the principal component analytical method of the extensive matrix data for the treatment of according to claim 1, it is characterised in that described S2, further includes:
S21:The matrix H that two-dimensional array space stores n × l is opened up in internal memory, and is 0 by the data initialization of the matrix H;
S22:Obtain the data of the default row of raw data matrix A and be stored in internal memory, and set the default row and form s × n Matrix Ai, calculating matrix multiplication Gi=AiΩ, wherein, the GiIt is the corresponding rows of matrix G;
S23:CalculateAnd result of calculation is assigned to matrix H;
S24:Judge whether all rows of acquisition raw data matrix A, if it is, stopping performing, otherwise return described in performing S22。
CN201611153472.6A 2016-12-14 2016-12-14 Process the principal component analytical method of extensive matrix data Pending CN106855918A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611153472.6A CN106855918A (en) 2016-12-14 2016-12-14 Process the principal component analytical method of extensive matrix data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611153472.6A CN106855918A (en) 2016-12-14 2016-12-14 Process the principal component analytical method of extensive matrix data

Publications (1)

Publication Number Publication Date
CN106855918A true CN106855918A (en) 2017-06-16

Family

ID=59125855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611153472.6A Pending CN106855918A (en) 2016-12-14 2016-12-14 Process the principal component analytical method of extensive matrix data

Country Status (1)

Country Link
CN (1) CN106855918A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443061A (en) * 2018-05-03 2019-11-12 阿里巴巴集团控股有限公司 A kind of data ciphering method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443061A (en) * 2018-05-03 2019-11-12 阿里巴巴集团控股有限公司 A kind of data ciphering method and device

Similar Documents

Publication Publication Date Title
Gu et al. Projection convolutional neural networks for 1-bit cnns via discrete back propagation
US12073328B2 (en) Integrating a memory layer in a neural network for one-shot learning
Wan et al. Regularization of neural networks using dropconnect
Tu et al. Spatial-temporal data augmentation based on LSTM autoencoder network for skeleton-based human action recognition
CN107943938A (en) A kind of large-scale image similar to search method and system quantified based on depth product
CN103942571B (en) Graphic image sorting method based on genetic programming algorithm
CN107203787A (en) Unsupervised regularization matrix decomposition feature selection method
Liu et al. Learning instance-wise sparsity for accelerating deep models
CN110019652A (en) A kind of cross-module state Hash search method based on deep learning
CN104463148B (en) Face identification method based on Image Reconstruction and hash algorithm
CN112508190A (en) Method, device and equipment for processing structured sparse parameters and storage medium
CN103971136A (en) Large-scale data-oriented parallel structured support vector machine classification method
CN108564116A (en) A kind of ingredient intelligent analysis method of camera scene image
CN115099461A (en) Solar radiation prediction method and system based on double-branch feature extraction
Can et al. Evaluating shape representations for Maya glyph classification
CN118135225A (en) Weak supervision indoor point cloud semantic segmentation method, device and medium based on clustering thought
Suganuma et al. Hierarchical feature construction for image classification using genetic programming
CN106855918A (en) Process the principal component analytical method of extensive matrix data
CN112765367A (en) Method and device for constructing theme knowledge graph
Erbin et al. Deep Learning: Complete Intersection Calabi–Yau Manifolds
CN115457638A (en) Model training method, data retrieval method, device, equipment and storage medium
Han et al. Modify self-attention via skeleton decomposition for effective point cloud transformer
US20200134360A1 (en) Methods for Decreasing Computation Time Via Dimensionality
Zhang et al. Index t-SNE Based on Sobol Sequence Initialized Harris Hawk Optimization Algorithm
CN113743593A (en) Neural network quantization method, system, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170616

RJ01 Rejection of invention patent application after publication