CN104346520A - Neural network based data dimension reduction system and dimension reducing method thereof - Google Patents
Neural network based data dimension reduction system and dimension reducing method thereof Download PDFInfo
- Publication number
- CN104346520A CN104346520A CN201410362559.9A CN201410362559A CN104346520A CN 104346520 A CN104346520 A CN 104346520A CN 201410362559 A CN201410362559 A CN 201410362559A CN 104346520 A CN104346520 A CN 104346520A
- Authority
- CN
- China
- Prior art keywords
- reference point
- data
- dimensionality reduction
- neuroid
- formula
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 168
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000013528 artificial neural network Methods 0.000 title abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 61
- 238000013499 data model Methods 0.000 claims description 40
- 238000013507 mapping Methods 0.000 claims description 40
- 230000008569 process Effects 0.000 claims description 27
- 230000004913 activation Effects 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 16
- 230000002860 competitive effect Effects 0.000 claims description 10
- 241001269238 Data Species 0.000 claims description 9
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 5
- 239000012141 concentrate Substances 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 238000005259 measurement Methods 0.000 claims description 5
- 230000007547 defect Effects 0.000 abstract description 2
- 238000013461 design Methods 0.000 description 9
- 238000012549 training Methods 0.000 description 9
- 238000007405 data analysis Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a neural network based data dimension reduction system and a dimension reducing method thereof. The neural network based data dimension reduction system comprises a data acquisition system, wherein the data acquisition system is connected with a control system; the control system comprises a data dimension reduction module based on the neural network. The neural network based data dimension reduction system can be combined with the dimension reducing method to effectively overcome the defects that in the prior art, the operand is large, and the neighborhood determination is uncertain and not practical.
Description
Technical field
The invention belongs to Data Dimensionality Reduction technical field, be specifically related to a kind of Data Dimensionality Reduction system based on neuroid and dimension reduction method thereof.
Background technology
Current image, video also have the signal of communication of some complexity to be transferred in control system in data acquisition system (DAS), normally carry out storing with the data mode of higher-dimension, so just bring in use take control system resource too much and the large problem very consuming time of operand, the serious problem that control system even can be caused to collapse.
Therefore the high dimensional data that existing control system generally have employed the such as image got from data acquisition system (DAS), video also have the signal of communication of some complexity to form carries out dimension-reduction treatment before the use, but the following problem of existing dimensionality reduction mode ubiquity:
(1) operand is still very large: the time complexity of the distance of the geodesic line based on the k neighbour figure step such as under existing dimension-reduction algorithm is O (kN
2logN), the time complexity protecting distance mapping step is O (N
3), under such computing, time complexity is very large;
(2) uncertainty determined of neighborhood: existing dimension-reduction algorithm uses k neighbour figure to carry out the calculating of geodesic line distance, but just there is article to deliver query in science magazine as far back as 2002, k is excessive causes short circuit error, the too small problem causing fragment of k, and the method solved can only be select suitable k to carry out uncertain dimensionality reduction, the same like this increase that can cause the complexity of operand, often occur that dimensionality reduction result is excessive with the error phase ratio error of former high dimensional data, even distortion completely;
(3) without practicality: high dimensional data point of newly arriving can change whole k neighbour figure, needs all to recalculate, online process is difficult to thus without practicality.
Summary of the invention
Object of the present invention provides a kind of Data Dimensionality Reduction system based on neuroid and dimension reduction method thereof, comprise data acquisition system (DAS), described data acquisition system (DAS) is connected with control system, with the Data Dimensionality Reduction module based on neuroid in described control system.And effectively can avoid operand of the prior art still very large, uncertainty that neighborhood is determined and the defect without practicality in conjunction with its dimension reduction method.
In order to overcome deficiency of the prior art, the invention provides a kind of based on the Data Dimensionality Reduction system of neuroid and the solution of dimension reduction method method thereof, specific as follows:
Based on a Data Dimensionality Reduction system for neuroid, comprise data acquisition system (DAS) 1, described data acquisition system (DAS) 1 is connected with control system 2, with the Data Dimensionality Reduction module 3 based on neuroid in described control system 2.
The measurement dimension reduction method of described a kind of Data Dimensionality Reduction system based on neuroid, step is as follows:
Step 1: first data acquisition system (DAS) is gathering the image that comes or the such signal data of video is sent in control system 2, then the control system 2 Data Dimensionality Reduction module 3 started based on neuroid is first configured to High Dimensional Data Set the image sent or the such signal data of video and stores;
Step 2: the process then determining Topology of Mainfolds structure reference point based on Data Dimensionality Reduction module 3 pairs of high dimensional datas of neuroid, described process high dimensional data being determined to Topology of Mainfolds structure reference point, describedly particularly determine that the detailed process of the process of Topology of Mainfolds structure reference point is for first to carry out initialization to high dimensional data, described initialization comprises first setting reference point set A={ L
1, L
2, wherein A is reference point set, L
1be the first reference point, L
2be the second reference point, the first reference point and the second reference point random concentrate from high dimensional data two high dimensional datas chosen; Then set limit set C based on the Data Dimensionality Reduction module 3 of neuroid, initial value be 0 two activate number variable, initial values be || L
1-L
2|| two range threshold variablees and initial value be 0 first be connected age variable
described
and its initial value is empty set, A × A represents the annexation between the reference point of reference point set, initial value is that empty set represents and initially do not connect between the first reference point and the second reference point, described two are activated number variable and are respectively the activation number variable for the first reference point and the activation number variable for the second reference point, and the activation number variable for the first reference point and the activation number variable for the second reference point are respectively
with
, described two range threshold variablees are respectively the first range threshold variable
with the second range threshold variable
described first connects age variable
what represent is the connection duration of the first reference point and the second reference point;
Step 3: then enter input and competitive stage, described input and competitive stage comprise data acquisition system (DAS) and continue one and gather image or the such signal data of video, and gathering an image coming or the such signal data of video is sent in control system, the Data Dimensionality Reduction module 3 based on neuroid in control system is first stored as a high dimensional data the image received or the such signal data of video, and described high dimensional data is as a new data model ξ ∈ R
d, wherein said new data model is ξ, described R
drepresent higher-dimension real number space, described R represents real number, D represents the dimension of high dimensional data, then calculate the Euclidean distance of each reference point in A and new data model ξ, the reference point corresponding to minimum Euclidean distance obtained and the reference point corresponding to little Euclidean distance second from the bottom are respectively victor's reference point s
1with second place's reference point s
2, the victor's reference point s namely represented by formula (1) and formula (2)
1with second place's reference point s
2:
Victor's reference point s
1with second place's reference point s
2just become two the most similar reference points; Enter the reference point more new stage subsequently, described reference point more the new stage Data Dimensionality Reduction module 3 comprised based on neuroid judge if
or
set up, just for new data model ξ is put in reference point set A, to generate a new value be ξ reference point, and namely A=A ∪ { ξ }, then returns in step 3 and perform;
Step 4: if s
1with s
2between there is not connection, perform C=C ∪ { (s
1, s
2) operation, be between two most similar references points and connect, reset the second age variable that initial value is 0
the second described age variable
that represent is victor's reference point s
1with second place's reference point s
2connection duration; If then judged
then perform
operation,
operation represent and s
1the connection duration of all reference points be connected adds 1, described
be the 3rd age variable, that the 3rd age variable represents is victor's reference point s
1with all reference point L be attached thereto
iconnection duration, i is natural number variable, sets for victor's reference point s
1activation number variable
and to for victor's reference point s
1activation number variable
perform
operation,
value be increase progressively from 0, then perform s
1=s
1+ ε (t) || ξ-s
1|| and s
2=s
2+ ε ' (t) || ξ-s
2|| operation, namely perform s
1with s
2to the operation of new data model movement, wherein
t is the working time of the Data Dimensionality Reduction system based on neuroid;
Step 5: the Data Dimensionality Reduction module 3 based on neuroid checks the connection (L between all reference points
i, L
j) ∈ C and each group reference point between connection (L
i, L
j) corresponding to current age parameter
if
just remove this connection from C, wherein age
maxbe predefined connection duration maximal value, the connection between wherein said all reference points is (L
i, L
j) ∈ C, wherein i and j is unequal natural number, described
for (L
i, L
j) between connection duration;
Step 6: the Data Dimensionality Reduction module 3 based on neuroid then performs the more new stage of the range threshold of reference point, the more new stage of the range threshold of described reference point comprises s
1and s
2range threshold
with
be updated to respectively and s by formula (3) and formula (4)
1and s
2the ultimate range of adjacent reference point
Described
with
be respectively for victor's reference point s
1range threshold and second place's reference point s
2range threshold, then the denoising stage is entered, the described denoising stage comprises by judging based on the Data Dimensionality Reduction module 3 of neuroid if the data sample sum of current input is the integral multiple defining value λ of setting, check the reference point in all reference point set A, if there is some reference point L
ionly have a reference point be connected, and
be less than the activation number minimum M of setting
min, just in reference point set A, leave out this reference point L
i, described
for for some reference point L
iactivation number variable, return in step 2 perform;
Step 7: the Data Dimensionality Reduction module 3 then based on neuroid enters the Calculation Basis point similarity stage;
Step 8: natural number variable i value is added 1, by extraction reference point L
i(i=1 ..., n), wherein n is the reference point number in reference point set A, for this reference point L
ienter the initial phase in Calculation Basis point similarity stage, first perform S={L
i, U=A-{L
ioperation, S is the first intermediate quantity set, and U is the second intermediate quantity set, then the similarity matrix D of n*n
g(n*n) D in
gthe value of (i, i) element is set to 0, described D
gthis reference point of (i, i) element representation L
iwith the Similarity value of self, for each reference point L in U
j(L
j∈ U), if L
iwith L
jbe connected, i.e. (L
i, L
j) ∈ C, then D
g(i, j) element value is set to || L
i-L
j||; Otherwise D
g(i, j) element value is set to ∞, described D
gthis reference point L described in (i, j) element indicates
iwith the L of the element in U
jbetween Similarity value
Step 9: enter intermediate point and choose the stage, described intermediate point choose the stage comprise to choose from U with this reference point L
ithe reference point L that Similarity value is minimum
min, i.e. L
min=argminD
g(i, j) and L
min∈ U, by L
minadd S, i.e. S=S ∪ { L
min, U=U-{L
min;
Step 10: then enter limit and expand the stage, the described limit stage of expanding comprises for each reference point L in U
k(L
k∈ U), k is natural number, if L
minwith L
kbe connected, i.e. (L
min, L
k) ∈ C, and D
g(i, min)+|| L
min-L
k|| < D
g(i, k), min is the sequence number of Lmin, then perform renewal rewards theory as shown in formula (5):
D
G(i,k)=D
G(i,min)+||L
min-L
k|| (5)
Then repeated execution of steps 9 and step 10 are until S=A,
till;
Step 11: return step 8 and perform, when i value reaches n by the time, after the reference point in expression reference point set A is all finished, obtains the similarity matrix D of n*n
g(n*n);
Step 12: the Data Dimensionality Reduction module 3 then based on neuroid enters reference point dimensionality reduction mapping phase, described reference point dimensionality reduction mapping phase comprises by formula (6) calculating square distance matrix Δ
n(i, j):
Δ
n(i,j)=D
G(i,j)*D
G(i,j),(i,j=1,…n) (6)
Then by formula (7) computation of mean values vector
Described
represent Δ
ni-th row of (i, j), i value is 1 to n;
Step 13: by formula (8) computation of mean values centralization matrix H
n:
Wherein δ (i, j) is intermediate parameters, generally gets 1, H
n(i, j) represents average centralization matrix H
nthe element value of the i-th row jth row;
Step 14: by formula (9) inner product matrix B
n:
Step 15: calculate eigenwert proper vector, described calculating eigenwert proper vector comprises calculating B
nmaximum d positive eigenvalue λ
1... λ
dwith its characteristic of correspondence vector
wherein d is the target dimension of dimensionality reduction;
Step 16: the dimensionality reduction mapping phase entering reference point, described dimensionality reduction mapping phase comprises and obtains by formula (10) matrix L that maps for the dimensionality reduction of reference point:
Step 17: enter online data dimensionality reduction mapping phase, described online data dimensionality reduction mapping phase comprises determines reference point belonging to new data point, determines the reference point L nearest apart from new data model ξ by formula (11)
α:
Step 18: obtain new data model ξ and the similarity D of all reference points according to formula (12)
s(ξ, L
i):
D
S(ξ,L
i)=||ξ-L
α||+D
G(α,i) (12)
Step 19: obtain square distance vector according to formula (13)
Step 20: obtain pseudoinverse transposed matrix according to formula (14), note L
#pseudoinverse transposed matrix for the matrix L that the dimensionality reduction of reference point maps:
Step 21: according to formula (15), low-dimensional is carried out to new data model ξ and map and obtain low-dimensional and map vectorial l
ξ:
By these technical characteristics, dimension reduction method of the present invention overcomes in conventional linear dimension reduction method, uses Euclidean distance to represent the shortcoming of similarity, proposes to use geodesic line distance to weigh similarity, thus obtain desirable dimensionality reduction result, for subsequent data analysis provides reliable pre-service.
Accompanying drawing explanation
Figure l is the syndeton schematic diagram of a kind of Data Dimensionality Reduction system based on neuroid of the present invention.
Fig. 2 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 1 high dimensional data is determined to the process of Topology of Mainfolds structure reference point face design sketch.
Fig. 3 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 1 high dimensional data is determined to the process of Topology of Mainfolds structure reference point overlook design sketch.
Fig. 4 is that the low-dimensional of carrying out in embodiments of the invention 1 maps the design sketch after obtaining low-dimensional mapping vector.
Fig. 5 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 2 high dimensional data is determined to the process of Topology of Mainfolds structure reference point face design sketch.
Fig. 6 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 2 high dimensional data is determined to the process of Topology of Mainfolds structure reference point overlook design sketch.
Fig. 7 is that the low-dimensional of carrying out in embodiments of the invention 2 maps the design sketch after obtaining low-dimensional mapping vector.
Fig. 8 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 3 high dimensional data is determined to the process of Topology of Mainfolds structure reference point face design sketch.
Fig. 9 be the Data Dimensionality Reduction module based on neuroid in embodiments of the invention 3 high dimensional data is determined to the process of Topology of Mainfolds structure reference point overlook design sketch.
Figure 10 is that the low-dimensional of carrying out in embodiments of the invention 3 maps the design sketch after obtaining low-dimensional mapping vector.
Embodiment
The object of the invention is a kind of efficiently Data Dimensionality Reduction system based on neuroid and the dimension reduction method thereof of developing robotization, be further detailed by drawings and Examples:
Embodiment 1:
It is swiss_roll data set that the image sent in the present embodiment or the such signal data of video are configured to High Dimensional Data Set, wherein 15000 data points of swiss_roll data centralization are used for determining reference point, and other 5000 data points are used for obtaining low-dimensional and map vectorial l
ξ, specific as follows:
As shown in Figure 1, Figure 2, Figure 3 and Figure 4, based on the Data Dimensionality Reduction system of neuroid, comprise data acquisition system (DAS) 1, described data acquisition system (DAS) 1 is connected with control system 2, with the Data Dimensionality Reduction module 3 based on neuroid in described control system 2.
The measurement dimension reduction method of described a kind of Data Dimensionality Reduction system based on neuroid, step is as follows:
Step 1: first data acquisition system (DAS) is gathering the image that comes or the such signal data of video is sent in control system 2, then the control system 2 Data Dimensionality Reduction module 3 started based on neuroid is first configured to High Dimensional Data Set the image sent or the such signal data of video and stores;
Step 2: the process then determining Topology of Mainfolds structure reference point based on Data Dimensionality Reduction module 3 pairs of high dimensional datas of neuroid, to high dimensional data, described determines that the target of the process of Topology of Mainfolds structure reference point utilizes training data to train self organizing neural network, make the result of training can represent the topological structure of former data set, reference point needed for generation and connection, described particularly high dimensional data is determined that the detailed process of the process of Topology of Mainfolds structure reference point is for first to carry out initialization, described initialization comprises first setting reference point set A={ L
1, L
2, wherein A is reference point set, L
1be the first reference point, L
2be the second reference point, the first reference point and the second reference point random concentrate from high dimensional data two high dimensional datas chosen, then set limit set C based on the Data Dimensionality Reduction module 3 of neuroid, initial value be 0 two activate number variable, initial values be || L
1-L
2|| two range threshold variablees and initial value be 0 first be connected age variable
described
and its initial value is empty set, A × A represents the annexation between the reference point of reference point set, initial value is that empty set represents and initially do not connect between the first reference point and the second reference point, described two are activated number variable and are respectively the activation number variable for the first reference point and the activation number variable for the second reference point, and the activation number variable for the first reference point and the activation number variable for the second reference point are respectively
with
, described two range threshold variablees are respectively the first range threshold variable
with the second range threshold variable
described first connects age variable
what represent is the connection duration of the first reference point and the second reference point,
Step 3: then enter input and competitive stage, described input and competitive stage comprise data acquisition system (DAS) and continue one and gather image or the such signal data of video, and gathering an image coming or the such signal data of video is sent in control system, the Data Dimensionality Reduction module 3 based on neuroid in control system is first stored as a high dimensional data the image received or the such signal data of video, and described high dimensional data is as a new data model ξ ∈ R
d, wherein said new data model is ξ, described R
drepresent higher-dimension real number space, described R represents real number, D represents the dimension of high dimensional data, then calculate the Euclidean distance of each reference point in A and new data model ξ, the reference point corresponding to minimum Euclidean distance obtained and the reference point corresponding to little Euclidean distance second from the bottom are respectively victor's reference point s
1with second place's reference point s
2, the victor's reference point s namely represented by formula (1) and formula (2)
1with second place's reference point s
2:
Victor's reference point s
1with second place's reference point s
2just become two the most similar reference points; Enter the reference point more new stage subsequently, described reference point more the new stage Data Dimensionality Reduction module 3 comprised based on neuroid judge if
or
set up, just for new data model ξ is put in reference point set A, to generate a new value be ξ reference point, and namely A=A ∪ { ξ }, then returns in step 3 and perform;
Step 4: if s
1with s
2between there is not connection, perform C=C ∪ { (s
1, s
2) operation, be between two most similar references points and connect, reset the second age variable that initial value is 0
the second described age variable
that represent is victor's reference point s
1with second place's reference point s
2connection duration; If then judge (s
1, L
i) ∈ C, then perform
operation,
operation represent and s
1the connection duration of all reference points be connected adds 1, described
be the 3rd age variable, that the 3rd age variable represents is victor's reference point s
1with all reference point L be attached thereto
iconnection duration, i is natural number variable, sets for victor's reference point s
1activation number variable
and to for victor's reference point s
1activation number variable
perform
operation,
value be increase progressively from 0, then perform s
1=s
1+ ε (t) || ξ-s
1|| and s
2=s
2+ ε ' (t) || ξ-s
2|| operation, namely perform s
1with s
2to the operation of new data model movement, wherein
t is the working time of the Data Dimensionality Reduction system based on neuroid;
Step 5: the Data Dimensionality Reduction module 3 based on neuroid checks the connection (L between all reference points
i, L
j) ∈ C and each group reference point between connection (L
i, L
j) corresponding to current age parameter
if
just remove this connection from C, wherein age
maxbe predefined connection duration maximal value, the connection between wherein said all reference points is (L
i, L
j) ∈ C, wherein i and j is unequal natural number, described
for (L
i, L
j) between connection duration;
Step 6: the Data Dimensionality Reduction module 3 based on neuroid then performs the more new stage of the range threshold of reference point, the more new stage of the range threshold of described reference point comprises s
1and s
2range threshold
with
be updated to respectively and s by formula (3) and formula (4)
1and s
2the ultimate range of adjacent reference point
Described
with
be respectively for victor's reference point s
1range threshold and second place's reference point s
2range threshold, then the denoising stage is entered, the described denoising stage comprises by judging based on the Data Dimensionality Reduction module 3 of neuroid if the data sample sum of current input is the integral multiple defining value λ of setting, check the reference point in all reference point set A, if there is some reference point L
ionly have a reference point be connected, and
be less than the activation number minimum M of setting
min, just in reference point set A, leave out this reference point L
i, described
for for some reference point L
iactivation number variable, return in step 2 perform; By the time, after training data sample all inputs, the connection C between Topology of Mainfolds structure reference point set A needed for us and reference point is obtained.
Step 7: the Data Dimensionality Reduction module 3 then based on neuroid enters the Calculation Basis point similarity stage, utilize the topology diagram produced in preceding step, i.e. reference point and annexation, Calculation Basis point each other shortest path in the drawings represents similarity, n the reference point produced, this sample stage just needs to calculate the shortest path of each reference point relative to other all reference points, thus produces similarity matrix D
g(n*n) it is 0 that, the described Calculation Basis point similarity stage comprises first setting natural number variable i value;
Step 8: natural number variable i value is added 1, by extraction reference point L
i(i=1 ..., n), wherein n is the reference point number in reference point set A, for this reference point L
ienter the initial phase in Calculation Basis point similarity stage, first perform S={L
i, U=A-{L
ioperation, S is the first intermediate quantity set, and U is the second intermediate quantity set, then the similarity matrix D of n*n
g(n*n) D in
gthe value of (i, i) element is set to 0, described D
gthis reference point of (i, i) element representation L
iwith the Similarity value of self, for each reference point L in U
j(L
j∈ U), if L
iwith L
jbe connected, i.e. (L
i, L
j) ∈ C, then D
g(i, j) element value is set to || L
i-L
j||; Otherwise D
g(i, j) element value is set to ∞, described D
gthis reference point L described in (i, j) element indicates
iwith the L of the element in U
jbetween Similarity value
Step 9: enter intermediate point and choose the stage, described intermediate point choose the stage comprise to choose from U with this reference point L
ithe reference point L that Similarity value is minimum
min, i.e. L
min=argminD
g(i, j) and L
min∈ U, by L
minadd
s, namely
S=S∪{L
min},U=U-{L
min};
Step 10: then enter limit and expand the stage, the described limit stage of expanding comprises for each reference point L in U
k(L
k∈ U), k is natural number, if L
minwith L
kbe connected, i.e. (L
min, L
k) ∈ C, and D
g(i, min)+|| L
min-L
k|| < D
g(i, k), min is L
minsequence number, then perform renewal rewards theory as shown in formula (5):
D
G(i,k)=D
G(i,min)+||L
min-L
k|| (5)
Then repeated execution of steps 9 and step 10 are until S=A,
till;
Step 11: return step 8 and perform, when i value reaches n by the time, after the reference point in expression reference point set A is all finished, obtains the similarity matrix D of n*n
g(n*n);
Step 12: the Data Dimensionality Reduction module 3 then based on neuroid enters reference point dimensionality reduction mapping phase, the object in this stage is at the similarity matrix D keeping n*n
g(n*n) carry out dimensionality reduction mapping to reference point under prerequisite, optimization aim is the coordinate of trying to achieve reference point in lower dimensional space, makes the Euclidean distance under lower dimensional space and similarity the most close, namely minimum error
described reference point dimensionality reduction mapping phase comprises by formula (6) calculating square distance matrix Δ
n(i, j):
Δ
n(i,j)=D
G(i,j)*D
G(i,j),(i,j=1,…n) (6)
Then by formula (7) computation of mean values vector
Described
represent Δ
ni-th row of (i, j), i value is 1 to n;
Step 13: by formula (8) computation of mean values centralization matrix H
n:
Wherein δ (i, j) is intermediate parameters, generally gets 1, H
n(i, j) represents average centralization matrix H
nthe element value of the i-th row jth row;
Step 14: by formula (9) inner product matrix B
n:
Step 15: calculate eigenwert proper vector, described calculating eigenwert proper vector comprises calculating B
nmaximum d positive eigenvalue λ
1... λ
dwith its characteristic of correspondence vector
wherein d is the target dimension of dimensionality reduction;
Step 16: the dimensionality reduction mapping phase entering reference point, described dimensionality reduction mapping phase comprises and obtains by formula (10) matrix L that maps for the dimensionality reduction of reference point:
Step 17: enter online data dimensionality reduction mapping phase, the target in this stage is the information obtained according to above-mentioned steps, with online mode, dimensionality reduction mapping is carried out to higher-dimension new data, described online data dimensionality reduction mapping phase comprises determines reference point belonging to new data point, determines the reference point L nearest apart from new data model ξ by formula (11)
α:
Step 18: obtain new data model ξ and the similarity D of all reference points according to formula (12)
s(ξ, L
i):
D
S(ξ,L
i)=||ξ-L
α||+D
G(α,i) (12)
Step 19: obtain square distance vector according to formula (13)
Step 20: obtain pseudoinverse transposed matrix according to formula (14), note L
#pseudoinverse transposed matrix for the matrix L that the dimensionality reduction of reference point maps:
Step 21: according to formula (15), low-dimensional is carried out to new data model ξ and map and obtain low-dimensional and map vectorial l
ξ:
Can be found out by the accompanying drawing for embodiment 1, the present embodiment overcomes in conventional linear dimension reduction method really, Euclidean distance is used to represent the shortcoming of similarity, propose to use geodesic line distance to weigh similarity, thus obtain desirable dimensionality reduction result, for subsequent data analysis provides reliable pre-service.
Embodiment 2:
It is swiss_roll data set that the image sent in the present embodiment or the such signal data of video are configured to High Dimensional Data Set, swiss_roll data centralization is with 200 Gaussian noise data points, 15000 data points of swiss_roll data centralization are used for determining reference point in addition, and other 5000 data points are used for obtaining low-dimensional and map vectorial l
ξ, specific as follows:
As shown in Fig. 1, Fig. 5, Fig. 6 and Fig. 7, based on the Data Dimensionality Reduction system of neuroid, comprise data acquisition system (DAS) 1, described data acquisition system (DAS) 1 is connected with control system 2, with the Data Dimensionality Reduction module 3 based on neuroid in described control system 2.
The measurement dimension reduction method of described a kind of Data Dimensionality Reduction system based on neuroid, step is as follows:
Step 1: first data acquisition system (DAS) is gathering the image that comes or the such signal data of video is sent in control system 2, then the control system 2 Data Dimensionality Reduction module 3 started based on neuroid is first configured to High Dimensional Data Set the image sent or the such signal data of video and stores;
Step 2: the process then determining Topology of Mainfolds structure reference point based on Data Dimensionality Reduction module 3 pairs of high dimensional datas of neuroid, to high dimensional data, described determines that the target of the process of Topology of Mainfolds structure reference point utilizes training data to train self organizing neural network, make the result of training can represent the topological structure of former data set, reference point needed for generation and connection, described particularly high dimensional data is determined that the detailed process of the process of Topology of Mainfolds structure reference point is for first to carry out initialization, described initialization comprises first setting reference point set A={ L
1, L
2, wherein A is reference point set, L
1be the first reference point, L
2be the second reference point, the first reference point and the second reference point random concentrate from high dimensional data two high dimensional datas chosen, then set limit set C based on the Data Dimensionality Reduction module 3 of neuroid, initial value be 0 two activate number variable, initial values be || L
1-L
2|| two range threshold variablees and initial value be 0 first be connected age variable
described
and its initial value is empty set, A × A represents the annexation between the reference point of reference point set, initial value is that empty set represents and initially do not connect between the first reference point and the second reference point, described two are activated number variable and are respectively the activation number variable for the first reference point and the activation number variable for the second reference point, and the activation number variable for the first reference point and the activation number variable for the second reference point are respectively
with
, described two range threshold variablees are respectively the first range threshold variable
with the second range threshold variable
described first connects age variable
what represent is the connection duration of the first reference point and the second reference point,
Step 3: then enter input and competitive stage, described input and competitive stage comprise data acquisition system (DAS) and continue one and gather image or the such signal data of video, and gathering an image coming or the such signal data of video is sent in control system, the Data Dimensionality Reduction module 3 based on neuroid in control system is first stored as a high dimensional data the image received or the such signal data of video, and described high dimensional data is as a new data model ξ ∈ R
d, wherein said new data model is ξ, described R
drepresent higher-dimension real number space, described R represents real number, D represents the dimension of high dimensional data, then calculate the Euclidean distance of each reference point in A and new data model ξ, the reference point corresponding to minimum Euclidean distance obtained and the reference point corresponding to little Euclidean distance second from the bottom are respectively victor's reference point s
1with second place's reference point s
2, the victor's reference point s namely represented by formula (1) and formula (2)
1with second place's reference point s
2:
Victor's reference point s
1with second place's reference point s
2just become two the most similar reference points; Enter the reference point more new stage subsequently, described reference point more the new stage Data Dimensionality Reduction module 3 comprised based on neuroid judge if
or
set up, just for new data model ξ is put in reference point set A, to generate a new value be ξ reference point, and namely A=A ∪ { ξ }, then returns in step 3 and perform;
Step 4: if s
1with s
2between there is not connection, perform C=C ∪ { (s
1, s
2) operation, be between two most similar references points and connect, reset the second age variable that initial value is 0
the second described age variable
that represent is victor's reference point s
1with second place's reference point s
2connection duration; If then judge (s
1, L
i) ∈ C, then perform
operation,
operation represent and s
1the connection duration of all reference points be connected adds 1, described
be the 3rd age variable, that the 3rd age variable represents is victor's reference point s
1with all reference point L be attached thereto
iconnection duration, i is natural number variable, sets for victor's reference point s
1activation number variable
and to for victor's reference point s
1activation number variable
perform
operation,
value be increase progressively from 0, then perform s
1=s
1+ ε (t) || ξ-s
1|| and s
2=s
2+ ε ' (t) || ξ-s
2|| operation, namely perform s
1with s
2to the operation of new data model movement, wherein
t is the working time of the Data Dimensionality Reduction system based on neuroid;
Step 5: the Data Dimensionality Reduction module 3 based on neuroid checks the connection (L between all reference points
i, L
j) ∈ C and each group reference point between connection (L
i, L
j) corresponding to current age parameter
if
just remove this connection from C, wherein age
maxbe predefined connection duration maximal value, the connection between wherein said all reference points is (L
i, L
j) ∈ C, wherein i and j is unequal natural number, described
for (L
i, L
j) between connection duration;
Step 6: the Data Dimensionality Reduction module 3 based on neuroid then performs the more new stage of the range threshold of reference point, the more new stage of the range threshold of described reference point comprises s
1and s
2range threshold
with
be updated to respectively and s by formula (3) and formula (4)
1and s
2the ultimate range of adjacent reference point
Described
with
be respectively for victor's reference point s
1range threshold and second place's reference point s
2range threshold, then the denoising stage is entered, the described denoising stage comprises by judging based on the Data Dimensionality Reduction module 3 of neuroid if the data sample sum of current input is the integral multiple defining value λ of setting, check the reference point in all reference point set A, if there is some reference point L
ionly have a reference point be connected, and
be less than the activation number minimum M of setting
min, just in reference point set A, leave out this reference point L
i, described
for for some reference point L
iactivation number variable, return in step 2 perform; By the time, after training data sample all inputs, the connection C between Topology of Mainfolds structure reference point set A needed for us and reference point is obtained.
Step 7: the Data Dimensionality Reduction module 3 then based on neuroid enters the Calculation Basis point similarity stage, utilize the topology diagram produced in preceding step, i.e. reference point and annexation, Calculation Basis point each other shortest path in the drawings represents similarity, n the reference point produced, this sample stage just needs to calculate the shortest path of each reference point relative to other all reference points, thus produces similarity matrix D
g(n*n) it is 0 that, the described Calculation Basis point similarity stage comprises first setting natural number variable i value;
Step 8: natural number variable i value is added 1, by extraction reference point L
i(i=1 ..., n), wherein n is the reference point number in reference point set A, for this reference point L
ienter the initial phase in Calculation Basis point similarity stage, first perform S={L
i, U=A-{L
ioperation, S is the first intermediate quantity set, and U is the second intermediate quantity set, then the similarity matrix D of n*n
g(n*n) D in
gthe value of (i, i) element is set to 0, described D
gthis reference point of (i, i) element representation L
iwith the Similarity value of self, for each reference point L in U
j(L
j∈ U), if L
iwith L
jbe connected, i.e. (L
i, L
j) ∈ C, then D
g(i, j) element value is set to || L
i-L
j||; Otherwise D
g(i, j) element value is set to ∞, described D
gthis reference point L described in (i, j) element indicates
iwith the L of the element in U
jbetween Similarity value
Step 9: enter intermediate point and choose the stage, described intermediate point choose the stage comprise to choose from U with this reference point L
ithe reference point L that Similarity value is minimum
min, i.e. L
min=argminD
g(i, j) and L
min∈ U, by L
minadd
s, namely
S=S∪{L
min},U=U-{L
min};
Step 10: then enter limit and expand the stage, the described limit stage of expanding comprises for each reference point L in U
k(L
k∈ U), k is natural number, if L
minwith L
kbe connected, i.e. (L
min, L
k) ∈ C, and D
g(i, min)+|| L
min-L
k|| < D
g(i, k), min is L
minsequence number, then perform renewal rewards theory as shown in formula (5):
D
G(i,k)=D
G(i,min)+||L
min-L
k|| (5)
Then repeated execution of steps 9 and step 10 are until S=A,
till;
Step 11: return step 8 and perform, when i value reaches n by the time, after the reference point in expression reference point set A is all finished, obtains the similarity matrix D of n*n
g(n*n);
Step 12: the Data Dimensionality Reduction module 3 then based on neuroid enters reference point dimensionality reduction mapping phase, the object in this stage is at the similarity matrix D keeping n*n
g(n*n) carry out dimensionality reduction mapping to reference point under prerequisite, optimization aim is the coordinate of trying to achieve reference point in lower dimensional space, makes the Euclidean distance under lower dimensional space and similarity the most close, namely minimum error
described reference point dimensionality reduction mapping phase comprises by formula (6) calculating square distance matrix Δ
n(i, j):
Δ
n(i,j)=D
G(i,j)*D
G(i,j),(i,j=1,…n) (6)
Then by formula (7) computation of mean values vector
Described
represent Δ
ni-th row of (i, j), i value is 1 to n;
Step 13: by formula (8) computation of mean values centralization matrix H
n:
Wherein δ (i, j) is intermediate parameters, generally gets 1, H
n(i, j) represents average centralization matrix H
nthe element value of the i-th row jth row;
Step 14: by formula (9) inner product matrix B
n:
Step 15: calculate eigenwert proper vector, described calculating eigenwert proper vector comprises calculating B
nmaximum d positive eigenvalue λ
1... λ
dwith its characteristic of correspondence vector
wherein d is the target dimension of dimensionality reduction;
Step 16: the dimensionality reduction mapping phase entering reference point, described dimensionality reduction mapping phase comprises and obtains by formula (10) matrix L that maps for the dimensionality reduction of reference point:
Step 17: enter online data dimensionality reduction mapping phase, the target in this stage is the information obtained according to above-mentioned steps, with online mode, dimensionality reduction mapping is carried out to higher-dimension new data, described online data dimensionality reduction mapping phase comprises determines reference point belonging to new data point, determines the reference point L nearest apart from new data model ξ by formula (11)
α:
Step 18: obtain new data model ξ and the similarity D of all reference points according to formula (12)
s(ξ, L
i):
D
S(ξ,L
i)=||ξ-L
α||+D
G(α,i) (12)
Step 19: obtain square distance vector according to formula (13)
Step 20: obtain pseudoinverse transposed matrix according to formula (14), note L
#pseudoinverse transposed matrix for the matrix L that the dimensionality reduction of reference point maps:
Step 21: according to formula (15), low-dimensional is carried out to new data model ξ and map and obtain low-dimensional and map vectorial l
ξ:
Can be found out by the accompanying drawing for embodiment 1, the present embodiment overcomes in conventional linear dimension reduction method really, Euclidean distance is used to represent the shortcoming of similarity, propose to use geodesic line distance to weigh similarity, thus obtain desirable dimensionality reduction result, for subsequent data analysis provides reliable pre-service.
Embodiment 3:
It is swiss_roll data set that the image sent in the present embodiment or the such signal data of video are configured to High Dimensional Data Set, swiss_roll data centralization is with 100 Uniform noise data points, 15000 data points of swiss_roll data centralization are used for determining reference point in addition, and other 5000 data points are used for obtaining low-dimensional and map vectorial l
ξ, specific as follows:
As shown in Fig. 1, Fig. 8, Fig. 9 and Figure 10, based on the Data Dimensionality Reduction system of neuroid, comprise data acquisition system (DAS) 1, described data acquisition system (DAS) 1 is connected with control system 2, with the Data Dimensionality Reduction module 3 based on neuroid in described control system 2.
The measurement dimension reduction method of described a kind of Data Dimensionality Reduction system based on neuroid, step is as follows:
Step 1: first data acquisition system (DAS) is gathering the image that comes or the such signal data of video is sent in control system 2, then the control system 2 Data Dimensionality Reduction module 3 started based on neuroid is first configured to High Dimensional Data Set the image sent or the such signal data of video and stores;
Step 2: the process then determining Topology of Mainfolds structure reference point based on Data Dimensionality Reduction module 3 pairs of high dimensional datas of neuroid, to high dimensional data, described determines that the target of the process of Topology of Mainfolds structure reference point utilizes training data to train self organizing neural network, make the result of training can represent the topological structure of former data set, reference point needed for generation and connection, described particularly high dimensional data is determined that the detailed process of the process of Topology of Mainfolds structure reference point is for first to carry out initialization, described initialization comprises first setting reference point set A={ L
1, L
2, wherein A is reference point set, L
1be the first reference point, L
2be the second reference point, the first reference point and the second reference point random concentrate from high dimensional data two high dimensional datas chosen, then set limit set C based on the Data Dimensionality Reduction module 3 of neuroid, initial value be 0 two activate number variable, initial values be || L
1-L
2|| two range threshold variablees and initial value be 0 first be connected age variable
described
and its initial value is empty set, A × A represents the annexation between the reference point of reference point set, initial value is that empty set represents and initially do not connect between the first reference point and the second reference point, described two are activated number variable and are respectively the activation number variable for the first reference point and the activation number variable for the second reference point, and the activation number variable for the first reference point and the activation number variable for the second reference point are respectively
with
, described two range threshold variablees are respectively the first range threshold variable
with the second range threshold variable
described first connects age variable
what represent is the connection duration of the first reference point and the second reference point,
Step 3: then enter input and competitive stage, described input and competitive stage comprise data acquisition system (DAS) and continue one and gather image or the such signal data of video, and gathering an image coming or the such signal data of video is sent in control system, the Data Dimensionality Reduction module 3 based on neuroid in control system is first stored as a high dimensional data the image received or the such signal data of video, and described high dimensional data is as a new data model ξ ∈ R
d, wherein said new data model is ξ, described R
drepresent higher-dimension real number space, described R represents real number, D represents the dimension of high dimensional data, then calculate the Euclidean distance of each reference point in A and new data model ξ, the reference point corresponding to minimum Euclidean distance obtained and the reference point corresponding to little Euclidean distance second from the bottom are respectively victor's reference point s
1with second place's reference point s
2, the victor's reference point s namely represented by formula (1) and formula (2)
1with second place's reference point s
2:
Victor's reference point s
1with second place's reference point s
2just become two the most similar reference points; Enter the reference point more new stage subsequently, described reference point more the new stage Data Dimensionality Reduction module 3 comprised based on neuroid judge if
or
set up, just for new data model ξ is put in reference point set A, to generate a new value be ξ reference point, and namely A=A ∪ { ξ }, then returns in step 3 and perform;
Step 4: if s
1with s
2between there is not connection, perform C=C ∪ { (s
1, s
2) operation, be between two most similar references points and connect, reset the second age variable that initial value is 0
the second described age variable
that represent is victor's reference point s
1with second place's reference point s
2connection duration; If then judge (s
1, L
i) ∈ C, then perform
operation,
operation represent and s
1the connection duration of all reference points be connected adds 1, described
be the 3rd age variable, that the 3rd age variable represents is victor's reference point s
1with all reference point L be attached thereto
iconnection duration, i is natural number variable, sets for victor's reference point s
1activation number variable
and to for victor's reference point s
1activation number variable
perform
operation,
value be increase progressively from 0, then perform s
1=s
1+ ε (t) || ξ-s
1|| and s
2=s
2+ ε ' (t) || ξ-s
2|| operation, namely perform s
1with s
2to the operation of new data model movement, wherein
t is the working time of the Data Dimensionality Reduction system based on neuroid;
Step 5: the Data Dimensionality Reduction module 3 based on neuroid checks the connection (L between all reference points
i, L
j) ∈ C and each group reference point between connection (L
i, L
j) corresponding to current age parameter
if
just remove this connection from C, wherein age
maxbe predefined connection duration maximal value, the connection between wherein said all reference points is (L
i, L
j) ∈ C, wherein i and j is unequal natural number, described
for (L
i, L
j) between connection duration;
Step 6: the Data Dimensionality Reduction module 3 based on neuroid then performs the more new stage of the range threshold of reference point, the more new stage of the range threshold of described reference point comprises s
1and s
2range threshold
with
be updated to respectively and s by formula (3) and formula (4)
1and s
2the ultimate range of adjacent reference point
Described
with
be respectively for victor's reference point s
1range threshold and second place's reference point s
2range threshold, then the denoising stage is entered, the described denoising stage comprises by judging based on the Data Dimensionality Reduction module 3 of neuroid if the data sample sum of current input is the integral multiple defining value λ of setting, check the reference point in all reference point set A, if there is some reference point L
ionly have a reference point be connected, and
be less than the activation number minimum M of setting
min, just in reference point set A, leave out this reference point L
i, described
for for some reference point L
iactivation number variable, return in step 2 perform; By the time, after training data sample all inputs, the connection C between Topology of Mainfolds structure reference point set A needed for us and reference point is obtained.
Step 7: the Data Dimensionality Reduction module 3 then based on neuroid enters the Calculation Basis point similarity stage, utilize the topology diagram produced in preceding step, i.e. reference point and annexation, Calculation Basis point each other shortest path in the drawings represents similarity, n the reference point produced, this sample stage just needs to calculate the shortest path of each reference point relative to other all reference points, thus produces similarity matrix D
g(n*n) it is 0 that, the described Calculation Basis point similarity stage comprises first setting natural number variable i value;
Step 8: natural number variable i value is added 1, by extraction reference point L
i(i=1 ..., n), wherein n is the reference point number in reference point set A, for this reference point L
ienter the initial phase in Calculation Basis point similarity stage, first perform S={L
i, U=A-{L
ioperation, S is the first intermediate quantity set, and U is the second intermediate quantity set, then the similarity matrix D of n*n
g(n*n) D in
gthe value of (i, i) element is set to 0, described D
gthis reference point of (i, i) element representation L
iwith the Similarity value of self, for each reference point L in U
j(L
j∈ U), if L
iwith L
jbe connected, i.e. (L
i, L
j) ∈ C, then D
g(i, j) element value is set to || L
i-L
j||; Otherwise D
g(i, j) element value is set to ∞, described D
gthis reference point L described in (i, j) element indicates
iwith the L of the element in U
jbetween Similarity value
Step 9: enter intermediate point and choose the stage, described intermediate point choose the stage comprise to choose from U with this reference point L
ithe reference point L that Similarity value is minimum
min, i.e. L
min=argminD
g(i, j) and L
min∈ U, by L
minadd
s, namely
S=S∪{L
min},U=U-{L
min};
Step 10: then enter limit and expand the stage, the described limit stage of expanding comprises for each reference point L in U
k(L
k∈ U), k is natural number, if L
minwith L
kbe connected, i.e. (L
min, L
k) ∈ C, and D
g(i, min)+|| L
min-L
k|| < D
g(i, k), min is L
minsequence number, then perform renewal rewards theory as shown in formula (5):
D
G(i,k)=D
G(i,min)+||L
min-L
k|| (5)
Then repeated execution of steps 9 and step 10 are until S=A,
till;
Step 11: return step 8 and perform, when i value reaches n by the time, after the reference point in expression reference point set A is all finished, obtains the similarity matrix D of n*n
g(n*n);
Step 12: the Data Dimensionality Reduction module 3 then based on neuroid enters reference point dimensionality reduction mapping phase, the object in this stage is at the similarity matrix D keeping n*n
g(n*n) carry out dimensionality reduction mapping to reference point under prerequisite, optimization aim is the coordinate of trying to achieve reference point in lower dimensional space, makes the Euclidean distance under lower dimensional space and similarity the most close, namely minimum error
described reference point dimensionality reduction mapping phase comprises by formula (6) calculating square distance matrix Δ
n(i, j):
Δ
n(i,j)=D
G(i,j)*D
G(i,j),(i,j=1,…n) (6)
Then by formula (7) computation of mean values vector
Described
represent Δ
ni-th row of (i, j), i value is 1 to n;
Step 13: by formula (8) computation of mean values centralization matrix H
n:
Wherein δ (i, j) is intermediate parameters, generally gets 1, H
n(i, j) represents average centralization matrix H
nthe element value of the i-th row jth row;
Step 14: by formula (9) inner product matrix B
n:
Step 15: calculate eigenwert proper vector, described calculating eigenwert proper vector comprises calculating B
nmaximum d positive eigenvalue λ
1... λ
dwith its characteristic of correspondence vector
wherein d is the target dimension of dimensionality reduction;
Step 16: the dimensionality reduction mapping phase entering reference point, described dimensionality reduction mapping phase comprises and obtains by formula (10) matrix L that maps for the dimensionality reduction of reference point:
Step 17: enter online data dimensionality reduction mapping phase, the target in this stage is the information obtained according to above-mentioned steps, with online mode, dimensionality reduction mapping is carried out to higher-dimension new data, described online data dimensionality reduction mapping phase comprises determines reference point belonging to new data point, determines the reference point L nearest apart from new data model ξ by formula (11)
α:
Step 18: obtain new data model ξ and the similarity D of all reference points according to formula (12)
s(ξ, L
i):
D
S(ξ,L
i)=||ξ-L
α||+D
G(α,i) (12)
Step 19: obtain square distance vector according to formula (13)
Step 20: obtain pseudoinverse transposed matrix according to formula (14), note L
#pseudoinverse transposed matrix for the matrix L that the dimensionality reduction of reference point maps:
Step 21: according to formula (15), low-dimensional is carried out to new data model ξ and map and obtain low-dimensional and map vectorial l
ξ:
Can be found out by the accompanying drawing for embodiment 1, the present embodiment overcomes in conventional linear dimension reduction method really, Euclidean distance is used to represent the shortcoming of similarity, propose to use geodesic line distance to weigh similarity, thus obtain desirable dimensionality reduction result, for subsequent data analysis provides reliable pre-service.
The above, it is only preferred embodiment of the present invention, not any pro forma restriction is done to the present invention, although the present invention discloses as above with preferred embodiment, but and be not used to limit the present invention, any those skilled in the art, do not departing within the scope of technical solution of the present invention, make a little change when the technology contents of above-mentioned announcement can be utilized or be modified to the Equivalent embodiments of equivalent variations, in every case be do not depart from technical solution of the present invention content, according to technical spirit of the present invention, within the spirit and principles in the present invention, to any simple amendment that above embodiment is done, equivalent replacement and improvement etc., within the protection domain all still belonging to technical solution of the present invention.
Claims (2)
1. based on a Data Dimensionality Reduction system for neuroid, comprise data acquisition system (DAS), described data acquisition system (DAS) is connected with control system, with the Data Dimensionality Reduction module based on neuroid in described control system.
2. the measurement dimension reduction method of a kind of Data Dimensionality Reduction system based on neuroid according to claim 1, it is characterized in that, step is as follows:
Step 1: first data acquisition system (DAS) is gathering the image that comes or the such signal data of video is sent in control system, then the control system Data Dimensionality Reduction module started based on neuroid is first configured to High Dimensional Data Set the image sent or the such signal data of video and stores;
Step 2: the process then based on the Data Dimensionality Reduction module of neuroid, high dimensional data being determined to Topology of Mainfolds structure reference point, described process high dimensional data being determined to Topology of Mainfolds structure reference point, describedly particularly determine that the detailed process of the process of Topology of Mainfolds structure reference point is for first to carry out initialization to high dimensional data, described initialization comprises first setting reference point set A={ L
1, L
2, wherein A is reference point set, L
1be the first reference point, L
2be the second reference point, the first reference point and the second reference point random concentrate from high dimensional data two high dimensional datas chosen; Then based on Data Dimensionality Reduction module setting limit set C, the initial value of neuroid be 0 two activate number variable, initial values be || L
1-L
2|| two range threshold variablees and initial value be 0 first be connected age variable
described
and its initial value is empty set, A × A represents the annexation between the reference point of reference point set, initial value is that empty set represents and initially do not connect between the first reference point and the second reference point, described two are activated number variable and are respectively the activation number variable for the first reference point and the activation number variable for the second reference point, and the activation number variable for the first reference point and the activation number variable for the second reference point are respectively
with
described two range threshold variablees are respectively the first range threshold variable
with the second range threshold variable
described first connects age variable
what represent is the connection duration of the first reference point and the second reference point;
Step 3: then enter input and competitive stage, described input and competitive stage comprise data acquisition system (DAS) and continue one and gather image or the such signal data of video, and gathering an image coming or the such signal data of video is sent in control system, the Data Dimensionality Reduction module 3 based on neuroid in control system is first stored as a high dimensional data the image received or the such signal data of video, and described high dimensional data is as a new data model ξ ∈ R
d, wherein said new data model is ξ, described R
drepresent higher-dimension real number space, described R represents real number, D represents the dimension of high dimensional data, then calculate the Euclidean distance of each reference point in A and new data model ξ, the reference point corresponding to minimum Euclidean distance obtained and the reference point corresponding to little Euclidean distance second from the bottom are respectively victor's reference point s
1with second place's reference point s
2, the victor's reference point s namely represented by formula (1) and formula (2)
1with second place's reference point s
2:
Victor's reference point s
1with second place's reference point s
2just become two the most similar reference points; Enter the reference point more new stage subsequently, described reference point more the new stage Data Dimensionality Reduction module 3 comprised based on neuroid judge if
or
set up, just for new data model ξ is put in reference point set A, to generate a new value be ξ reference point, and namely A=A ∪ { ξ }, then returns in step 3 and perform;
Step 4: if s
1with s
2between there is not connection, perform C=C ∪ { (s
1, s
2) operation, be between two most similar references points and connect, reset the second age variable that initial value is 0
the second described age variable
that represent is victor's reference point s
1with second place's reference point s
2connection duration; If then judge (s
1, L
i) ∈ C, then perform
operation,
operation represent and s
1the connection duration of all reference points be connected adds 1, described
be the 3rd age variable, that the 3rd age variable represents is victor's reference point s
1with all reference point L be attached thereto
iconnection duration, i is natural number variable, sets for victor's reference point s
1activation number variable
and to for victor's reference point s
1activation number variable
perform
operation,
value be increase progressively from 0, then perform s
1=s
1+ ε (t) || ξ-s
1|| and s
2=s
2+ ε ' (t) || ξ-s
2|| operation, namely perform s
1with s
2to the operation of new data model movement, wherein
t is the working time of the Data Dimensionality Reduction system based on neuroid;
Step 5: based on the connection (L between the reference point that the Data Dimensionality Reduction module check of neuroid is all
i, L
j) ∈ C and each group reference point between connection (L
i, L
j) corresponding to current age parameter
if
just remove this connection from C, wherein age
maxbe predefined connection duration maximal value, the connection between wherein said all reference points is (L
i, L
j) ∈ C, wherein i and j is unequal natural number, described
between connection duration;
Step 6: the Data Dimensionality Reduction module based on neuroid then performs the more new stage of the range threshold of reference point, the more new stage of the range threshold of described reference point comprises s
1and s
2range threshold
with
be updated to respectively and s by formula (3) and formula (4)
1and s
2the ultimate range of adjacent reference point
Described
with
be respectively for victor's reference point s
1range threshold and second place's reference point s
2range threshold, then the denoising stage is entered, if the described denoising stage comprises by judging that the data sample sum of current input is the integral multiple defining value λ of setting based on the Data Dimensionality Reduction module of neuroid, check the reference point in all reference point set A, if there is some reference point L
ionly have a reference point be connected, and
be less than the activation number minimum M of setting
min, just in reference point set A, leave out this reference point L
i, described
for for some reference point L
iactivation number variable, return in step and perform;
Step 7: the Data Dimensionality Reduction module then based on neuroid enters the Calculation Basis point similarity stage;
Step 8: natural number variable i value is added 1, by extraction reference point L
i(i=1 ..., n), wherein n is the reference point number in reference point set A, for this reference point L
ienter the initial phase in Calculation Basis point similarity stage, first perform S={L
i, U=A-{L
ioperation, S is the first intermediate quantity set, and U is the second intermediate quantity set, then the similarity matrix D of n*n
g(n*n) D in
gthe value of (i, i) element is set to 0, described D
gthis reference point of (i, i) element representation L
iwith the Similarity value of self, for each reference point L in U
j(L
j∈ U), if L
iwith L
jbe connected, i.e. (L
i, L
j) ∈ C, then D
g(i, j) element value is set to || L
i-L
j||; Otherwise D
g(i, j) element value is set to ∞, described D
gthis reference point L described in (i, j) element indicates
iwith the L of the element in U
jbetween Similarity value
Step 9: enter intermediate point and choose the stage, described intermediate point choose the stage comprise to choose from U with this reference point L
ithe reference point L that Similarity value is minimum
min, i.e. L
min=argminD
g(i, j) and L
min∈ U, by L
minadd S, i.e. S=S ∪ { L
min, U=U-{L
min;
Step 10: then enter limit and expand the stage, the described limit stage of expanding comprises for each reference point L in U
k(L
k∈ U), k is natural number, if L
minwith L
kbe connected, i.e. (L
min, L
k) ∈ C, and D
g(i, min)+|| L
min-L
k|| < D
g(i, k), min is L
minsequence number, then perform renewal rewards theory as shown in formula (5):
D
G(i,k)=D
G(i,min)+||L
min-L
k|| (5)
Then repeated execution of steps 9 and step 10 are until S=A,
till;
Step 11: return step 8 and perform, when i value reaches n by the time, after the reference point in expression reference point set A is all finished, obtains the similarity matrix D of n*n
g(n*n);
Step 12: the Data Dimensionality Reduction module then based on neuroid enters reference point dimensionality reduction mapping phase, described reference point dimensionality reduction mapping phase comprises by formula (6) calculating square distance matrix Δ
n(i, j):
Then by formula (7) computation of mean values vector
Described
represent Δ
ni-th row of (i, j), i value is 1 to n;
Step 13: by formula (8) computation of mean values centralization matrix H
n:
Wherein δ (i, j) is intermediate parameters, generally gets 1, H
n(i, j) represents average centralization matrix H
nthe element value of the i-th row jth row;
Step 14: by formula (9) inner product matrix B
n:
Step 15: calculate eigenwert proper vector, described calculating eigenwert proper vector comprises calculating B
nmaximum d positive eigenvalue λ
1... λ
dwith its characteristic of correspondence vector
wherein d is the target dimension of dimensionality reduction;
Step 16: the dimensionality reduction mapping phase entering reference point, described dimensionality reduction mapping phase comprises and obtains by formula (10) matrix L that maps for the dimensionality reduction of reference point:
the column vector of n d dimension of the matrix L mapped for the dimensionality reduction of reference point
be respectively the coordinate of n reference point at d dimension space;
Step 17: enter online data dimensionality reduction mapping phase, described online data dimensionality reduction mapping phase comprises determines reference point belonging to new data point, determines the reference point L nearest apart from new data model ξ by formula (11)
α:
Step 18: obtain new data model ξ and the similarity D of all reference points according to formula (12)
s(ξ, L
i):
D
S(ξ,L
i)=||ξ-L
α||+D
G(α,i) (12)
Step 19: obtain square distance vector according to formula (13)
Step 20: obtain pseudoinverse transposed matrix according to formula (14), note L
#pseudoinverse transposed matrix for the matrix L that the dimensionality reduction of reference point maps:
Step 21: according to formula (15), low-dimensional is carried out to new data model ξ and map and obtain low-dimensional and map vectorial l
ξ:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410362559.9A CN104346520B (en) | 2014-07-28 | 2014-07-28 | A kind of Data Dimensionality Reduction system and its dimension reduction method based on neuroid |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410362559.9A CN104346520B (en) | 2014-07-28 | 2014-07-28 | A kind of Data Dimensionality Reduction system and its dimension reduction method based on neuroid |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104346520A true CN104346520A (en) | 2015-02-11 |
CN104346520B CN104346520B (en) | 2017-10-13 |
Family
ID=52502108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410362559.9A Active CN104346520B (en) | 2014-07-28 | 2014-07-28 | A kind of Data Dimensionality Reduction system and its dimension reduction method based on neuroid |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104346520B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388869A (en) * | 2018-02-28 | 2018-08-10 | 苏州大学 | A kind of hand-written data sorting technique and system based on multiple manifold |
CN110955809A (en) * | 2019-11-27 | 2020-04-03 | 南京大学 | High-dimensional data visualization method supporting topology structure maintenance |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101546332A (en) * | 2009-05-07 | 2009-09-30 | 哈尔滨工程大学 | Manifold dimension-reducing medical image search method based on quantum genetic optimization |
CN101807245A (en) * | 2010-03-02 | 2010-08-18 | 天津大学 | Artificial neural network-based multi-source gait feature extraction and identification method |
CN102269972A (en) * | 2011-03-29 | 2011-12-07 | 东北大学 | Method and device for compensating pipeline pressure missing data based on genetic neural network |
-
2014
- 2014-07-28 CN CN201410362559.9A patent/CN104346520B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101546332A (en) * | 2009-05-07 | 2009-09-30 | 哈尔滨工程大学 | Manifold dimension-reducing medical image search method based on quantum genetic optimization |
CN101807245A (en) * | 2010-03-02 | 2010-08-18 | 天津大学 | Artificial neural network-based multi-source gait feature extraction and identification method |
CN102269972A (en) * | 2011-03-29 | 2011-12-07 | 东北大学 | Method and device for compensating pipeline pressure missing data based on genetic neural network |
Non-Patent Citations (4)
Title |
---|
MUKUND BALASUBRAMANIAN 等: "The Isomap Algorithm and", 《SCIENCE》 * |
吴证 等: "结合主元成分分析的受限玻耳兹曼机神经网络的降维方法", 《上海交通大学学报》 * |
王建中: "基于流形学习的数据降维方法及其在人脸识别中的应用", 《中国博士学位论文全文数据库(信息科技辑)》 * |
钱晓东 等: "基于信号传递的神经网络文本降维算法", 《计算机工程》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388869A (en) * | 2018-02-28 | 2018-08-10 | 苏州大学 | A kind of hand-written data sorting technique and system based on multiple manifold |
CN108388869B (en) * | 2018-02-28 | 2021-11-05 | 苏州大学 | Handwritten data classification method and system based on multiple manifold |
CN110955809A (en) * | 2019-11-27 | 2020-04-03 | 南京大学 | High-dimensional data visualization method supporting topology structure maintenance |
CN110955809B (en) * | 2019-11-27 | 2023-03-31 | 南京大学 | High-dimensional data visualization method supporting topology structure maintenance |
Also Published As
Publication number | Publication date |
---|---|
CN104346520B (en) | 2017-10-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107330115B (en) | Information recommendation method and device | |
Dash et al. | An outliers detection and elimination framework in classification task of data mining | |
Papadopoulos et al. | Network mapping by replaying hyperbolic growth | |
CN107808122A (en) | Method for tracking target and device | |
CN112184391A (en) | Recommendation model training method, medium, electronic device and recommendation model | |
JP2015099593A5 (en) | ||
Barman et al. | Shape: A novel graph theoretic algorithm for making consensus-based decisions in person re-identification systems | |
CN105978711B (en) | A kind of best exchange side lookup method based on minimum spanning tree | |
CN107423762A (en) | Semi-supervised fingerprinting localization algorithm based on manifold regularization | |
CN104156943B (en) | Multi objective fuzzy cluster image change detection method based on non-dominant neighborhood immune algorithm | |
Mokarram et al. | Using machine learning for land suitability classification | |
CN113780002A (en) | Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning | |
CN113378656B (en) | Action recognition method and device based on self-adaptive graph convolution neural network | |
CN109756842A (en) | Wireless indoor location method and system based on attention mechanism | |
CN111488460B (en) | Data processing method, device and computer readable storage medium | |
Comarela et al. | Robot routing in sparse wireless sensor networks with continuous ant colony optimization | |
CN114743273A (en) | Human skeleton behavior identification method and system based on multi-scale residual error map convolutional network | |
CN104346520A (en) | Neural network based data dimension reduction system and dimension reducing method thereof | |
CN109492770A (en) | A kind of net with attributes embedding grammar based on the sequence of personalized relationship | |
CN113111193A (en) | Data processing method and device of knowledge graph | |
CN115759199B (en) | Multi-robot environment exploration method and system based on hierarchical graph neural network | |
Wang et al. | Decentralized recommender systems | |
CN116738983A (en) | Word embedding method, device and equipment for performing financial field task processing by model | |
Peng et al. | Graphangel: Adaptive and structure-aware sampling on graph neural networks | |
CN104636489B (en) | The treating method and apparatus of attribute data is described |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |