CN111046914A - Semi-supervised classification method based on dynamic composition - Google Patents

Semi-supervised classification method based on dynamic composition Download PDF

Info

Publication number
CN111046914A
CN111046914A CN201911131232.XA CN201911131232A CN111046914A CN 111046914 A CN111046914 A CN 111046914A CN 201911131232 A CN201911131232 A CN 201911131232A CN 111046914 A CN111046914 A CN 111046914A
Authority
CN
China
Prior art keywords
matrix
node
neighbor
distance
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911131232.XA
Other languages
Chinese (zh)
Other versions
CN111046914B (en
Inventor
马君亮
肖冰
敬欣怡
何聚厚
汪西莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN201911131232.XA priority Critical patent/CN111046914B/en
Publication of CN111046914A publication Critical patent/CN111046914A/en
Application granted granted Critical
Publication of CN111046914B publication Critical patent/CN111046914B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to a semi-supervised classification method based on dynamic composition, which includes: s100, preparing a data set; s200, selecting edges on the data set prepared in the step S100 by using a dynamic nearest neighbor DNN method to obtain an adjacency matrix A; s300, calculating the similarity probability among the nodes of the adjacent matrix A generated in the step S200 by using an ADW method to obtain an affinity matrix M; s400, carrying out label propagation according to the affinity matrix M obtained in the step S300 to obtain a final classification result. The classification method can capture the distribution of data, more edges are connected in a data dense area, and fewer edges are connected in a data sparse area, so that the density degree of the data can be better reflected, and a better classification effect is achieved.

Description

Semi-supervised classification method based on dynamic composition
Technical Field
The present disclosure relates to a data classification method, and in particular, to a semi-supervised classification method based on dynamic composition dcg (dynamic configuration graph).
Background
The existing data classification methods comprise methods of supervised classification, semi-supervised classification, unsupervised classification and the like. In the supervised classification method, a large amount of labeled data is needed to train the model, so that the application scene of the model is limited; unsupervised classification does not need class information of data and is wide in application, but classification effect is poor due to lack of class information. The semi-supervised classification method only needs a small amount of marked data, is low in acquisition cost, and can obtain a good classification effect by learning the data distribution of a large amount of unmarked data, so that the semi-supervised classification method has a wide application scene.
Semi-supervised classification based on graphs is an important branch in semi-supervised classification, and due to the fact that the relation among data is fully utilized, good effects are obtained often, and wide attention is paid. However, in the current semi-supervised classification method based on graphs, a similarity graph is often constructed by a k-nearest neighbor (kNN) or epsilon-nearest neighbor method, only the attribute features of data are used in the process of constructing the graph, the category information of the labeled data is not used, the obtained similarity graph cannot well reflect the actual situation, and the classification result is relatively inaccurate.
Different graph structures are constructed based on different data distribution assumptions. An ideal graph should have the following three features: the edge selection algorithm can reflect data distribution, more neighbors are selected in a region with dense data, and fewer neighbors are selected in a region with sparse data; the measure of similarity should not only be related to distance, but also to local structure; the composition algorithm should reduce the impact of the human setting parameters on the composition effect. Since the edge selection algorithm and the similarity calculation method in the prior art both have respective limitations, and the result of the classification method is not accurate enough, a new semi-supervised classification method is urgently needed to make the result of data classification more accurate.
Disclosure of Invention
In order to solve the problems, the disclosure provides a semi-supervised classification method based on Dynamic composition, which adopts a Dynamic Nearest Neighbor (DNN) method to select edges and an Adaptive Degree Weighting (ADW) method to calculate similarity, and the method can well describe local features of data, thereby improving the accuracy of data classification.
According to the semi-supervised classification method based on the dynamic composition, firstly, a DNN method is used for edge selection, D neighbor of each node is dynamically selected, then, an ADW method is used for calculating the weight of the edge, namely the similarity probability among the nodes, and finally, a Local and Global Consistency (LLGC) algorithm is used for classifying the graph.
Specifically, a semi-supervised classification method based on dynamic composition comprises the following steps:
s100, preparing a data set, wherein the data set comprises marked data XlAnd unlabeled data XuTwo parts, marked data XlIs marked with information FlThe characteristics of the data in the data set are described by data attribute information, l represents the number of marked data, the data in the data set is abstracted into n nodes on an m-dimensional space, and the ith node is represented as pi
S200, selecting edges on the data set prepared in step S100 by using a dynamic nearest neighbor DNN method to obtain an adjacency matrix a, specifically:
s201, calculating Euclidean distances among the nodes in the data set to obtain a direct distance matrix S;
s202, selecting node p by using dynamic neighbor DNN methodiIs selected as the edge, and an adjacency matrix A is generated based on the D neighbors, A is an n × n matrix, and in the adjacency matrix A, if p is pjIs piIs close to, then the corresponding position A in the matrixijIs 1, otherwise is 0, AijA value representing the ith row and the jth column in the adjacency matrix A;
s300, calculating the similarity probability among the nodes of the adjacent matrix A generated in the step S200 by using an ADW method to obtain an affinity matrix M, wherein the method specifically comprises the following steps:
s301, defining a distance matrix S ', S ' according to the direct distance matrix S in the step S201 and the adjacent matrix A defined in the step S202 'ijThe value representing the ith row and the jth column in the distance matrix S' is specifically defined as:
When i ≠ j,
Figure BDA0002281093020000031
when i ═ j, S'ij=0;
S302, defining a weight matrix W according to the distance matrix S' defined in the step 301, wherein the weight matrix W is an n multiplied by n matrix, and WijIs used to describe the node piAnd node pjThe similarity of the weight matrix W is the value of the ith row and the jth column of the weight matrix W;
s303, normalizing the weight matrix W defined in the step 302 to obtain an affinity matrix M, wherein the affinity matrix M is an n multiplied by n matrix, and M isijIs used to describe the node piAnd node pjSimilar probabilities, i.e., values of the affinity matrix M at row i and column j;
s400, carrying out label propagation according to the affinity matrix M obtained in the step S300 to obtain a final classification result.
Preferably, the data sets in step S100 include composite data sets TwoSpirals, ToyData, FourGaussian, TwoMoon and image data sets USPS, mnst-3495, Coil20, Coil (1500), G241d, Coil 2.
Preferably, in step S201, the node p in the data setiAnd pjThe euclidean distance between them is:
Figure BDA0002281093020000032
where m denotes the dimension of the data, pi、pjDenotes the ith, j nodes, x in the diagramikAnd xjkAre respectively a node pi、pjGenerating a direct distance matrix S according to Euclidean distance between nodes according to the k-dimension coordinate, wherein the direct distance matrix S is a two-dimensional matrix of n multiplied by n, and SijRepresenting the value of the ith row and jth column in the matrix, storing a node piAnd node pjThe euclidean distance between.
Preferably, the step S201 further includes:
sorting Euclidean distances between each node and other nodes in the direct distance matrix S from small to large to obtain a matrix O, and simultaneously generating an index matrix E corresponding to the direct distance matrix S, wherein the specific process is that for the ith row in the direct distance matrix S, the stored distances are sorted from small to large, and the distance sorted into j-1 is stored in the matrix OijWhile storing the position of the distance in the direct distance matrix S in EijTherefore, the corresponding position of the distance stored in the matrix O in the direct distance matrix S can be found through the index matrix E; the matrix O and the index matrix E are both two-dimensional matrices of n x n, EijRepresenting the elements of the ith row and the jth column of the index matrix E.
Preferably, in step S202, the node p is selected by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts an algebraic method:
N(pi) Represents piD neighbor set of (a) storing distance nodes p in ith row and jth column of matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure BDA0002281093020000041
Judgment of
Figure BDA0002281093020000042
Is or is not piD neighbor, the criterion is: nearest neighbor is connected
Figure BDA0002281093020000043
Adding into D neighbor, adding into
Figure BDA0002281093020000044
As a reference point, when
Figure BDA0002281093020000045
Figure BDA0002281093020000046
Eyes of a user
Figure BDA0002281093020000047
When the temperature of the water is higher than the set temperature,
Figure BDA0002281093020000048
is piOtherwise, otherwise
Figure BDA0002281093020000049
To
Figure BDA00022810930200000410
Are not piOf which is in the nearest neighborhood, wherein
Figure BDA00022810930200000411
Is piIs determined to be the nearest neighbor of (c),
Figure BDA00022810930200000412
represents the distance piSample points ordered as j, d (-) represents a distance metric; then will be
Figure BDA00022810930200000413
As a reference point, when
Figure BDA00022810930200000414
And is
Figure BDA00022810930200000415
When the temperature of the water is higher than the set temperature,
Figure BDA00022810930200000416
is piIs close to, then will
Figure BDA00022810930200000417
The judgment is carried out as a reference point, and the analogy is carried out until the judgment is finished
Figure BDA00022810930200000418
Is not piD is close to the node p, stopping judgment, and obtaining the result at the momentiD is nearAdjacent to, p isiThe connection line adjacent to its D is the selected edge.
Preferably, in step S202, the node p is selected by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts a geometric method:
node piD neighbor lookup procedure of (1): storing a distance node p in the ith row and jth column of the matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure BDA00022810930200000419
Nearest neighbor is connected
Figure BDA00022810930200000420
Adding the D-neighbor cell into the D-neighbor cell,
Figure BDA00022810930200000421
the perpendicular bisector of (A) divides the plane into two regions, according to
Figure BDA00022810930200000422
Is selected from the perpendicular bisector ofiD is close to the region to which the neighbor belongs, i.e. close to piThe side is the region to which the D neighbor belongs; selecting a distance p from the region to which the D neighbor belongsiNearest neighbors, i.e.
Figure BDA00022810930200000423
According to
Figure BDA00022810930200000424
Is selected from the perpendicular bisector ofiIs close to the region to which the new D neighbor belongs, i.e. close to piThis side is the area to which the new D neighbor belongs, and the distance p is selected in this areaiNearest point
Figure BDA0002281093020000051
Added and used as the next reference point, and the process is circulated until the approach p composed of all the perpendicular bisectors is reachediBecomes a closed region, and all nodes in the closed region are piD is close to piThe connection line adjacent to its D is the selected edge.
Preferably, in step S302, the weight matrix W is defined as:
Wij=deg(pi)S′ij
wherein p isiAre nodes on the graph, WijThe value representing the ith row and jth column in the weight matrix W, deg (p)i) Is degree, S 'of a node'ijRepresenting a node piAnd node pjI.e. the value of the ith row and the jth column in the distance matrix S'.
Preferably, the affinity matrix M is defined in step S303, and specifically:
according to the weight matrix W, the diagonal matrix T is calculated by using the following formula:
Tii=∑jWij
wherein T isiiRepresenting the value of the diagonal matrix T at row i and column i, WijRepresents the value of the ith row and the jth column of the weight matrix W.
The affinity matrix M is obtained after normalization using the following formula:
M=T-1/2WT-1/2
where T is the diagonal matrix, W is the weight matrix, and M is the affinity matrix.
Preferably, the label propagation in step S400 is performed by a local and global consistency LLGC method, and the calculation method is as follows:
F=(I-αM)-1Y
wherein F is an n × c matrix, n is the number of nodes, c is the number of label types, FijRepresenting a node piProbability of label marked as jth type label, i.e. value of ith row and jth column of matrix F, I is identity matrix, α is regulation parameter, M is affinity matrix, Y is label information, it is a matrix of n x c, in which label information of every node is stored, Y is label information of every nodeijIs the value of the ith row and jth column of the matrix Y if the node piIs marked as a class j tag, then Y ij1, otherwise Yij=0;(I-αM)-1The probability that each node acquires a label of a marked node;
finally, the node p is obtainediThe marking information of (1) is specifically:
Figure BDA0002281093020000061
wherein FijIs the value of the ith row and the jth column of the matrix F, argmax represents the value of the current FijThe value of j when the maximum value is obtained is assigned to yiI.e. node piIs marked as yiAnd finishing the classification of the data after all the nodes are marked.
Compared with the prior art, the method has the following beneficial technical effects:
(1) the semi-supervised classification method based on the dynamic composition can better express the potential distribution characteristics of data;
(2) the semi-supervised classification method based on dynamic composition provided by the disclosure adopts a dynamic neighbor edge selection method, and then calculates the weight of edges by using an ADW method, so as to classify; the classification method can capture the distribution of data, more edges are connected in a data dense area, and fewer edges are connected in a data sparse area, so that the density degree of the data can be better reflected, and a better classification effect is achieved.
Drawings
FIG. 1 shows a flow chart of a semi-supervised classification algorithm based on dynamic composition of the present disclosure;
FIG. 2 is a schematic diagram of a DNN edge selection method on a two-dimensional plane;
FIG. 3 is a diagram illustrating a D-nearest neighbor lookup process on a two-dimensional plane;
FIG. 4(a) shows a TwoSPIRals dataset;
FIG. 4(b) shows a ToyData data set;
FIG. 4(c) shows a Fourier gaussian dataset;
FIG. 4(d) shows a TwoMoon dataset;
FIG. 5(a) shows a Twospirals dataset DNN patterning result;
FIG. 5(b) shows the Twospirals dataset DNN prediction results;
FIG. 5(c) shows Twospirals dataset kNN patterning results;
FIG. 5(d) shows Twospirals dataset kNN prediction results;
FIG. 6(a) shows the ToyData data set DNN composition result;
FIG. 6(b) shows the results of the ToyData data set DNN prediction;
fig. 6(c) shows kNN composition results when the ToyData dataset k is 5;
fig. 6(d) shows kNN prediction results when the ToyData dataset k is 5;
fig. 6(e) shows kNN composition results when the ToyData dataset k is 10;
fig. 6(f) shows kNN prediction results when the ToyData dataset k is 10;
fig. 6(g) shows kNN composition results when the ToyData dataset k is 15;
fig. 6(h) shows kNN prediction results when the ToyData dataset k is 15.
Detailed Description
The semi-supervised classification method based on dynamic composition provided by the disclosure comprises the following steps:
the present invention is explained below with reference to fig. 1 to 6(h), and in one embodiment, as shown in fig. 1, a dynamic composition-based semi-supervised classification method is provided, including steps S100-S400:
s100, preparing a data set, wherein the data set comprises marked data XlAnd unlabeled data XuTwo parts, marked data XlIs marked with information FlThe characteristics of the data in the data set are described by data attribute information, l represents the number of marked data, the data in the data set is abstracted into n nodes on an m-dimensional space, and the ith node is represented as pi
S200, selecting edges on the data set prepared in step S100 by using a dynamic nearest neighbor DNN method to obtain an adjacency matrix a, specifically:
s201, calculating Euclidean distances among the nodes in the data set to obtain a direct distance matrix S;
s202, useNode p selected by dynamic nearest neighbor DNN methodiIs selected as the edge, and an adjacency matrix A is generated based on the D neighbors, A is an n × n matrix, and in the adjacency matrix A, if p is pjIs piIs close to, then the corresponding position A in the matrixijIs 1, otherwise is 0, AijA value representing the ith row and the jth column in the adjacency matrix A;
s300, calculating the similarity probability among the nodes of the adjacent matrix A generated in the step S200 by using an ADW method to obtain an affinity matrix M, wherein the method specifically comprises the following steps:
s301, defining a distance matrix S ', S ' according to the direct distance matrix S in the step S201 and the adjacent matrix A defined in the step S202 'ijThe value representing the ith row and the jth column in the distance matrix S' is specifically defined as:
when i ≠ j,
Figure BDA0002281093020000081
when i ═ j, S'ij=0;
S302, defining a weight matrix W according to the distance matrix S' defined in the step 301, wherein the weight matrix W is an n multiplied by n matrix, and WijIs used to describe the node piAnd node pjThe similarity of the weight matrix W is the value of the ith row and the jth column of the weight matrix W;
s303, normalizing the weight matrix W defined in the step 302 to obtain an affinity matrix M, wherein the affinity matrix M is an n multiplied by n matrix, and M isijIs used to describe the node piAnd node pjSimilar probabilities, i.e., values of the affinity matrix M at row i and column j;
s400, carrying out label propagation according to the affinity matrix M obtained in the step S300 to obtain a final classification result.
In the embodiment, a dynamic nearest neighbor DNN method is used for edge selection on a prepared data set to obtain an adjacency matrix A, an affinity matrix M is obtained through calculation, and label propagation is further carried out to obtain a final classification result; the classification method can capture the distribution of data, more edges are connected in a data dense area, and fewer edges are connected in a data sparse area, so that the density degree of the data can be better reflected, and a better classification effect is achieved. The DNN method may be specifically implemented by an algebraic method or a geometric method.
In another embodiment, the classification method proposed by the present disclosure can be applied to a variety of data sets for classification as long as the data sets satisfy the requirements including data and matching tags thereof. Such as synthetic datasets Twompirals, ToyData, FourGaussian, TwoMoon, and image datasets USPS, Mnist-3495, Coil20, Coil (1500), G241d, COIL 2.
The sample number, attribute number and category number of the synthetic datasets TwoSpirals, ToyData, fourier gaussian, TwoMoon are shown in table 1:
TABLE 1 synthetic data set
Synthesizing data sets Number of samples Attribute dimension Number of categories
TwoSpirals 2000 2 2
ToyData 788 2 7
FourGaussian 1200 2 4
TwoMoon 400 2 2
The number of samples, the number of attributes, and the number of categories of the image data sets USPS, mnst-3495, Coil20, Coil (1500), G241d, and Coil2 are shown in table 2:
TABLE 2 image dataset
Reference data set Number of samples Attribute dimension Number of categories
USPS 1800 256 6
Mnist 6996 784 10
Mnist-3495 3495 784 10
Coil20 1440 1024 20
Coil(1500) 1500 241 6
G241d 1500 241 2
COIL2 1500 241 2
The data sets are all existing data sets and can be obtained in an ImageNet database.
In another embodiment, in the step S201, the node p in the data setiAnd pjThe euclidean distance between them is:
Figure BDA0002281093020000091
where m denotes the dimension of the space, pi、pjDenotes the ith, j nodes, x in the diagramikAnd xjkAre respectively a node pi、pjGenerating a direct distance matrix S according to Euclidean distance between nodes according to the k-dimension coordinate, wherein the direct distance matrix S is a two-dimensional matrix of n multiplied by n, and SijRepresentation matrixThe element in the ith row and the jth column in the ith row, a storage node piAnd node pjThe euclidean distance between.
In another embodiment, the step S201 further includes:
sorting Euclidean distances between each node and other nodes in the direct distance matrix S from small to large to obtain a matrix O, and simultaneously generating an index matrix E corresponding to the direct distance matrix S, wherein the specific process is that for the ith row in the direct distance matrix S, the stored distances are sorted from small to large, and the distance sorted into j-1 is stored in the matrix OijWhile storing the position of the distance in the direct distance matrix S in EijTherefore, the corresponding position of the distance stored in the matrix O in the direct distance matrix S can be found through the index matrix E; the matrix O and the index matrix E are both two-dimensional matrices of n x n, EijRepresenting the elements of the ith row and the jth column of the index matrix E.
In another embodiment, the node p is selected in step S202 by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts an algebraic method:
N(pi) Represents piD neighbor set of (a) storing distance nodes p in ith row and jth column of matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure BDA0002281093020000101
Judgment of
Figure BDA0002281093020000102
Is or is not piD neighbor, the criterion is: nearest neighbor is connected
Figure BDA0002281093020000103
Adding into D neighbor, adding into
Figure BDA0002281093020000104
As a reference point, when
Figure BDA0002281093020000105
Figure BDA0002281093020000106
And is
Figure BDA0002281093020000107
When the temperature of the water is higher than the set temperature,
Figure BDA0002281093020000108
is piOtherwise, otherwise
Figure BDA0002281093020000109
To
Figure BDA00022810930200001010
Part is not piOf which is in the nearest neighborhood, wherein
Figure BDA00022810930200001011
Is piIs determined to be the nearest neighbor of (c),
Figure BDA00022810930200001012
represents the distance piSample points ordered as j, d (-) represents a distance metric; then will be
Figure BDA00022810930200001013
As a reference point, when
Figure BDA00022810930200001014
And is
Figure BDA00022810930200001015
When the temperature of the water is higher than the set temperature,
Figure BDA00022810930200001016
is piIs close to, then will
Figure BDA00022810930200001017
The judgment is carried out as a reference point, and the analogy is carried out until the judgment is finished
Figure BDA00022810930200001018
Is not piD is close to the node p, stopping judgment, and obtaining the result at the momentiD is close to piThe connection line adjacent to its D is the selected edge.
In this embodiment, in particular, piD neighbor determination as shown in fig. 2 on a two-dimensional plane: in order to ensure the connectivity of the graph structure in the DNN, namely no isolated nodes appear, at least one connection is established for each sample, so that firstly, the connection is established
Figure BDA00022810930200001019
Is added to piIn the neighborhood of (1), connect piAnd
Figure BDA00022810930200001020
the dotted line in the figure is
Figure BDA00022810930200001021
Perpendicular bisector of (1), then judging
Figure BDA00022810930200001022
Is or is not piAccording to the previously defined D neighbor judgment criterion, when
Figure BDA00022810930200001023
Figure BDA0002281093020000111
Namely, it is
Figure BDA0002281093020000112
Is located at the perpendicular bisector near piOn one side, and with piAs a center of circle, in
Figure BDA0002281093020000113
Is p in the circle of radiusiD is adjacent, then
Figure BDA0002281093020000114
Adding to the set N (p)i) In (1).
Figure BDA0002281093020000115
Albeit with
Figure BDA0002281093020000116
Distance piThe distances are the same, but because
Figure BDA0002281093020000117
It will probably become
Figure BDA0002281093020000118
D is adjacent to. Sample point piThe searching process of the D neighbor is essentially piAnd (3) a concentric circle searching process taking the distance from the center to different neighbors as a radius. If p isiThe denser the local data distribution, the more concentric circles it searches, generating more D neighbors.
Therefore, the method can capture the distribution of the data, more edges are connected in the data dense area, and less edges are connected in the data sparse area, so that the density degree of the data can be better reflected, and the method has a better classification effect.
The graph constructed using DNN is connected: in graph G ═ x, ∈, a set of connected sub-vertices is selected
Figure BDA0002281093020000119
Figure BDA00022810930200001110
Find a point B ∈ x outside the set of vertices and
Figure BDA00022810930200001111
and B is recorded as A from the nearest point in the vertex set z, the point B is the adaptive neighbor of the point A, namely the point B is communicated with the vertex set z, the point B is added into the communicated subset z, and the operations are repeated, all the points in the graph can be added into the communicated subset z, so that all the points in the graph G (x, epsilon) are communicated.
The graph constructed using DNN does not have the problem of weak connectivity: using the back-syndrome method, assume that point a is an adaptive neighbor of point B, which is not. Since point B is not an adaptive neighbor of point a, it is certain that there is a point C for point B, such that the path cost of point BCA is less than the path cost of BA, and the path cost of AB is greater than the path cost of ACB, point a is not an adaptive neighbor of point B, contrary to the assumption. So weak communication does not occur.
In another embodiment, the node p is selected in step S202 by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts a geometric method:
node piD neighbor lookup procedure of (1): storing a distance node p in the ith row and jth column of the matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure BDA0002281093020000121
Nearest neighbor is connected
Figure BDA0002281093020000122
Adding the D-neighbor cell into the D-neighbor cell,
Figure BDA0002281093020000123
the perpendicular bisector of (A) divides the plane into two regions, according to
Figure BDA0002281093020000124
Is selected from the perpendicular bisector ofiD is close to the region to which the neighbor belongs, i.e. close to piThe side is the region to which the D neighbor belongs; selecting a distance p from the region to which the D neighbor belongsiNearest neighbors, i.e.
Figure BDA0002281093020000125
According to
Figure BDA0002281093020000126
Is selected from the perpendicular bisector ofiIs close to the region to which the new D neighbor belongs, i.e. close to piThis side is the area to which the new D neighbor belongs, and the distance p is selected in this areaiNearest point
Figure BDA0002281093020000127
Added and used as the next reference point, and the process is circulated until the approach p composed of all the perpendicular bisectors is reachediBecomes a closed region, and all nodes in the closed region are piD is close to piThe connection line adjacent to its D is the selected edge.
In this embodiment, the enclosed region means that a connecting line between any point inside the region and any point outside the region intersects with the boundary of the region.
In this embodiment, in particular, piD neighbor determination as shown in fig. 3 on a two-dimensional plane: in order to ensure the connectivity of the graph structure in the DNN, namely no isolated nodes appear, at least one connection is established for each sample, so that firstly, the connection is established
Figure BDA0002281093020000128
Is added to piIn the neighborhood of (1), connect piAnd
Figure BDA0002281093020000129
the dotted line ① is
Figure BDA00022810930200001210
The perpendicular bisector line of (a) which divides the planar area into two parts, piD is close to piOne side region, i.e., the left side region of the dotted line ①, and p is selected in the left side region of the dotted line ①iNearest neighbor of (2)
Figure BDA00022810930200001211
Connection piAnd
Figure BDA00022810930200001212
the dotted line ② is
Figure BDA00022810930200001213
Middle drop ofThe lines, dashed line ① and dashed line ② divide the planar area into four parts, piD is close to piOne side region, i.e., the left side region of the dotted line ① and the lower side region of the dotted line ②, in which p is selectediNearest neighbor of (2)
Figure BDA00022810930200001214
Connection piAnd
Figure BDA00022810930200001215
the dotted line ③ is
Figure BDA00022810930200001216
The dashed lines ①, ②, and ③ divide the planar area into six portions, piD is close to piOne side region, i.e., the left side region of the dashed line ①, the lower side region of the dashed line ②, and the right side region of the dashed line ③, where p is selectediNearest neighbor of (2)
Figure BDA00022810930200001217
Connection piAnd
Figure BDA00022810930200001218
the dotted line ④ is
Figure BDA00022810930200001219
The dashed lines ①, ②, ③ and ④ divide the planar area into nine parts, piD is close to piA region on one side, i.e. a left region of the dashed line ①, a lower region of the dashed line ②, a right region of the dashed line ③, and an upper region of the dashed line ④, is a closed region, and all nodes in the closed region are nodes piD is adjacent to.
It can be seen that if piThe more densely distributed local data are located, the more the number of nodes in the closed area is, the more D neighbors of the nodes are, the method can capture the distribution of the data, more edges are connected in the dense data area, and the data are sparseThe sparse area is connected with fewer edges, so that the density degree of data can be better reflected, and a better classification effect is achieved.
In another embodiment, in the step S302, the weight matrix W is defined as:
Wij=deg(pi)S′ij
wherein p isiAre nodes on the graph, WijThe value representing the ith row and jth column in the weight matrix W, deg (p)i) Is degree, S 'of a node'ijRepresenting a node piAnd node pjI.e. the value of the ith row and the jth column in the distance matrix S'.
In another embodiment, the affinity matrix M is defined in step S303, specifically:
according to the weight matrix W, the diagonal matrix T is calculated by using the following formula:
Tii=∑jWij
wherein T isiiRepresenting the value of the diagonal matrix T at row i and column i, WijRepresents the value of the ith row and the jth column of the weight matrix W.
The affinity matrix M is obtained after normalization using the following formula:
M=T-1/2WT-1/2
where T is the diagonal matrix, W is the weight matrix, and M is the affinity matrix.
In this embodiment, the similarity probability of the nodes in the graph in the ADW is determined by both the similarity and the degree of the nodes, the distribution of the data in the ADW is expressed by the degree of the nodes, the similarity between the data is expressed by the gaussian kernel function, the ADW is simple to calculate, the algorithm complexity is low, and the ADW has the following two advantages:
(1) the formula reduces the overfitting problem caused by weight parameterization and is insensitive to noise data. In experiments, ADW was observed to be robust to input data noise using the composite data set, and the performance advantages of ADW were demonstrated in 7 real data sets.
(2) ADW has no additional tuning parameters.
In another embodiment, the label propagation in step S400 is implemented by Local and Global consistency llgc (learning with Local and Global consistency) method, and the calculation method is as follows:
F=(I-αM)-1Y
wherein F is an n × c matrix, n is the number of nodes, c is the number of label types, FijRepresenting a node piProbability of label marked as jth type label, i.e. value of ith row and jth column of matrix F, I is identity matrix, α is regulation parameter, M is affinity matrix, Y is label information, it is a matrix of n x c, in which label information of every node is stored, Y is label information of every nodeijIs the value of the ith row and jth column of the matrix Y if the node piIs marked as a class j tag, then Y ij1, otherwise Yij=0;(I-αM)-1The probability that each node acquires a label of a marked node;
finally, the node p is obtainediThe marking information of (1) is specifically:
Figure BDA0002281093020000141
wherein FijIs the value of the ith row and the jth column of the matrix F, argmax represents the value of the current FijThe value of j when the maximum value is obtained is assigned to yiI.e. node piIs marked as yiAnd finishing the classification of the data after all the nodes are marked.
The steps of the semi-supervised classification method based on dynamic composition provided by the present disclosure are specifically introduced above, and the superiority of the classification method provided by the present disclosure compared with the existing data classification method is illustrated by specific experimental comparison below.
Experiment:
to illustrate the superiority of the semi-supervised classification method based on dynamic composition proposed by the present disclosure, experiments were performed on synthetic data sets and real data sets widely used in graph-based semi-supervised learning. The method mainly aims to verify that the proposed method can better express the potential distribution characteristics of data, and improves the semi-supervised classification method. Comparing the method provided by the present disclosure with the kNN method, the kNN method is to find k nearest neighbors of a sample, and assign an average value of attributes of the nearest neighbors to the sample, so as to obtain the attribute of the sample. One of the evaluation criteria of the graph construction method is: on the premise of using the same derivation method, if the better classification performance can be realized, the LLGC classification method is adopted in the experiment. For the multi-class problem, the Error Rate (Error Rate) is used to evaluate the performance of the algorithm, as shown in the following equation:
Figure BDA0002281093020000151
wherein c is the total number of sample classes, NiIs the number of class i samples, FiThe number of misclassified samples in the ith type of sample.
Synthetic data set experimental results
The classification performance of DNN + LLGC and kNN + LLGC was verified using 4 synthetic datasets in table 1. The 4 synthetic data sets included Twospirals, ToyData, FourGaussian and TwoMoon. Their sample number, attribute number and category number are listed in table 1, and they are randomly generated two-dimensional data. In the TwoSpirals dataset, 1000 positive and negative samples each, as shown in fig. 4(a), were distributed in a double spiral shape. The ToyData data set has 788 samples, which belong to 7 classes, and each sample has 2 attributes, as shown in FIG. 4 (b). The fourier gaussian dataset consists of 1200 samples, classified into 4 classes, each sample having 2 attributes, as shown in fig. 4 (c). In the TwoMoon dataset, 200 positive and negative samples each, as shown in fig. 4(d), were distributed in a bi-lunar shape.
Taking the TwoSpirals data set and the ToyData data set as examples, the graphs constructed by using the DNN and kNN methods and the results of the category prediction are shown in fig. 5(a) -5 (d) and 6(a) -6 (h), the graph composition results are graphs obtained by using the edge selection method and the edge re-weighting method, and the prediction results are classification results obtained by performing label propagation according to the graph composition results. In fig. 5(a) -5 (d), dark dots and light dots represent two types of data, respectively. Fig. 5(a) and 5(c) are DNN and kNN plots of the TwoSpirals dataset, respectively. In the DNN graph, edge connection still exists between light-colored data points with longer distance, and the correlation between the same-type sample points is better expressed. Fig. 5(b) and 5(d) show the class prediction results obtained by the DNN and kNN algorithms, and it can be seen that the DNN prediction results are more accurate. In fig. 6(a) -6 (h), the dots of different depth colors represent different seven types of data. Fig. 6(a) and 6(b) show the composition results and the category prediction results obtained by the DNN method, fig. 6(c), 6(e), and 6(g) show the composition results obtained by the kNN method when K is 5, 10, and 15, and fig. 6(d), 6(f), and 6(h) show the category prediction results obtained by the kNN method when K is 5, 10, and 15.
When a sort experiment was performed, the number of labeled samples is shown in Table 3, and the other data was unlabeled. In the graph construction process, the kNN, DNN method is used, k 5, 10, 15, LLGC is used for tag propagation. The experiment was repeated 20 times and the average accuracy was calculated. The classification results are shown in table 3. The first column is the name of the synthesized data set, the second column is the number of marked samples, the third column is the composition method used in the composition process, the fourth column is different values of the parameter k in the kNN method, the fifth column to the sixth column respectively give the minimum degree and the maximum degree of the nodes in the graph, and the seventh column and the eighth column respectively give the average error rate and the standard deviation of the error rate of the classification. It can be seen that, in addition to the TwoSpirals dataset, for other datasets, the classification error rate of the DNN method is smaller than that of the kNN method under different values of k.
TABLE 3 Classification Performance of synthetic datasets
Figure BDA0002281093020000161
Figure BDA0002281093020000171
Experimental results of image data set
In order to compare the classification performance of the different composition methods and derivation methods, a combination of the different composition methods and derivation methods was applied to the following 7 image data sets, and they are presented in table 2. These data sets are each composed of grayscale images, and the grayscale values are taken as the characteristic values of each image.
Randomly selected sample points in each class are labeled as labeled sample points, the number of labeled samples in each class is shown in table 4, and the remaining sample points are unlabeled. In the experiment, the data dimensionality is reduced to 50 dimensions by using a PCA method, wherein the PCA method is used for mapping the characteristics of the high-dimensional data to the low-dimensional data by retaining the important characteristics of the high-dimensional data and removing noise and unimportant characteristics. The unlabeled samples were classified and the experiment was repeated 10 times, and the classification results (average classification error rate) are shown in table 4: the first column is the name of the image data set, the second column is the number of marked samples, the third column is the composition method, the fourth column is different values of the parameter k in the kNN method, the fifth column to the sixth column respectively give the minimum degree and the maximum degree of the nodes in the graph, and the seventh column and the eighth column respectively give the standard deviation of the classified average error rate and error rate. It can be seen that for each data set, the classification error rate of the DNN method is smaller than that of the kNN method under different values of k. Therefore, the semi-supervised classification method based on dynamic composition can improve the classification precision.
TABLE 4 Classification Performance of image datasets
Figure BDA0002281093020000172
Figure BDA0002281093020000181
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims (9)

1. A semi-supervised classification method based on dynamic composition comprises the following steps:
s100, preparing a data set, wherein the data set comprises marked data XlAnd unlabeled data XuTwo parts, marked data XlIs marked with information FlThe characteristics of the data in the data set are described by data attribute information, l represents the number of marked data, the data in the data set is abstracted into n nodes on an m-dimensional space, and the ith node is represented as pi
S200, selecting edges on the data set prepared in step S100 by using a dynamic nearest neighbor DNN method to obtain an adjacency matrix a, specifically:
s201, calculating Euclidean distances among the nodes in the data set to obtain a direct distance matrix S;
s202, selecting node p by using dynamic neighbor DNN methodiIs selected as the edge, and an adjacency matrix A is generated based on the D neighbors, A is an n × n matrix, and in the adjacency matrix A, if p is pjIs piIs close to, then the corresponding position A in the matrixijIs 1, otherwise is 0, AijA value representing the ith row and the jth column in the adjacency matrix A;
s300, calculating the similarity probability among the nodes of the adjacent matrix A generated in the step S200 by using an ADW method to obtain an affinity matrix M, wherein the method specifically comprises the following steps:
s301, defining a distance matrix S ', S ' according to the direct distance matrix S in the step S201 and the adjacent matrix A defined in the step S202 'ijThe value representing the ith row and the jth column in the distance matrix S' is specifically defined as:
when i ≠ j,
Figure FDA0002281093010000011
when i ═ j, S'ij=0;
S302, defining a weight matrix W according to the distance matrix S' defined in the step 301, wherein the weight matrix W is an n multiplied by n matrix, and WijIs used to describe the node piAnd node pjIs likeDegree, i.e. the value of the ith row and the jth column of the weight matrix W;
s303, normalizing the weight matrix W defined in the step 302 to obtain an affinity matrix M, wherein the affinity matrix M is an n multiplied by n matrix, and M isijIs used to describe the node piAnd node pjSimilar probabilities, i.e., values of the affinity matrix M at row i and column j;
s400, carrying out label propagation according to the affinity matrix M obtained in the step S300 to obtain a final classification result.
2. The method according to claim 1, wherein the data set in step S100 includes synthetic data sets TwoSpirals, ToyData, FourGaussian, TwoMoon and image data sets USPS, mnst-3495, Coil20, Coil (1500), G241d, Coil 2.
3. The method according to claim 1, wherein in step S201, a node p in a data setiAnd pjThe euclidean distance between them is:
Figure FDA0002281093010000021
where m denotes the dimension of the data, pi、pjDenotes the ith, j nodes, x in the diagramikAnd xjkAre respectively a node pi、pjGenerating a direct distance matrix S according to Euclidean distance between nodes according to the k-dimension coordinate, wherein the direct distance matrix S is a two-dimensional matrix of n multiplied by n, and SijRepresenting the value of the ith row and jth column in the matrix, storing a node piAnd node pjThe euclidean distance between.
4. The method of claim 3, the step S201 further comprising:
sequencing Euclidean distances of each node and other nodes in the direct distance matrix S from small to large to obtain a matrix O, and simultaneously generating an index matrix E corresponding to the direct distance matrix S, wherein the specific process is that for the ith in the direct distance matrix SThe rows are sorted from small to large, and the distances sorted to j-1 are stored in OijWhile storing the position of the distance in the direct distance matrix S in EijTherefore, the corresponding position of the distance stored in the matrix O in the direct distance matrix S can be found through the index matrix E; the matrix O and the index matrix E are both two-dimensional matrices of n x n, EijRepresenting the elements of the ith row and the jth column of the index matrix E.
5. The method of claim 4, wherein the step S202 of selecting the node p is performed by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts an algebraic method:
N(pi) Represents piD neighbor set of (a) storing distance nodes p in ith row and jth column of matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure FDA0002281093010000031
Judgment of
Figure FDA0002281093010000032
Is or is not piD neighbor, the criterion is: nearest neighbor is connected
Figure FDA0002281093010000033
Adding into D neighbor, adding into
Figure FDA0002281093010000034
As a reference point, when
Figure FDA0002281093010000035
Figure FDA0002281093010000036
And is
Figure FDA0002281093010000037
When the temperature of the water is higher than the set temperature,
Figure FDA0002281093010000038
is piOtherwise, otherwise
Figure FDA0002281093010000039
To
Figure FDA00022810930100000310
Are not piOf which is in the nearest neighborhood, wherein
Figure FDA00022810930100000311
Is piIs determined to be the nearest neighbor of (c),
Figure FDA00022810930100000312
represents the distance piSample points ordered as j, d (-) represents a distance metric; then will be
Figure FDA00022810930100000313
As a reference point, when
Figure FDA00022810930100000314
And is
Figure FDA00022810930100000315
When the temperature of the water is higher than the set temperature,
Figure FDA00022810930100000316
is piIs close to, then will
Figure FDA00022810930100000317
The judgment is carried out as a reference point, and the analogy is carried out until the judgment is finished
Figure FDA00022810930100000318
Is not piD when the neighbor is close to the D, the judgment is stopped, and the result obtained at the timeI.e. node piD is close to piThe connection line adjacent to its D is the selected edge.
6. The method of claim 4, wherein the step S202 of selecting the node p is performed by using a dynamic neighbor DNN methodiThe D neighbor specifically adopts a geometric method:
node piD neighbor lookup procedure of (1): storing a distance node p in the ith row and jth column of the matrix OiThe distance with j can be found out at the position S in the direct distance matrix S through the index matrix EimI.e. node pmIs to node piIs ranked as j, node pmIs marked as
Figure FDA00022810930100000319
Nearest neighbor is connected
Figure FDA00022810930100000320
Adding the D-neighbor cell into the D-neighbor cell,
Figure FDA00022810930100000321
the perpendicular bisector of (A) divides the plane into two regions, according to
Figure FDA00022810930100000322
Is selected from the perpendicular bisector ofiD is close to the region to which the neighbor belongs, i.e. close to piThe side is the region to which the D neighbor belongs; selecting a distance p from the region to which the D neighbor belongsiNearest neighbors, i.e.
Figure FDA00022810930100000323
According to
Figure FDA00022810930100000324
Is selected from the perpendicular bisector ofiIs close to the region to which the new D neighbor belongs, i.e. close to piThis side is the area to which the new D neighbor belongs, and the distance p is selected in this areaiNearest point
Figure FDA00022810930100000325
Added and used as the next reference point, and the process is circulated until the approach p composed of all the perpendicular bisectors is reachediBecomes a closed region, and all nodes in the closed region are piD is close to piThe connection line adjacent to its D is the selected edge.
7. The method according to claim 1, wherein in step S302, the weight matrix W is defined as:
Wij=deg(pi)S′ij
wherein p isiAre nodes on the graph, WijThe value representing the ith row and jth column in the weight matrix W, deg (p)i) Is degree, S 'of a node'ijRepresenting a node piAnd node pjI.e. the value of the ith row and the jth column in the distance matrix S'.
8. The method according to claim 1, wherein the affinity matrix M is defined in step S303, and specifically:
according to the weight matrix W, the diagonal matrix T is calculated by using the following formula:
Tii=∑jWij
wherein T isiiRepresenting the value of the diagonal matrix T at row i and column i, WijRepresenting the value of ith row and jth column of the weight matrix W;
the affinity matrix M is obtained after normalization using the following formula:
M=T-1/2WT-1/2
where T is the diagonal matrix, W is the weight matrix, and M is the affinity matrix.
9. The method of claim 1, wherein the label propagation in step S400 is performed by local and global consistency LLGC methods, which are calculated as follows:
F=(I-αM)-1Y
where F is an nxc matrix, n is the number of nodes, and c is the indexNumber of types of labels, FijRepresenting a node piProbability of label marked as jth type label, i.e. value of ith row and jth column of matrix F, I is identity matrix, α is regulation parameter, M is affinity matrix, Y is label information, it is a matrix of n x c, in which label information of every node is stored, Y is label information of every nodeijIs the value of the ith row and jth column of the matrix Y if the node piIs marked as a class j tag, then Yij1, otherwise Yij=0;(I-αM)-1The probability that each node acquires a label of a marked node;
finally, the node p is obtainediThe marking information of (1) is specifically:
Figure FDA0002281093010000041
wherein FijIs the value of the ith row and the jth column of the matrix F, argmax represents the value of the current FijThe value of j when the maximum value is obtained is assigned to yiI.e. node piIs marked as yiAnd finishing the classification of the data after all the nodes are marked.
CN201911131232.XA 2019-11-20 2019-11-20 Semi-supervised classification method based on dynamic composition Active CN111046914B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911131232.XA CN111046914B (en) 2019-11-20 2019-11-20 Semi-supervised classification method based on dynamic composition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911131232.XA CN111046914B (en) 2019-11-20 2019-11-20 Semi-supervised classification method based on dynamic composition

Publications (2)

Publication Number Publication Date
CN111046914A true CN111046914A (en) 2020-04-21
CN111046914B CN111046914B (en) 2023-10-27

Family

ID=70232184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911131232.XA Active CN111046914B (en) 2019-11-20 2019-11-20 Semi-supervised classification method based on dynamic composition

Country Status (1)

Country Link
CN (1) CN111046914B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019001071A1 (en) * 2017-06-28 2019-01-03 浙江大学 Adjacency matrix-based graph feature extraction system and graph classification system and method
CN109815986A (en) * 2018-12-24 2019-05-28 陕西师范大学 The semisupervised classification method of fusion part and global characteristics
CN109829472A (en) * 2018-12-24 2019-05-31 陕西师范大学 Semisupervised classification method based on probability neighbour
CN110309871A (en) * 2019-06-27 2019-10-08 西北工业大学深圳研究院 A kind of semi-supervised learning image classification method based on random resampling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019001071A1 (en) * 2017-06-28 2019-01-03 浙江大学 Adjacency matrix-based graph feature extraction system and graph classification system and method
CN109815986A (en) * 2018-12-24 2019-05-28 陕西师范大学 The semisupervised classification method of fusion part and global characteristics
CN109829472A (en) * 2018-12-24 2019-05-31 陕西师范大学 Semisupervised classification method based on probability neighbour
CN110309871A (en) * 2019-06-27 2019-10-08 西北工业大学深圳研究院 A kind of semi-supervised learning image classification method based on random resampling

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱常宝;程勇;高强;: "基于半监督深度信念网络的图像分类算法研究", 计算机科学, no. 1 *
王娜;王小凤;耿国华;宋倩楠;: "基于C均值聚类和图转导的半监督分类算法", 计算机应用, no. 09 *

Also Published As

Publication number Publication date
CN111046914B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
Lee et al. Foreground focus: Unsupervised learning from partially matching images
CN113408605B (en) Hyperspectral image semi-supervised classification method based on small sample learning
Liu et al. Nonparametric scene parsing via label transfer
Das et al. Automatic clustering using an improved differential evolution algorithm
US7889914B2 (en) Automated learning of model classifications
CN110163258A (en) A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention
CN108897791B (en) Image retrieval method based on depth convolution characteristics and semantic similarity measurement
CN110647907B (en) Multi-label image classification algorithm using multi-layer classification and dictionary learning
CN101140623A (en) Video frequency objects recognition method and system based on supporting vectors machine
CN109635140B (en) Image retrieval method based on deep learning and density peak clustering
CN1723468A (en) Computer vision system and method employing illumination invariant neural networks
CN111275052A (en) Point cloud classification method based on multi-level aggregation feature extraction and fusion
CN113807456A (en) Feature screening and association rule multi-label classification algorithm based on mutual information
Hsieh et al. Adaptive structural co-regularization for unsupervised multi-view feature selection
Andreetto et al. Unsupervised learning of categorical segments in image collections
CN117253093A (en) Hyperspectral image classification method based on depth features and graph annotation force mechanism
Tamrakar et al. Integration of lazy learning associative classification with kNN algorithm
CN105844299A (en) Image classification method based on bag of words
CN111046914B (en) Semi-supervised classification method based on dynamic composition
Turtinen et al. Contextual analysis of textured scene images.
CN114821157A (en) Multi-modal image classification method based on hybrid model network
Jamotton et al. Insurance analytics with clustering techniques
Chowdhury et al. Compact image signature generation: An application in image retrieval
Dupé et al. Hierarchical bag of paths for kernel based shape classification
Di Ruberto et al. Decomposition of two-dimensional shapes for efficient retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant