CN111524098A - Neural network output layer cutting and template frame size determining method based on self-organizing clustering - Google Patents

Neural network output layer cutting and template frame size determining method based on self-organizing clustering Download PDF

Info

Publication number
CN111524098A
CN111524098A CN202010265447.7A CN202010265447A CN111524098A CN 111524098 A CN111524098 A CN 111524098A CN 202010265447 A CN202010265447 A CN 202010265447A CN 111524098 A CN111524098 A CN 111524098A
Authority
CN
China
Prior art keywords
clustering
layer
samples
centers
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010265447.7A
Other languages
Chinese (zh)
Other versions
CN111524098B (en
Inventor
郝梦茜
张辉
周斌
杨柏胜
倪少波
靳松直
丛龙剑
刘严羊硕
郑文娟
韦海萍
田爱国
邵俊伟
李建伟
张孝赫
张连杰
张艺明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aerospace Automatic Control Research Institute
Original Assignee
Beijing Aerospace Automatic Control Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aerospace Automatic Control Research Institute filed Critical Beijing Aerospace Automatic Control Research Institute
Priority to CN202010265447.7A priority Critical patent/CN111524098B/en
Publication of CN111524098A publication Critical patent/CN111524098A/en
Application granted granted Critical
Publication of CN111524098B publication Critical patent/CN111524098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Abstract

The invention relates to a neural network output layer cutting and template frame size determining method based on self-organizing clustering, belongs to the technical field of target detection and identification of convolutional neural networks, and particularly provides a network output layer cutting and template frame size determining method aiming at an SSD algorithm. The self-organizing clustering is used for obtaining a better clustering result under the condition that the size distribution of the target is uncertain, the clustering result is used for calculating the upper limit area of the target, the number of layers of an output layer is determined, the output layers with overlarge receptive field and overlarge number of layers are deleted, the network depth and the number of parameters are reduced, the difficulty of model training is reduced, the convergence of the model is accelerated, the generalization capability of the model is improved, the time consumed by calculation is reduced, and the calculation efficiency is improved.

Description

Neural network output layer cutting and template frame size determining method based on self-organizing clustering
Technical Field
The invention relates to a neural network output layer cutting and template frame size determining method based on self-organizing clustering, belongs to the technical field of target detection and identification of convolutional neural networks, and particularly provides a network output layer cutting and template frame size determining method aiming at an SSD algorithm.
Background
In recent years, the convolutional neural network shows the performance far beyond that of the traditional image analysis method in the field of image target detection and identification, and has good use effect in the fields of civil use, national defense, industry and the like. At present, in academic circles, the main research direction of the convolutional neural network is mainly visible light image large target scenes, and in the problems, the size of a target is large, the characteristics are rich, training samples are rich, and a deeper network is required to provide better nonlinear characteristics for target detection and identification.
However, in some special application scenarios such as remote sensing and military, SAR and infrared images are mainly used, the imaging resolution is low, the target types are limited, the target pixel size is generally small, the number of training samples is limited, the use of a deeper network often causes the difficulty in convergence of the training process, the training result is easy to be over-fitted, the model generalization performance is poor, and the practical effect is poor.
In order to solve the problem, part of schemes reduce the difficulty of model training by reducing the network depth and reducing the number of parameters to be trained of the network, but the method for reducing the network depth basically depends on artificial experience adjustment, and the adjustment effect is difficult to guarantee.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method is used for clustering the target sizes in training sample data by using a self-organizing clustering algorithm to obtain the number of clustering centers and the clustering centers, and aims at solving the problems that under the scenes of limited target types, generally small target pixel sizes and limited training sample number, the training convergence is difficult and the model generalization capability is poor due to an excessively deep SSD network; according to the corresponding criterion, the number of the SSD network output layers and the size of a template box (also called default box) are determined, unnecessary output layers are removed, the network depth is reduced, the network complexity is reduced, and the model training convergence difficulty is reduced. Aiming at the problems, the scheme provides a self-organizing clustering-based output layer cutting and template frame size determining method for an SSD algorithm, and by analyzing training sample data, sample size distribution is extracted, so that a proper network output layer number and a reasonable template frame size are determined, the original model output layer is cut and the template frame size is reasonably determined, the network depth is reduced, the network complexity is reduced, the model training convergence difficulty is reduced, and the calculation time is shortened.
The solution of the invention is:
a neural network output layer clipping and template frame size determining method based on self-organizing clustering comprises the following steps:
(1) each target on the training data is represented by a two-dimensional feature vector (w, h), wherein w is the width of a target pixel, h is the height of the target pixel, the number of the two-dimensional feature vectors (w, h) is represented by N, the two-dimensional feature vectors (w, h) are referred to as samples, and the samples are represented by x.
(2) Setting the initial clustering center number of the N samples obtained in the step (1) to be K and the minimum sample number in clustering to be thetaNAnd the standard deviation of the sample distance distribution in the clustering is thetaSThe minimum distance between the two cluster centers is thetaCThe maximum iterative operation number is Imax
(3) Randomly selecting K samples from N samples as initial clustering centers, and enabling N samples to be selected as initial clustering centersC=K,NCRepresenting the number of current cluster centers, each cluster center being represented by ZjDenotes, j ═ 1,2, …, NCS for the category corresponding to each cluster centerjDenotes, j ═ 1,2, …, NCClass SjThe number of samples in (1) is NjDenotes, j ═ 1,2, …, NCThe number of iterative operations is represented by I, and I is 1.
(4) Traversing all the samples x, calculating the samples x and each clustering center ZjA distance D betweenjSample x is classified into the category corresponding to the cluster center with the smallest distance to sample x.
(5) If a certain class SjNumber of middle samples NjNThen cancel the category and make the current cluster center number NCReducing by 1, and classifying the samples in the category into other categories according to the minimum distance criterion in (4); otherwise to class SjNo treatment is done.
(6) For each category SjThe average value of the samples x is the corrected clustering center Zj,j=1,2,…,NC
(7) For each category SjCalculate eachAverage distance of samples in a class to cluster center
Figure BDA0002441108120000033
j=1,2,…,NC
(8) Calculating the total average distance between all the class samples and the corresponding cluster center
Figure BDA0002441108120000032
(9) Judging the class SjSplit, merge and iterative operations.
1) If the iterative operation times I is more than or equal to ImaxI.e. the last iteration, byCAnd (4) jumping to the step (13) when the value is 0.
2) If theta is greater than thetaNK/2, namely the number of the cluster centers is equal to or less than half of the specified value, the step (10) is entered, and the existing clusters are split.
3) If the number of iterations I is even, or NCIf the K is more than or equal to 2K, the splitting treatment is not carried out, and the step (13) is skipped; if I is odd, and NC<And 2K, entering the step (10) to perform splitting treatment.
(10) Calculate each class SjFrom medium samples x to cluster center ZjIs a standard deviation vector ofj,j=1,2,…,NC
(11) For the standard deviation vector sigma calculated in (10)jExtracting the maximum component by sigmajmaxDenotes, j ═ 1,2, …, NC
(12) Set of maximum components σjmax},j=1,2,…,NCIn, if there is σjmaxSAnd either of the following two conditions is satisfied:
(1)
Figure BDA0002441108120000031
and N isj>(θN+1)*2;
(2)NC≤K/2;
Then Z will bejSplitting into two new cluster centers, and counting the number N of the cluster centersCAnd adding 1. After the splitting operation is finished, jumping back to the step (4), and adding 1 to the iterative operation times I; otherwise, the clustering center Z is not alignedjThe operation proceeds to step (13).
(13) Calculating NCDistance D between every two clustering centersij=||Zi-Zj||,i=1,2,…,NC-1,j=i+1,2,…,NC
(14) If the distance D between two nearest cluster centersijCCombining the two clustering centers into a new clustering center, combining the corresponding categories of the two clustering centers into a category, and counting the number N of the clustering centersCSubtracting 1; otherwise, no processing is performed.
(15) If the iterative operation times I is more than or equal to ImaxAnd (5) ending the clustering operation, entering the step (16), and otherwise, returning to the step (4) and adding 1 to the iterative operation times I.
(16) To NCIndividual clustering center Zj=(wj,hj) Calculating the upper limit area
Figure BDA0002441108120000041
To NCAn
Figure BDA0002441108120000042
Sorting from small to large to obtain the maximum upper limit area
Figure BDA0002441108120000043
(17) According to
Figure BDA0002441108120000044
Judging the number L of output layersout
If it is
Figure BDA0002441108120000045
The number of output layers is Lout=6;
If it is
Figure BDA0002441108120000046
Then outputNumber of layers Lout=5;
If it is
Figure BDA0002441108120000047
The number of output layers is Lout=4;
If it is
Figure BDA0002441108120000048
The number of output layers is Lout=3;
If it is
Figure BDA0002441108120000049
The number of output layers is Lout=2。
(18) There are 6 output layers in the SSD algorithm, conv4_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
If L isoutOnly conv4_3, fc7 layers are reserved when the number is 2, and convolutional layers after fc7 are deleted;
if L isoutOnly conv4_3, fc7, conv8_2 are reserved when the value is 3, and the convolutional layer after conv8_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2 and conv9_2 are reserved for 4, and the convolution layer behind conv9_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2, conv9_2 and conv10_2 are reserved when the convolutional layer after conv10_2 is deleted when the convolutional layer is 5;
if L isoutNo pruning is done for the SSD network 6.
(19) Determining the output layer where the corresponding template frame is located for each clustering center:
to NCIndividual clustering center Zj=(wj,hj) Calculating the area Aj=wj×hj
If Aj>(300/3)2Designing a corresponding template box at the conv11_2 layer;
if (300/5)2<Aj≤(300/3)2Designing a corresponding template box at the conv10_2 layer;
if (300/10)2<Aj≤(300/5)2Designing a corresponding template box at the conv9_2 layer;
if (300/19)2<Aj≤(300/10)2Designing a corresponding template box at the conv8_2 layer;
if (300/38)2<Aj≤(300/19)2Then the corresponding template box is designed at fc7 level;
if Aj≤(300/38)2The corresponding template box is designed at the conv4_3 layer.
(20) Determining the corresponding template frame size of each cluster center:
NCindividual clustering center Zj=(wj,hj) The corresponding template frame sizes are respectively:
Figure BDA0002441108120000051
max_size=max(wj,hj)
Figure BDA0002441108120000052
where floor () is rounded down.
(21) If some output layer does not design corresponding template frames after the corresponding template frames are designed for all the cluster centers, designing according to the following criteria:
if a certain output layer does not design the corresponding template frame, the min _ size, max _ size, aspect _ ratio parameter of the layer closest to it is used. If two output layers are at the same distance from the Layer, the shallow Layer is LayerBDeep Layer is LayerTThen the output layer parameter
Figure BDA0002441108120000053
Layer adopted by aspect _ ratioBAspect _ ratio ofB and LayerTAspect _ ratio ofTThe union of (a).
(22) A template frame ratio of aspect _ ratio 1 is added to all output layers.
(23) And training the convolutional neural network which finishes the cutting of the output layer and the determination of the size of the template frame to obtain a neural network model with fewer layers, lower complexity and higher computational efficiency.
In the above scheme, in step (1), the specific method for extracting the target width w and height h in the labeling information comprises: reading the values < xmin >, < ymin >, < xmax >, < ymax > in each < bndbox > node in xml, and calculating the width w ═ xmax-xmin +1 and the height h ═ ymax-ymin +1 of the target.
In the scheme, in the step (3), the specific method for randomly selecting K samples is that K random numbers α are generated according to the uniform distribution of U (0,1) between 0 and 112,…,αKTake the ceil (a)iN) samples as the ith initial cluster center, with ceil () rounded up.
In step (4), the sample x and the clustering center ZjThe distance between the two sensors is calculated by the following method: dj=||x-Zj||。
In the scheme, in the step (4), the classification method of the sample x is as follows: if it is
Figure BDA0002441108120000054
Then the sample x is attributed to SjAnd (4) class.
In the above scheme, in step (5), the specific method for canceling a certain category is: the cluster center is cancelled, so that the number of the cluster centers is NCSubtracting 1, releasing the samples originally belonging to the category, calculating the distance between the released samples and other cluster centers, and classifying the released samples into which category the distance between the released samples and which cluster center is closest.
In the step (6), the clustering centers Z of all the categories are correctedjThe specific method comprises the following steps:
Figure BDA0002441108120000061
the above scheme calculates each class S in step (7)jAverage distance of the sample to the cluster center in (1)
Figure BDA0002441108120000067
The specific method comprises the following steps:
Figure BDA0002441108120000062
in step (8), the total average distance between all the category samples and the corresponding cluster center is calculated
Figure BDA0002441108120000063
The specific method comprises the following steps:
Figure BDA0002441108120000064
the above scheme calculates each class S in step (10)jWhere each sample x ═ xw,xh) To the clustering center Zj=(wj,hj) Is a standard deviation vector ofj=(σw,jh,j) The specific method comprises the following steps:
Figure BDA0002441108120000065
Figure BDA0002441108120000066
in step (11), the above scheme extracts sigma in each standard deviation vectorj=(σw,jh,j) Maximum component σ ofjmaxThe specific method comprises the following steps:
σjmax=max(σw,jh,j)
in step (12), Z isjThe specific method for splitting into two new clustering centers is as follows: clustering SjFrom medium samples x to cluster center ZjHas a standard deviation ofj=(σw,jh,j) If σ isw,j≥σh,jLet γ be (σ)w,j0); if σw,jh,jLet γ equal to (0, σ)h,j)。ZjIs split offThe two new cluster centers of (a) are: zj+kγ and Zj-k γ, wherein 0<k<1。
In the above scheme, in step (14), the specific method for merging the clustering centers is as follows: if the distance D between two nearest cluster centersijCCombining the corresponding categories of the two clustering centers into one category, canceling the clustering center status of the two clustering centers, and recalculating the clustering centers of the samples released from the two categories
Figure BDA0002441108120000071
And make the number of clustering centers NCMinus 1.
The above scheme obtains the maximum area in step (16)
Figure BDA0002441108120000072
The specific method comprises the following steps:
Figure BDA0002441108120000073
wherein ,
Figure BDA0002441108120000074
drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram of the dimensions of a template frame.
Detailed Description
The invention is further illustrated by the following figures and examples.
Examples
A neural network output layer clipping and template frame size determining method based on self-organizing clustering comprises the following steps:
(1) each target on the training data is represented by a two-dimensional feature vector (w, h), wherein w is the width of a target pixel, h is the height of the target pixel, the number of the two-dimensional feature vectors (w, h) is represented by N, the two-dimensional feature vectors (w, h) are referred to as samples, and the samples are represented by x.
(2) Setting the initial clustering center number of the N samples obtained in the step (1) to be K and the minimum sample number in clustering to be thetaNAnd the standard deviation of the sample distance distribution in the clustering is thetaSThe minimum distance between the two cluster centers is thetaCThe maximum iterative operation number is Imax
(3) Randomly selecting K samples from N samples as initial clustering centers, and enabling N samples to be selected as initial clustering centersC=K,NCRepresenting the number of current cluster centers, each cluster center being represented by ZjDenotes, j ═ 1,2, …, NCS for the category corresponding to each cluster centerjDenotes, j ═ 1,2, …, NCClass SjThe number of samples in (1) is NjDenotes, j ═ 1,2, …, NCThe number of iterative operations is represented by I, and I is 1.
(4) Traversing all the samples x, calculating the samples x and each clustering center ZjA distance D betweenjSample x is classified into the category corresponding to the cluster center with the smallest distance to sample x.
(5) If a certain class SjNumber of middle samples NjNThen cancel the category and make the current cluster center number NCReducing by 1, and classifying the samples in the category into other categories according to the minimum distance criterion in (4); otherwise to class SjNo treatment is done.
(6) For each category SjThe average value of the samples x is the corrected clustering center Zj,j=1,2,…,NC
(7) For each category SjCalculating the average distance of the samples in each category to the cluster center
Figure BDA0002441108120000081
j=1,2,…,NC
(8) Calculating the total average distance between all the class samples and the corresponding cluster center
Figure BDA0002441108120000082
(9) Judging the class SjSplit, merge and iterative operations.
1) If the iterative operation times I is more than or equal to ImaxI.e. the last iteration, byCAnd (4) jumping to the step (13) when the value is 0.
2) If theta is greater than thetaNK/2, namely the number of the cluster centers is equal to or less than half of the specified value, the step (10) is entered, and the existing clusters are split.
3) If the number of iterations I is even, or NCIf the K is more than or equal to 2K, the splitting treatment is not carried out, and the step (13) is skipped; if I is odd, and NC<And 2K, entering the step (10) to perform splitting treatment.
(10) Calculate each class SjFrom medium samples x to cluster center ZjIs a standard deviation vector ofj,j=1,2,…,NC
(11) For the standard deviation vector sigma calculated in (10)jExtracting the maximum component by sigmajmaxDenotes, j ═ 1,2, …, NC
(12) Set of maximum components σjmax},j=1,2,…,NCIn, if there is σjmaxSAnd either of the following two conditions is satisfied:
(1)
Figure BDA0002441108120000083
and N isj>(θN+1)*2;
(2) NC≤K/2;
Then Z will bejSplitting into two new cluster centers, and counting the number N of the cluster centersCAnd adding 1. After the splitting operation is finished, jumping back to the step (4), and adding 1 to the iterative operation times I; otherwise, the clustering center Z is not alignedjThe operation proceeds to step (13).
(13) Calculating NCDistance D between every two clustering centersij=||Zi-Zj||,i=1,2,…,NC-1,j=i+1,2,…,NC
(14) If the distance D between two nearest cluster centersijCThen cluster the twoCombining the centers into a new cluster center, combining the corresponding categories of the two cluster centers into a category, and counting the number N of the cluster centersCSubtracting 1; otherwise, no processing is performed.
(15) If the iterative operation times I is more than or equal to ImaxAnd (5) ending the clustering operation, entering the step (16), and otherwise, returning to the step (4) and adding 1 to the iterative operation times I.
(16) To NCIndividual clustering center Zj=(wj,hj) Calculating the upper limit area
Figure BDA0002441108120000091
To NCAn
Figure BDA0002441108120000092
Sorting from small to large to obtain the maximum upper limit area
Figure BDA0002441108120000093
(17) According to
Figure BDA0002441108120000094
Judging the number L of output layersout
If it is
Figure BDA0002441108120000095
The number of output layers is Lout=6;
If it is
Figure BDA0002441108120000096
The number of output layers is Lout=5;
If it is
Figure BDA0002441108120000097
The number of output layers is Lout=4;
If it is
Figure BDA0002441108120000098
The number of output layers is Lout=3;
If it is
Figure BDA0002441108120000099
The number of output layers is Lout=2。
(18) There are 6 output layers in the SSD algorithm, conv4_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
If L isoutOnly conv4_3, fc7 layers are reserved when the number is 2, and convolutional layers after fc7 are deleted;
if L isoutOnly conv4_3, fc7, conv8_2 are reserved when the value is 3, and the convolutional layer after conv8_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2 and conv9_2 are reserved for 4, and the convolution layer behind conv9_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2, conv9_2 and conv10_2 are reserved when the convolutional layer after conv10_2 is deleted when the convolutional layer is 5;
if L isoutNo pruning is done for the SSD network 6.
(19) Determining the output layer where the corresponding template frame is located for each clustering center:
to NCIndividual clustering center Zj=(wj,hj) Calculating the area Aj=wj×hj
If Aj>(300/3)2Designing a corresponding template box at the conv11_2 layer;
if (300/5)2<Aj≤(300/3)2Designing a corresponding template box at the conv10_2 layer;
if (300/10)2<Aj≤(300/5)2Designing a corresponding template box at the conv9_2 layer;
if (300/19)2<Aj≤(300/10)2Designing a corresponding template box at the conv8_2 layer;
if (300/38)2<Aj≤(300/19)2Then the corresponding template box is designed at fc7 level;
if Aj≤(300/38)2The corresponding template box is designed at the conv4_3 layer.
(20) Determining the corresponding template frame size of each cluster center:
NCindividual clustering center Zj=(wj,hj) The corresponding template frame sizes are respectively:
Figure BDA0002441108120000101
max_size=max(wj,hj)
Figure BDA0002441108120000102
where floor () is rounded down.
(21) If some output layer does not design corresponding template frames after the corresponding template frames are designed for all the cluster centers, designing according to the following criteria:
if a certain output layer does not design the corresponding template frame, the min _ size, max _ size, aspect _ ratio parameter of the layer closest to it is used. If two output layers are at the same distance from the Layer, the shallow Layer is LayerBDeep Layer is LayerTThen the output layer parameter
Figure BDA0002441108120000103
Layer adopted by aspect _ ratioBAspect _ ratio ofB and LayerTAspect _ ratio ofTThe union of (a).
(22) A template frame ratio of aspect _ ratio 1 is added to all output layers.
(23) And training the convolutional neural network which finishes the cutting of the output layer and the determination of the size of the template frame to obtain a neural network model with fewer layers, lower complexity and higher computational efficiency.
In the above scheme, in step (1), the specific method for extracting the target width w and height h in the labeling information comprises: reading the values < xmin >, < ymin >, < xmax >, < ymax > in each < bndbox > node in xml, and calculating the width w ═ xmax-xmin +1 and the height h ═ ymax-ymin +1 of the target.
In the scheme, in the step (3), K samples are randomly selectedThe method comprises generating K random numbers α with uniform distribution between 0 and 1 according to U (0,1)12,…,αKTake the ceil (a)iN) samples as the ith initial cluster center, with ceil () rounded up.
In step (4), the sample x and the clustering center ZjThe distance between the two sensors is calculated by the following method: dj=||x-Zj||。
In the scheme, in the step (4), the classification method of the sample x is as follows: if it is
Figure BDA0002441108120000111
Then the sample x is attributed to SjAnd (4) class.
In the above scheme, in step (5), the specific method for canceling a certain category is: the cluster center is cancelled, so that the number of the cluster centers is NCSubtracting 1, releasing the samples originally belonging to the category, calculating the distance between the released samples and other cluster centers, and classifying the released samples into which category the distance between the released samples and which cluster center is closest.
In the step (6), the clustering centers Z of all the categories are correctedjThe specific method comprises the following steps:
Figure BDA0002441108120000112
the above scheme calculates each class S in step (7)jAverage distance of the sample to the cluster center in (1)
Figure BDA0002441108120000116
The specific method comprises the following steps:
Figure BDA0002441108120000113
in step (8), the total average distance between all the category samples and the corresponding cluster center is calculated
Figure BDA0002441108120000114
The specific method comprises the following steps:
Figure BDA0002441108120000115
the above scheme calculates each class S in step (10)jWhere each sample x ═ xw,xh) To the clustering center Zj=(wj,hj) Is a standard deviation vector ofj=(σw,jh,j) The specific method comprises the following steps:
Figure BDA0002441108120000121
Figure BDA0002441108120000122
in step (11), the above scheme extracts sigma in each standard deviation vectorj=(σw,jh,j) Maximum component σ ofjmaxThe specific method comprises the following steps:
σjmax=max(σw,jh,j)
in step (12), Z isjThe specific method for splitting into two new clustering centers is as follows: clustering SjFrom medium samples x to cluster center ZjHas a standard deviation ofj=(σw,jh,j) If σ isw,j≥σh,jLet γ be (σ)w,j0); if σw,jh,jLet γ equal to (0, σ)h,j)。ZjThe two new cluster centers split are respectively: zj+kγ and Zj-k γ, wherein 0<k<1。
In the above scheme, in step (14), the specific method for merging the clustering centers is as follows: if the distance D between two nearest cluster centersijCCombining the corresponding categories of the two clustering centers into one category, canceling the clustering center status of the two clustering centers, and recalculating the clustering centers of the samples released from the two categories
Figure BDA0002441108120000123
And make the number of clustering centers NCMinus 1.
The above scheme obtains the maximum area in step (16)
Figure BDA0002441108120000124
The specific method comprises the following steps:
Figure BDA0002441108120000125
wherein ,
Figure BDA0002441108120000126
fig. 1 shows a specific implementation process of the SSD network output layer clipping and template frame size determining method based on self-organizing clustering according to the present invention.
In fig. 1, "extracting width and height of a target in training data as a feature vector sample" corresponds to step (1):
all the training data are 1000 pictures in total, all the targets in all the pictures are traversed, the values < xmin >, < ymin >, < xmax >, < ymax > in the target marking information < bndbnbdbox > node are read, the width w of the target is xmax-xmin +1, the height h is ymax-ymin +1, the (w, h) is used as a two-dimensional feature vector sample x to be recorded, the subsequent operation is carried out, and the number N of the feature vectors is recorded as 1858 in the embodiment.
"parameter initialization" in fig. 1 corresponds to step (2):
in this embodiment, the initial cluster center number K is set to 6, and the minimum number of samples θ in the clusterN80, standard deviation θ of sample distance distribution in clusterS5, minimum distance θ between two cluster centersCMaximum number of iterations I, 5max=100。
The step (3) of "randomly selecting initial clustering centers" in fig. 1 corresponds to:
generating K random numbers α in a uniform distribution between 0 and 112,…,αKTake the ceil (a)iN) samples as the ith initial clustering center, wherein ceil () is rounded up, let NCK, the number of iterations I is 1.
"samples are sorted by minimum distance criterion" in fig. 1 corresponding to step (4):
traversing all the samples x, calculating the sample x and the clustering center ZjA distance D betweenj=||x-ZjAnd | l, classifying the sample x into a category corresponding to the cluster center with the minimum distance from the sample x.
The "cancel category with too small number of samples" in fig. 1 corresponds to step (5):
if a certain class SjNumber of middle samples NjNThen cancel the category and make the current cluster center number NCReducing 1, releasing samples originally belonging to the category, calculating the distance between the released samples and other cluster centers, and classifying the released samples into which category when the distance between the released samples and which cluster center is closest; otherwise to class SjNo treatment is done.
"correcting cluster center" in fig. 1 corresponds to step (6):
for each category SjAll samples in (1) are averaged to obtain an average value which is the corrected clustering center
Figure BDA0002441108120000131
j=1,2,…,Nc
In fig. 1, "calculating the average distance from the samples in each class to the cluster center" corresponds to step (7):
calculate each class SjSample to cluster center Z injAverage distance of
Figure BDA0002441108120000132
"calculate the total average distance of all class samples from their respective cluster centers" in fig. 1 corresponds to step (8):
calculating the total average distance between all the class samples and the corresponding cluster centers
Figure BDA0002441108120000133
In fig. 1, "splitting, merging, and iterative operations for judgment category" corresponds to step (9):
judging the splitting, merging and iterative operation of the category, judging whether the current state needs to be split, if so, jumping to the step (10), and if not, jumping to the step (13), wherein the specific judgment method comprises the following steps:
1) if the iterative operation times I is more than or equal to ImaxI.e. the last iteration, byCAnd (4) jumping to the step (13) when the value is 0.
2) If theta is greater than thetaNK/2, namely the number of the cluster centers is equal to or less than half of the specified value, the step (10) is entered, and the existing clusters are split.
3) If the number of iterations I is even, or NCIf the K is more than or equal to 2K, the splitting treatment is not carried out, and the step (13) is skipped; if I is odd, and NC<And 2K, entering the step (10) to perform splitting treatment.
In fig. 1, "calculating the standard deviation of each class sample to the cluster center" corresponds to step (10):
calculate each class SjWhere each sample x ═ xw,xh) To the clustering center ZjIs a standard deviation vector ofj=(σw,jh,j) The specific method comprises the following steps:
Figure BDA0002441108120000141
Figure BDA0002441108120000142
the "obtaining the largest component in standard deviation" described in fig. 1 corresponds to step (11):
extracting sigma in each standard deviation vectorj=(σw,jh,j) Maximum component σ ofjmax=max(σw,jh,j)。
The "class splitting for classes satisfying the splitting condition" described in fig. 1 corresponds to step (12):
if each class SjHas a injmaxSAnd either of the following two conditions is satisfied:
(1)
Figure BDA0002441108120000143
and N isj>(θN+1)*2;
(2)NC≤K/2;
Then Z will bejSplit into two new cluster centers, class SjFrom medium samples x to cluster center ZjHas a standard deviation ofj=(σw,jh,j) If σ isw,j≥σh,jLet γ be (σ)w,j0); if σw,jh,jLet γ equal to (0, σ)h,j)。ZjTwo new clustering centers which are split are respectively Zj+kγ and Zj-k γ, wherein 0<k<1, in this embodiment, k is equal to 0.5, and let the number of clustering centers NCAdding 1; otherwise, the clustering center Z is not alignedjThe operation proceeds to step (13).
And (4) after the splitting operation is finished, adding 1 to the iterative operation times I, and returning to the step (4).
The step (13) of calculating the distance between every two clustering centers described in fig. 1 corresponds to:
calculating NCDistance D between every two clustering centersij=||Zi-Zj||,i=1,2,…,NC-1,j=i+1,2,…,NC
The "merge classes satisfying the merge condition" described in fig. 1 corresponds to step (14):
if the distance D between two nearest cluster centersijCCombining the corresponding categories of the two clustering centers into a category, combining the two clustering centers into a new clustering center, canceling the clustering center positions of the two clustering centers, and recalculating the clustering centers of the two released samples
Figure BDA0002441108120000151
And make the number of clustering centers NCMinus 1.
The "judge whether the iteration is finished" in fig. 1 corresponds to step (15):
if the iterative operation times I is more than or equal to ImaxAnd (5) ending the clustering operation, entering the step (16), and otherwise, returning to the step (4) and adding 1 to the iterative operation times I.
The "calculating the maximum upper limit area of the cluster center" described in fig. 1 corresponds to step (16):
to NCIndividual cluster center vector (w)j,hj) Calculating the upper limit area
Figure BDA0002441108120000152
To NCAn
Figure BDA0002441108120000153
Sorting from small to large to obtain the maximum upper limit area
Figure BDA0002441108120000154
In this embodiment, 7 cluster centers are obtained at the end of clustering, which are respectively (10.9,28.7), (27.2,12.1), (9.8,4.5), (6.8,11.4), (13.9,21.7), (19.7,15.4), (11.9,10.2), and the upper limit area of each cluster center is calculated
Figure BDA0002441108120000155
Figure BDA0002441108120000156
So that the maximum upper limit area is
Figure BDA0002441108120000157
The "judgment of the number of output layers" in fig. 1 corresponds to step (17):
according to
Figure BDA0002441108120000158
Judging the number L of output layersout
If it is
Figure BDA0002441108120000159
The number of output layers is Lout=6;
If it is
Figure BDA00024411081200001510
The number of output layers is Lout=5;
If it is
Figure BDA00024411081200001511
The number of output layers is Lout=4;
If it is
Figure BDA00024411081200001512
The number of output layers is Lout=3;
If it is
Figure BDA0002441108120000161
The number of output layers is Lout=2。
In this example
Figure BDA0002441108120000162
So that the number of output layers Lout=3。
The "pruning SSD network" described in fig. 1 corresponds to step (18):
there are 6 output layers in the SSD algorithm, conv4_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
If L isoutOnly conv4_3, fc7 layers are reserved when the number is 2, and convolutional layers after fc7 are deleted;
if L isoutOnly conv4_3, fc7, conv8_2 are reserved when the value is 3, and the convolutional layer after conv8_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2 and conv9_2 are reserved for 4, and the convolution layer behind conv9_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2, conv9_2, conv10_2 are reserved when the conv10_2 is deleted when the conv is 5The convolutional layer of (1);
if L isoutNo pruning is done for the SSD network 6.
In this example LoutTherefore, only conv4_3, fc7, conv8_2 output layers are reserved, and convolutional layers after conv8_2 are deleted.
The step (19) corresponding to "determining the output layer where the corresponding template box is located for each cluster center" described in fig. 1:
for each clustering center Zj=(wj,hj) Calculating the area Aj=wj×hj
If Aj>(300/3)2Designing a corresponding template box at the conv11_2 layer;
if (300/5)2<Aj≤(300/3)2Designing a corresponding template box at the conv10_2 layer;
if (300/10)2<Aj≤(300/5)2Designing a corresponding template box at the conv9_2 layer;
if (300/19)2<Aj≤(300/10)2Designing a corresponding template box at the conv8_2 layer;
if (300/38)2<Aj≤(300/19)2Then the corresponding template box is designed at fc7 level;
if Aj≤(300/38)2The corresponding template box is designed at the conv4_3 layer.
The cluster centers in this example are (10.9,28.7), (27.2,12.1), (9.8,4.5), (6.8,11.4), (13.9,21.7), (19.7,15.4), (11.9, 10.2); the area of each cluster center is A1=312.83,A2=329.12,A3=44.10,A4=77.52,A5=301.63,A6=303.38,A7121.38, the corresponding template frames are located in conv8_2 layer, conv8_2 layer, conv4_3 layer, fc7 layer, conv8_2 layer, conv8_2 layer and fc7 layer.
The "determine their respective template box size for each cluster center" correspondence step (20) described in fig. 1:
NCindividual clustering center Zj=(wj,hj) The corresponding template frame dimensions are:
Figure BDA0002441108120000171
max_size=max(wj,hj)
Figure BDA0002441108120000172
where floor () is rounded down.
In this example, the cluster centers are (10.9,28.7), (27.2,12.1), (9.8,4.5), (6.8,11.4), (13.9,21.7), (19.7,15.4), (11.9,10.2), respectively. The sizes of the template frames corresponding to the clustering centers are calculated as follows:
clustering center 1: min _ size 17.7; max _ size 28.7; aspect _ ratio is 2
Cluster center 2: min _ size 18.1; max _ size 27.2; aspect _ ratio is 2
Clustering center 3: min _ size ═ 6.6; max _ size 9.8; aspect _ ratio is 2
Cluster center 4: min _ size ═ 8.8; max _ size 11.4; aspect _ ratio is 1
Cluster center 5: min _ size 17.3; max _ size ═ 21.7; aspect _ ratio is 1
Cluster center 6: min _ size 17.4; max _ size 19.7; aspect _ ratio is 1
The clustering center 7: min _ size ═ 11.0; max _ size 11.9; aspect _ ratio is 1
The "outputting layer for which the template box has not been designed as described in fig. 1 relates to the template box" corresponding step (21):
if a certain output layer does not design the corresponding template frame, the min _ size, max _ size, aspect _ ratio parameter of the layer closest to it is used. If two output layers are at the same distance from the Layer, the shallow Layer is LayerBDeep Layer is LayerTThen the output layer parameter
Figure BDA0002441108120000173
Layer adopted by aspect _ ratioBAspect _ ratio ofB and LayerTAspect _ ratio ofTThe union of (a).
In the present practical example, the conv4_3 layer, fc7 layer and conv8_2 layer output layer all have responsive template frames, so the present step operation is not performed.
conv4_3 layer: min _ size ═ 6.6; max _ size 9.8; aspect _ ratio is 2
fc7 layer: min _ size ═ 8.8; max _ size 11.4; aspect _ ratio is 1
min_size=11.0;max_size=11.9;aspect_ratio=1
conv8_2 layer: min _ size 17.7; max _ size 28.7; aspect _ ratio is 2
min_size=18.1;max_size=27.2;aspect_ratio=2
min_size=17.3;max_size=21.7;aspect_ratio=1
min_size=17.4;max_size=19.7;aspect_ratio=1
The "template frame ratio with aspect _ ratio of 1 added to all output layers" described in fig. 1 corresponds to step (22):
in the present embodiment, the conv4_3 layer has no template frame proportion of "1" and thus "1" is added; the fc7 layer and conv8_2 layer both have a template frame ratio of 1 and therefore do not need to be added.
The network structure determined at the end of the scheme reserves conv4_3, fc7 and conv8_2 output layers for the SSD network, deletes the convolution layer after conv8_2, and the template frame design of each output layer is as follows:
conv4_3 layer:
min_size=6.6
max_size=9.8
aspect_ratio=1,2
fc7 layer:
min_size=8.8,11.0
max_size=11.4,11.9
aspect_ratio=1
conv8_2 layer:
min_size=17.7,18.1,17.3,17.4
max_size=28.7,27.2,21.7,19.7
aspect_ratio=1,2
a schematic of the dimensions of the template frame is shown in fig. 2.
Before the SSD network is modified, the network converges to the MAP of 0.9 which needs to be iterated 35000 times, and after the SSD network is modified, the network converges to the MAP of 0.9 which only needs 23000 times, which shows that the scheme can remove unnecessary output layers, reduce the network depth, reduce the network complexity and reduce the model training convergence difficulty. Before the SSD network is modified, the time consumed by network calculation is 29ms, and after the SSD network is modified, the time consumed by network calculation is 20ms, so that the calculation efficiency is improved.
The invention uses the self-organizing clustering algorithm to perform clustering analysis on the sizes of the training samples, determines the number of SSD network output layers according to clustering results, and cuts the network.
The self-organizing clustering is used for obtaining a better clustering result under the condition that the size distribution of the target is uncertain, the clustering result is used for calculating the upper limit area of the target, the number of layers of an output layer is determined, the output layers with overlarge receptive field and overlarge number of layers are deleted, the network depth and the number of parameters are reduced, the difficulty of model training is reduced, the convergence of the model is accelerated, the generalization capability of the model is improved, the time consumed by calculation is reduced, and the calculation efficiency is improved.
The invention uses the self-organizing clustering algorithm to perform clustering analysis on the sizes of the training samples, and determines the size of the template frame according to the clustering result.
The size of the template frame is designed by using the self-organizing clustering result, so that the size of the template frame is closer to the real size of the target, the regression difficulty of the network on the position deviation of the target is reduced, and the accuracy of target detection is improved.

Claims (10)

1. A neural network output layer cutting and template frame size determining method based on self-organizing clustering is characterized in that the method comprises the following steps:
(1) each target on the training data is represented by a two-dimensional feature vector (w, h), wherein w is the width of a target pixel, h is the height of the target pixel, the number of the two-dimensional feature vectors (w, h) is represented by N, the two-dimensional feature vectors (w, h) are referred to as samples, and the samples are represented by x.
(2) Setting the initial clustering center number of the N samples obtained in the step (1) to be K and the minimum sample number in clustering to be thetaNAnd the standard deviation of the sample distance distribution in the clustering is thetaSThe minimum distance between the two cluster centers is thetaCThe maximum iterative operation number is Imax
(3) Randomly selecting K samples from N samples as initial clustering centers, and enabling N samples to be selected as initial clustering centersC=K,NCRepresenting the number of current cluster centers, each cluster center being represented by ZjDenotes, j ═ 1,2, …, NCS for the category corresponding to each cluster centerjDenotes, j ═ 1,2, …, NCClass SjThe number of samples in (1) is NjDenotes, j ═ 1,2, …, NCThe iterative operation times are represented by I;
(4) traversing all the samples x, calculating the samples x and each clustering center ZjA distance D betweenjClassifying the sample x into a category corresponding to the clustering center with the minimum distance to the sample x;
(5) if a certain class SjNumber of middle samples NjNThen cancel the category and make the current cluster center number NCReducing by 1, and classifying the samples in the category into other categories according to the minimum distance criterion in (4); otherwise not to class SjProcessing;
(6) for each category SjThe average value of the samples x is the corrected clustering center Zj,j=1,2,…,NC
(7) For each category SjCalculating the average distance of the samples in each category to the cluster center
Figure FDA0002441108110000012
j=1,2,…,NC
(8) Calculating the total average distance between all the class samples and the corresponding cluster center
Figure FDA0002441108110000011
(9) Judging the class SjIs divided intoSplitting, merging and iterative operation;
1) if the iterative operation times I is more than or equal to ImaxI.e. the last iteration, byCJumping to the step (13) when the value is 0;
2) if theta is greater than thetaNK/2 or less, namely the number of the clustering centers is equal to or less than half of the specified value, entering the step (10) and splitting the existing clusters;
3) if the number of iterations I is even, or NCIf the K is more than or equal to 2K, the splitting treatment is not carried out, and the step (13) is skipped; if I is odd, and NC<2K, entering the step (10) to perform splitting treatment;
(10) calculate each class SjFrom medium samples x to cluster center ZjIs a standard deviation vector ofj,j=1,2,…,NC
(11) For the standard deviation vector sigma calculated in (10)jExtracting the maximum component by sigmajmaxDenotes, j ═ 1,2, …, NC
(12) Set of maximum components σjmaxWhere j is 1,2, …, NCIf there is σjmaxSAnd either of the following two conditions is satisfied:
(a)
Figure FDA0002441108110000021
and N isj>(θN+1)*2;
(b)NC≤K/2;
Then Z will bejSplitting into two new cluster centers, and counting the number N of the cluster centersCAdding 1, jumping back to the step (4) after the splitting operation is completed, and adding 1 to the iterative operation frequency I; otherwise, the clustering center Z is not alignedjPerforming operation, and entering the step (13);
(13) calculating NCDistance D between every two clustering centersij=||Zi-Zj||,i=1,2,…,NC-1,j=i+1,2,…,NC
(14) If the distance D between two nearest cluster centersijCThen the two cluster centers are merged intoA new cluster center, the corresponding categories of the two cluster centers are merged into a category, and the number N of the cluster centers is setCSubtracting 1; otherwise, no processing is carried out;
(15) if the iterative operation times I is more than or equal to ImaxAfter the clustering operation is finished, entering the step (16), otherwise returning to the step (4) and adding 1 to the iterative operation times I;
(16) to NCIndividual clustering center Zj=(wj,hj) Calculating the upper limit area
Figure FDA0002441108110000022
To NCAn
Figure FDA0002441108110000023
Sorting from small to large to obtain the maximum upper limit area
Figure FDA0002441108110000024
(17) According to
Figure FDA0002441108110000025
Judging the number L of output layersout
If it is
Figure FDA0002441108110000026
The number of output layers is Lout=6;
If it is
Figure FDA0002441108110000031
The number of output layers is Lout=5;
If it is
Figure FDA0002441108110000032
The number of output layers is Lout=4;
If it is
Figure FDA0002441108110000033
The number of output layers isLout=3;
If it is
Figure FDA0002441108110000034
The number of output layers is Lout=2。
(18) There are 6 output layers in the SSD algorithm, conv4_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
If L isoutOnly conv4_3, fc7 layers are reserved when the number is 2, and convolutional layers after fc7 are deleted;
if L isoutOnly conv4_3, fc7, conv8_2 are reserved when the value is 3, and the convolutional layer after conv8_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2 and conv9_2 are reserved for 4, and the convolution layer behind conv9_2 is deleted;
if L isoutOnly conv4_3, fc7, conv8_2, conv9_2 and conv10_2 are reserved when the convolutional layer after conv10_2 is deleted when the convolutional layer is 5;
if L isoutNo pruning is performed on the SSD network if 6;
(19) determining the output layer where the corresponding template frame is located for each clustering center:
to NCIndividual clustering center Zj=(wj,hj) Calculating the area Aj=wj×hj
If Aj>(300/3)2Designing a corresponding template box at the conv11_2 layer;
if (300/5)2<Aj≤(300/3)2Designing a corresponding template box at the conv10_2 layer;
if (300/10)2<Aj≤(300/5)2Designing a corresponding template box at the conv9_2 layer;
if (300/19)2<Aj≤(300/10)2Designing a corresponding template box at the conv8_2 layer;
if (300/38)2<Aj≤(300/19)2Then the corresponding template box is designed at fc7 level;
if Aj≤(300/38)2The corresponding template box is designed at the conv4_3 layer.
(20) Determining the corresponding template frame size of each cluster center:
NCindividual clustering center Zj=(wj,hj) The corresponding template frame sizes are respectively:
Figure FDA0002441108110000035
max_size=max(wj,hj)
Figure FDA0002441108110000041
wherein floor () is rounded down;
(21) if some output layer does not design corresponding template frames after the corresponding template frames are designed for all the cluster centers, designing according to the following criteria:
if a certain output Layer is not designed with a corresponding template frame, the min _ size, max _ size and aspect _ ratio parameters of the Layer closest to the output Layer are adopted, if two output layers are the same distance away from the Layer, the shallow Layer is LayerBDeep Layer is LayerTThen the output layer parameter
Figure FDA0002441108110000042
Figure FDA0002441108110000043
Layer adopted by aspect _ ratioBAspect _ ratio ofB and LayerTAspect _ ratio ofTA union of (1);
(22) adding a template frame proportion of aspect _ ratio equal to 1 to all output layers;
(23) and training the convolutional neural network which finishes the cutting of the output layer and the determination of the size of the template frame to obtain a neural network model.
2. The method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in the step (1), the specific method for extracting the width w and the height h of the target in the labeling information comprises the following steps: reading the values < xmin >, < ymin >, < xmax >, < ymax > in each < bndbox > node in xml, and calculating the width w ═ xmax-xmin +1 and the height h ═ ymax-ymin +1 of the target.
3. The method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein in the step (3), K samples are randomly selected by generating K random numbers α according to the uniform distribution of U (0,1) between 0 and 112,…,αKTake the ceil (a)iN) samples as the ith initial cluster center, with ceil () rounded up.
4. The method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in the step (4), the sample x and the clustering center ZjThe distance between the two sensors is calculated by the following method: dj=||x-ZjThe classification method of the sample x is as follows: if it is
Figure FDA0002441108110000044
Then the sample x is attributed to SjAnd (4) class.
5. The method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in the step (5), the specific method for canceling a certain category is as follows: the cluster center is cancelled, so that the number of the cluster centers is NCSubtracting 1, releasing the sample originally belonging to the category, calculating the distance between the released sample and other cluster centers, and classifying the released sample into the category when the distance between the released sample and which cluster center is the nearest.
6. The method of claim 1 for neural network output layer clipping and template box size determination based on self-organizing clusteringThe method is characterized in that: in the step (6), modifying the clustering center Z of each categoryjThe specific method comprises the following steps:
Figure FDA0002441108110000051
in step (7), each class S is calculatedjAverage distance of the sample to the cluster center in (1)
Figure FDA0002441108110000055
The specific method comprises the following steps:
Figure FDA0002441108110000052
in the step (8), the total average distance between all the category samples and the corresponding cluster center is calculated
Figure FDA0002441108110000056
The specific method comprises the following steps:
Figure FDA0002441108110000053
7. the method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in step (10), each class S is calculatedjWhere each sample x ═ xw,xh) To the clustering center Zj=(wj,hj) Is a standard deviation vector ofj=(σw,jh,j) The specific method comprises the following steps:
Figure FDA0002441108110000054
8. the method of claim 1 for neural network output layer clipping and template box sizing based on self-organizing clustering, whereinIs characterized in that: in step (11), σ in each standard deviation vector is extractedj=(σw,jh,j) Maximum component σ ofjmaxThe specific method comprises the following steps:
σjmax=max(σw,jh,j)。
9. the method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in step (12), Z isjThe specific method for splitting into two new clustering centers is as follows: clustering SjFrom medium samples x to cluster center ZjHas a standard deviation ofj=(σw,jh,j) If σ isw,j≥σh,jLet γ be (σ)w,j0); if σw,jh,jLet γ equal to (0, σ)h,j)。ZjThe two new cluster centers split are respectively: zj+kγ and Zj-k γ, wherein 0<k<1。
10. The method for neural network output layer clipping and template box size determination based on self-organizing clustering as claimed in claim 1, wherein: in the step (14), the specific method for merging the clustering centers comprises the following steps: if the distance D between two nearest cluster centersijCCombining the corresponding categories of the two clustering centers into one category, canceling the clustering center status of the two clustering centers, and recalculating the clustering centers of the samples released from the two categories
Figure FDA0002441108110000061
And make the number of clustering centers NCSubtracting 1;
in the step (16), the maximum area is obtained
Figure FDA0002441108110000062
The specific method comprises the following steps:
Figure FDA0002441108110000063
wherein ,
Figure FDA0002441108110000064
CN202010265447.7A 2020-04-07 2020-04-07 Neural network output layer cutting and template frame size determining method based on self-organizing clustering Active CN111524098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010265447.7A CN111524098B (en) 2020-04-07 2020-04-07 Neural network output layer cutting and template frame size determining method based on self-organizing clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010265447.7A CN111524098B (en) 2020-04-07 2020-04-07 Neural network output layer cutting and template frame size determining method based on self-organizing clustering

Publications (2)

Publication Number Publication Date
CN111524098A true CN111524098A (en) 2020-08-11
CN111524098B CN111524098B (en) 2023-05-12

Family

ID=71901605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010265447.7A Active CN111524098B (en) 2020-04-07 2020-04-07 Neural network output layer cutting and template frame size determining method based on self-organizing clustering

Country Status (1)

Country Link
CN (1) CN111524098B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095418A (en) * 2021-04-19 2021-07-09 航天新气象科技有限公司 Target detection method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161606A1 (en) * 2015-12-06 2017-06-08 Beijing University Of Technology Clustering method based on iterations of neural networks
CN108898154A (en) * 2018-09-29 2018-11-27 华北电力大学 A kind of electric load SOM-FCM Hierarchical clustering methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161606A1 (en) * 2015-12-06 2017-06-08 Beijing University Of Technology Clustering method based on iterations of neural networks
CN108898154A (en) * 2018-09-29 2018-11-27 华北电力大学 A kind of electric load SOM-FCM Hierarchical clustering methods

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘绚;文俊;刘天琪: "基于自组织神经网络的模糊聚类同调机群识别" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095418A (en) * 2021-04-19 2021-07-09 航天新气象科技有限公司 Target detection method and system
CN113095418B (en) * 2021-04-19 2022-02-18 航天新气象科技有限公司 Target detection method and system

Also Published As

Publication number Publication date
CN111524098B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN108229550B (en) Cloud picture classification method based on multi-granularity cascade forest network
Xiao et al. A fast method for particle picking in cryo-electron micrographs based on fast R-CNN
CN111275044A (en) Weak supervision target detection method based on sample selection and self-adaptive hard case mining
CN112163628A (en) Method for improving target real-time identification network structure suitable for embedded equipment
CN111507426B (en) Non-reference image quality grading evaluation method and device based on visual fusion characteristics
CN111539247B (en) Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN112233129B (en) Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device
CN109490838A (en) A kind of Recognition Method of Radar Emitters of data base-oriented incompleteness
CN112052877B (en) Picture fine granularity classification method based on cascade enhancement network
CN111368935A (en) SAR time-sensitive target sample augmentation method based on generation countermeasure network
CN111833322B (en) Garbage multi-target detection method based on improved YOLOv3
CN112489168A (en) Image data set generation and production method, device, equipment and storage medium
CN114998602A (en) Domain adaptive learning method and system based on low confidence sample contrast loss
CN115565019A (en) Single-channel high-resolution SAR image ground object classification method based on deep self-supervision generation countermeasure
CN111524098A (en) Neural network output layer cutting and template frame size determining method based on self-organizing clustering
CN112528058B (en) Fine-grained image classification method based on image attribute active learning
CN113627481A (en) Multi-model combined unmanned aerial vehicle garbage classification method for smart gardens
CN110503049B (en) Satellite video vehicle number estimation method based on generation countermeasure network
CN115588178B (en) Automatic extraction method for high-precision map elements
CN116704585A (en) Face recognition method based on quality perception
CN114821174B (en) Content perception-based transmission line aerial image data cleaning method
CN115880477A (en) Apple detection positioning method and system based on deep convolutional neural network
CN115410035A (en) Air traffic controller unsafe behavior classification method based on monitoring video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant