CN111178405A - Similar object identification method fusing multiple neural networks - Google Patents
Similar object identification method fusing multiple neural networks Download PDFInfo
- Publication number
- CN111178405A CN111178405A CN201911310303.2A CN201911310303A CN111178405A CN 111178405 A CN111178405 A CN 111178405A CN 201911310303 A CN201911310303 A CN 201911310303A CN 111178405 A CN111178405 A CN 111178405A
- Authority
- CN
- China
- Prior art keywords
- training
- image
- neural network
- setting
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 69
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 73
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 34
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 238000012795 verification Methods 0.000 claims abstract description 10
- 238000002372 labelling Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 238000011478 gradient descent method Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 4
- 238000013519 translation Methods 0.000 claims description 4
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 3
- 230000002238 attenuated effect Effects 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 235000020061 kirsch Nutrition 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a similar object identification method fusing multiple neural networks, which comprises the steps of acquiring a plurality of image sets with similar objects through a camera, preprocessing and labeling the image sets, expanding the processed data sets by using a data enhancement method, training the training sample sets by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net respectively, integrating the training results of the four networks, setting a multi-network voting mechanism according to the verification accuracy and outputting the results to obtain the final integrated network identification. And importing a real-time image of the object to be recognized through the camera, recognizing the object to be recognized through the integrated neural network, and finally outputting a recognition result of the object to be recognized in the image. The invention realizes the identification of similar objects under the condition of integrating a plurality of neural network identification results and can achieve the aim of identifying similar objects in a short time with high accuracy.
Description
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a similar object identification method fusing multiple neural networks for image data analysis and processing.
Background
In recent years, technological innovation in the field of image recognition is continuously developed and increased, and transformation and upgrading are actively carried out in many industries, so that the cost is reduced and the efficiency is improved by using the image recognition technology. Image recognition has become an important research direction in aspects such as warehouse storage and inventory or supermarket same-type article recognition, and the utilization of a computer to efficiently distinguish similar objects is beneficial to reducing labor intensity of workers, improving working efficiency and reducing early-stage cost.
In the prior art, besides the object discrimination by human, there are some techniques for performing image processing by a computer and discriminating similar objects in an image. Currently, the target identification is mainly performed as follows:
(1) preprocessing a template image and an image of an object to be detected to generate a feature library, detecting the edge of the object in the image through a Sobel operator, a Roberts operator, a Prewitt operator, a Kirsch operator and the like, removing the miscellaneous edges, burrs and the like of the edge image to keep the connectivity of the edge, and measuring the similarity of the object to be detected by using the image contour feature matching cost to distinguish and identify;
(2) identifying and distinguishing similar pictures by utilizing combination of Python and OpenCV, processing the images by algorithms such as average hash and perceptual hash, calculating a Hamming distance, and representing the similarity degree of the object by utilizing the length of the Hamming distance;
however, in the above technical solutions, the time taken for edge detection is long, the recognition efficiency of the Python and OpenCV combined method is not high, and both methods have certain limitations.
Disclosure of Invention
In view of the prior art and the background, the invention solves the problems that in the prior art, the traditional method for manually distinguishing similar objects takes longer time and consumes labor force, so that the competitiveness of an enterprise is hindered, a general computer neural network recognition algorithm takes longer time when recognizing the similar objects, the detection rate cannot meet the actual usable requirement, and the acquired images have very clear requirements.
The invention adopts the technical scheme that a similar object identification method fusing multiple neural networks comprises the following steps:
step 1: acquiring a plurality of image sets with similar objects through a camera;
step 2: preprocessing and labeling the image set;
and step 3: expanding the processed data set by using a data enhancement method to obtain a training sample set which meets the number of samples required by neural network training;
and 4, step 4: respectively training the training sample set by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net;
and 5: integrating the results of the four network training and setting a multi-network voting mechanism to output results according to the verification accuracy to obtain a final integrated neural network;
step 6: and (3) leading in the image of the object to be recognized in real time through the camera so as to recognize the object to be recognized through the integrated neural network, and finally outputting the recognition result of the object to be recognized in the image.
Further, in the step 1, the similar objects are objects of the same type and similar shapes.
Further, in the step 2, the image processing specifically includes the following steps:
step 2.1: manually screening and checking all the images in the image set, and removing the images with the object shielding rate exceeding a preset value;
step 2.2: and performing block diagram annotation on the screened image set, and adding the specific category and name of the object in the image into an annotation file.
Further, in step 3, the data enhancement includes the following steps:
step 3.1: for each image in the processed image set, one or more of means of random scaling, translation, rotation, overturning, pixel deletion, contrast adjustment and affine transformation are selected, so that the total amount of the image is expanded to 16 times and no repeated image exists;
step 3.2: calculating a surrounding frame of the object after data enhancement and storing the surrounding frame into a labeling file, realizing synchronous expansion of the picture set and the surrounding frame, and finishing data enhancement;
step 3.3: and obtaining a training sample set which meets the number of samples required by the neural network training after data enhancement.
Further, in the step 4, initializing the neural network training model includes the following steps:
step 4.1: initializing a Cascade R-CNN neural network training model; training a plurality of cascade detectors, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to attenuate once every A step length, setting the total number of training rounds, and training all data by each round;
step 4.2: initializing a Grid R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to be once attenuated per B step length, setting the total number of training rounds, and training all data per wheel;
step 4.3: initializing a Libra R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per C step length, setting the total number of training rounds, and training all data by each round;
step 4.4: initializing a Retina-Net neural network training model, setting a learning rate, momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per D step length, setting the total number of training rounds, and training all data by each round.
Further, in the step 5, according to the output accuracy of the four neural network model verification sets in the step 4, a weight is assigned to each network, a multi-network voting mechanism is established on the basis of the weight, and the voting mechanism integrates the recognition result of each network and the proportion weight to output a final result, so that the integrated neural network integrating the targets of various network recognition results is realized.
Further, in step 6, the integrated neural network can automatically process images transmitted by the camera in real time, and the processing speed reaches 0.2 s/frame.
The invention provides a similar object identification method fusing multiple neural networks, which integrates the advantages of a single network and integrates the advantages of multiple networks, acquires a large number of image sets with similar objects through a camera, performs a series of screening processing on the image sets, after marking data, enhances and expands the data sets through the data to obtain a training sample set meeting the requirements of the neural networks, trains a target data set by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net respectively, sets a learning rate, a momentum and a weight attenuation rate by using an optimization method of random gradient descent, determines corresponding weights according to a verification set, and finally sets a voting mechanism to perform integrated processing on the training results of the four networks to obtain a final multi-network identification result integrated model, the network can integrate the advantages of a single network and achieve higher recognition rate. And finally, introducing a real-time image of the object to be recognized into the environment through a camera, recognizing and identifying the object to be recognized through the integrated multi-neural-network recognition model, and outputting a recognition result of the object to be recognized in the image.
The invention integrates various neural networks, combines the advantages of a single network for identification, can be distinguished in a short time, has high identification rate, can efficiently and accurately identify similar objects of real-time images, solves the defects of low manual identification efficiency and high cost in the prior art, simultaneously makes up the problems of low identification speed and low accuracy of the existing single network, can meet the requirement of actual use, can be widely applied to various aspects such as warehouse article counting, supermarket article counting and the like in the future, can better realize the unmanned aim when being carried on the camera of the robot, and improves the operation efficiency.
Drawings
FIG. 1 is a flow chart of similar object recognition with multiple neural networks fused.
Fig. 2 is a diagram of the recognition effect displayed by the built test bed.
Detailed Description
The present invention is described in further detail with reference to the following examples, but the scope of the present invention is not limited thereto.
The invention relates to a similar object identification method fusing multiple neural networks, which comprises the following steps.
Step 1: acquiring a plurality of image sets with similar objects through a camera;
in the step 1, the similar objects are the same type and similar in shape.
In the present invention, similar objects, such as mineral water of different brands, are generally objects of the same type and similar shapes.
In the invention, the image set is shot manually to obtain representative data sets with different scales, angles, placing modes and illumination conditions, and the representative data sets can be selected by the technicians in the field.
Step 2: preprocessing and labeling the image set;
in the step 2, the image processing specifically includes the following steps:
step 2.1: manually screening and checking all the images in the image set, and removing the images with the object shielding rate exceeding a preset value;
step 2.2: and performing block diagram annotation on the screened image set, and adding the specific category and name of the object in the image into an annotation file.
In the present invention, the preset value in step 2.1 may be set based on the features of the image set to be recognized and trained, for example, 70%.
In the invention, the marking in the step 2.2 can be manually marked, and can also be based on computer identification and automatic framing identification.
And step 3: expanding the processed data set by using a data enhancement method to obtain a training sample set which meets the number of samples required by neural network training;
in step 3, the data enhancement includes the following steps:
step 3.1: for each image in the processed image set, one or more of means of random scaling, translation, rotation, overturning, pixel deletion, contrast adjustment and affine transformation are selected, so that the total amount of the image is expanded to 16 times and no repeated image exists;
step 3.2: calculating a surrounding frame of the object after data enhancement and storing the surrounding frame into a labeling file, realizing synchronous expansion of the picture set and the surrounding frame, and finishing data enhancement;
step 3.3: and obtaining a training sample set which meets the number of samples required by the neural network training after data enhancement.
In the invention, an embodiment is given for the data expansion in the step 3.1, such as setting the turnover probability to be 0.5, adjusting the contrast to be 1.5, setting the rotation angle to be between minus 45 degrees and 45 degrees, filtering the Gaussian to be 0-3.0, filtering the mean value to be 2-7, filtering the median value to be 3-11, missing ratio to be 0.01-0.1, turning the color channel to be 0.05, adjusting the brightness to be +/-10 pixel values, and setting the translation distance to be 10% of the image width.
In the invention, in step 3.2, because the enhancement means conforms to the mathematical rules, the surrounding frame or the polygonal mask of the enhanced object can be calculated through the mathematical rules and stored in the annotation file.
And 4, step 4: respectively training the training sample set by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net;
in step 4, initializing four separate neural network training models, which includes the following steps:
step 4.1: initializing a Cascade R-CNN neural network training model; training a plurality of cascade detectors, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to attenuate once every A step length, setting the total number of training rounds, and training all data by each round;
step 4.2: initializing a Grid R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to be once attenuated per B step length, setting the total number of training rounds, and training all data per wheel;
step 4.3: initializing a Libra R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per C step length, setting the total number of training rounds, and training all data by each round;
step 4.4: initializing a Retina-Net neural network training model, setting a learning rate, momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per D step length, setting the total number of training rounds, and training all data by each round.
In the invention, the Cascade R-CNN is composed of a residual convolutional neural network with the depth of 50 and a characteristic pyramid network with the depth of 4, wherein the number of layers of the residual convolutional neural network is 4, and the number of channels of each layer of the characteristic pyramid is 256,512,1024,2048. The number of input channels of the area generation network of the residual convolutional neural network is 256, the number of output channels is 256, the aspect ratio of a window frame is 0.5,1.0 and 2.0, the window step length on each feature layer is 4,8,16,32 and 64, the classification Loss adopts cross entropy, the classification is carried out by using a Sigmoid function of the neural network, the regression Loss adopts a SmoothL1Loss function, and the weights of the two are 1. The output size of the target detection extraction layer is 7, the number of samples is 2, the step length of the characteristic diagram is 4,8,16 and 32, the IOU threshold value of each cascaded target detection layer is 0.5, 0.6 and 0.7, the number of input channels of the last connecting layer is 256, the number of output channels is 1024, and the problem of noise interference of a detection frame is solved by training a plurality of cascaded detectors.
In the invention, the Grid R-CNN changes the conventional method for realizing target position correction through a regression mode into the method for realizing accurate correction of a target positioning frame through a full convolution network, and consists of a residual convolution neural network with the depth of 50 and a characteristic pyramid network with the depth of 4, wherein the number of layers of the residual convolution neural network is 4, and the number of channels of each layer of the characteristic pyramid is 256,512,1024,2048. The number of input channels of the area generation network of the residual convolutional neural network is 256, the number of output channels is 256, the aspect ratio of a window frame is 0.5,1.0 and 2.0, the window step length on each feature layer is 4,8,16,32 and 64, the classification Loss adopts cross entropy, the classification is carried out by using a Sigmoid function of the neural network, the regression Loss adopts a SmoothL1Loss function, and the weights of the two are 1. The Grid point number in a target detection extraction layer of the network is 9, the input channel is 256, cross entropy is adopted for loss, the input channel number of the last connection layer is 256, and the output channel number is 1024.
In the invention, the Libra-RCNN is composed of a residual convolution neural network with the depth of 50 and a characteristic pyramid network with the depth of 4, wherein the number of layers of the residual convolution neural network is 4, and the number of channels of each layer of the characteristic pyramid is 256,512,1024,2048. The number of input channels of the area generation network of the residual convolution neural network is 256, the number of output channels is 256, the aspect ratio of a window frame is 0.5,1.0 and 2.0, the window step length on each feature layer is 4,8,16,32 and 64, the classification Loss adopts cross entropy, the classification is carried out by using a Sigmoid function of the neural network, the regression Loss adopts BalancedL1Loss,the weight of the two is 1, the output size of the target detection extraction layer is 7, the number of samples is 2, the step length of the feature map is 4,8,16 and 32, the number of input channels of the last connection layer is 256, and the number of output channels is 1024.
In the invention, the Retina-Net is composed of a residual convolution neural network with the depth of 50 and a characteristic pyramid network with the depth of 4, wherein the number of layers of the residual convolution neural network is 4, and the number of channels of each layer of the characteristic pyramid is 256,512,1024,2048. The number of input channels of the area generation network of the residual convolutional neural network is 256, the number of output channels is 256, the aspect ratio of a window frame is 0.5,1.0 and 2.0, the window step length on each feature layer is 4,8,16,32 and 64, the classification Loss adopts cross entropy, the classification is carried out by using a Sigmoid function of the neural network, the regression Loss adopts a SmoothL1Loss function, and the weights of the two are 1.
And 5: integrating the results of the four network training, and setting a multi-network voting mechanism to output results according to the verification accuracy to obtain a final integrated neural network;
in the step 5, according to the output accuracy of the four neural network model verification sets in the step 4, a weight is distributed to each network, a multi-network voting mechanism is established on the basis of the weight, and the voting mechanism integrates the identification result of each network and the proportion weight to output a final result, so that the integrated neural network integrating the targets of various network identification results is realized.
In the invention, the verification set corresponds to the training sample set.
Step 6: and (3) leading in the image of the object to be recognized in real time through the camera so as to recognize the object to be recognized through the integrated neural network, and finally outputting the recognition result of the object to be recognized in the image.
In the step 6, the integrated neural network can automatically process the images transmitted by the camera in real time, and the processing speed reaches 0.2 s/frame.
In the invention, the identification accuracy rate reaches more than 98%.
The invention carries out real-time image recognition by integrating the training results of four neural network recognition models, integrates the advantages of a single network and integrates the advantages of a plurality of networks, obtains a large number of image sets with similar objects by a camera, carries out a series of screening treatment on the image sets, obtains a training sample set meeting the requirements of the neural network by data enhancement and expansion of the data set after marking data, trains a target data set by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net respectively, sets a learning rate, momentum and a weight attenuation rate by using an optimization method of random gradient descent, determines corresponding weights according to a verification set, and finally sets a mechanism to carry out integrated treatment on the training results of the four networks to obtain a final multi-network recognition result integrated model, wherein the network can integrate the advantages of the single network, a higher recognition rate is achieved; and finally, introducing a real-time image of the object to be recognized into the environment through a camera, recognizing and identifying the object to be recognized through the integrated multi-neural-network recognition model, and outputting a recognition result of the object to be recognized in the image.
The invention integrates various neural networks, combines the advantages of a single network for identification, can be distinguished in a short time, has high identification rate, can efficiently and accurately identify similar objects of real-time images, solves the defects of low manual identification efficiency and high cost in the prior art, simultaneously makes up the problems of low identification speed and low accuracy of the existing single network, can meet the requirement of actual use, can be widely applied to various aspects such as warehouse article counting, supermarket article counting and the like in the future, can better realize the unmanned aim when being carried on the camera of the robot, and improves the operation efficiency.
Claims (7)
1. A similar object identification method fusing multiple neural networks is characterized in that: the method comprises the following steps:
step 1: acquiring a plurality of image sets with similar objects through a camera;
step 2: preprocessing and labeling the image set;
and step 3: expanding the processed data set by using a data enhancement method to obtain a training sample set which meets the number of samples required by neural network training;
and 4, step 4: respectively training the training sample set by using Cascade R-CNN, Grid R-CNN, Libra R-CNN and Retina-Net;
and 5: integrating the results of the four network training and setting a multi-network voting mechanism to output results according to the verification accuracy to obtain a final integrated neural network;
step 6: and (3) leading in the image of the object to be recognized in real time through the camera so as to recognize the object to be recognized through the integrated neural network, and finally outputting the recognition result of the object to be recognized in the image.
2. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: in the step 1, the similar objects are the same type and similar in shape.
3. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: the step 2 comprises the following steps:
step 2.1: manually screening and checking all the images in the image set, and removing the images with the object shielding rate exceeding a preset value;
step 2.2: and performing block diagram annotation on the screened image set, and adding the specific category and name of the object in the image into an annotation file.
4. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: the step 3 comprises the following steps:
step 3.1: for each image in the processed image set, one or more of means of random scaling, translation, rotation, overturning, pixel deletion, contrast adjustment and affine transformation are selected, so that the total amount of the image is expanded to 16 times and no repeated image exists;
step 3.2: calculating a surrounding frame of the object after data enhancement and storing the surrounding frame into a labeling file, realizing synchronous expansion of the picture set and the surrounding frame, and finishing data enhancement;
step 3.3: and obtaining a training sample set which meets the number of samples required by the neural network training after data enhancement.
5. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: in step 4, initializing the neural network training model includes the following steps:
step 4.1: initializing a Cascade R-CNN neural network training model; training a plurality of cascade detectors, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to attenuate once every A step length, setting the total number of training rounds, and training all data by each round;
step 4.2: initializing a Grid R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation to be once attenuated per B step length, setting the total number of training rounds, and training all data per wheel;
step 4.3: initializing a Libra R-CNN neural network training model, setting a learning rate, a momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per C step length, setting the total number of training rounds, and training all data by each round;
step 4.4: initializing a Retina-Net neural network training model, setting a learning rate, momentum and a weight attenuation rate by adopting a random gradient descent method, setting linear attenuation as attenuation once per D step length, setting the total number of training rounds, and training all data by each round.
6. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: in the step 5, according to the output accuracy of the four neural network model verification sets in the step 4, a weight is distributed to each network, a multi-network voting mechanism is established on the basis of the weight, and the voting mechanism integrates the identification result of each network and the proportion weight to output a final result, so that the integrated neural network integrating the targets of various network identification results is realized.
7. The method for identifying similar objects fusing multiple neural networks according to claim 1, wherein: in the step 6, the integrated neural network can automatically process the images transmitted by the camera in real time, and the processing speed reaches 0.2 s/frame.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911310303.2A CN111178405A (en) | 2019-12-18 | 2019-12-18 | Similar object identification method fusing multiple neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911310303.2A CN111178405A (en) | 2019-12-18 | 2019-12-18 | Similar object identification method fusing multiple neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111178405A true CN111178405A (en) | 2020-05-19 |
Family
ID=70655594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911310303.2A Pending CN111178405A (en) | 2019-12-18 | 2019-12-18 | Similar object identification method fusing multiple neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111178405A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598885A (en) * | 2020-05-21 | 2020-08-28 | 公安部交通管理科学研究所 | Automatic visibility grade marking method for highway foggy pictures |
CN111652292A (en) * | 2020-05-20 | 2020-09-11 | 贵州电网有限责任公司 | Similar object real-time detection method and system based on NCS and MS |
WO2023071121A1 (en) * | 2021-10-26 | 2023-05-04 | 苏州浪潮智能科技有限公司 | Multi-model fusion-based object detection method and apparatus, device and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040220837A1 (en) * | 2003-04-30 | 2004-11-04 | Ge Financial Assurance Holdings, Inc. | System and process for a fusion classification for insurance underwriting suitable for use by an automated system |
CN102831413A (en) * | 2012-09-11 | 2012-12-19 | 上海中原电子技术工程有限公司 | Face identification method and face identification system based on fusion of multiple classifiers |
CN106650721A (en) * | 2016-12-28 | 2017-05-10 | 吴晓军 | Industrial character identification method based on convolution neural network |
CN108921047A (en) * | 2018-06-12 | 2018-11-30 | 江西理工大学 | A kind of multi-model ballot mean value action identification method based on cross-layer fusion |
CN110163213A (en) * | 2019-05-16 | 2019-08-23 | 西安电子科技大学 | Remote sensing image segmentation method based on disparity map and multiple dimensioned depth network model |
CN110245644A (en) * | 2019-06-22 | 2019-09-17 | 福州大学 | A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning |
CN110390251A (en) * | 2019-05-15 | 2019-10-29 | 上海海事大学 | A kind of pictograph semantic segmentation method based on the processing of multiple neural network Model Fusion |
-
2019
- 2019-12-18 CN CN201911310303.2A patent/CN111178405A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040220837A1 (en) * | 2003-04-30 | 2004-11-04 | Ge Financial Assurance Holdings, Inc. | System and process for a fusion classification for insurance underwriting suitable for use by an automated system |
CN102831413A (en) * | 2012-09-11 | 2012-12-19 | 上海中原电子技术工程有限公司 | Face identification method and face identification system based on fusion of multiple classifiers |
CN106650721A (en) * | 2016-12-28 | 2017-05-10 | 吴晓军 | Industrial character identification method based on convolution neural network |
CN108921047A (en) * | 2018-06-12 | 2018-11-30 | 江西理工大学 | A kind of multi-model ballot mean value action identification method based on cross-layer fusion |
CN110390251A (en) * | 2019-05-15 | 2019-10-29 | 上海海事大学 | A kind of pictograph semantic segmentation method based on the processing of multiple neural network Model Fusion |
CN110163213A (en) * | 2019-05-16 | 2019-08-23 | 西安电子科技大学 | Remote sensing image segmentation method based on disparity map and multiple dimensioned depth network model |
CN110245644A (en) * | 2019-06-22 | 2019-09-17 | 福州大学 | A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning |
Non-Patent Citations (2)
Title |
---|
XIN LU: "Grid R-CNN", ARXIV:1811.12030, pages 1 - 9 * |
ZHAOWEI CAI: "Cascade R-CNN: Delving into High Quality Object Detection", ARXIV:1712.00726, pages 1 - 9 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652292A (en) * | 2020-05-20 | 2020-09-11 | 贵州电网有限责任公司 | Similar object real-time detection method and system based on NCS and MS |
CN111598885A (en) * | 2020-05-21 | 2020-08-28 | 公安部交通管理科学研究所 | Automatic visibility grade marking method for highway foggy pictures |
WO2023071121A1 (en) * | 2021-10-26 | 2023-05-04 | 苏州浪潮智能科技有限公司 | Multi-model fusion-based object detection method and apparatus, device and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325713B (en) | Neural network-based wood defect detection method, system and storage medium | |
CN108960245B (en) | Tire mold character detection and recognition method, device, equipment and storage medium | |
CN110148130B (en) | Method and device for detecting part defects | |
CN108898137B (en) | Natural image character recognition method and system based on deep neural network | |
CN111062915B (en) | Real-time steel pipe defect detection method based on improved YOLOv3 model | |
CN109509187B (en) | Efficient inspection algorithm for small defects in large-resolution cloth images | |
CN107230203B (en) | Casting defect identification method based on human eye visual attention mechanism | |
CN110929756B (en) | Steel size and quantity identification method based on deep learning, intelligent equipment and storage medium | |
CN112348787B (en) | Training method of object defect detection model, object defect detection method and device | |
CN109767422A (en) | Pipe detection recognition methods, storage medium and robot based on deep learning | |
CN111539330B (en) | Transformer substation digital display instrument identification method based on double-SVM multi-classifier | |
CN111968098A (en) | Strip steel surface defect detection method, device and equipment | |
CN111178405A (en) | Similar object identification method fusing multiple neural networks | |
CN113393426B (en) | Steel rolling plate surface defect detection method | |
CN111724355A (en) | Image measuring method for abalone body type parameters | |
CN111415339B (en) | Image defect detection method for complex texture industrial product | |
CN111539957A (en) | Image sample generation method, system and detection method for target detection | |
CN111814852A (en) | Image detection method, image detection device, electronic equipment and computer-readable storage medium | |
CN114863189B (en) | Intelligent image identification method based on big data | |
CN111161295A (en) | Background stripping method for dish image | |
CN114863464B (en) | Second-order identification method for PID drawing picture information | |
CN118279304B (en) | Abnormal recognition method, device and medium for special-shaped metal piece based on image processing | |
CN110334775B (en) | Unmanned aerial vehicle line fault identification method and device based on width learning | |
CN109615610B (en) | Medical band-aid flaw detection method based on YOLO v2-tiny | |
CN117808746A (en) | Fruit quality grading method based on image processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200519 |
|
RJ01 | Rejection of invention patent application after publication |