CN112418362B - Target detection training sample screening method - Google Patents
Target detection training sample screening method Download PDFInfo
- Publication number
- CN112418362B CN112418362B CN202110093092.2A CN202110093092A CN112418362B CN 112418362 B CN112418362 B CN 112418362B CN 202110093092 A CN202110093092 A CN 202110093092A CN 112418362 B CN112418362 B CN 112418362B
- Authority
- CN
- China
- Prior art keywords
- sample
- labeled
- training
- screening
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for screening target detection training samples, which is used for detecting targets in training set data by using models at different stages obtained in a training process to obtain detection results of each image sample on a model M at different stages. And screening the detection result of each image sample in the model at different stages to obtain a complete forgotten sample and a partial forgotten sample. The model replaces manpower to analyze the noise samples in a huge number of target detection data sets, so that the manpower is saved, the subjective influence of manual data screening is eliminated, and the efficiency and the accuracy of executing the target detection task by using a deep learning method are improved.
Description
Technical Field
The invention belongs to the technical field of target detection, and particularly relates to a method for screening a target detection training sample.
Background
In recent years, with the continuous development of artificial intelligence technology, deep learning technology has made breakthrough progress in the tasks of classification, identification, detection, segmentation, tracking and the like in the field of computer vision. Compared with the traditional machine vision method, the deep convolutional neural network learns useful characteristics from a large amount of data under the training of big data, and has the advantages of high speed, high precision, low cost and the like. However, a great part of the reason why deep learning can achieve this advantage over conventional methods is because deep learning is based on a large amount of data, and especially in the field of target detection, deep learning requires a large amount of effective data. In order to provide a sufficient amount of effective data, the current mainstream practice is data enhancement, and many other sample generation methods also appear, but after a sufficient amount of samples are obtained, some noises which can be recognized by the model at the initial training stage and cannot be recognized by the model at the later training stage are inevitably generated, which are called as "forgotten samples", and the forgotten samples have negative effects on the model training process.
At the present stage, for a forgotten sample (such as an error label) in a data set, manual screening is generally needed, the workload is huge and is not representative, namely, a part of samples exist, and people subjectively consider that the samples are noise, but the samples are not noise or do not influence training in the view of a model, so that the target detection effect is influenced.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method for screening a target detection training sample aiming at the technical defects related in the background technology, so that the noise in the sample is reduced, the effective rate of the sample is improved, and the detection efficiency and the accuracy in the process of executing target detection by utilizing a deep learning method are improved.
According to one aspect of the present invention, there is provided a target detection training sample screening method, including:
and detecting the target in the training set data by using the models at different stages obtained in the training process to obtain the detection result of each image sample on the model M at different stages. Each image sample is processed as follows:
setting the recall rate recall as 1 when all the pre-labeled targets in the detection result of the model M have identification frames which are consistent with the pre-labeled categories and have IOU (input output unit) which is more than or equal to A;
when all the pre-labeled targets do not have the identification frames which are consistent with the pre-labeled categories and the IOU is more than or equal to A, the recall rate recall is 0;
when only the identification frames of part of the pre-labeled targets are consistent with the pre-labeled categories and the IOU is more than or equal to A, the recall rate recall is more than 0 and less than 1;
the IOU is the intersection ratio of the identification frame area in the detection result and the pre-marked identification frame area; a is a constant of 0 to 1 set according to an empirical value.
Screening at M1-MmUpper recall > 0, and at Mm+1-MnTaking the image sample with the upper call being 0 as a complete forgetting sample;
screening at M1-MmUpper recall is 1, and is at Mm+1-MnThe image samples with upper 0 < recall < 1 are used as partial forgetting samples.
Wherein m belongs to n, n is the number of stages of the model, and n is a natural number more than 1.
Preferably, n is m + 1 and a is 0.5.
The training set is composed of pre-labeled image samples, and the pre-labeled information optionally comprises information such as names of samples, target categories, coordinates of pre-labeled identification frames and the like.
The models in different stages are selected from models loaded with weight files learned from a training set, and n corresponds to the learning times.
The detection result optionally contains information such as the name of the sample, the identified target class, the coordinate of the identification frame and the like.
Compared with the prior art, the invention has at least the following beneficial effects: according to the method, the model replaces manpower to analyze the noise samples in a large number of target detection data sets, so that the manpower is saved, the subjective influence of manual data screening is eliminated, and the efficiency and the accuracy of executing the target detection task by using a deep learning method are improved.
Drawings
Fig. 1 is a schematic diagram of a completely forgotten sample screened by a target detection training sample screening method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a completely forgotten sample screened by the target detection training sample screening method according to the embodiment of the present invention.
Fig. 3 is a schematic diagram of a part of forgotten samples screened by the target detection training sample screening method according to the embodiment of the present invention.
Detailed Description
In order to make the technical solutions in one or more embodiments of the present disclosure better understood, the technical solutions in one or more embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in one or more embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of one or more embodiments of the present disclosure, but not all embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments in one or more of the specification without inventive faculty are intended to fall within the scope of one or more of the specification.
Example 1: in order to solve the above technical problems, this embodiment takes a model of a contraband detection scenario as an example, and describes a sample screening method, including the following steps:
1. and detecting the target in the training set data by using the models at different stages obtained in the training process to obtain the detection result of each image sample on the model M at different stages.
The training set is composed of manually labeled image samples, the number of the image samples is huge, generally thousands or even hundreds of thousands, the acquisition of the image samples is not limited to image acquisition, and the training set can also comprise samples obtained by data augmentation of images and/or new samples obtained by an image fusion method disclosed in the prior art. The type of the image is not limited, and the image may be acquired by a camera, an X-ray security inspection apparatus, or a terahertz imaging apparatus, but in general, the pattern is acquired by the same kind of apparatus, and the images acquired by a plurality of kinds of apparatuses are not mixed to be a sample of the same training set. The manual labeling is the labeling of the detection target, and the pre-labeling information is selectable and mainly comprises information such as the name of the sample, the type of the target, the coordinate of the pre-labeled identification frame and the like. And pre-labeling the obtained image samples to obtain labeled image samples to form a training set for training the constructed contraband detection model.
The type of the model is not limited in this embodiment, and the target detection model based on the deep learning method may be any target detection model, such as a tow stage target detection framework model represented by fast RCNN or a one stage target detection framework model represented by SSD or YOLO.
The training process of the model is to require the model to learn the training set for multiple times, and each learning (called an epoch) adjusts the parameters of the model, that is, a weight file is generated in each learning process. As an embodiment of the present invention, training 14 epochs saves the weight file obtained by 14 epochs in the model training process. The model M at different stages can be obtained by loading the weight files of different epochs on the model respectively1-M14And detecting the targets in the training set by using the models in different stages to obtain the detection result of each image sample on the model M in different stages.
The detection result optionally contains information such as the name of the sample, the identified target class, the coordinates of the identification frame and the like.
For a certain object on an image, when the recognition class of the object by the model is consistent with the pre-labeling class of the object, if the intersection ratio IOU (intersection over Union) of the recognition frame of the object and the labeling frame of the model is greater than a threshold value, the object is considered to be correctly detected by the model. The preset threshold may be set empirically, and is preferably 0.5.
2. And screening the detection results of the models in different stages to obtain a complete forgotten sample and a partial forgotten sample.
The screening process of the noise sample screening method of the training set data center in the field of target detection is a forgetting mechanism based on a target detection model, so the noise sample screened by the method is called a forgetting sample in the invention. In this embodiment, concepts of the IOU and the recall are introduced to screen a forgetting sample in a target detection data set, and the forgetting sample is subdivided into a completely forgotten sample and a partially forgotten sample according to whether the forgetting is completely forgotten or partially forgotten.
Specifically, for one image sample, multiple targets may be included, and for one target model detection result, multiple recognition frames may be included. Each image sample is processed as follows: setting a completely forgotten sample as a model M1-M13The detection result of (1) has an identification frame (recall > 0) that the pre-labeled target is consistent with the pre-labeled type and the IOU is more than or equal to 0.5, and the identification frame is in the model M14In the detection result of (1), no image sample of the identification frame (recall 0) which is consistent with the labeling type and has the IOU more than or equal to 0.5 exists in the identification frame of the pre-labeled target.
Setting partial forgetting samples to be in model M1-M13All the pre-labeled targets in the upper detection result have identification frames (the recall rate recall is 1) which are consistent with the pre-labeled type and the IOU is more than or equal to 0.5, but in the model M14Only part of the pre-labeled targets in the detection result have image samples of identification frames (0 < recall < 1) which are consistent with the pre-labeled type and have IOU more than or equal to 0.5.
The model self-checking method solves the problems that the workload of manually screening samples is huge, a small amount of mislabeled samples which are favorable for the robustness of the model can be deleted by human subjective judgment, and huge errors can be brought to the screening of data. Because the method screens the noise data which are unfavorable to the model in a model self-checking mode, the method is more targeted than manual screening and can not regard a small amount of wrong standard samples which are favorable to model training as the noise data, thereby ensuring the robustness of the model while deleting the screened noise data.
By the method of example 1, a training set consisting of 25000 image samples was tested to find that 10 total forgetting samples and 167 partial forgetting samples were screened out. Fig. 1 and fig. 2 show an example of an original image of a screened completely forgotten sample, and it can be seen that an object in the image sample is wrongly labeled. Fig. 3 is an example of a screened partial forgotten sample, and it can be seen that the image sample contains correctly labeled firecrackers and also contains objects incorrectly labeled as pistols.
And (3) eliminating the screened complete forgotten samples and partial forgotten samples, taking the rest data as a new training set and an original training set, respectively training the same model, and then testing to find that the model trained by the new training set has higher identification accuracy.
Example 2: in example 1 for M1-M13And M14After the identification frame is used for judging and screening the forgotten sample composition set C1, the M is continuously carried out1-M5And M6-M14And judging and screening the forgotten samples by the obtained identification frame to form a set C2, and removing a union set of C1 and C2 to obtain a new training set. The forgetting samples with different forgetting degrees can be obtained by screening for multiple times at different stages, and the specific times can be selected according to the actual effect.
The technical scheme of the invention can also be applied to target identification detection scenes except contraband detection in the embodiment, such as various target detection scenes, such as face identification, license plate identification, road identification, unmanned driving, focus detection analysis under medical image CT inspection scenes and the like.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.
Claims (7)
1. A method for screening a target detection training sample is characterized by comprising the following steps:
s1, detecting the target in the training set data by using the models at different stages obtained in the training process to obtain the detection result of each image sample on the model M at different stages;
s2, the following processing is carried out on each image sample:
setting the recall rate recall as 1 when all the pre-labeled targets in the detection result of the model M have identification frames which are consistent with the pre-labeled categories and have IOU (input output unit) which is more than or equal to A;
when all the pre-labeled targets do not have the identification frames which are consistent with the pre-labeled categories and the IOU is more than or equal to A, the recall rate recall is 0;
when only part of the pre-labeled targets have identification frames which are consistent with the pre-labeled categories and have IOUs larger than or equal to A, the recall rate recall is more than 0 and less than 1;
the IOU is the intersection ratio of the identification frame area in the detection result and the pre-marked identification frame area; a is a constant of 0 to 1 set according to an empirical value;
s3-will be at M1-MmUpper recall > 0, and at Mm+1-MnTaking the image sample with the upper call being 0 as a complete forgetting sample;
will be at M1-MmUpper recall is 1, and is at Mm+1-MnTaking the image sample with upper 0 < recall < 1 as a partial forgetting sample;
wherein m belongs to n, n is the number of stages of the model, and n is a natural number more than 1;
s4: screening out a complete forgotten sample and/or a partial forgotten sample;
the models in different stages are selected from models loaded with weight files learned from a training set, and n corresponds to the learning times.
2. The method as claimed in claim 1, wherein n is m + 1.
3. The method for screening a training sample for target detection according to any one of claims 1 or 2, wherein a is 0.5.
4. The method for screening the training samples for target detection according to claim 1, wherein the training set is composed of pre-labeled image samples, and the pre-labeled information optionally includes target category and pre-labeled identification frame coordinate information.
5. The method as claimed in claim 1, wherein the detection result optionally includes the recognized target class and the recognition frame coordinate information.
6. The method as claimed in claim 1, wherein the step S3 is performed for a plurality of times according to different m.
7. The method for screening a training sample for target detection according to any one of claims 1-2 and 4-6, wherein the screened sample is deleted and the remaining samples constitute a training set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110093092.2A CN112418362B (en) | 2021-01-25 | 2021-01-25 | Target detection training sample screening method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110093092.2A CN112418362B (en) | 2021-01-25 | 2021-01-25 | Target detection training sample screening method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112418362A CN112418362A (en) | 2021-02-26 |
CN112418362B true CN112418362B (en) | 2021-04-30 |
Family
ID=74782558
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110093092.2A Active CN112418362B (en) | 2021-01-25 | 2021-01-25 | Target detection training sample screening method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112418362B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113344858B (en) * | 2021-05-14 | 2024-07-09 | 云从科技集团股份有限公司 | Feature detection method, device and computer storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110738264A (en) * | 2019-10-18 | 2020-01-31 | 上海眼控科技股份有限公司 | Abnormal sample screening, cleaning and training method, device, equipment and storage medium |
CN111310826A (en) * | 2020-02-13 | 2020-06-19 | 南京旷云科技有限公司 | Method and device for detecting labeling abnormity of sample set and electronic equipment |
CN111353555A (en) * | 2020-05-25 | 2020-06-30 | 腾讯科技(深圳)有限公司 | Label detection method and device and computer readable storage medium |
-
2021
- 2021-01-25 CN CN202110093092.2A patent/CN112418362B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110738264A (en) * | 2019-10-18 | 2020-01-31 | 上海眼控科技股份有限公司 | Abnormal sample screening, cleaning and training method, device, equipment and storage medium |
CN111310826A (en) * | 2020-02-13 | 2020-06-19 | 南京旷云科技有限公司 | Method and device for detecting labeling abnormity of sample set and electronic equipment |
CN111353555A (en) * | 2020-05-25 | 2020-06-30 | 腾讯科技(深圳)有限公司 | Label detection method and device and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
AN EMPIRICAL STUDY OF EXAMPLE FORGETTING;Mariya Toneva等;《arxiv.org》;20191115;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112418362A (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rijal et al. | Ensemble of deep neural networks for estimating particulate matter from images | |
CN111178197A (en) | Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method | |
CN112232450B (en) | Multi-stage comprehensive difficult sample mining method and target detection method | |
CN108830332A (en) | A kind of vision vehicle checking method and system | |
CN109509187A (en) | A kind of efficient check algorithm for the nibs in big resolution ratio cloth image | |
CN109902662B (en) | Pedestrian re-identification method, system, device and storage medium | |
CN112102229A (en) | Intelligent industrial CT detection defect identification method based on deep learning | |
CN116843650A (en) | SMT welding defect detection method and system integrating AOI detection and deep learning | |
CN110161233B (en) | Rapid quantitative detection method for immunochromatography test paper card | |
CN116310785B (en) | Unmanned aerial vehicle image pavement disease detection method based on YOLO v4 | |
CN113205163B (en) | Data labeling method and device | |
CN109146873A (en) | A kind of display screen defect intelligent detecting method and device based on study | |
US12051253B2 (en) | Method and apparatus for training a neural network classifier to classify an image depicting one or more objects of a biological sample | |
CN112365497A (en) | High-speed target detection method and system based on Trident Net and Cascade-RCNN structures | |
US20220092359A1 (en) | Image data classification method, device and system | |
CN114494780A (en) | Semi-supervised industrial defect detection method and system based on feature comparison | |
CN113158969A (en) | Apple appearance defect identification system and method | |
CN112418362B (en) | Target detection training sample screening method | |
CN114092935A (en) | Textile fiber identification method based on convolutional neural network | |
CN115272252A (en) | Improved YOLOX-based carbon fiber defect detection method, device and system | |
CN111310837A (en) | Vehicle refitting recognition method, device, system, medium and equipment | |
CN114596244A (en) | Infrared image identification method and system based on visual processing and multi-feature fusion | |
CN116521917A (en) | Picture screening method and device | |
CN116485766A (en) | Grain imperfect grain detection and counting method based on improved YOLOX | |
CN113505784B (en) | Automatic nail labeling analysis method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |