CN111723852A - Robust training method for target detection network - Google Patents

Robust training method for target detection network Download PDF

Info

Publication number
CN111723852A
CN111723852A CN202010480420.XA CN202010480420A CN111723852A CN 111723852 A CN111723852 A CN 111723852A CN 202010480420 A CN202010480420 A CN 202010480420A CN 111723852 A CN111723852 A CN 111723852A
Authority
CN
China
Prior art keywords
label
mining
network
training
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010480420.XA
Other languages
Chinese (zh)
Other versions
CN111723852B (en
Inventor
李涵生
韩鑫
亢宇鑫
崔磊
杨林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Diyingjia Technology Co ltd
Original Assignee
Hangzhou Diyingjia Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Diyingjia Technology Co ltd filed Critical Hangzhou Diyingjia Technology Co ltd
Priority to CN202010480420.XA priority Critical patent/CN111723852B/en
Publication of CN111723852A publication Critical patent/CN111723852A/en
Application granted granted Critical
Publication of CN111723852B publication Critical patent/CN111723852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention relates to a robust training method aiming at a target detection network, which comprises the following steps: acquiring a training sample, wherein a part of detection targets on the training sample carry artificial labeling frames; performing feature extraction on the training sample by using a target detection network, and generating a suggestion box on the training sample; marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels; performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic; inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label; fusing the mining label and the original sampling label to generate a gold label; and using the gold label for training the target detection network.

Description

Robust training method for target detection network
Technical Field
The invention relates to the technical field of computer vision and target detection, in particular to a robust training method for a target detection network.
Background
In recent years, a Convolutional Neural Network (CNN) based object detection framework has become a powerful method for various computer vision tasks and has been widely applied to object localization and object statistics tasks. At the same time, the Convolutional Neural Network (CNN) based object detection framework has continued to improve and a number of excellent architectures have been proposed. Among them, a region-based detection framework (e.g., fasternn, FPN) including a pre-processing step proposed for a region is widely used due to its more accurate detection performance. At the same time, many approaches continue to improve the performance of feature extractors by optimizing their network architecture. However, how to enhance the training robustness under non-optimal parameters and the trainability of the network under various label qualities has been proposed little.
Disclosure of Invention
The present application is proposed to solve the above technical problem, and provides a robust training method for a target detection network.
According to an aspect of the present application, there is provided a robust training method for a target detection network, including: acquiring a training sample, wherein a part of detection targets on the training sample carry artificial labeling frames; performing feature extraction on the training sample by using a target detection network, and generating a suggestion box on the training sample; marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels; performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic; inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label; fusing the mining label and the original sampling label to generate a gold label; and using the gold label for training the target detection network.
Compared with the prior art, the robust training method for the target detection network is adopted, the processes of suggestion frame mining and label fusion are added in the training process of the target detection network, the phenomenon that the suggestion frame is wrongly annotated or a sample has too many false positives due to the fact that a manual annotation frame is missing or the set threshold (the first threshold and the second threshold) is too high or too low is effectively overcome, and the anti-interference performance of the network training process is improved.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 is a flow chart of a robust training method for a target detection network of the present invention;
FIG. 2 is a segmentation of the processing stage of FIG. 1;
FIG. 3 is some positive label examples generated when training on a sparse VOC2007 training set;
FIG. 4 is a comparison graph (1) of results of a target detection network obtained by a common training method and a training method proposed in the present application under sparse COCO;
fig. 5 is a comparison graph (2) of the results of the target detection network obtained by the common training method and the training method proposed in the present application under COCO.
Detailed Description
Hereinafter, example embodiments of the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.
Summary of the application
Taking the fasterRCNN in the target detection network as an example, the fasterRCNN generates a suggestion frame during training, then calculates the intersection ratio of the suggestion frame and the labeling frame, if the intersection ratio is larger than a manually set threshold value, the suggestion frame is marked with a category label (positive sample), otherwise, the suggestion frame is marked with a background label (negative sample), and the label is used as a positive sample and a negative sample to train the network. However, if the manual labeling frame in the image is missing, the suggestion frame will be labeled with an error label. In addition, if the manually set threshold is not optimal, the sampling performance of the positive and negative samples is affected, and if the threshold is set too high, too many positive samples are lost, so that the capability of the network for identifying the target is reduced; if the threshold is set too low, too many false positives will occur in the sampled samples, interfering with the normal training process of the network and affecting the final performance.
Aiming at the technical problems, the invention aims to improve the training robustness of the pathological image detection network under the training data with different labeling quality and non-optimal parameters. The core component of the present invention is a neural network named "mining network". The mining network is able to learn the characteristics of the positive samples and mine potential positive samples in the mined images. Since the mined positive samples typically contain positive samples that are lost due to non-optimal parameters and missing annotations. In this way, the excavated positive samples are merged with the originally sampled positive samples, and the lost positive proposals caused by improper manual parameter setting and sample lack can be found back.
Having described the general principles of the present application, various non-limiting embodiments of the present application will now be described with reference to the accompanying drawings.
Exemplary method
The robust training method for the target detection network, as shown in fig. 1, includes:
s10, obtaining a training sample, wherein a part of detection targets on the training sample carry artificial labeling boxes;
s20, performing feature extraction on the training sample by using a target detection network, and generating a suggestion box on the training sample;
the task of the target detection network is to locate and identify an object from an image, and the image space is an Euclidean space which is not an effective feature separable space, so that a feature extractor is needed to be used for feature combination of field pixels of the image, and features of a larger range and even the whole image are mapped to a high-dimensional separable space. Because the performance of the network is closely related to the separability of the feature space, the backbone network of the target detection network often utilizes a mainstream classification network that has been widely verified to extract and combine features. The classification networks are usually pre-trained on a large-scale public data set, so that the search range of a parameter space of network parameters is effectively limited by a transfer learning mode, and the training difficulty of the network on a new target detection task is further reduced. Therefore, in the invention, the classification model ResNet101 after pre-training is used as a backbone network to execute a feature extraction task.
S30, marking the original sample label on the suggestion box: judging whether the intersection ratio of the suggestion frame and the manual labeling frame is greater than a set first threshold value or not, if so, marking the suggestion frame as a positive label, judging whether the intersection ratio of the suggestion frame and the manual labeling frame is less than a set second threshold value or not, if so, marking the suggestion frame as a negative label, wherein the positive label and the negative label are original sampling labels;
the proposed boxes that are neither positive nor negative do not help the network training, and therefore the number of positive labels is crucial to the training of the detector.
S40, adopting two pooling branches, performing pooling operation on the suggestion boxes marked as positive labels respectively, and outputting a first region-of-interest feature and a second region-of-interest feature;
the feature map region corresponding to the positive label is also called a region of interest (RoI), and two parallel pooling branches are used to pool the feature map region corresponding to the RoI, and output a RoI feature (i.e., a first RoI feature) and a RoI feature (i.e., a second RoI feature) for mining respectively. The two parallel different branch structures of the RoI pooling ensure that the mining process does not interfere with the training process of the detector. And inputting the RoI characteristics (namely the second region of interest characteristics) into the target detection network, and outputting the result by the target detection network.
S50, inputting the first region of interest features into a mining network, wherein the mining network is a fully-connected neural network and generates a new suggestion box label, namely a mining label;
the mining network is a fully-connected neural network, the input of which is the RoI characteristic used for mining, and the hidden layer of which can be one or more layers. The mining network outputs a probability distribution (mining score) with the suggested box category activated by softmax, then the mining score is subjected to one-hot coding, and a suggested box mining label represented by m is generated.
This process can be expressed as:
m=onehot[softmax(fmining(xroi))],
where m represents a suggestion box mining tag, fminingIs a mining network, xroiIs the input RoI feature used for mining.
S60, fusing the mining label and the original sampling label to generate a gold label, wherein the gold label is used as a final label of the suggestion frame;
and (4) the label obtained by fusing the mining label of the suggestion box and the original sampling label is called a gold label, and the gold label is used as a real label for detection training. By generating gold tags through merging operations, it can be ensured that the performance of the probe is not affected even under worst case conditions (excavation network is invalid).
Specifically, the gold tag (g) is the union of the original sampling tag (a) and the suggestion box mining tag (m). Through the merge operation, some false negative tags (which should be positive but sampled negative) in the original sample tags will be corrected by the suggestion box mining tag. Thus, many positive tags that were lost due to improper manual thresholds and missing annotations will be recovered, and the tag merging process can be expressed as:
Figure BDA0002517141380000041
wherein, ak0 indicates that the suggestion box indexed by k is assigned a negative label.
And S70, using the suggestion box corresponding to the gold label for training the target detection network. The total loss for network training is represented by the following equation:
LossTotal=Lmcls(p,g)+Lloc+Lmining(a,m),
where p is the final output of the probability distribution with softmax activation, LlocIs a loss of positioning; l ismclsIs the cross entropy loss, which can be expressed as:
Figure BDA0002517141380000042
wherein N isclsIs the number of suggested boxes, piClassification probability distribution of suggestion boxes indexed by i, g, that are output by the FastRCNN branchiThe method comprises the steps that a suggested frame gold label is indexed according to i, and an original sampling label is optimized through the gold label;
Lminingthe expression mining loss, i.e. the cross entropy loss of the training mining network, can be expressed as:
Figure BDA0002517141380000051
wherein, aiDenotes a tag, m, indexed by i in the assignment tagiIs the output indexed by i in the mining network. Obviously, the labels used to train the mining network are sampling labels. Typically, there are hundreds of recommendations with tags per training step, and therefore hundreds of tags in the sample tags, which ensures that the mining network can adequately learn the characteristics of positive tags, and up to this point, Loss can be usedTatalAnd training the whole target detection network. The loss function comprises classification loss and positioning loss, wherein the classification loss comprises cross entropy loss LmclsAnd excavation loss LminingThe algorithm of the positioning loss follows the conventional calculation method, and is not described herein.
As shown in fig. 2, a general R-CNN training process is shown by a dotted line in the figure, and a recommendation frame can be obtained by further correcting the position of a default recommendation region (e.g., "anchor point" in fasterncn), and then a class label (or background label) is assigned to the recommendation frame and used as a training sample for training a detector. The process of suggested frame mining and label fusion is added in the training process of the application, as shown by a chain line in fig. 2, the phenomenon that the suggested frame is wrongly marked or the sample is excessively false positive due to the fact that the manual marking frame is missing or the set threshold (the first threshold and the second threshold) is too high or too low is effectively overcome, and the anti-interference performance of the network training process is improved.
To verify the validity of this patent, experiments were performed on the paschaloc 2007 and MSCOCO2017 datasets. The paschaloc 2007 consisted of 5k training images and 5k test images for approximately 20 classes of subjects. The COCO data set contained about 11.8 ten thousand training images and 5k validation images, and was tested using the validation set. The sparse data set is created manually by deleting annotations randomly until only one annotation per training image per class, as shown in fig. 3(a) (sparse annotation). Sparse operations are only performed on the training set of PASCAL and the training set of COCO, and the test set of PASCAL and the validation set of COCO remain intact.
1. Experimental parameters and details:
in the experiment, FasterRCNN is adopted in the target detection network, a feature extractor is ResNet101, and ResNet101 is pre-trained on ImageNet. The number of training steps is 150k on PASCAL, 1500k on COCO and 1 for blocksize. The learning rate was initially set at 0.0001, divided by 10 at 60k/600k steps for PASCAL and 10 at 80k/800k steps for COCO. Zooming the image in the training process to make the length of the short side 600 pixels; the maximum length of the long side is 1000 pixels. In addition, the images are randomly flipped horizontally to enhance the training data. The intersection ratio IoU of the suggestion box with the annotation is higher than 0.5 and is assigned a positive label, otherwise, it is a negative label.
2. Quantitative results:
TABLE 1 fast-RCNN trained on PASCAL training set, and average accuracy (nAP) and Average Recall (AR) results evaluated on PASCAL test set
Data of This patent mAP AR
Sparse × 58.5 73.4
Sparse 61.5 75.5
Complete (complete) × 68.5 82.7
Complete (complete) 68.9 83.4
Table 1 lists the results evaluated on the PASCAL2007 test set, and under the training of sparse PASCAL, the method of the patent improves the mapp (mean average precision) by 3.0% and the AR by 2.1%. Meanwhile, the method realizes 0.7 percent of AR on the original PASCAL
(AverageRecall ) improvement.
TABLE 2 mean accuracy Ap results using fast-RCNN trained on the MSCOCO training set and evaluated on the MSCOCO validation set
Data of This patent AP@0.5 AP AP-s AP-m AP-l
Sparse × 25.4 14.9 2.1 13.2 27.7
Sparse 28.4 16.5 2.7 15.7 30.3
Complete (complete) × 34.0 19.6 4.5 20.7 33.0
Complete (complete) 36.5 20.6 5.3 22.2 34.0
Table 2 shows the results evaluated on the validation set of the COCO dataset, with the method of the invention increasing the AP trained on sparse COCO and complete COCO by 1.6% and 1.0%, respectively. In addition, the method of the invention improves the AP @0.5 by 3.0% and 2.5% respectively under sparse and complete COCO, and the AP @0.5 means the statistical result under a single threshold value of 0.5. AP-s, AP-m, AP-l are AP indices for small, medium and large targets, respectively.
3. And (3) robustness analysis:
TABLE 3 mean recall AR results using fast-RCNN trained on the MSCOCO training set and evaluated on the MSCOCO validation set
Data of This patent AP AR-s AR-m AR-l
Sparse × 17.4 2.1 14.9 32.7
Sparse 19.7 2.8 18.0 37.0
Complete (complete) × 23.5 4.9 24.4 40.4
Complete (complete) 25.7 6.0 27.4 43.4
In Table 3, the AR results of the present invention (19.70 and 25.7AR) are not much improved over the original FasterRCNN (17.4 and 23.5). In this section, the training performance of the target detection network will be explored, as well as the effectiveness of the present invention at IoU thresholds under different conditions. The number of positive advice boxes at different IoU thresholds at the last iteration cycle on the PASCAL training set is counted and the average number of positive advice boxes per image is reported. At the same time, the mAP results of the networks trained on the PASCAL training set are given and evaluated on the test set.
TABLE 4 number of positive advice boxes averaged over last training period (different IoU thresholds), and mAP results evaluated by PASCAL test set
Figure BDA0002517141380000071
As shown in table 4, the maps results of the method of the present invention outperformed fasternn except that the threshold of IoU was 0.3. However, with the increasing IoU threshold, the method of the invention can achieve more significant mAP improvement, for example, when the IoU threshold is 0.6, 0.7 and 0.8, respectively, the mAP improvement of the method of the invention is 1.0%, 2.7% and 6.8%, respectively.
4. Qualitative results
In fig. 4 and 5, some of the test results generated by the method of this patent are illustrated, as compared to fast. Fasterncn trained on sparse COCO datasets tends to miss some objects (red dashed box), and the method of this patent largely avoids this error. Meanwhile, the method of the patent obtains more accurate prediction on the COCO data set.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (4)

1. The robust training method for the target detection network is characterized by comprising the following steps:
acquiring a training sample, wherein a part of detection targets on the training sample carry artificial labeling frames;
performing feature extraction on the training sample by using a target detection network, and generating a suggestion box on the training sample;
marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels;
performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic;
inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label;
fusing the mining label and the original sampling label to generate a gold label;
and using the gold label for training the target detection network.
2. The robust training method for the target detection network as claimed in claim 1, wherein the generation process of the mining tag comprises: inputting the first region of interest feature into a mining network, outputting the probability distribution with the suggested box category activated by softmax by the mining network, performing one-hot coding on the probability distribution, and generating the mining tag, which is specifically represented as:
m=onehot[softmax(fmining(xroi))],
wherein m represents a mining tag, fminingRepresenting a mined network, xroiRepresenting a first region of interest feature.
3. The robust training method for the target detection network as claimed in claim 1, wherein the gold label is a union of the original sampling labels and the mining label, and a false negative label, which is a positive label but marked as a negative label, in the original sampling labels is corrected by the mining label and restored to be a positive label through a merging operation;
the label merging process is represented as:
Figure FDA0002517141370000011
wherein, ak0 denotes that the suggestion box indexed by k is marked as a negative label, g denotes a gold label, a denotes the original sample label, and m denotes the mining label.
4. The robust training method for an object detection network as claimed in claim 2, wherein the loss function for object detection network training is:
LossTotal=Lmcls(p,g)+Lloc+Lmining(a,m)
wherein p is the probability distribution with the suggested box category after activation by softmax, g represents the gold tag;
Lmclswhich represents the cross-entropy loss in the entropy domain,
Figure FDA0002517141370000021
wherein N isclsIndicates the number of suggestion boxes, piRepresenting the probability distribution, g, indexed by i, with the category of suggestion boxes after activation by softmaxiA gold tag representing an index by i;
Llocindicating a loss of positioning;
Lminingwhich is indicative of a loss of excavation,
Figure FDA0002517141370000022
CN202010480420.XA 2020-05-30 2020-05-30 Robust training method for target detection network Active CN111723852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010480420.XA CN111723852B (en) 2020-05-30 2020-05-30 Robust training method for target detection network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010480420.XA CN111723852B (en) 2020-05-30 2020-05-30 Robust training method for target detection network

Publications (2)

Publication Number Publication Date
CN111723852A true CN111723852A (en) 2020-09-29
CN111723852B CN111723852B (en) 2022-07-22

Family

ID=72565402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010480420.XA Active CN111723852B (en) 2020-05-30 2020-05-30 Robust training method for target detection network

Country Status (1)

Country Link
CN (1) CN111723852B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221970A (en) * 2021-04-25 2021-08-06 武汉工程大学 Deep convolutional neural network-based improved multi-label semantic segmentation method
CN114612717A (en) * 2022-03-09 2022-06-10 四川大学华西医院 AI model training label generation method, training method, use method and device
CN117572531A (en) * 2024-01-16 2024-02-20 电子科技大学 Intelligent detector embedding quality testing method and system

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599939A (en) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 Real-time target detection method based on region convolutional neural network
US20180089505A1 (en) * 2016-09-23 2018-03-29 Samsung Electronics Co., Ltd. System and method for deep network fusion for fast and robust object detection
CN108197687A (en) * 2017-12-27 2018-06-22 江苏集萃智能制造技术研究所有限公司 A kind of webpage two-dimensional code generation method
CN108416287A (en) * 2018-03-04 2018-08-17 南京理工大学 A kind of pedestrian detection method excavated based on omission negative sample
CN108875819A (en) * 2018-06-08 2018-11-23 浙江大学 A kind of object and component associated detecting method based on shot and long term memory network
CN108960143A (en) * 2018-07-04 2018-12-07 北京航空航天大学 Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images
CN109285139A (en) * 2018-07-23 2019-01-29 同济大学 A kind of x-ray imaging weld inspection method based on deep learning
US20190102646A1 (en) * 2017-10-02 2019-04-04 Xnor.ai Inc. Image based object detection
CN109800778A (en) * 2018-12-03 2019-05-24 浙江工业大学 A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible
US20190294177A1 (en) * 2018-03-20 2019-09-26 Phantom AI, Inc. Data augmentation using computer simulated objects for autonomous control systems
WO2019238976A1 (en) * 2018-06-15 2019-12-19 Université de Liège Image classification using neural networks
CN110610210A (en) * 2019-09-18 2019-12-24 电子科技大学 Multi-target detection method
CN110716792A (en) * 2019-09-19 2020-01-21 华中科技大学 Target detector and construction method and application thereof
US20200134454A1 (en) * 2018-10-30 2020-04-30 Samsung Sds Co., Ltd. Apparatus and method for training deep learning model
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180089505A1 (en) * 2016-09-23 2018-03-29 Samsung Electronics Co., Ltd. System and method for deep network fusion for fast and robust object detection
CN106599939A (en) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 Real-time target detection method based on region convolutional neural network
US20190102646A1 (en) * 2017-10-02 2019-04-04 Xnor.ai Inc. Image based object detection
CN108197687A (en) * 2017-12-27 2018-06-22 江苏集萃智能制造技术研究所有限公司 A kind of webpage two-dimensional code generation method
CN108416287A (en) * 2018-03-04 2018-08-17 南京理工大学 A kind of pedestrian detection method excavated based on omission negative sample
US20190294177A1 (en) * 2018-03-20 2019-09-26 Phantom AI, Inc. Data augmentation using computer simulated objects for autonomous control systems
CN108875819A (en) * 2018-06-08 2018-11-23 浙江大学 A kind of object and component associated detecting method based on shot and long term memory network
WO2019238976A1 (en) * 2018-06-15 2019-12-19 Université de Liège Image classification using neural networks
CN108960143A (en) * 2018-07-04 2018-12-07 北京航空航天大学 Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images
CN109285139A (en) * 2018-07-23 2019-01-29 同济大学 A kind of x-ray imaging weld inspection method based on deep learning
US20200134454A1 (en) * 2018-10-30 2020-04-30 Samsung Sds Co., Ltd. Apparatus and method for training deep learning model
CN109800778A (en) * 2018-12-03 2019-05-24 浙江工业大学 A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible
CN110610210A (en) * 2019-09-18 2019-12-24 电子科技大学 Multi-target detection method
CN110716792A (en) * 2019-09-19 2020-01-21 华中科技大学 Target detector and construction method and application thereof
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BIN LIU等: "Study of object detection based on Faster R-CNN", 《2017 CHINESE AUTOMATION CONGRESS(CAC)》 *
唐博恒: "基于改进Mask RCNN的不规则3D物体抓取点识别", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221970A (en) * 2021-04-25 2021-08-06 武汉工程大学 Deep convolutional neural network-based improved multi-label semantic segmentation method
CN114612717A (en) * 2022-03-09 2022-06-10 四川大学华西医院 AI model training label generation method, training method, use method and device
CN117572531A (en) * 2024-01-16 2024-02-20 电子科技大学 Intelligent detector embedding quality testing method and system
CN117572531B (en) * 2024-01-16 2024-03-26 电子科技大学 Intelligent detector embedding quality testing method and system

Also Published As

Publication number Publication date
CN111723852B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN110443818B (en) Graffiti-based weak supervision semantic segmentation method and system
CN111723852B (en) Robust training method for target detection network
JP5176763B2 (en) Low quality character identification method and apparatus
CN109977895B (en) Wild animal video target detection method based on multi-feature map fusion
CN111612008A (en) Image segmentation method based on convolution network
CN111950610B (en) Weak and small human body target detection method based on precise scale matching
CN112017192B (en) Glandular cell image segmentation method and glandular cell image segmentation system based on improved U-Net network
CN113420669B (en) Document layout analysis method and system based on multi-scale training and cascade detection
CN112819840B (en) High-precision image instance segmentation method integrating deep learning and traditional processing
CN112861785B (en) Instance segmentation and image restoration-based pedestrian re-identification method with shielding function
CN112347284A (en) Combined trademark image retrieval method
CN114998220A (en) Tongue image detection and positioning method based on improved Tiny-YOLO v4 natural environment
CN107564007B (en) Scene segmentation correction method and system fusing global information
CN114255403A (en) Optical remote sensing image data processing method and system based on deep learning
CN110135428B (en) Image segmentation processing method and device
CN114429649B (en) Target image identification method and device
CN114639122A (en) Attitude correction pedestrian re-recognition method based on convolution generation countermeasure network
CN116258978A (en) Target detection method for weak annotation of remote sensing image in natural protection area
CN115273154A (en) Thermal infrared pedestrian detection method and system based on edge reconstruction and storage medium
CN114037886A (en) Image recognition method and device, electronic equipment and readable storage medium
CN111582057B (en) Face verification method based on local receptive field
CN112991280A (en) Visual detection method and system and electronic equipment
CN112861840A (en) Complex scene character recognition method and system based on multi-feature fusion convolutional network
Lee et al. Enhancement for automatic extraction of RoIs for bone age assessment based on deep neural networks
CN115937161A (en) Adaptive threshold semi-supervised based ore sorting method and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant