CN110781941A - Human ring labeling method and device based on active learning - Google Patents

Human ring labeling method and device based on active learning Download PDF

Info

Publication number
CN110781941A
CN110781941A CN201910995320.8A CN201910995320A CN110781941A CN 110781941 A CN110781941 A CN 110781941A CN 201910995320 A CN201910995320 A CN 201910995320A CN 110781941 A CN110781941 A CN 110781941A
Authority
CN
China
Prior art keywords
sample data
detection model
target detection
positioning
unlabeled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910995320.8A
Other languages
Chinese (zh)
Inventor
周镇镇
李峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Wave Intelligent Technology Co Ltd
Original Assignee
Suzhou Wave Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Wave Intelligent Technology Co Ltd filed Critical Suzhou Wave Intelligent Technology Co Ltd
Priority to CN201910995320.8A priority Critical patent/CN110781941A/en
Publication of CN110781941A publication Critical patent/CN110781941A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a human ring labeling method and a human ring labeling device based on active learning, wherein the method comprises the following steps: establishing a target detection model by using the labeled sample data; inputting the sample data which is not marked into the target detection model for testing so as to calculate the classification doubt degree, the positioning stability and the positioning compactness of the sample data which is not marked; calculating the comprehensive score of the unlabeled sample according to the classification doubt degree, the positioning stability and the positioning compactness; extracting unlabeled sample data with the comprehensive score within a preset range and carrying out manual labeling; optimizing a target detection model by using manually marked sample data; and circularly executing the first four steps until the target detection model meets the termination condition, and labeling the residual unlabeled sample data based on the trained target detection model. The method of the invention is utilized to relieve the contradiction between the high quality requirement of the image annotation in the field of computer vision and the time-consuming and labor-consuming supply and demand of the annotation process, thereby more effectively carrying out sample annotation.

Description

Human ring labeling method and device based on active learning
Technical Field
The invention relates to the technical field of artificial intelligence. The invention further relates to a human ring labeling method and device based on active learning.
Background
With the development of artificial intelligence technology, the application fields of computer vision are more and more extensive, including robotics, automatic driving, intelligent medical treatment and the like. The greatest driving force in the development of computer vision technology at present is machine learning or deep learning technology. The current mainstream is the deep learning technology. The deep learning technology can be used for tasks such as target detection, target tracking, image classification, image segmentation and the like, and is based on the fact that a large amount of labeling information of digital images is needed.
The conventional manual image file labeling method and the manual and automatic labeling method based on active learning are still low in efficiency and huge in cost and time consumption, so that the number of labeled images in the image field is limited directly, and the rapid development of the image field technology is limited.
Aiming at the defects in the prior art, an optimized labeling method needs to be provided for solving the problems of limited task and low efficiency of the existing labeling tool.
Disclosure of Invention
In one aspect, the present invention provides a human ring labeling method based on active learning based on the above mentioned objectives, wherein the method comprises the following steps:
establishing a target detection model by using the labeled sample data;
inputting the sample data which is not marked into the target detection model for testing so as to calculate the classification doubt degree, the positioning stability and the positioning compactness of the sample data which is not marked;
calculating the comprehensive score of the unlabeled sample according to the classification doubt degree, the positioning stability and the positioning compactness;
extracting unlabeled sample data with the comprehensive score within a preset range and carrying out manual labeling;
optimizing a target detection model by using manually marked sample data;
and circularly executing the first four steps until the target detection model meets the termination condition, and labeling the residual unlabeled sample data based on the trained target detection model.
According to an embodiment of the active learning-based human ring labeling method of the present invention, the establishing a target detection model using labeled sample data further comprises:
collecting the labeled sample data, and preprocessing the labeled sample data;
and establishing a target detection model by utilizing the preprocessed sample data.
According to the embodiment of the human ring labeling method based on active learning, the preprocessing at least comprises scale scaling, equalization and normalization.
According to an embodiment of the active learning-based human ring labeling method of the present invention, the establishing a target detection model using labeled sample data further comprises:
using a back bone feature extraction network VGG16 Conv1 to Conv5 layer of a Faster RCNN framework;
selecting a plurality of anchors scales and a plurality of anchors proportions;
conv6 used 512 3 × 3 convolution kernels, filled with zero values, with a step size of 1;
conv7 used 512 5 × 5 convolution kernels, filled with zero values, with a step size of 1;
and respectively setting the threshold values of the intersection ratio of the non-maximum suppression of the training annotation data and the test unlabeled data.
According to the embodiment of the human ring labeling method based on active learning, the classification suspicion degree is the value with the highest possibility in the prediction results of different classes of the given target frame of the unlabeled data, and the calculation formula of the classification suspicion degree is as follows:
U B(B)=1-P max(B)。
according to an embodiment of the active learning-based human ring labeling method of the present invention, the closeness of localization is a closeness degree of localization of the prediction target frame of the unlabeled data with respect to the corresponding candidate region generated by the final classifier, and a calculation formula of the closeness of localization is:
Figure BDA0002239544450000031
wherein the content of the first and second substances,
Figure BDA0002239544450000032
is the jth predicted targetFrame
Figure BDA0002239544450000033
The positioning tightness of the positioning device is improved,
Figure BDA0002239544450000034
is input to the final classifier generation
Figure BDA0002239544450000035
The corresponding candidate region of (a).
According to the embodiment of the active learning-based human ring labeling method, the positioning stability is the stability of the positioning of the target frame without labeled data, and the calculation formula of the positioning stability is as follows:
Figure BDA0002239544450000036
wherein the content of the first and second substances,
Figure BDA0002239544450000037
is an object frame N is the noise level.
According to an embodiment of the active learning-based human ring labeling method, the comprehensive score is a weighted sum of the classification doubt degree, the positioning stability and the positioning closeness degree.
According to an embodiment of the method for human ring annotation based on active learning, the termination condition is a predetermined number of active cycles and/or a predetermined IoU index.
In another aspect, the present invention further provides a human ring labeling device based on active learning, wherein the device includes:
at least one processor; and
a memory storing processor-executable program instructions that, when executed by the processor, perform the steps of the method of any of the preceding embodiments.
By adopting the technical scheme, the invention at least has the following beneficial effects: the contradiction between high quality requirement of image labeling in the current computer vision field and time-consuming and labor-consuming supply and demand in the labeling process is relieved, the data labeling amount required by deep learning is effectively reduced through an active learning method, and the detection performance equivalent to that of the image labeled by more labels is achieved; the comprehensive score of the unlabeled samples is calculated by utilizing the classification doubt degree, the positioning stability and the positioning compactness, so that the unlabeled samples which need to be labeled manually most can be selected more effectively, and the samples are labeled more effectively.
The present invention provides aspects of embodiments, which should not be used to limit the scope of the present invention. Other embodiments are contemplated in accordance with the techniques described herein, as will be apparent to one of ordinary skill in the art upon study of the following figures and detailed description, and are intended to be included within the scope of the present application.
Embodiments of the invention are explained and described in more detail below with reference to the drawings, but they should not be construed as limiting the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the description of the prior art and the embodiments will be briefly described below, parts in the drawings are not necessarily drawn to scale, and related elements may be omitted, or in some cases the scale may have been exaggerated in order to emphasize and clearly show the novel features described herein. In addition, the structural order may be arranged differently, as is known in the art.
FIG. 1 shows a schematic block diagram of an embodiment of a human ring annotation method based on active learning according to the invention;
fig. 2 is a schematic diagram illustrating an active learning process according to another embodiment of the method for labeling human rings based on active learning of the present invention.
Detailed Description
While the present invention may be embodied in various forms, there is shown in the drawings and will hereinafter be described some exemplary and non-limiting embodiments, with the understanding that the present disclosure is to be considered an exemplification of the invention and is not intended to limit the invention to the specific embodiments illustrated.
Fig. 1 shows a schematic block diagram of an embodiment of a method for human ring annotation based on active learning according to the present invention. In the embodiment shown in fig. 1, the method comprises at least the following steps:
s10: establishing a target detection model by using the labeled sample data;
s20: inputting the sample data which is not marked into the target detection model for testing so as to calculate the classification doubt degree, the positioning stability and the positioning compactness of the sample data which is not marked;
s30: calculating the comprehensive score of the unlabeled sample according to the classification doubt degree, the positioning stability and the positioning compactness;
s40: extracting unlabeled sample data with the comprehensive score within a preset range and carrying out manual labeling;
s50: optimizing a target detection model by using manually marked sample data;
s60: and circularly executing the steps S20 to S50 until the target detection model meets the termination condition, and labeling the residual unlabeled sample data based on the trained target detection model.
In order to overcome the defects in the prior art, step S10 is first to establish a target detection model by using labeled sample data, where the target detection model is a model used for labeling sample data in the following process and is a model for testing sample data. The target detection model is not perfect at the beginning of the establishment, so it needs to be further trained repeatedly. Therefore, in step S20, the unlabeled sample data is input into the target detection model for testing to calculate the classification doubt degree, the positioning stability and the positioning compactness of the unlabeled sample data. Then, step S30 calculates the comprehensive score of the unlabeled sample according to the classification doubt, the positioning stability and the positioning closeness obtained in step S20. The tested unlabeled sample data are sorted according to the comprehensive score, then step S40 extracts the unlabeled sample data with the comprehensive score within a predetermined range for manual labeling, usually extracts the unlabeled sample data with the comprehensive score sorted in the front for manual labeling, and step S50 optimizes the current target detection model by using the manually labeled sample data. The steps S20 to S50 are executed in a loop until the target detection model meets the termination condition, and then step S60 labels the remaining unlabeled sample data based on the trained target detection model. Therefore, the training of the target detection model and the process of labeling the sample data based on the model are completed.
In some embodiments of the active learning-based human ring labeling method of the present invention, the step S10 of building a target detection model using the labeled sample data further includes: collecting the labeled sample data, and preprocessing the labeled sample data; and establishing a target detection model by utilizing the preprocessed sample data. In further embodiments, the pre-processing includes at least scaling, equalization, normalization, and the like.
In one or more embodiments of the active learning-based human ring labeling method of the present invention, the step S10 of building a target detection model using the labeled sample data further includes:
using a back bone feature extraction network VGG16 Conv1 to Conv5 layer of a Faster RCNN framework;
selecting a plurality of anchors scales and a plurality of anchors proportions;
conv6 used 512 3 × 3 convolution kernels, filled with zero values, with a step size of 1;
conv7 used 512 5 × 5 convolution kernels, filled with zero values, with a step size of 1;
the thresholds for The non-maximum suppression nms (non maximum-over-unity) intersection ratios IoU (The intersection-over-unity) for The training annotation data and The test unlabeled data are set, respectively.
That is, in these embodiments, the object detection model uses the Faster RCNN framework and makes the above-described modifications thereto. The last two pooling layers of the backbone feature extraction network VGG16(conv1-conv5) used by fast RCNN are removed, so that the proportion of the positive sample in the candidate target is increased, and the target is small and sparse in the image. Furthermore, the choice of the anchor scale and the anchor proportion is preferably five different anchors scales and three different anchors proportions. In addition, when setting the thresholds for IoU for the NMS of the training annotated data and the test unlabeled data, it is generally guaranteed that the threshold for the training annotated data is greater than the threshold for the test unlabeled data, preferably 0.7 and 0.3, respectively.
In several embodiments of the active learning-based human ring labeling method of the present invention, the classification suspicion degree is a value with the highest possibility among different classification prediction results for a given target frame of unlabeled data, and a calculation formula of the classification suspicion degree is:
U B(B)=1-P max(B)。
when the probability of a certain class is close to 1.0, the probability of other classes is necessarily lower, which indicates that the probability of the detector determining the class is higher; in contrast, when a plurality of categories have similar likelihoods, since the likelihoods of the respective categories sum up to 1, the likelihood of each category is necessarily low. Based on this, for a specific ith picture I iClassification accuracy of U C(I i) It can be calculated from the largest classification suspicion degree in all the target frames.
In some embodiments of the active learning-based human ring labeling method of the present invention, the closeness of localization is a closeness degree of localization of the prediction target frame of the unlabeled data with respect to the corresponding candidate region generated by the final classifier, and a calculation formula of the closeness of localization is:
Figure BDA0002239544450000071
wherein the content of the first and second substances,
Figure BDA0002239544450000072
is the jth predicted target frame
Figure BDA0002239544450000073
The positioning tightness of the positioning device is improved,
Figure BDA0002239544450000074
is input of the final classificationDevice generation
Figure BDA0002239544450000075
The corresponding candidate region of (a). The candidate Region refers to a target frame which is obtained by selecting a search or an RPN (Region candidate Network) and may contain a foreground target. Because the target detection is not only used for classifying the picture targets, but also used for positioning the picture targets, the position and the scale of the target frame can be continuously adjusted in the network training process, and the quality of the picture targets can be measured by using the positioning stability.
In several embodiments of the active learning-based human ring labeling method of the present invention, the positioning stability is a stability degree of positioning of a target frame of unlabeled data, and a calculation formula of the positioning stability is as follows:
Figure BDA0002239544450000076
wherein the content of the first and second substances,
Figure BDA0002239544450000077
is an object frame
Figure BDA0002239544450000078
N is the noise level (of the picture).
For a given image I iThe positioning stability calculation formula is as follows:
Figure BDA0002239544450000079
wherein M represents the number of the reference target frames, and the weight of each reference target frame is the probability of the highest scoring category, so as to screen the target frames with higher probability.
In one or more embodiments of the active learning-based human ring labeling method of the present invention, the composite score is a weighted sum of the classification doubt degree, the positioning stability and the positioning closeness, and the calculation formula is as follows:
F(I i)=αU C(I i)+βT I(I i)+γS I(I i),
of these, the weights α, γ preferably both take 1.
In some embodiments of the active learning based human ring annotation method of the invention, the termination condition is a predetermined number of active cycles and/or a predetermined IoU criteria. And when the number of active learning cycles reaches a set value or the detection IoU of the target detection model on the verification set meets set indexes, and the target detection model is considered to be mature and complete enough, terminating the training process of the target detection model.
In order to facilitate understanding of the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be described in more detail by taking the following embodiments as examples. The described embodiments are only some of the embodiments of the present invention. Referring to fig. 2, a schematic diagram of an active learning process of still another embodiment of the active learning-based human ring labeling method according to the present invention is shown. The main implementation process comprises the following steps: the target detector utilizes the existing collected tagged data to perform a target classification and positioning task; screening images in an unmarked sample pool for marking by subsequent marking personnel; conveying the screened images to a labeling person, and adding a labeling target frame and a target category by the labeling person to form a label file; adding an annotation image to an annotation training set; and continuing to train the model by using the original detector by using the new labeled training set.
Further, in this embodiment, the method for labeling human rings based on active learning according to the present invention more specifically includes the following steps and sub-steps:
and step 0, firstly, collecting the data of the existing label, and preprocessing the data to reduce the influence of the data on network training as much as possible because the quality of the data directly influences the effect and the precision of a subsequent target detection algorithm, wherein the preprocessing flow of the data comprises scale scaling, equalization and normalization.
Step 1, training a deep learning model by using the existing labeled data, and using a Faster RCNN frame:
step 1.1, initializing by using ImageNet data set pre-training weight;
step 1.2, reading in a data set through a data generation module, generating a batch required by network batch training, selecting VGG16 as a backbone feature extraction network of fast RCNN, extracting a feature diagram of an image, and modifying a VGG16 network, wherein the step comprises the following steps:
a) removing the last two pooling layers of VGG16(conv1-conv5) for improving the proportion of the positive sample in the candidate target, aiming at the small and sparse target in the image;
b) conv6 used 512 3 × 3 convolution kernels, filled with zero values, with a step size of 1;
c) conv7 used 512 5 × 5 convolution kernels, filled with zero values, with a step size of 1;
step 1.3, continuously transmitting the high-dimensional image features generated in step 1.2 forward to generate higher-dimensional features;
step 1.4, rapidly extracting candidate regions and region scores by using RPN, modifying the setting of an original algorithm anchor, and using five anchor scales (16, 24, 32, 48, 96) and 3 anchor proportions (1:2, 1:1, 2: 1);
step 1.5, calculating the scaling and translation dimensions of the prediction box, and adjusting the original fast RCNN calculation formula as follows:
t w=min(log(w/w a),log(1000/16))
t h=min(log(h/h a),log(1000/16))
t x=(x-x a)/w a
t y=(y-y a)/h a
wherein x, y, w, h respectively represent the center horizontal and vertical coordinates, width and height of the prediction box, and x a、y a、w a、h aDenotes the center abscissa, ordinate, t, of the anchor w、t h、t x、t yRespectively representing the position translation scale and the scaling scale of the horizontal coordinate and the vertical coordinate of the prediction frame;
step 1.6, calculating the scaling scale and the translation scale of the calibration frame by using the same formula;
step 1.7, correcting the position of a detection target through a translation scale and a scaling scale to obtain a candidate box, and adjusting the IOU threshold of NMS to be 0.7 during training;
step 1.8, inputting the feature map of 1.3 and the candidate region in 1.7 into ROIploling layer at the same time, and generating high-dimensional features of the corresponding region;
and 1.9, outputting the target area bbox and the score by passing the high-dimensional characteristics of the corresponding area through three fully-connected layers.
And 2, inputting a large amount of unlabelled sample data into the deep learning calculation model for testing, wherein the IOU threshold of the NMS is adjusted to 0.3 during testing.
Step 3, calculating the classification doubt degree, the positioning stability and the positioning compactness of the unlabeled sample:
one of the tasks of target detection is to classify the targets in the image and determine the accuracy of the trained model of the current detector in measuring the target classification. Given the target box B, the classification suspicion degree is calculated as follows:
U B(B)=1-P max(B),
wherein, U B(B) Representing the highest likelihood of different categories of predicted outcomes for the target box. When the probability of a certain class is close to 1.0, the probability of other classes is necessarily lower, which indicates that the probability of the detector determining the class is higher; in contrast, when a plurality of categories have similar likelihoods, since the likelihoods of the respective categories sum up to 1, the likelihood of each category is necessarily low. Based on this, for a specific ith picture I iClassification accuracy of U C(I i) Can be calculated from the maximum classification suspicion among all the target frames, i.e.
U C(I i)=max(U B(B))。
Since the label of the unlabeled image is unknown, after the RPN network outputs candidate regions, it is necessary to estimate that these candidate regions are enough to contain foreground objects. In addition to classifying the targets in the image, the target detection also needs to locate the positions of the targets, and whether the location is accurate is measured by using the location compactness and the location stability.
For a given candidate target box B, the closeness of localization is calculated as follows:
Figure BDA0002239544450000101
wherein the content of the first and second substances,
Figure BDA0002239544450000102
represents the jth predicted target box
Figure BDA0002239544450000103
The positioning tightness of the positioning device is improved,
Figure BDA0002239544450000104
presentation input final classifier generation The corresponding candidate region of (a).
Defining the score of each target box as J, which is given by the following formula:
Figure BDA0002239544450000106
multiple predicted target frames are generated for each image, for image I iThe closeness of positioning score is
Figure BDA0002239544450000111
The position and the scale of the target frame are continuously adjusted in the network training process, and the quality of the target frame is measured by using the positioning stability. For a given target box, the positioning stability calculation formula is as follows:
Figure BDA0002239544450000112
wherein the content of the first and second substances,
Figure BDA0002239544450000113
representing an object box N denotes the noise level of the picture. At different noise levels, the positional stability of an image may measure the tolerance level of the image to noise.
For a given image I iThe positioning stability is calculated as follows:
Figure BDA0002239544450000115
wherein M represents the number of the reference target frames, and the weight of each reference target frame is the probability of the highest scoring category, so as to screen the target frames with higher probability.
Step 4, using the classification doubt degree, the positioning stability and the positioning compactness, calculating and sequencing the comprehensive scores of the unlabeled samples, and defining the score F (I) of the image i)=U C(I i)+T I(I i)+S I(I i). And in the active learning process, sequencing the unmarked images from low to high of the comprehensive score.
And 5, extracting 1% of unlabeled samples with the scores close to the former, conveying the unlabeled samples to the labeling personnel for labeling, adding new labeled data into the original data training set in the labeling process after the labeling of the samples, re-training the detector, and circularly performing the whole process.
Step 6, in an actual embodiment, a certain active cycle number and detection IoU index are set, when the active learning cycle number reaches a set value, or the detection IoU of the target detection model to the verification set meets the set index, the training process of the target detection model is terminated, wherein the IOU is calculated by the model prediction result and the verification set ground route.
And 7, detecting the residual unmarked data by adopting the model, and taking the detection result as a label of the sample.
And ending the whole human ring labeling method flow based on active learning.
In addition, it should be noted that the scheme in the above embodiment can also be applied to an online intelligent annotation task of an image, a human ring annotation method based on active learning is called in a background, a data set uploaded to an annotation platform by a user is trained, after a comprehensive score of an image to be annotated is given, the image with a higher annotation value is annotated by the user, and time and expenses required for the image with a higher comprehensive score, which can be well identified by an annotation model, are saved.
The above steps 0 to 7 are used as examples of the method for labeling human rings based on active learning according to the present invention, and are intended to be interpreted and exemplified, wherein the steps, sequence, values, value ranges and the like that are referred to are all understood as preferred or more preferred examples, and should not be construed as limiting the present invention.
In another aspect, the present invention further provides a human ring labeling device based on active learning, wherein the device includes:
at least one processor; and
a memory storing processor executable program instructions which, when executed by the processor, perform the steps of the active learning based human ring annotation method of any one of the preceding embodiments.
The devices and apparatuses disclosed in the embodiments of the present invention may be various electronic terminal apparatuses, such as a mobile phone, a Personal Digital Assistant (PDA), a tablet computer (PAD), a smart television, and the like, or may be a large terminal apparatus, such as a server, and therefore the scope of protection disclosed in the embodiments of the present invention should not be limited to a specific type of device and apparatus. The client disclosed in the embodiment of the present invention may be applied to any one of the above electronic terminal devices in the form of electronic hardware, computer software, or a combination of both.
The computer-readable storage media (e.g., memory) described herein may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
By adopting the technical scheme, the invention at least has the following beneficial effects: the contradiction between high quality requirement of image labeling in the current computer vision field and time-consuming and labor-consuming supply and demand in the labeling process is relieved, the data labeling amount required by deep learning is effectively reduced through an active learning method, and the detection performance equivalent to that of the image labeled by more labels is achieved; the comprehensive score of the unlabeled samples is calculated by utilizing the classification doubt degree, the positioning stability and the positioning compactness, so that the unlabeled samples which need to be labeled manually most can be selected more effectively, and the samples are labeled more effectively.
It is to be understood that the features listed above for the different embodiments may be combined with each other to form further embodiments within the scope of the invention, where technically feasible. Furthermore, the specific examples and embodiments described herein are non-limiting, and various modifications of the structure, steps and sequence set forth above may be made without departing from the scope of the invention.
In this application, the use of the conjunction of the contrary intention is intended to include the conjunction. The use of definite or indefinite articles is not intended to indicate cardinality. In particular, references to "the" object or to "an" and "an" object are intended to be one of many such objects possible. However, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated. Furthermore, the conjunction "or" may be used to convey simultaneous features, rather than mutually exclusive schemes. In other words, the conjunction "or" should be understood to include "and/or". The term "comprising" is inclusive and has the same scope as "comprising".
The above-described embodiments, particularly any "preferred" embodiments, are possible examples of implementations, and are presented merely for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing substantially from the spirit and principles of the technology described herein. All such modifications are intended to be included within the scope of this disclosure.

Claims (10)

1. A human ring labeling method based on active learning is characterized by comprising the following steps:
establishing a target detection model by using the labeled sample data;
inputting unlabeled sample data into the target detection model for testing to calculate the classification doubt degree, the positioning stability and the positioning compactness of the unlabeled sample data;
calculating the comprehensive score of the unlabeled sample according to the classification doubt degree, the positioning stability and the positioning compactness;
extracting unlabeled sample data with the comprehensive score within a preset range for manual labeling;
optimizing the target detection model by using the manually marked sample data;
and circularly executing the first four steps until the target detection model meets the termination condition, and labeling the residual unlabeled sample data based on the trained target detection model.
2. The method of claim 1, wherein the building a target detection model using labeled sample data further comprises:
collecting the labeled sample data, and preprocessing the labeled sample data;
and establishing a target detection model by utilizing the preprocessed sample data.
3. The method of claim 2, wherein the pre-processing comprises at least scaling, equalization, and normalization.
4. The method of claim 1, wherein the building a target detection model using labeled sample data further comprises:
using a back bone feature extraction network VGG16 Conv1 to Conv5 layer of a Faster RCNN framework;
selecting a plurality of anchors scales and a plurality of anchors proportions;
conv6 used 512 3 × 3 convolution kernels, filled with zero values, with a step size of 1;
conv7 used 512 5 × 5 convolution kernels, filled with zero values, with a step size of 1;
and respectively setting the threshold values of the intersection ratio of the non-maximum suppression of the training annotation data and the test unlabeled data.
5. The method of claim 1, wherein the classification suspicion degree is a value with the highest possibility among different classification prediction results for a given target box of the unlabeled data, and the calculation formula of the classification suspicion degree is as follows:
U B(B)=1-P max(B)。
6. the method according to claim 1, wherein the closeness of localization is a closeness of localization of the prediction target box with respect to the corresponding candidate region generated by the final classifier for the unlabeled data, and the closeness of localization is calculated by:
Figure FDA0002239544440000021
wherein the content of the first and second substances,
Figure FDA0002239544440000022
is the jth predicted target frame
Figure FDA0002239544440000023
Positioning ofThe degree of tightness is determined by the degree of tightness,
Figure FDA0002239544440000024
is input to the final classifier generation
Figure FDA0002239544440000025
The corresponding candidate region of (a).
7. The method according to claim 1, wherein the positioning stability is a stable degree of positioning of the target frame of the unlabeled data, and the calculation formula of the positioning stability is:
Figure FDA0002239544440000026
wherein the content of the first and second substances,
Figure FDA0002239544440000027
is an object frame
Figure FDA0002239544440000028
N is the noise level.
8. The method of claim 1, wherein the composite score is a weighted sum of the classification doubt, the localization stability, and the localization closeness.
9. The method according to claim 1, wherein the termination condition is a predetermined number of active cycles and/or a predetermined IoU criteria.
10. A human ring annotation device based on active learning, the device comprising:
at least one processor; and
a memory storing processor executable program instructions which, when executed by the processor, perform the steps of the active learning based human ring annotation method of any one of claims 1 to 9.
CN201910995320.8A 2019-10-18 2019-10-18 Human ring labeling method and device based on active learning Withdrawn CN110781941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910995320.8A CN110781941A (en) 2019-10-18 2019-10-18 Human ring labeling method and device based on active learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910995320.8A CN110781941A (en) 2019-10-18 2019-10-18 Human ring labeling method and device based on active learning

Publications (1)

Publication Number Publication Date
CN110781941A true CN110781941A (en) 2020-02-11

Family

ID=69386043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910995320.8A Withdrawn CN110781941A (en) 2019-10-18 2019-10-18 Human ring labeling method and device based on active learning

Country Status (1)

Country Link
CN (1) CN110781941A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428587A (en) * 2020-03-10 2020-07-17 同济大学 Crowd counting and density estimating method and device, storage medium and terminal
CN111429512A (en) * 2020-04-22 2020-07-17 北京小马慧行科技有限公司 Image processing method and device, storage medium and processor
CN111783844A (en) * 2020-06-10 2020-10-16 东莞正扬电子机械有限公司 Target detection model training method and device based on deep learning and storage medium
CN112614570A (en) * 2020-12-16 2021-04-06 上海壁仞智能科技有限公司 Sample set labeling method, pathological image classification method and classification model construction method and device
CN112968941A (en) * 2021-02-01 2021-06-15 中科视拓(南京)科技有限公司 Data acquisition and man-machine collaborative annotation method based on edge calculation
CN113221875A (en) * 2021-07-08 2021-08-06 北京文安智能技术股份有限公司 Target detection model training method based on active learning

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428587A (en) * 2020-03-10 2020-07-17 同济大学 Crowd counting and density estimating method and device, storage medium and terminal
CN111428587B (en) * 2020-03-10 2022-07-29 同济大学 Crowd counting and density estimating method, device, storage medium and terminal
CN111429512A (en) * 2020-04-22 2020-07-17 北京小马慧行科技有限公司 Image processing method and device, storage medium and processor
CN111429512B (en) * 2020-04-22 2023-08-25 北京小马慧行科技有限公司 Image processing method and device, storage medium and processor
CN111783844A (en) * 2020-06-10 2020-10-16 东莞正扬电子机械有限公司 Target detection model training method and device based on deep learning and storage medium
CN112614570A (en) * 2020-12-16 2021-04-06 上海壁仞智能科技有限公司 Sample set labeling method, pathological image classification method and classification model construction method and device
CN112614570B (en) * 2020-12-16 2022-11-25 上海壁仞智能科技有限公司 Sample set labeling method, pathological image classification method, classification model construction method and device
CN112968941A (en) * 2021-02-01 2021-06-15 中科视拓(南京)科技有限公司 Data acquisition and man-machine collaborative annotation method based on edge calculation
CN112968941B (en) * 2021-02-01 2022-07-08 中科视拓(南京)科技有限公司 Data acquisition and man-machine collaborative annotation method based on edge calculation
CN113221875A (en) * 2021-07-08 2021-08-06 北京文安智能技术股份有限公司 Target detection model training method based on active learning
CN113221875B (en) * 2021-07-08 2021-09-21 北京文安智能技术股份有限公司 Target detection model training method based on active learning

Similar Documents

Publication Publication Date Title
CN110781941A (en) Human ring labeling method and device based on active learning
CN109815770B (en) Two-dimensional code detection method, device and system
CN108470172B (en) Text information identification method and device
CN108304820B (en) Face detection method and device and terminal equipment
CN109492643A (en) Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN111914642B (en) Pedestrian re-identification method, device, equipment and medium
CN110765865B (en) Underwater target detection method based on improved YOLO algorithm
CN112418278A (en) Multi-class object detection method, terminal device and storage medium
CN109472193A (en) Method for detecting human face and device
CN110781962B (en) Target detection method based on lightweight convolutional neural network
CN111783819A (en) Improved target detection method based on region-of-interest training on small-scale data set
CN110135446A (en) Method for text detection and computer storage medium
CN109345460B (en) Method and apparatus for rectifying image
CN110147833A (en) Facial image processing method, apparatus, system and readable storage medium storing program for executing
Wang et al. Yolov5 enhanced learning behavior recognition and analysis in smart classroom with multiple students
US20230106178A1 (en) Method and apparatus for marking object outline in target image, and storage medium and electronic apparatus
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN109101984B (en) Image identification method and device based on convolutional neural network
CN112699809B (en) Vaccinia category identification method, device, computer equipment and storage medium
CN116958962A (en) Method for detecting pre-fruit-thinning pomegranate fruits based on improved YOLOv8s
CN113065379A (en) Image detection method and device fusing image quality and electronic equipment
US11893784B2 (en) Assessment of image quality for optical character recognition using machine learning
CN115512207A (en) Single-stage target detection method based on multipath feature fusion and high-order loss sensing sampling
CN111127327B (en) Picture inclination detection method and device
CN111046861B (en) Method for identifying infrared image, method for constructing identification model and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200211