CN114429577B

CN114429577B - Flag detection method, system and equipment based on high confidence labeling strategy

Info

Publication number: CN114429577B
Application number: CN202210101441.5A
Authority: CN
Inventors: 刘欢; 张驰; 秦涛; 郑庆华; 刘炉林; 何子豪
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2022-01-27
Filing date: 2022-01-27
Publication date: 2024-03-08
Anticipated expiration: 2042-01-27
Also published as: CN114429577A

Abstract

A flag detection method, system and equipment based on a high confidence labeling strategy detect flags in pictures through core feature information of different target flags. Firstly, collecting related flag pictures; second, expanding unbalanced flag class pictures; thirdly, constructing a core feature labeling standard; fourth, build as a complete dataset; fifth, training a supervised flag detection model using the constructed dataset; and finally, identifying the flag in the unknown picture by the detection model. The method and the device label the core characteristic information of the target flag by utilizing the core characteristic labeling criterion, better acquire the core characteristic information of the target flag, improve the confidence of labeling samples, improve the detection capability of a model on different flag picture types, better solve the problems of identification of the shielding flag and the deformation flag in the picture, and simultaneously treat the unbalanced data types by utilizing an effective data enhancement method, and have the advantages of high recall, strong robustness, high efficiency and the like.

Description

Flag detection method, system and equipment based on high confidence labeling strategy

Technical Field

The invention relates to the field of target detection, in particular to a flag detection method, a system and equipment based on a high-confidence labeling strategy.

Background

The flag detection is a technology for detecting the position and the category of a target flag from one picture. Flag detection techniques are widely used in the field of image content auditing, where users upload a large number of pictures each day, possibly including sensitive flags, on large internet platforms. The flag detection belongs to the field of target detection, the characteristic of supervised learning of the flag detection leads to higher requirements on original training data, and the labeling work is used as a pre-stage basis of target detection, so that the labeling quality directly influences the effect of target detection. The data labeling criterion in the current target detection task is mainly to globally label the target object, namely, the whole range of the target is labeled as far as possible by utilizing a rectangular frame. However, unlike other object detection tasks, flags are non-rigid (non-ringing) objects with typical deformation characteristics; meanwhile, the flag detection task is often accompanied with shielding and other problems. Therefore, the global labeling of the target flag cannot sufficiently extract useful discrimination information, and even a large amount of noise information is introduced, which makes flag detection difficult. Therefore, a new flag detection method is needed.

The prior art proposes a flag detection method for detecting a flag in a video stream of a camera, which mainly comprises: firstly, enhancing an original flag data set by utilizing a plurality of effective data enhancement methods; then, in the first detection branch, adopting a combined optical flow and GMM method to detect the target; meanwhile, in a second detection branch, inputting the video frames of the expanded data set as input of a Darknet53 backbone network to extract a characteristic layer of the multi-scale video frames, and then adopting a sample selection algorithm to select positive and negative samples to train a YOLOv3 deep neural network model and target detection; and finally, merging the detection results of the two detection branches to detect whether a flag exists in the video stream of the camera.

According to the flag detection method, the target flags are still marked by utilizing the traditional global marking mode, the core characteristics of different flags are not considered, useful distinguishing information cannot be fully extracted, and meanwhile, a plurality of noise information is introduced by the global marking mode, so that the confidence of marked samples is low, and particularly, the detection of the target flags is difficult in the scenes such as flag shielding and deformation.

Disclosure of Invention

The invention aims to provide a flag detection method, a system and equipment based on a high-confidence labeling strategy, so as to solve the problems.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a flag detection method based on a high confidence labeling strategy comprises the following steps:

step 1, flag picture acquisition: taking an Internet media website as a data source, crawling k different types of flag pictures by using an API interface to obtain a picture data set consisting of n flag pictures

Step 2, unbalanced sample data enhancement: performing data enhancement operation on flag types with the sample number smaller than 100, generating extended samples with the similar sample numbers to other types of flag samples, and adding a picture data set;

step 3, labeling a high-confidence sample: aiming at shielding and deformation existing in the flag pictures, core characteristics of each type of flag are determined for k kinds of collected flags, and core characteristic marking standards are determined for the core characteristics;

step 4, marking core characteristics of the target flag: marking the target flag core feature area according to the high-confidence sample marking strategy in the step 3 by using a marking tool, and marking the belonging flag category to obtain a tag vector Y of the picture _i ＝{a _i ,b _i ,w _i ,h _i C }, wherein a _i ，b _i Marking the central point coordinates of the region for the core features, w _i ，h _i Marking the width and height of the region for the core feature, c being the coreThe flag category to which the heart feature region belongs; adding the label vectors of all pictures into the data set in the step 2, and according to 8:2, dividing the training set verification set in proportion to construct a complete flag detection data set;

step 5, building a flag detection model: extracting a training sample from the data set constructed in the step 4, inputting a YOLOv3 target detection model, and constructing and training a supervised flag detection model;

step 6, flag detection: and (5) inputting the picture p to be identified into the detection model trained in the step (5), judging whether the picture p contains the target flag or not, and determining the flag type and the flag position.

Further, in step 2, the data enhancement operation includes: firstly, analyzing the quantity proportion of different flag categories of the data collected in the step 1; and then, respectively carrying out color enhancement, gaussian noise addition, twice amplification, random rotation, random shearing, horizontal overturning, vertical overturning and horizontal and vertical overturning operation on the pictures of the classes with the number of the samples smaller than 100 to generate extended samples with the number similar to that of other flag classes, and adding the extended samples into the original data set to obtain a new data set.

Further, in the step 3, it is determined that in the flag high confidence sample label: firstly, comparing all types of flags in a data set, determining that the most distinguishable area of each target flag is the core characteristic of the target flag, and taking the complete flag area as the core characteristic if no obvious core characteristic exists; secondly, constructing a flag high-confidence sample labeling strategy: when the core features of the target flag are displayed completely, marking is carried out by using a marking frame as small as possible under the condition that the core features are covered completely, and marking is not carried out on the non-core feature area of the flag; when the core characteristics of the target flag deform, the deformed areas are marked normally; when the core features of the target flag are shielded, if the core features are shielded more than half, marking is not carried out, otherwise, marking is carried out.

Further, in step 3, the core feature labeling standard includes labeling positions, coverage areas and deformation shielding abnormal condition processing.

Further, in the step 4, in the marking of the core features of the target flag, Y is used to represent marking information of the data, forMiddle picture p _i And (3) marking the core characteristic areas of the target flags according to the marking strategies in the step (3) by using marking tools based on the core characteristics of the flags of different categories, marking the category of the flag, and recording the central point coordinates and the width and height of each marking area in the marking file.

Further, in the step 5 flag detection model establishment, aiming at the training sample data set constructed in the step 3, k-means clustering is carried out on the marked frame areas in all the picture marked information to obtain 9 anchors with different sizes; preprocessing and cutting pictures in a training set to 416×416, and converting the pictures into an RGB three-channel image matrix X; inputting the image matrix, the anchor and the image label into a detection network model for training to obtain the flag detection network model; the constructed supervised model based on the core feature labeling is used for training a coefficient matrix W to map a data matrix X to a labeling information matrix Y, and the training mode is as follows:

wherein, I _box To predict regression loss between frame and real frame, l _obj To predict confidence loss of a frame, l _cls Is a classification loss between different classes.

Further, the specific training process is as follows:

(1) The method comprises the steps of reading in pictures and label information, preprocessing the images, constructing a data matrix X of a training set, carrying out k-means clustering on labeling frame areas in all picture labeling information to obtain 9 anchors with different sizes, and setting a termination threshold E for judging optimization convergence;

(2) Inputting the image matrix X, anchor and the image tag matrix Y into a Yolov3 detection network model, extracting input image features by using a Darknet53 feature extraction network, and generating feature maps respectively comprising 13×13, 26×26 and 52×52 grid units by adopting a FPN (Feature Pyramid Network) -like structure;

(3) Taking a characteristic map grid cell in which the center of a real flag area of an input image is positioned as a prediction grid cell, taking the prediction grid cell as the center to obtain a prediction frame corresponding to an anchor, and screening out the prediction frame with the maximum IOU value of the real flag area as a prediction flag area;

(4) Comparing the real flag area with the predicted flag area, and judging whether the descending amplitude of the objective function value is smaller than E or not through updating model parameters by training errors, and if not, returning to the step (3) to continue training; otherwise, the training is exited to save the parameter matrix W of the final detection network model. .

Further, in the step 6 flag detection, for the picture p to be identified, inputting the picture p to be identified into the trained detection model in the step 5, judging whether the target flag appears in the picture, and obtaining the predictive label vector y of the target picture p by mapping the supervised flag detection model coefficient matrix W, if soIndicating that no flag is detected from the target picture; if y= { a _i ,b _i ,w _i ,h _i C, c.epsilon. {1,2, …, k }, then the flag with the category c is detected from the target picture, and the center point coordinate of the flag position is a _i ，b _i The width and the height of the flag are w _i ，h _i 。

Further, a flag detection system based on a high confidence labeling strategy, comprising:

the flag picture acquisition module is used for crawling different types of flag pictures to obtain a picture data set consisting of n flag pictures;

the sample enhancement processing module is used for carrying out data enhancement operation on flag types with smaller sample numbers, generating expanded samples with the number similar to that of other types of flag samples and adding the expanded samples into a picture data set;

the sample marking module is used for determining the core characteristics of each type of flag for k kinds of collected flags according to the shielding and deformation existing in the flag picture, and determining the core characteristic marking standard according to the core characteristics;

the target flag core feature labeling module is used for labeling the target flag core feature area according to a high-confidence sample labeling strategy by using a labeling tool, and noting the category of the flag to obtain a label vector of a picture, and constructing a complete flag detection data set;

the flag detection model building module is used for extracting training samples from the constructed data set, inputting a YOLOv3 target detection model, and constructing and training a supervised flag detection model;

and the flag detection module is used for inputting the picture p to be identified into the trained detection model, judging whether the picture p contains the target flag or not, and determining the type and the position of the flag.

Further, a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of a high confidence labeling strategy based flag detection method when executing the computer program.

Compared with the prior art, the invention has the following technical effects:

the method and the device label the core characteristic information of the target flag by utilizing the core characteristic labeling criterion, better acquire the core characteristic information of the target flag, improve the confidence of labeling samples, improve the detection capability of a model on different flag picture types, better solve the problems of identification of the shielding flag and the deformation flag in the picture, and simultaneously treat the unbalanced data types by utilizing an effective data enhancement method, and have the advantages of high recall, strong robustness, high efficiency and the like;

the method provides a high confidence labeling strategy, focuses more on the core characteristics of different flags, fully digs the most distinguishing characteristics in the flags, and improves the detection accuracy and recall rate; the method effectively solves the problem through data enhancement, and improves the overall recognition performance of the model.

Drawings

FIG. 1 is a flow chart of a method for flag detection based on a high confidence labeling strategy.

Fig. 2 is a flow chart of a data acquisition process.

Fig. 3 is a flow chart of an unbalanced sample data enhancement process.

Fig. 4 is a schematic diagram of the core features of the flag.

Fig. 5 is a flow chart of core feature labeling standard construction and picture labeling.

FIG. 6 is a flow chart of a test model training process.

FIG. 7 is a block diagram of an implementation of image flag detection of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings and examples. It should be noted that the embodiments described herein are only for explaining the present invention, and are not intended to limit the present invention. Furthermore, the technical features related to the embodiments in the present invention may be combined with each other without collision.

The specific implementation process of the invention comprises a data acquisition process, an unbalanced sample data enhancement process, a high confidence sample labeling strategy determination process, a target flag core feature labeling process, a flag detection model establishment process and a flag detection process.

The invention discloses a flag detection method based on a high confidence labeling strategy, which comprises the following steps of 1) obtaining flag data from an Internet social media platform in a keyword crawler mode; 2) Analyzing the data, carrying out data enhancement on the type of unbalance of the samples, and increasing the number of the samples to improve the performance of the model; 3) The method comprises the steps of implementing a high-confidence sample marking strategy, determining core characteristics of each type of flag, and constructing core characteristic marking standards including abnormal condition treatment such as marked positions, coverage areas and deformation shielding; 4) Marking the core feature area of the target flag according to marking standards by using marking tools, marking the category of the flag, generating a picture tag vector, and constructing a complete data set; 5) Constructing a supervised flag detection model based on the YOLOv3 target detection model; 6) And inputting the picture to be identified into a constructed detection model for target flag detection. According to the flag detection method disclosed by the invention, the flag sample with high confidence is marked by utilizing the core characteristic information of the target flag, so that the detection capability of a model on different flag picture types is improved, and the identification of the shielding flag and the deformation flag in the picture is better solved. Meanwhile, the data type imbalance problem is solved by using an effective data enhancement method, and the method has the advantages of high recall, strong robustness, high efficiency and the like, so that the method has obvious advantages compared with other flag detection methods.

FIG. 1 is a general flow chart of a flag detection method based on a high confidence labeling strategy of the present invention.

Data acquisition process

The specific process of data acquisition is as follows:

(1) And (5) performing picture crawling according to related keywords of the flag category through a crawler technology. When crawling, related flag tags such as 'flag of China', 'flag of Japan', 'flag of South Korea' and the like can be used for crawling;

(2) Performing de-duplication treatment on the crawled k flag pictures to obtain a picture data set consisting of n flag pictures

The above data acquisition process is shown in fig. 2.

Unbalanced sample data enhancement procedure

The data collected in step 1 are analyzed for the quantitative ratio of the different flag categories. For the category with smaller sample number, performing operations such as color enhancement, gaussian noise addition, twice amplification, random rotation, random shearing, horizontal overturning, vertical overturning, horizontal and vertical overturning on the picture, generating an extended sample with the similar number to other flag categories, and adding the extended sample into the original data set to obtain a new data set.

The unbalanced sample data enhancement process described above is shown in fig. 3.

Determining high confidence sample labeling strategy procedure

Determining the core characteristics of the flag: comparing all types of flags in the data set, determining the most distinguishable area of each target flag as the core characteristic of the target flag, and taking the complete flag area as the core characteristic if no obvious core characteristic exists. If the united nationality flag in the united nationality flag has strong distinguishing property, namely the core characteristic of the flag, and the united kingdom flag does not contain the area with strong distinguishing property, the whole united kingdom flag is the core characteristic. Constructing a flag high-confidence sample labeling strategy: when the core features of the target flag are displayed completely, marking is carried out by using a marking frame as small as possible under the condition that the core features are covered completely, and marking is not carried out on the non-core feature area of the flag; when the core characteristics of the target flag deform, the deformed areas are marked normally; when the core features of the target flag are shielded, if the core features are shielded more than half, marking is not carried out, otherwise, marking is carried out.

The core features of the flag are schematically shown in fig. 4.

Target flag core feature labeling process

Using Y to represent annotation information of data, forMiddle picture p _i Marking the core feature areas of the target flags according to marking standards in the step 3 by using marking tools based on the core features of the flags of different categories, marking the category of the flag, and recording the center point coordinates and the width and height of each marking area in a marking file; generating a label vector Y of a picture according to a labeling file _i ＝{a _i ,b _i ,w _i ,h _i C }, wherein a _i ，b _i To mark the center point coordinates of the region, w _i ，h _i For the width and the height of the marking area, c represents the flag category to which the marking area belongs, c=1, 2, …, k respectively correspond to k different flag types; adding the label vectors of all pictures into the original data set, and according to 8:2, thereby constructing a complete flag detection data set.

The process of constructing the labeling standard and labeling the picture is shown in fig. 5.

Flag detection model building process

The flag detection model herein is built based on the YOLOv3 detection network. The model in YOLOv3 adopts a structure similar to FPN to enhance the accuracy of detecting small targets, and has 3 detection layers with different dimensions, and each detection layer is provided with 3 anchors with different dimensions. And (3) aiming at the training sample data set constructed in the step (4), carrying out k-means clustering on the labeling frame areas in all the picture labeling information to obtain 9 anchors with different sizes. Preprocessing and cutting pictures in a training set to 416 multiplied by 416, converting the preprocessed pictures into an RGB three-channel image matrix X, and inputting the image matrix, the anchor and the image labels into a detection network model for training to obtain the flag detection network model. The constructed supervised model based on the core feature labeling is used for training a coefficient matrix W to map a data matrix X to a labeling information matrix Y, and the training mode is as follows:

wherein, I _box To predict regression loss between frame and real frame, l _obj To predict confidence loss of a frame, l _cls The specific training process is as follows:

(1) And reading in the picture and the label information, preprocessing the image, constructing a data matrix X for obtaining a training set, and carrying out k-means clustering on the labeling frame areas in all the picture labeling information to obtain 9 anchors with different sizes. Meanwhile, setting a termination threshold epsilon for judging optimization convergence;

(2) Inputting the image matrix X, anchor and the image tag matrix Y into a Yolov3 detection network model, extracting input image features by using a Darknet53 feature extraction network, and generating feature maps respectively comprising 13×13, 26×26 and 52×52 grid units by adopting a FPN-like structure;

(4) Comparing the real flag area with the predicted flag area, and judging whether the descending amplitude of the objective function value is smaller than E or not through updating model parameters by training errors, and if not, returning to the step (3) to continue training; otherwise, the training is exited to save the parameter matrix W of the final detection network model.

The training process of the detection model is shown in fig. 6.

Flag detection process

And (5) inputting the picture p to be identified into the trained detection model in the step (5), and judging whether a target flag appears in the picture. Mapping the supervised flag detection model coefficient matrix W to obtain a predictive label vector y of the target picture p, ifIndicating that no flag is detected from the target picture; if y= { a _i ,b _i ,w _i ,h _i C, c.epsilon. {1,2, …, k }, then the flag with the category c is detected from the target picture, and the center point coordinate of the flag position is a _i ，b _i The width and the height of the flag are w _i ，h _i 。

The above flag detection process is shown in fig. 7.

The following are system embodiments of the present invention that may be used to perform method embodiments of the present invention.

In still another embodiment of the present invention, a flag detection system based on a high confidence labeling strategy is provided, which can be used to implement the above-mentioned flag detection method based on a high confidence labeling strategy, and specifically, the flag detection system based on a high confidence labeling strategy includes:

In yet another embodiment of the present invention, a computer device is provided that includes a processor and a memory for storing a computer program including program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal adapted to implement one or more instructions, in particular adapted to load and execute one or more instructions within a computer storage medium to implement the corresponding method flow or corresponding functions; the processor disclosed by the embodiment of the invention can be used for the operation of the flag detection method based on the high-confidence labeling strategy.

Claims

1. A flag detection method based on a high confidence labeling strategy is characterized by comprising the following steps:

step 4, marking core characteristics of the target flag: marking the target flag core feature area according to the high-confidence sample marking strategy in the step 3 by using a marking tool, and marking the belonging flag category to obtain a tag vector Y of the picture _i ＝{a _i ,b _i ,w _i ,h _i C }, wherein a _i ，b _i Marking the central point coordinates of the region for the core features, w _i ，h _i Marking the width and the height of a region for the core feature, wherein c is the flag category to which the core feature region belongs; adding the label vectors of all pictures into the data set in the step 2, and according to 8:2, dividing the training set verification set in proportion to construct a complete flag detection data set;

step 6, flag detection: inputting the picture p to be identified into the trained detection model in the step 5, judging whether a target flag is contained in the picture p or not, and determining the type and the position of the flag;

step 3, determining that in the flag high confidence sample label: firstly, comparing all types of flags in a data set, determining that the most distinguishable area of each target flag is the core characteristic of the target flag, and taking the complete flag area as the core characteristic if no obvious core characteristic exists; secondly, constructing a flag high-confidence sample labeling strategy: when the core features of the target flag are displayed completely, marking is carried out by using a marking frame as small as possible under the condition that the core features are covered completely, and marking is not carried out on the non-core feature area of the flag; when the core characteristics of the target flag deform, the deformed areas are marked normally; when the core features of the target flag are shielded, if the core features are shielded more than half, marking is not carried out, otherwise, marking is carried out;

in the step 3, the core feature labeling standard comprises labeling positions, coverage areas and deformation shielding abnormal condition processing.

2. The method for detecting a flag based on a high confidence labeling strategy according to claim 1, wherein in step 2, the data enhancement operation comprises: firstly, analyzing the quantity proportion of different flag categories of the data collected in the step 1; and then, respectively carrying out color enhancement, gaussian noise addition, twice amplification, random rotation, random shearing, horizontal overturning, vertical overturning and horizontal and vertical overturning operation on the pictures of the classes with the number of the samples smaller than 100 to generate extended samples with the number similar to that of other flag classes, and adding the extended samples into the original data set to obtain a new data set.

3. The method for detecting a flag based on a high confidence labeling strategy according to claim 1, wherein in the step 4, the target flag core feature labeling is performed by using Y to represent labeling information of data, forMiddle picture p _i And (3) marking the core characteristic areas of the target flags according to the marking strategies in the step (3) by using marking tools based on the core characteristics of the flags of different categories, marking the category of the flag, and recording the central point coordinates and the width and height of each marking area in the marking file.

4. The flag detection method based on the high-confidence labeling strategy according to claim 1, wherein in the step 5 flag detection model establishment, k-means clustering is performed on labeling frame areas in all picture labeling information aiming at the training sample data set constructed in the step 3 to obtain 9 anchors with different sizes; preprocessing and cutting pictures in a training set to 416×416, and converting the pictures into an RGB three-channel image matrix X; inputting the image matrix, the anchor and the image label into a detection network model for training to obtain the flag detection network model; the constructed supervised model based on the core feature labeling is used for training a coefficient matrix W to map a data matrix X to a labeling information matrix Y, and the training mode is as follows:

5. The method for detecting a flag based on a high confidence labeling strategy according to claim 4, wherein the specific training process is as follows:

6. The flag detection method based on high confidence labeling strategy according to claim 1, wherein in step 6, in flag detection, for a picture p to be identified, the picture p is input into the detection model trained in step 5, whether a target flag appears in the picture is judged, and a predictive label vector y of the target picture p is obtained through mapping of a coefficient matrix W of a supervised flag detection model, ifIndicating that no flag is detected from the target picture; if y= { a _i ,b _i ,w _i ,h _i C, c.epsilon. {1,2, …, k }, then the flag with the category c is detected from the target picture, and the center point coordinate of the flag position is a _i ，b _i The width and the height of the flag are w _i ，h _i 。

7. A flag detection system based on a high confidence labeling strategy, comprising:

the flag detection module is used for inputting the picture p to be identified into the trained detection model, judging whether the picture p contains a target flag or not and determining the type and the position of the flag;

determining flag high confidence sample labels: firstly, comparing all types of flags in a data set, determining that the most distinguishable area of each target flag is the core characteristic of the target flag, and taking the complete flag area as the core characteristic if no obvious core characteristic exists; secondly, constructing a flag high-confidence sample labeling strategy: when the core features of the target flag are displayed completely, marking is carried out by using a marking frame as small as possible under the condition that the core features are covered completely, and marking is not carried out on the non-core feature area of the flag; when the core characteristics of the target flag deform, the deformed areas are marked normally; when the core features of the target flag are shielded, if the core features are shielded more than half, marking is not carried out, otherwise, marking is carried out;

the core feature labeling standard comprises labeling positions, coverage areas and deformation shielding abnormal condition processing.

8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the high confidence labeling strategy based flag detection method as claimed in any of claims 1 to 6 when the computer program is executed by the processor.