CN112990300A - Foreground identification method, device, equipment and computer readable storage medium - Google Patents
Foreground identification method, device, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN112990300A CN112990300A CN202110265506.5A CN202110265506A CN112990300A CN 112990300 A CN112990300 A CN 112990300A CN 202110265506 A CN202110265506 A CN 202110265506A CN 112990300 A CN112990300 A CN 112990300A
- Authority
- CN
- China
- Prior art keywords
- category
- picture
- recognized
- feature
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000004422 calculation algorithm Methods 0.000 claims description 18
- 230000015654 memory Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 238000005259 measurement Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 description 27
- 238000012512 characterization method Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000001771 impaired effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application provides a foreground identification method, which comprises the following steps: firstly, generating a feature representation of a picture to be recognized, then, determining whether the feature representation of the picture to be recognized falls into a set feature range of any known foreground category or background category according to the feature representation of the picture to be recognized and the pre-generated feature representation corresponding to each of at least one known foreground category and at least one background category, and if not, recognizing the picture to be recognized as an unknown foreground category. According to the method and the device, whether the image in the picture to be recognized belongs to the known foreground category or the background category is determined through feature comparison, and when the image does not belong to the known foreground category or the background category, the picture to be recognized is recognized as the unknown foreground category, so that the recognition accuracy of the foreground category is improved. The application also provides a foreground identification device, equipment and a computer readable storage medium.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a foreground identification method, apparatus, device, and computer-readable storage medium.
Background
With the rapid development of deep learning, the network model also rapidly develops towards a lighter and more efficient direction, so that the intelligent wearable device obtains more and more extensive research. The intelligent wearable device greatly improves the life quality of people in shopping, traveling, social contact and other aspects, and plays a vital role in special fields such as clinical treatment and life assistance of disabled people.
The vision blind guiding system based on target detection is one of the important wearing devices for intelligently assisting the visually impaired people, and the vision blind guiding system takes images or videos as input data and is used for detecting target objects (such as pedestrians or vehicles) and then tells the visually impaired people the information of the detected target objects in modes of voice and the like, so that the visually impaired people are assisted in going out, living and the like.
However, existing visual blind-guide systems based on object detection can only detect a predefined limited foreground category and classify all the remaining categories into background categories. Under special conditions, such as encountering a new automobile or rare wild animals, the vision blind guiding system cannot timely remind visually impaired people to effectively avoid the background of the samples because the samples are not seen in the training stage, thereby causing danger.
Disclosure of Invention
The application provides a foreground identification method, a foreground identification device, a foreground identification equipment and a computer readable storage medium, which can improve the identification accuracy of foreground categories.
In a first aspect, the present application provides a foreground identification method, including:
generating a characteristic representation of the picture to be identified;
determining whether the characteristic representation of the picture to be recognized falls into a set characteristic range of any known foreground category or the background category according to the characteristic representation of the picture to be recognized and the characteristic representation corresponding to at least one known foreground category and background category;
and if the characteristic representation of the picture to be recognized does not fall into any known foreground category or the set characteristic range of the background category, recognizing the picture to be recognized as an unknown foreground category.
Optionally, the method further includes:
if the characteristic representation of the picture to be recognized falls into the set characteristic range of any known foreground category, recognizing the picture to be recognized as the known foreground category;
and if the picture to be recognized falls into the set characteristic range of the background category, recognizing the picture to be recognized as the background category.
Optionally, the determining whether the feature representation of the picture to be recognized falls within a set feature range of any known foreground category or the background category includes:
for each known foreground category, calculating a first characteristic distance between the picture to be identified and the known foreground category by utilizing the characteristic representation of the picture to be identified and the characteristic representation of the known foreground category; determining whether the picture to be identified falls into a preset feature range of the known foreground category or not according to whether the first feature distance is smaller than a first preset threshold or not;
for the background category, calculating a second characteristic distance between the picture to be recognized and the background category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the background category; and determining whether the picture to be identified falls into a preset feature range of the background category or not according to whether the second feature distance is smaller than a second preset threshold or not.
Optionally, if the distance to be identified is smaller than a first preset threshold according to whether the first feature distance is smaller than the first preset threshold, determining whether the picture to be identified falls within a preset feature range of the known foreground category, including:
and judging whether the first characteristic distance is smaller than a first preset threshold value, if so, determining that the picture to be recognized falls into a preset characteristic range of the known foreground category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the known foreground category.
Optionally, if the distance to the picture to be recognized is determined to fall within the preset feature range of the background category according to whether the second feature distance is smaller than a second preset threshold, the determining includes:
and judging whether the second characteristic distance is smaller than a second preset threshold value, if so, determining that the picture to be recognized falls into the preset characteristic range of the background category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the background category.
Optionally, the feature representation of the known foreground category includes:
a feature representation of at least one learnable prototype belonging to the known foreground class;
and for each learnable prototype of the known foreground class, a feature representation of a feature space range based on the feature representation of the learnable prototype.
Optionally, the feature representation of the background category includes:
a feature representation of at least one learnable prototype belonging to the context class;
and for each learnable prototype of the background category, a feature representation of a feature space range based on the feature representation of the learnable prototype.
Optionally, the method further includes:
and if the picture to be recognized is recognized as the unknown foreground category, classifying and recognizing the picture to be recognized through a clustering algorithm.
In a second aspect, the present application provides a foreground identification apparatus, including:
the characteristic generating unit is used for generating a characteristic representation set of the picture to be identified;
the characteristic measurement unit is used for determining whether the characteristic representation of the picture to be identified falls into any set characteristic range of a known foreground category or a background category according to the characteristic representation of the picture to be identified and the characteristic representation corresponding to each of the at least one known foreground category and the background category;
and the foreground identification unit is used for identifying the picture to be identified as an unknown foreground category if the characteristic representation of the picture to be identified does not fall into any known foreground category or the set characteristic range of the background category.
Optionally, the foreground identifying unit is further configured to:
if the characteristic representation of the picture to be recognized falls into the set characteristic range of any known foreground category, recognizing the picture to be recognized as the known foreground category;
and if the picture to be recognized falls into the set characteristic range of the background category, recognizing the picture to be recognized as the background category.
Optionally, when determining whether the feature representation of the to-be-identified picture falls within a set feature range of any known foreground category or the background category, the feature measurement unit is specifically configured to:
for each known foreground category, calculating a first characteristic distance between the picture to be identified and the known foreground category by utilizing the characteristic representation of the picture to be identified and the characteristic representation of the known foreground category; determining whether the picture to be identified falls into a preset feature range of the known foreground category or not according to whether the first feature distance is smaller than a first preset threshold or not;
for the background category, calculating a second characteristic distance between the picture to be recognized and the background category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the background category; and determining whether the picture to be identified falls into a preset feature range of the background category or not according to whether the second feature distance is smaller than a second preset threshold or not.
Optionally, the feature measurement unit is specifically configured to, when determining whether the to-be-identified picture falls within a preset feature range of the known foreground category according to whether the first feature distance is smaller than a first preset threshold:
and judging whether the first characteristic distance is smaller than a first preset threshold value, if so, determining that the picture to be recognized falls into a preset characteristic range of the known foreground category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the known foreground category.
Optionally, the feature measurement unit is specifically configured to, when determining whether the to-be-identified picture falls within a preset feature range of the background category according to whether the second feature distance is smaller than a second preset threshold:
and judging whether the second characteristic distance is smaller than a second preset threshold value, if so, determining that the picture to be recognized falls into the preset characteristic range of the background category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the background category.
Optionally, the feature representation of the known foreground category includes:
a feature representation of at least one learnable prototype belonging to the known foreground class;
and for each learnable prototype of the known foreground class, a feature representation of a feature space range based on the feature representation of the learnable prototype.
Optionally, the feature representation of the background category includes:
a feature representation of at least one learnable prototype belonging to the context class;
and for each learnable prototype of the background category, a feature representation of a feature space range based on the feature representation of the learnable prototype.
Optionally, the apparatus further comprises:
and the classification identification unit is used for performing classification identification on the picture to be identified through a clustering algorithm if the picture to be identified is identified as the unknown foreground category.
In a third aspect, the present application provides an electronic device, comprising: a processor, a memory;
the memory for storing a computer program;
the processor is used for executing the foreground identification method by calling the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the foreground identification method described above.
In the technical scheme provided by the application, the feature representation of the picture to be recognized is generated firstly, then, according to the feature representation of the picture to be recognized and the pre-generated feature representation corresponding to at least one known foreground category and background category, whether the feature representation of the picture to be recognized falls into the set feature range of any known foreground category or background category is determined, and if not, the picture to be recognized is recognized as the unknown foreground category. Therefore, the method and the device for recognizing the foreground category determine whether the characteristic representation of the picture to be recognized falls into the set characteristic range of the known foreground category or the background category through characteristic comparison, determine whether the image in the picture to be recognized belongs to the known foreground category or the background category, and when the image does not belong to the known foreground category or the background category, recognize the picture to be recognized as the unknown foreground category without attributing the unknown foreground category to the background category, so that the recognition accuracy of the foreground category is improved.
Drawings
Fig. 1 is a schematic flowchart of a foreground identification method shown in the present application;
fig. 2 is a schematic diagram illustrating a foreground recognition apparatus according to the present application;
fig. 3 is a schematic structural diagram of an electronic device shown in the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Referring to fig. 1, a schematic flow chart of a foreground identification method provided in an embodiment of the present application is shown, where the method includes the following steps:
s101: and generating a characteristic representation of the picture to be identified.
In the embodiment of the present application, for a current image that needs to be subjected to image recognition, for example, a current image taken by a wearable device used by a visually impaired person, it is necessary to identify which images in the current image are foreground images and which images are background images, and it is necessary to identify one or more image categories included in the foreground images. Therefore, in the actual image recognition process, one or more pictures to be recognized can be divided from the current image, and the picture to be recognized can be a frame or other pictures framed from the current image, and for each picture to be recognized, the picture to be recognized should belong to a background image or a foreground image, and if the picture to be recognized belongs to a foreground image, the picture should belong to a specific image category (such as a car, a tree, and the like).
It should be noted that, when it is determined that there are one or more pictures to be recognized in the current image, the subsequent content will introduce the image recognition process with respect to one of the pictures to be recognized, and the recognition manner of other pictures to be recognized is the same.
In S101, for a certain picture to be recognized, features of the picture to be recognized need to be analyzed to generate a feature representation of the picture to be recognized, so as to recognize image categories in the picture to be recognized by using the feature representation through subsequent steps, that is, determine whether the picture to be recognized is a foreground image or a background image, and when the picture to be recognized is a foreground image, determine whether the picture to be recognized belongs to a known foreground category or an unknown foreground category.
S102: and determining whether the characteristic representation of the picture to be recognized falls into a set characteristic range of any known foreground category or background category according to the characteristic representation of the picture to be recognized and the characteristic representation corresponding to at least one known foreground category and background category.
In the embodiment of the present application, it is necessary to generate feature representations of one or more known foreground categories and feature representations of a background image in advance, so as to identify an image category of the to-be-identified picture based on the feature representations of the to-be-identified picture and the feature representations corresponding to the known foreground categories and the background categories.
In one implementation of the embodiment of the present application, for each known foreground class, the feature representation of the known foreground class includes: a feature representation of at least one learnable prototype belonging to the known foreground class, and, for each learnable prototype of the known foreground class, a feature representation of a feature space range with the feature representation of the learnable prototype as a reference; similarly, the feature representation of the background category includes: the feature representation of the feature space range is based on the feature representation of the learnable prototype for each learnable prototype of the background category.
Specifically, in this implementation, N classes of known foregrounds may be predefined, respectively class 1, class 2, and … … class N, with the background class labeled as class 0. Respectively setting K learnable prototypes for each known foreground class and background class (for example, when a certain foreground class is a car class, K different types of car prototypes can be selected); in this way, each known foreground class can be characterized by K learnable prototypes, that is, for each known foreground class, K features for representing the known foreground class can be obtained through feature learning, and the K features are used as feature representations of the known foreground class; similarly, K features for representing the background category may be obtained through feature learning, where the K features are represented as features of the background category.
The above-mentioned feature representation sets of N known foreground classes and background classes can be defined symbolically and are denoted as setsWherein K is more than or equal to 1, N is more than or equal to 1,representing the kth feature of the ith class. In particular, when k is 1 (i.e., each known foreground class can be characterized by 1 learnable prototype),
for each known foreground category and background category, a learnable radius characterization can be set for each prototype of each category, and recorded as a setWherein K is more than or equal to 1, N is more than or equal to 1,a feature representation representing a feature space range with reference to the kth feature of the ith class, which may specifically be a feature space within a learnable radius with reference to the kth featureAnd (3) a range.
The following describes how M and R are generated.
The method comprises the steps of establishing prototype representations for predefined N known foreground categories and predefined background categories by training an open set target detection model, establishing a prototype anchor point of each category in a feature space and a corresponding feature space range by prototype learning, and specifically obtaining the prototype anchor point representations of the N known foreground categories and the N known background categories by learningAnd prototype radius characterization
The open set target detection algorithm used by the open set target detection model can be regarded as a natural extension of the traditional target detection algorithm, and can be applied to most of the existing mainstream general target detection algorithms, for example, the existing mainstream target detection algorithms include one-stage detection algorithms FCOS and centrnet, and two-stage detection algorithms fast-RCNN and RetinaNet, and the like. Therefore, the method and the device adopt an open set target detection algorithm based on general target detection, introduce an open set concept into target detection, are suitable for most mainstream target detection algorithms, do not limit a detection algorithm framework, and can be suitable for more application scenes.
In the existing target detection algorithm, the method can be roughly divided into two parts: and extracting a feature graph through the feature extraction skeleton network, and obtaining a final detection frame and category classification through the target detection head. In the embodiment of the application, a Retianet detection model is used as a detection framework, and FocalLoss training is adopted for identifying the candidate frame in the sample image:
wherein p istRepresenting the probability of model prediction; FL (p)t) To representFocallloss; γ is a hyper-parameter, representing the attention parameter (focusingparameter), used to smooth the scaled values.
In the open set target detection algorithm provided by the embodiment of the application, only the target detection head part needs to be modified, and each known foreground class and background class are made to learn prototype representation through prototype learning.
In order to implement prototype learning, K learnable prototypes can be set for each known foreground class and background class, and are recorded as a setThat is, each known foreground and background class may be characterized by K learnable prototypes, which when K is 1, are denoted as a setThat is, each known foreground and background class can be characterized by 1 learnable prototype.
The following description is given with K ═ 1, and when K >1, the algorithm process is similar.
In training an open-set target detection model, a loss function of prototype learning, which may constrain feature representations of each class (i.e., N known foreground classes and background classes) to cluster as close as possible in the feature space near the respective corresponding prototype anchor point (i.e., prototype representation), is as follows:
wherein f isiRepresenting classification layer characteristics of a target detection head; n represents the total number of categories; pl represents a loss function name, pl loss; m isiPrototype representations representing the ith category.
Because the feature spatial distribution of each category is different, a range parameter can be further learned, namely, the radius characterization of each known foreground category and background category is recorded as a set When K takes 1, it is recorded as a setThe loss function for the prototype radius is as follows:
plr can be regarded as a regular term of pl shown in formula (2), and plr further constrains the feature space distribution of classes, so that the intra-class features are more compactly distributed in the feature space, and the inter-class feature distribution is more discrete, which is helpful for obtaining more robust classification.
Based on the above equations (1) - (3), the final Loss function Loss of the open set target detection model training is as follows:
Loss=FL(pt)+α*pl+β*plr (4)
wherein, alpha and beta are weight parameters; FL (p)t) For the meaning of the parameters pl, plr, see the above formulae (1) to (3).
The prototype anchor point representation M and the prototype radius representation R of N known foreground types and background types can be obtained through the prototype learning. Wherein, for each known foreground class, the prototype anchor point representation in M about the known foreground class and the prototype radius representation in R about the known foreground class can be used as feature representations of the known foreground class; for a background class, the prototype anchor point characterization in M for that background class and the prototype radius characterization in R for that background class may be used as feature representations for that background class.
As described above, each of the known foreground categories and the known background categories may be represented by K learnable prototypes, that is, each of the categories is represented by K features, the number of the features of each category is the same, and the number of the features corresponding to each of the known foreground categories and the known background categories may be partially or totally different. In addition, the above description uses the retinet framework as the target detection framework, and can replace the retinet framework with other general target detection frameworks.
Then, according to the feature representation of the picture to be recognized and the feature representations corresponding to the N known foreground categories and the background categories, it may be determined whether the feature representation of the picture to be recognized falls within a set feature range of one of the known foreground categories or whether the feature representation of the picture to be recognized falls within a set feature range of the background category. The set feature range of the known foreground category may be a feature range including a prototype anchor point representation and a prototype radius representation of the known foreground category; the set feature range of the background category may be a feature range including a prototype anchor point representation and a prototype radius representation of the background category.
In one implementation manner of the embodiment of the present application, the "determining whether the feature representation of the picture to be recognized falls within a set feature range of any known foreground category or background category" in S102 may include:
for each known foreground category, calculating a first characteristic distance between the picture to be recognized and the known foreground category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the known foreground category; determining whether the picture to be identified falls into a preset feature range of the known foreground category or not according to whether the first feature distance is smaller than a first preset threshold or not;
for the background category, calculating a second characteristic distance between the picture to be recognized and the background category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the background category; and determining whether the picture to be recognized falls into the preset feature range of the background category or not according to whether the second feature distance is smaller than a second preset threshold or not.
Specifically, in this implementation manner, for each known foreground category, a first feature distance between the picture to be recognized and the known foreground category may be calculated by using the feature representation of the picture to be recognized and the feature representation of the known foreground category, and then, it is determined whether the first feature distance is smaller than a first preset threshold, if so, it is determined that the picture to be recognized falls within a preset feature range of the known foreground category, and if not, it is determined that the picture to be recognized does not fall within the preset feature range of the known foreground category. Similarly, the feature representation of the picture to be recognized and the feature representation of the background category can be utilized to calculate a second feature distance between the picture to be recognized and the background category, and determine whether the second feature distance is smaller than a second preset threshold value, if so, determine that the picture to be recognized falls within the preset feature range of the background category, and if not, determine that the picture to be recognized does not fall within the preset feature range of the background category. The first preset threshold and the second preset threshold may be the same or different.
In the embodiments of the present application, the distance calculation method used is not limited, and any of the methods such as euclidean distance, cosine distance, mahalanobis distance, and combined calculation of various distances may be used.
S103: and if the characteristic representation of the picture to be recognized does not fall into the set characteristic range of any known foreground category or background category, recognizing the picture to be recognized as an unknown foreground category.
In the embodiment of the present application, if it is determined through the step S102 that the feature of the to-be-identified picture does not fall within the set feature range of any known foreground category or background category, the to-be-identified picture is identified as the unknown foreground category. However, if it is determined through the above step S102 that the feature representation of the picture to be recognized falls within the set feature range of any known foreground category, the picture to be recognized is recognized as the known foreground category; if it is determined through the above step S102 that the picture to be recognized falls within the set feature range of the background category, the picture to be recognized is recognized as the background category.
Further, the embodiment of the present application may further include: and if the picture to be identified is identified as the unknown foreground category, classifying and identifying the picture to be identified through a clustering algorithm. That is, if the picture to be recognized is recognized as the unknown foreground category, any clustering algorithm existing or appearing in the future may be adopted to recognize the specific category to which the image in the picture to be recognized belongs, for example, it is recognized that the image in the picture to be recognized is a new type of automobile.
In the foreground identification method provided in the embodiment of the present application, a feature representation of a to-be-identified picture is generated first, and then, according to the feature representation of the to-be-identified picture and a pre-generated feature representation corresponding to each of at least one known foreground category and a background category, whether the feature representation of the to-be-identified picture falls within a set feature range of any known foreground category or any known background category is determined, and if not, the to-be-identified picture is identified as the unknown foreground category. Therefore, the embodiment of the application determines whether the characteristic representation of the picture to be identified falls into the set characteristic range of the known foreground category or the background category through characteristic comparison, so as to determine whether the image in the picture to be identified belongs to the known foreground category or the background category, and when the image does not belong to the known foreground category or the background category, the picture to be identified is identified as the unknown foreground category, the unknown foreground category is not classified as the background category, and the identification accuracy of the foreground category is improved.
Referring to fig. 2, a schematic composition diagram of a foreground identification apparatus provided in an embodiment of the present application is shown, where the apparatus includes:
a feature generation unit 210, configured to generate a feature representation set of a picture to be identified;
a feature measurement unit 220, configured to determine whether the feature representation of the to-be-identified picture falls within a set feature range of any known foreground category or any known background category according to the feature representation of the to-be-identified picture and feature representations corresponding to at least one known foreground category and the background category;
a foreground identifying unit 230, configured to identify the picture to be identified as an unknown foreground category if the feature of the picture to be identified does not fall within a set feature range of any known foreground category or the background category.
In an implementation manner of the embodiment of the present application, the foreground identifying unit 230 is further configured to:
if the characteristic representation of the picture to be recognized falls into the set characteristic range of any known foreground category, recognizing the picture to be recognized as the known foreground category;
and if the picture to be recognized falls into the set characteristic range of the background category, recognizing the picture to be recognized as the background category.
In an implementation manner of the embodiment of the present application, when determining whether the feature representation of the to-be-recognized picture falls within a set feature range of any known foreground category or the background category, the feature measurement unit 220 is specifically configured to:
for each known foreground category, calculating a first characteristic distance between the picture to be identified and the known foreground category by utilizing the characteristic representation of the picture to be identified and the characteristic representation of the known foreground category; determining whether the picture to be identified falls into a preset feature range of the known foreground category or not according to whether the first feature distance is smaller than a first preset threshold or not;
for the background category, calculating a second characteristic distance between the picture to be recognized and the background category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the background category; and determining whether the picture to be identified falls into a preset feature range of the background category or not according to whether the second feature distance is smaller than a second preset threshold or not.
In an implementation manner of the embodiment of the present application, when determining whether the to-be-identified picture falls within a preset feature range of the known foreground category according to whether the first feature distance is smaller than a first preset threshold, the feature measurement unit 220 is specifically configured to:
and judging whether the first characteristic distance is smaller than a first preset threshold value, if so, determining that the picture to be recognized falls into a preset characteristic range of the known foreground category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the known foreground category.
In an implementation manner of the embodiment of the present application, when determining whether the to-be-identified picture falls within a preset feature range of the background category according to whether the second feature distance is smaller than a second preset threshold, the feature measurement unit 220 is specifically configured to:
and judging whether the second characteristic distance is smaller than a second preset threshold value, if so, determining that the picture to be recognized falls into the preset characteristic range of the background category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the background category.
In an implementation manner of the embodiment of the present application, the feature representation of the known foreground category includes:
a feature representation of at least one learnable prototype belonging to the known foreground class;
and for each learnable prototype of the known foreground class, a feature representation of a feature space range based on the feature representation of the learnable prototype.
In an implementation manner of the embodiment of the present application, the feature representation of the context category includes:
a feature representation of at least one learnable prototype belonging to the context class;
and for each learnable prototype of the background category, a feature representation of a feature space range based on the feature representation of the learnable prototype.
In an implementation manner of the embodiment of the present application, the apparatus further includes:
and the classification identification unit is used for performing classification identification on the picture to be identified through a clustering algorithm if the picture to be identified is identified as the unknown foreground category.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
An embodiment of the present application further provides an electronic device, a schematic structural diagram of the electronic device is shown in fig. 3, the electronic device 3000 includes at least one processor 3001, a memory 3002, and a bus 3003, and the at least one processor 3001 is electrically connected to the memory 3002; the memory 3002 is configured to store at least one computer executable instruction, and the processor 3001 is configured to execute the at least one computer executable instruction so as to perform the steps of any of the foreground identification methods as provided by any of the embodiments or any alternative embodiments of the present application.
Further, the processor 3001 may be an FPGA (Field-Programmable Gate Array) or other devices with logic processing capability, such as an MCU (micro controller Unit) and a CPU (Central processing Unit).
By applying the method and the device, whether the characteristic representation of the picture to be identified falls into the set characteristic range of the known foreground category or the background category is determined through characteristic comparison, whether the image in the picture to be identified belongs to the known foreground category or the background category is determined, and when the image does not belong to the known foreground category or the background category, the picture to be identified is identified as the unknown foreground category, the unknown foreground category is not classified as the background category, and the identification accuracy of the foreground category is improved.
The embodiment of the present application further provides another computer-readable storage medium, which stores a computer program, where the computer program is used for implementing the steps of any one of the foreground identification methods provided in any one of the embodiments or any one of the optional implementation manners of the present application when the computer program is executed by a processor.
The computer-readable storage medium provided by the embodiments of the present application includes, but is not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magneto-optical disks, ROMs (Read-Only memories), RAMs (Random Access memories), EPROMs (Erasable Programmable Read-Only memories), EEPROMs (Electrically Erasable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards. That is, a readable storage medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).
By applying the method and the device, whether the characteristic representation of the picture to be identified falls into the set characteristic range of the known foreground category or the background category is determined through characteristic comparison, whether the image in the picture to be identified belongs to the known foreground category or the background category is determined, and when the image does not belong to the known foreground category or the background category, the picture to be identified is identified as the unknown foreground category, the unknown foreground category is not classified as the background category, and the identification accuracy of the foreground category is improved.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.
Claims (11)
1. A foreground identification method, comprising:
generating a characteristic representation of the picture to be identified;
determining whether the characteristic representation of the picture to be recognized falls into a set characteristic range of any known foreground category or the background category according to the characteristic representation of the picture to be recognized and the characteristic representation corresponding to at least one known foreground category and background category;
and if the characteristic representation of the picture to be recognized does not fall into any known foreground category or the set characteristic range of the background category, recognizing the picture to be recognized as an unknown foreground category.
2. The method of claim 1, further comprising:
if the characteristic representation of the picture to be recognized falls into the set characteristic range of any known foreground category, recognizing the picture to be recognized as the known foreground category;
and if the picture to be recognized falls into the set characteristic range of the background category, recognizing the picture to be recognized as the background category.
3. The method according to claim 1, wherein the determining whether the feature representation of the picture to be recognized falls within a set feature range of any known foreground category or the background category comprises:
for each known foreground category, calculating a first characteristic distance between the picture to be identified and the known foreground category by utilizing the characteristic representation of the picture to be identified and the characteristic representation of the known foreground category; determining whether the picture to be identified falls into a preset feature range of the known foreground category or not according to whether the first feature distance is smaller than a first preset threshold or not;
for the background category, calculating a second characteristic distance between the picture to be recognized and the background category by utilizing the characteristic representation of the picture to be recognized and the characteristic representation of the background category; and determining whether the picture to be identified falls into a preset feature range of the background category or not according to whether the second feature distance is smaller than a second preset threshold or not.
4. The method according to claim 3, wherein determining whether the picture to be recognized falls within a preset feature range of the known foreground category according to whether the first feature distance is smaller than a first preset threshold includes:
and judging whether the first characteristic distance is smaller than a first preset threshold value, if so, determining that the picture to be recognized falls into a preset characteristic range of the known foreground category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the known foreground category.
5. The method according to claim 3, wherein determining whether the picture to be recognized falls within a preset feature range of the background category according to whether the second feature distance is smaller than a second preset threshold includes:
and judging whether the second characteristic distance is smaller than a second preset threshold value, if so, determining that the picture to be recognized falls into the preset characteristic range of the background category, and if not, determining that the picture to be recognized does not fall into the preset characteristic range of the background category.
6. The method of claim 1, wherein the feature representation of the known foreground class comprises:
a feature representation of at least one learnable prototype belonging to the known foreground class;
and for each learnable prototype of the known foreground class, a feature representation of a feature space range based on the feature representation of the learnable prototype.
7. The method of claim 1, wherein the feature representation of the context class comprises:
a feature representation of at least one learnable prototype belonging to the context class;
and for each learnable prototype of the background category, a feature representation of a feature space range based on the feature representation of the learnable prototype.
8. The method according to any one of claims 1-7, further comprising:
and if the picture to be recognized is recognized as the unknown foreground category, classifying and recognizing the picture to be recognized through a clustering algorithm.
9. A foreground recognition apparatus, comprising:
the characteristic generating unit is used for generating a characteristic representation set of the picture to be identified;
the characteristic measurement unit is used for determining whether the characteristic representation of the picture to be identified falls into any set characteristic range of a known foreground category or a background category according to the characteristic representation of the picture to be identified and the characteristic representation corresponding to each of the at least one known foreground category and the background category;
and the foreground identification unit is used for identifying the picture to be identified as an unknown foreground category if the characteristic representation of the picture to be identified does not fall into any known foreground category or the set characteristic range of the background category.
10. An electronic device, comprising: a processor, a memory;
the memory for storing a computer program;
the processor is configured to execute the foreground identification method according to any one of claims 1 to 8 by calling the computer program.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the foreground recognition method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110265506.5A CN112990300A (en) | 2021-03-11 | 2021-03-11 | Foreground identification method, device, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110265506.5A CN112990300A (en) | 2021-03-11 | 2021-03-11 | Foreground identification method, device, equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112990300A true CN112990300A (en) | 2021-06-18 |
Family
ID=76336356
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110265506.5A Pending CN112990300A (en) | 2021-03-11 | 2021-03-11 | Foreground identification method, device, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112990300A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108961267A (en) * | 2018-06-19 | 2018-12-07 | Oppo广东移动通信有限公司 | Image processing method, picture processing unit and terminal device |
CN111291817A (en) * | 2020-02-17 | 2020-06-16 | 北京迈格威科技有限公司 | Image recognition method and device, electronic equipment and computer readable medium |
CN111639653A (en) * | 2020-05-08 | 2020-09-08 | 浙江大华技术股份有限公司 | False detection image determining method, device, equipment and medium |
CN111834004A (en) * | 2020-05-25 | 2020-10-27 | 杭州深睿博联科技有限公司 | Unknown disease category identification method and device based on centralized space learning |
US20210019892A1 (en) * | 2019-07-15 | 2021-01-21 | Google Llc | Video Background Substraction Using Depth |
-
2021
- 2021-03-11 CN CN202110265506.5A patent/CN112990300A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108961267A (en) * | 2018-06-19 | 2018-12-07 | Oppo广东移动通信有限公司 | Image processing method, picture processing unit and terminal device |
US20210019892A1 (en) * | 2019-07-15 | 2021-01-21 | Google Llc | Video Background Substraction Using Depth |
CN111291817A (en) * | 2020-02-17 | 2020-06-16 | 北京迈格威科技有限公司 | Image recognition method and device, electronic equipment and computer readable medium |
CN111639653A (en) * | 2020-05-08 | 2020-09-08 | 浙江大华技术股份有限公司 | False detection image determining method, device, equipment and medium |
CN111834004A (en) * | 2020-05-25 | 2020-10-27 | 杭州深睿博联科技有限公司 | Unknown disease category identification method and device based on centralized space learning |
Non-Patent Citations (2)
Title |
---|
PANDIT, TRUPTI M ET.AL: "Suspicious Object Detection In Surveillance Videos For Security Applications", 《2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT)》, 1 March 2017 (2017-03-01), pages 309 - 313 * |
王秋红: "多障碍物环境下基于无人机的移动目标监控", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, 15 February 2020 (2020-02-15), pages 031 - 1221 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444821B (en) | Automatic identification method for urban road signs | |
CN110020592B (en) | Object detection model training method, device, computer equipment and storage medium | |
CN112131978B (en) | Video classification method and device, electronic equipment and storage medium | |
CN106897738B (en) | A kind of pedestrian detection method based on semi-supervised learning | |
Salimi et al. | Visual-based trash detection and classification system for smart trash bin robot | |
US20170364742A1 (en) | Lip-reading recognition method and apparatus based on projection extreme learning machine | |
CN111813997B (en) | Intrusion analysis method, device, equipment and storage medium | |
JP2022141931A (en) | Method and device for training living body detection model, method and apparatus for living body detection, electronic apparatus, storage medium, and computer program | |
CN106372666B (en) | A kind of target identification method and device | |
CN109255289B (en) | Cross-aging face recognition method based on unified generation model | |
CN106407911A (en) | Image-based eyeglass recognition method and device | |
JP6756406B2 (en) | Image processing equipment, image processing method and image processing program | |
CN105956570B (en) | Smiling face's recognition methods based on lip feature and deep learning | |
CN110276252B (en) | Anti-expression-interference face recognition method based on generative countermeasure network | |
CN113869449A (en) | Model training method, image processing method, device, equipment and storage medium | |
CN113255557A (en) | Video crowd emotion analysis method and system based on deep learning | |
CN113724286A (en) | Method and device for detecting saliency target and computer-readable storage medium | |
CN117437691A (en) | Real-time multi-person abnormal behavior identification method and system based on lightweight network | |
CN109815887B (en) | Multi-agent cooperation-based face image classification method under complex illumination | |
Chang et al. | Robust abandoned object detection and analysis based on online learning | |
CN111652080A (en) | Target tracking method and device based on RGB-D image | |
KR101847175B1 (en) | Method for object recognition and apparatus thereof | |
CN112990300A (en) | Foreground identification method, device, equipment and computer readable storage medium | |
CN107180244B (en) | Image detection method and device based on cascade classifier | |
Yao et al. | Extracting robust distribution using adaptive Gaussian Mixture Model and online feature selection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210618 |
|
RJ01 | Rejection of invention patent application after publication |