CN113963189A - Object classification method and device, electronic equipment and storage medium - Google Patents
Object classification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113963189A CN113963189A CN202010629378.3A CN202010629378A CN113963189A CN 113963189 A CN113963189 A CN 113963189A CN 202010629378 A CN202010629378 A CN 202010629378A CN 113963189 A CN113963189 A CN 113963189A
- Authority
- CN
- China
- Prior art keywords
- information
- image
- category
- classified
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The application provides an object classification method, an object classification device, electronic equipment and a computer-readable storage medium. The object classification method comprises the following steps: acquiring a state image containing an object to be classified, wherein the object to be classified is the object to be classified; performing feature extraction processing on the state image to obtain information of a first image feature of the state image; determining information of a first identification category of the object to be classified according to the information of the first image characteristic; when the first identification category information is detected to be the same as the preset reference category information, acquiring second identification category information of the object to be classified as target category information of the object to be classified, wherein the second identification category belongs to the first identification category. In the application, the objects can be classified carefully, and the accuracy of object classification is improved.
Description
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to an object classification method and apparatus, an electronic device, and a computer-readable storage medium.
Background
In recent years, computer vision technology has been rapidly developed, in which various imaging systems are used as input sensitive means instead of visual organs, and a computer is used instead of a brain to perform processing and interpretation. The classification task is a processing task commonly used in computer vision technology. The method has the advantages that objects are accurately classified, and the method has positive significance for certain application scenes, such as counting whether a certain object appears in a certain place, sorting express mails and the like.
In the prior art, a target classification network is generally trained by acquiring and aiming at an image of each class, so that the target classification network captures characteristic information of each class, and then the class of an object is accurately identified through the image.
However, in the course of research and practice on the prior art, the inventors of the present invention found that, when there are large differences in features between some objects and high similarities in features between some objects in the objects to be recognized, the classification result may also be recognized incorrectly because it is difficult for the existing target classification network to capture the difference features between the objects with high similarities. For example, in an express sorting scene, four objects, namely a forklift, a person and a tool, in a sorting field need to be identified, and the situation that target classification is inaccurate easily occurs due to the fact that the similarity between the forklift and the forklift is high. Therefore, the existing target classification network is difficult to accurately classify objects in detail, so that the accuracy of object classification is relatively low.
Disclosure of Invention
The application provides an object classification method, an object classification device, electronic equipment and a computer-readable storage medium, and aims to solve the problem that the accuracy of object classification is relatively low due to the fact that an existing target classification network is difficult to accurately classify objects finely.
In a first aspect, the present application provides a method of classifying an object, the method comprising:
acquiring a state image containing an object to be classified;
performing feature extraction processing on the state image to obtain information of a first image feature of the state image;
determining information of a first identification category of the object to be classified according to the information of the first image characteristic;
when the first identification category information is detected to be the same as the preset reference category information, acquiring second identification category information of the object to be classified as target category information of the object to be classified, wherein the second identification category belongs to the first identification category.
In a second aspect, the present application provides an object classification apparatus comprising:
an acquisition unit for acquiring a state image containing an object to be classified;
the processing unit is used for carrying out feature extraction processing on the state image to obtain information of a first image feature of the state image;
the first classification unit is used for determining information of a first identification category of the object to be classified according to the information of the first image characteristic;
and the second classification unit is used for acquiring information of a second identification class of the object to be classified as information of a target class of the object to be classified when the first identification class information and the preset reference class information are detected to be the same, wherein the second identification class belongs to the first identification class.
In a possible implementation manner of the present application, the second classification unit is further specifically configured to:
when the condition image is detected to be identical to the preset reference type information, performing feature extraction processing according to the first identification type information and the condition image to obtain second image feature information of the condition image;
and determining the information of the second identification category of the object to be classified according to the information of the second image characteristic, wherein the information is used as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the first classification unit is further specifically configured to:
and when the fact that the information of the first identification category is different from the preset reference category is detected, taking the information of the first identification category as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the first classification unit is further specifically configured to:
inputting the information of the first image characteristic into a first classification sub-network of a preset cascade classification network, so that the first classification sub-network determines information of a first identification class of the object to be classified according to the information of the first image characteristic;
in a possible implementation manner of the present application, the second classification unit is further specifically configured to:
when the fact that the information of the first identification category is the same as the information of the preset reference category is detected, the state image is input into a second classification sub-network of the preset cascade classification network, so that the second classification sub-network performs feature extraction processing on the state image to obtain information of second image features of the state image, and the information of the second identification category of the object to be classified is determined according to the information of the second image features to serve as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the object classification apparatus further includes a training unit, and the training unit is further specifically configured to:
obtaining a sample image and obtaining a class label of the sample image, wherein the sample image comprises a first sample image, a second sample image and a third sample image, the class label comprises a first class label and a second class label, the first class label comprises a first class label of the first sample image, a first class label of the second sample image and a first class label of the third sample image, and the second class label comprises a second class label of the second sample image and a second class label of the third sample image;
inputting the sample image into a first preset sub-network of a preset model, so that the first preset sub-network performs feature extraction on the sample image to obtain information of a first sample feature of the sample image, and determining information of a first prediction category of the sample image according to the information of the first sample feature;
when the information of the first prediction type is detected to be the information of a preset reference type, inputting the sample image into a second preset sub-network of the preset model, so that the second preset sub-network performs feature extraction on the sample image to obtain information of a second sample feature of the sample image, and determining the information of the second prediction type of the sample image according to the information of the second sample feature;
and updating the model parameters of the preset model according to the information of the first prediction category, the information of the second prediction category, the first category label and the second category label until the preset model is taken as a preset cascade classification network when the preset model meets the preset training stopping condition.
In one possible implementation manner of the present application, the training unit is further specifically configured to:
acquiring an original image of an object to be identified;
calling a preset GAN network, and generating a plurality of supplementary images according to the original image;
the original image and the supplementary image are taken as sample images.
In a possible implementation manner of the present application, the obtaining unit is further specifically configured to:
acquiring a plurality of images acquired in a specific area;
taking the plurality of images as the status image;
in a possible implementation manner of the present application, the object classification device further includes a statistical unit, and the statistical unit is further specifically configured to:
the method further comprises the following steps:
and determining the occurrence frequency of objects of a preset category in the specific area according to the information of the target category.
In a third aspect, the present application further provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores a computer program, and the processor executes the steps in any one of the object classification methods provided in the present application when calling the computer program in the memory.
In a fourth aspect, the present application also provides a computer-readable storage medium having a computer program stored thereon, which is loaded by a processor to perform the steps of the object classification method.
The method comprises the steps of obtaining a state image containing an object to be classified; performing feature extraction processing on the state image to obtain information of a first image feature of the state image; determining information of a first identification category of the object to be classified according to the information of the first image characteristic; and when the information of the first identification category is detected to be the same as the information of the preset reference category, acquiring the information of the second identification category of the object to be classified as the information of the target category of the object to be classified. In one aspect, classification of an object to be classified can be achieved based on an image.
On the other hand, since the first identification category of the object to be classified (i.e. the large category to which the object to be classified belongs) is determined first, when the first identification category of the object to be classified (i.e. the large category to which the object to be classified belongs) is determined and the preset reference category (i.e. the preset large category to which the object to be classified needs to be further classified), the second identification category of the object to be classified (i.e. the small category to which the object to be classified belongs) is determined, and the second identification category of the object to be classified is taken as the target category of the object to be classified. Therefore, the object to be classified can be classified finely. Therefore, the problem that the existing target classification network is difficult to capture the difference characteristics between objects with higher similarity is solved, and the accuracy of object classification is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of a scene of an object classification system provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating an embodiment of an object classification method provided in an embodiment of the present application;
fig. 3 is a schematic flowchart of an object classification method according to another embodiment of the present application;
FIG. 4 is a scene schematic diagram of a status image provided in an embodiment of the present application;
FIG. 5 is a schematic structural diagram of an embodiment of an object classification device provided in the embodiments of the present application;
fig. 6 is a schematic structural diagram of an embodiment of an electronic device provided in the embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the embodiments of the present application, it should be understood that the terms "first", "second", and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present application, "a plurality" means two or more unless specifically defined otherwise.
The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known processes have not been described in detail so as not to obscure the description of the embodiments of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed in the embodiments herein.
The embodiment of the application provides an object classification method, an object classification device, electronic equipment and a computer-readable storage medium. The object classification device may be integrated in an electronic device, and the electronic device may be a server or a terminal.
First, before describing the embodiments of the present application, the related contents of the embodiments of the present application with respect to the application context will be described.
Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
Object classification is a common processing task in computer vision technology. The method has the advantages that objects are accurately classified, and the method has positive significance for certain application scenes, such as counting whether a certain object appears in a certain place, sorting express mails and the like.
For example, in the field of logistics, there is often the task of detailed classification of personnel, materials and tools inside transit stations and related network points, where not only large categories are accurately classified, but also small categories within the categories need to be clearly classified.
However, in practical application, it is found that there is a problem of a large error when an existing target classification network is used for object classification, that is, there are many erroneous judgments on picture categories.
The inventor of the present application finds that the reason for misjudging the category image data is mainly two cases, one is that the small number of category samples is small, so that the characteristics of the data cannot be remembered during training. The other is that the similarity of the intra-class sample group (such as a plate car and a small truck) in the large class (such as a car) is extremely high, the intra-class samples which belong to different intra-group classes are divided into one group, and errors are caused in the fine classification; if the small truck is wrongly judged as the plate trailer.
Based on the above-mentioned defects of the prior art, the embodiments of the present application provide an object classification method, which overcomes the defects of the prior art to at least some extent.
An execution main body of the object classification method in the embodiment of the present application may be the object classification device provided in the embodiment of the present application, or different types of electronic devices such as a server device, a physical host, or a User Equipment (UE) integrated with the object classification device, where the object classification device may be implemented in a hardware or software manner, and the UE may specifically be a terminal device such as a smart phone, a tablet computer, a notebook computer, a palm computer, a desktop computer, or a Personal Digital Assistant (PDA).
The electronic equipment can adopt an operating mode of independent operation or an operating mode of equipment clustering, and can accurately identify objects needing to be classified finely and objects needing to be classified roughly by applying the object classification method provided by the embodiment of the application, so that the accuracy of object classification is improved.
In the embodiment of the present application, the object classification refers to identifying a target category of an object included in an image according to the features of the image by extracting the features of the image (for example, extracting the features of the image in an express mail sorting area, and identifying the object included in the image as a daily product, food, or the like according to the extracted image features), where the target category has a positive significance for some application scenarios, such as counting whether an object appears in a certain place, and sorting express mails.
Referring to fig. 1, fig. 1 is a scene schematic diagram of an object classification system according to an embodiment of the present application. The object classification system may include an electronic device 100, and an object classification apparatus is integrated in the electronic device 100. For example, the electronic device may acquire a status image containing an object to be classified; performing feature extraction processing on the state image to obtain information of a first image feature of the state image; determining information of a first identification category of the object to be classified according to the information of the first image characteristic; and when the information of the first identification category is detected to be the same as the information of the preset reference category, acquiring the information of the second identification category of the object to be classified as the information of the target category of the object to be classified.
In addition, as shown in fig. 1, the object classification system may further include a memory 200 for storing data, such as image data, video data, and the like, captured over a period of time by a camera deployed in a specific scene area (e.g., a sorting floor).
It should be noted that the scene schematic diagram of the object classification system shown in fig. 1 is merely an example, the object classification system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation to the technical solution provided in the embodiment of the present application, and as a person having ordinary skill in the art knows, with the evolution of the object classification system and the appearance of a new service scene, the technical solution provided in the embodiment of the present invention is also applicable to similar technical problems.
In the following, an object classification method provided in an embodiment of the present application is described, in which an electronic device is used as an execution subject, and the execution subject will be omitted in subsequent embodiments of the method for simplifying and facilitating the description.
Referring to fig. 2, fig. 2 is a schematic flow chart of an embodiment of an object classification method provided in the application embodiment. It should be noted that, although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown or described herein. The object classification method includes steps S10-S40, in which:
and S10, acquiring a state image containing the object to be classified.
The object to be classified may be any object. For example, under the express sorting scene, a forklift in the sorting field needs to be identified, that is, in order to classify the forklift, the object to be classified is the forklift. If people in the sorting field need to be identified, namely people are classified, the object to be classified refers to people. For another example, it is necessary to identify tools in the sorting yard, i.e. in order to sort the tools, the object to be sorted is referred to as a tool.
If not otherwise indicated, the status image referred to hereinafter is an image containing the object to be classified.
And S20, performing feature extraction processing on the state image to obtain information of the first image feature of the state image.
The first image feature is an image feature of a state image obtained by performing feature extraction processing on the state image.
In some embodiments, step S20 may be implemented by a convolutional neural network. For example, the state image may be input to a convolutional neural network, so that the convolutional neural network performs feature extraction processing according to the state image to obtain information of the first image feature of the state image. Specifically, in the specific embodiment of the step S20 "performing the feature extraction process on the state image to obtain the information of the first image feature of the state image", reference may be made to the following embodiment of the step a2 "inputting the sample image into the first predetermined sub-network of the predetermined model, so that the first predetermined sub-network performs the feature extraction on the sample image to obtain the information of the first sample feature of the sample image", which is not described herein again.
And S30, determining the information of the first identification category of the object to be classified according to the information of the first image characteristics.
For example, first, a preset first-class regression function is obtained. And then, determining the information of the first identification category of the object to be classified contained in the state image according to a preset first category regression function and the information of the first image characteristic. Wherein the preset first class regression function is used for indicating the relationship between the image features and the large class (namely, the first identification class) to which the object belongs.
In some embodiments, a plurality of objects to be recognized (i.e., objects to be recognized, the name of the object to be recognized may be simply understood as the name of the final class recognition result) are classified manually according to the classification habit and experience in advance, and the name of the large class and the name of the small class to which each object belongs are determined. The name of the large class to which the object belongs is used for identifying the large class to which the object belongs, and the name of the small class to which the object belongs is used for identifying the small class to which the object belongs.
The category of the two objects to be identified is the category of the two objects with relatively large difference in characteristics. The subclass refers to the class of two objects with relatively high feature similarity in the object classes to be identified.
For example, 4 objects, i.e., a person, a tool, a forklift and a forklift, in a sorting field need to be identified, the general categories to which the person, the tool, the forklift and the forklift belong are determined to be named as "person", "tool", "vehicle" and "truck", and the subclasses to which the forklift and the forklift belong are determined to be named as "forklift" and "forklift" through manual pre-classification.
In some embodiments, the object to be identified may also be categorized by an electronic device, that is, the method in this embodiment further includes: and pre-classifying a plurality of objects to be recognized to obtain the names of large classes corresponding to each object and the names of small classes corresponding to each object.
Specifically, first, the name of the broad class to which each object to be recognized belongs is determined as the name of the object to be recognized. Then, obtaining the characteristics of a plurality of objects to be identified, and determining the characteristic similarity between any two objects according to the characteristics of any two objects; wherein the features of the object to be recognized can be characterized by the image; for example, the similarity between the pixel points of the two object images may be compared by respectively obtaining the images of the two objects, and the similarity between the pixel points of the two object images may be used as the feature similarity between the two objects. When detecting that the feature similarity between any plurality of objects is greater than a preset threshold, classifying any plurality of objects (marked as objects to be classified) with the feature similarity greater than the preset threshold into the same large class, and determining the large class name of the same large class; meanwhile, on one hand, the large class name of the same large class is adopted, and the large class name of the object to be classified is updated; on the other hand, the name of the subclass to which the object to be classified belongs is determined as the name of the object to be recognized. And naming the major classes of any plurality of objects with the feature similarity greater than the preset threshold as the names of the preset reference classes (i.e., the preset reference classes in step S40).
For example, 4 objects, i.e., people, tools, forklifts, etc., in the sorting floor need to be identified. Firstly, the names of the large classes to which each object to be recognized (person, tool, forklift and forklift) belongs are determined as the objects to be recognized, and the names of the large classes to which the person, tool, forklift and forklift belong are respectively ' person ', ' tool ', ' forklift and ' forklift '. When detecting that the feature similarity (such as 90%) between any multiple objects (such as a forklift and a forklift) is larger than a preset threshold (such as 70%), classifying the forklift and the forklift into the same large class, and determining the large class name (such as 'vehicle') of the same large class; on one hand, the large class names (cars) of the same large class are adopted, and the large class names (forklifts) to which the forklifts and forklifts belong are updated to be the forklifts and the forklifts; on the other hand, the subclasses to which the forklift and the forklift belong are named as "forklift" and "forklift".
It can be seen from the above that each object has a name corresponding to the large class to which it belongs, but some objects have names corresponding to the small classes to which they belong, and some objects do not have names corresponding to the small classes to which they belong.
Here, it is emphasized that it is not simply the classification of each object to be recognized into a corresponding large class, a corresponding small class; but the feature similarity between the objects to be identified is determined whether the objects with higher similarity need to be further divided into the same large class, so that the features of certain large class of objects can be further extracted to determine the small class of the objects to serve as the target class of the objects, and further the detailed classification of the objects is realized.
When the feature difference between some objects is large and the feature similarity between some objects is high, in order to accurately identify the class of the object, the class to which the object belongs is identified through the image features; and further extracting the characteristics of the image according to the name of the large class to which the object belongs, and further determining the small class to which the object belongs, so that the class of the object can be accurately identified.
The purpose of step S30 is to identify the broad class to which the object belongs. Namely, the first identification category refers to the major category to which the object to be classified belongs, and the name of the first identification category refers to the major category to which the object to be classified belongs.
And S40, when the information of the first identification category is detected to be the same as the information of the preset reference category, acquiring the information of the second identification category of the object to be classified as the information of the target category of the object to be classified.
Wherein the second identification category is subordinate to the first identification category. That is, the first recognition category refers to a large category to which the object to be classified belongs, and the first recognition category refers to a small category to which the object to be classified belongs. I.e. the second identification class is further classified by the first identification class.
In the embodiment of the application, in the process of determining the target class of the object to be classified, if the object to be classified has the subclass to which the object belongs, the target class of the object to be classified is determined as the subclass to which the object to be classified belongs. And if the object to be classified does not have the subclass to which the object belongs, determining the target class of the object to be classified as the major class to which the object to be classified belongs.
Therefore, it is necessary to identify a large class to which the object to be classified belongs (i.e., a first identification class of the object to be classified), and a small class to which the object to be classified belongs (i.e., a second identification class of the object to be classified).
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an object classification method according to another embodiment of the present disclosure.
In some embodiments of the present application, as shown in fig. 3, the step S40 may specifically include the following steps S41 to S42, wherein:
and S41, when the first identification type information is detected to be the same as the preset reference type information, performing feature extraction processing according to the first identification type information and the state image to obtain second image feature information of the state image.
In order to detect whether the first recognition category is the same as the preset reference category, as shown in fig. 3, the step S41 further includes the step S50: and detecting whether the information of the first identification category is the same as the information of a preset reference category.
The preset reference category refers to a large category to which any plurality of objects with feature similarity larger than a preset threshold belong. The preset reference categories may include a plurality of preset broad categories, for example, the preset reference categories may include a broad category named "car" and a broad category named "tool".
Specifically, firstly, the naming information corresponding to the first identification category and the naming information of the large category corresponding to the preset reference category are obtained. And then, comparing whether the naming information corresponding to the first identification category is the same as the naming information corresponding to the large category of the preset reference category. And if the naming information corresponding to the first identification category and the naming information corresponding to the preset reference category are the same, determining that the information of the first identification category is the same as the information of the preset reference category. And if the naming information corresponding to the first identification category is different from the naming information corresponding to the preset reference category, determining that the information of the first identification category is different from the information of the preset reference category.
After the step of "detecting whether the information of the first identification category is the same as the information of the preset reference category", if it is detected that the information of the first identification category is not the same as the information of the preset reference category, as an implementation manner, the information of the first identification category may be used as the information of the target category of the object to be classified. As another embodiment, no processing may be performed, and since the information of the first identification category of the object to be classified has already been determined in step S30, classification of the object to be classified may be similarly performed.
For example, if the preset reference category is "vehicle" (i.e., the naming information corresponding to the large category corresponding to the preset reference category is "vehicle") and the first identification category is "tool" (the naming information corresponding to the first identification category is "tool"), it is determined that the first identification category ("tool") is different from the preset reference category ("vehicle"). If the preset reference category is "vehicle", "tool" (that is, the naming information of the large category corresponding to the preset reference category is "vehicle", "tool"), and the first identification category is "tool", it is determined that the first identification category ("tool") is the same as the preset reference category ("tool" in the vehicle "and" tool ").
Specifically, the "obtaining the information of the second image feature of the state image by performing the feature extraction processing according to the information of the first identification category and the state image" may be implemented by a neural network, and a specific embodiment thereof may refer to an implementation of "inputting the sample image into the second predetermined sub-network of the preset model so that the second predetermined sub-network performs the feature extraction on the sample image to obtain the information of the second sample feature of the sample image" in step a3, which is not described herein again.
And S42, determining the information of the second identification category of the object to be classified according to the information of the second image characteristic, and taking the information of the second identification category as the information of the target category of the object to be classified.
The purpose of step S42 is to identify the subclass to which the object belongs. Namely, the second recognition category refers to the subclass to which the object to be classified belongs, and the name of the second recognition category refers to the name of the subclass to which the object to be classified belongs.
For example, first, a preset second-class regression function is acquired. Then, according to a preset second class regression function and information of second image characteristics, determining information of a second identification class of the object to be classified contained in the state image, and taking the information of the second identification class of the object to be classified as information of a target class of the object to be classified. Wherein the preset second category regression function is used for indicating the relationship between the image features and the subclass (i.e. the second recognition category) to which the object belongs. The preset first-class regression function is different from the preset second-class regression function in that: the preset first-class regression function is used for regressing a large class (namely, a first identification class) to which the object belongs according to the first image features, and the preset second-class regression function is used for regressing a small class (namely, a second identification class) to which the object belongs according to the second image features.
For example, according to the information of the second image feature and a preset second class regression function, it is determined that the second identification class of the object to be classified contained in the state image is "forklift", and then the "forklift" is used as the target class of the object to be classified.
From the above, it can be seen that by comparing whether the first identification category (i.e. the large category to which the object to be classified belongs) of the object to be classified is the same as the preset reference category, when the first identification category of the object to be classified is the preset reference category, the second identification category (i.e. the small category to which the object to be classified belongs) of the object to be classified is further identified based on the state image identification, so as to serve as the target category of the object to be classified, and therefore, the fineness of the target category can be improved, and further, the object with higher similarity can be accurately classified.
In some scenes, objects with high similarity and objects with large characteristic differences are simultaneously contained in the objects to be recognized, so that accurate classification can be realized, in addition to obtaining the second recognition category obtained after detailed classification as the target category of the objects to be classified, the first recognition category is also required to be used as the target category of the objects to be classified.
As shown in fig. 3, to this end, the object classification method further includes step S60: and when the fact that the information of the first identification category is different from the preset reference category is detected, taking the information of the first identification category as the information of the target category of the object to be classified.
From the above, when the first recognition category is different from the preset reference category, the first recognition category is used as the target category of the object to be classified. For some objects, such as objects to be classified which do not need to be classified in detail, the first identification category (i.e., the large category to which the objects to be classified belong) is adopted as the target category of the objects to be classified, so that accurate classification can be realized, and further, the objects which only need to be classified roughly can be classified.
Therefore, according to the embodiment of the application, when the object to be recognized simultaneously comprises the object with larger characteristic difference and the object with higher characteristic similarity, the accuracy of object classification can be improved.
In some embodiments of the present application, step S40 may specifically include the following steps (1) to (4), where:
(1) and when detecting that the information of the first identification category is the same as the information of the preset reference category, performing feature extraction processing according to the information of the first identification category and the state image to obtain information of a second image feature of the state image.
(2) And determining the information of the second identification category of the object to be classified according to the information of the second image characteristic.
(3) And further detecting whether the information of the second identification category is the same as the information of the preset reference category.
(4) And (3) when the second identification category information is detected to be the same as the preset reference category information, repeating the steps (1) to (3) by taking the second identification category information obtained in the step (3) as the first identification category information until the second identification category information is different from the preset reference category information, and taking the second identification category information of the object to be classified as the target category information of the object to be classified.
In some embodiments of the present application, the above steps S20-S40 may be implemented by a machine learning model. For example, the above steps S20 to S40 may be implemented by a preset cascade classification network, which is obtained by training a machine learning model, and the preset cascade classification network may include a first classification subnetwork and a second sub-classification network.
The first classification subnetwork is used for identifying a large class to which the object to be classified belongs, namely information used for determining a first identification class of the object to be classified, wherein the large class is contained in the state image. The first classification subnetwork may be an open-source pellee network or an open-source MobileNet network.
The second classification subnetwork is used for identifying the subclass to which the object to be classified belongs contained in the state image, namely the information for determining the second identification class of the object to be classified. The second classification subnetwork may also be an open source pellee network or an open source MobileNet network.
In some embodiments of the present application, the predetermined cascade classification network is obtained by training the following steps a 1-a 4, wherein:
a1, obtaining a sample image and obtaining a class label of the sample image.
The sample image comprises a first sample image, a second sample image and a third sample image, the category labels comprise a first category label and a second category label, the first category label comprises a first category label of the first sample image, a first category label of the second sample image and a first category label of the third sample image, and the second category label comprises a second category label of the second sample image and a second category label of the third sample image.
The first class label of the first sample image is used to indicate a broad class to which the object contained in the first sample image belongs. The first class label of the second sample image is used to indicate a large class to which the object contained in the second sample image belongs. The first class label of the third sample image is used to indicate a large class to which the object contained in the third sample image belongs.
The second category label of the second sample image is used to indicate a subclass to which the object contained in the second sample image belongs. The second category label of the third sample image is used to indicate a subclass to which the object contained in the third sample image belongs.
The objects contained in the sample image are classified in advance, and the names of the large classes and the names of the small classes to which each object belongs are determined. The name of the large class to which the object belongs is used to identify the large class to which the object contained in the sample image belongs, and the name of the small class to which the object belongs is used to identify the small class to which the object contained in the sample image belongs.
Here, the first sample image is an image in which the object included in the first sample image is associated with the major class to which the object belongs, but is not associated with the minor class to which the object belongs. The first sample image may be a plurality of images, each image corresponding to an object containing a particular category.
The second sample image is an image in which an object included in the second sample image is associated with a large class and an object included in the second sample image is associated with a small class. The second sample image may be a plurality of images, each image corresponding to an object containing a particular category.
The third sample image is an image in which the object included in the third sample image is associated with a large class and the object included in the third sample image is associated with a small class. The third sample image may be a plurality of images, each image corresponding to an object containing a particular category.
The second sample image and the third sample image are different in that the specific subclass to which the object included in the image belongs is different. For example, the major category to which the object included in the second sample image belongs is "car", and the specific minor category to which the object belongs is "van"; the object included in the third sample image belongs to the general category "car" and the specific subclass "trailer" to which it belongs. The purpose of distinguishing the second sample image from the third sample image is to emphasize that the image dataset used to train the preset cascade classification network includes two object images of the class with the greater feature similarity.
The first sample image, the second sample image and the third sample image are further distinguished here in order to emphasize that the image dataset used for training the preset cascade classification network includes both the object images of the two classes with the greater feature similarity and the object images of the two classes with the greater feature difference. It can be understood by those skilled in the art that, instead of emphasizing that the preset cascade classification network only learns the features of three classes, the number of the first sample image, the second sample image, and the third sample image may be adjusted for a specific service scene, so that the preset cascade classification network may learn the features of multiple classes, and may further identify objects of multiple classes.
In some embodiments of the present application, step a1 may specifically include: acquiring an original image of an object to be identified; calling a preset GAN network, and generating a plurality of supplementary images according to the original image; the original image and the supplementary image are taken as sample images.
In which, a GAN network (Generative adaptive Networks) can generate a new image according to an original image, thereby implementing data enhancement.
From the above, it can be seen that by using the preset GAN network, the data volume of the sample can be increased, and further, the problem that the classification accuracy is low due to the fact that the characteristics of the object cannot be remembered by the preset cascade classification network due to the reasons that the image of the sample is difficult to obtain and the like is solved. Therefore, the data volume of the sample is increased by adopting the preset GAN network, and the classification accuracy of the preset cascade classification network can be improved.
A2, inputting the sample image into a first preset sub-network of a preset model, so that the first preset sub-network performs feature extraction on the sample image to obtain information of a first sample feature of the sample image, and determining information of a first prediction category of the sample image according to the information of the first sample feature.
The first preset sub-network is correspondingly provided with a first loss function, so that the first preset sub-network can learn the class information of the large class to which the object in the image belongs. The first penalty function is set corresponding to a first prediction class output by the first predetermined sub-network.
Wherein the first prediction class is used to indicate a large class to which an object contained in the sample image belongs.
The first predetermined sub-network may include a feature extraction layer, a prediction output layer. The characteristic extraction layer is used for carrying out characteristic extraction on the sample image to obtain a first sample characteristic of the sample image; the feature extraction layer may be a Convolutional Neural Network (CNN). And the prediction output layer is used for predicting and outputting a first prediction category of the sample image according to a first prediction function and a first sample characteristic corresponding to the first preset sub-network.
For example, when a first predetermined sub-network of a predetermined model is input to a sample image (including a first sample image, a second sample image, and a third sample image, and the actual types of objects included therein are "person", "container", and "cart", respectively), a feature extraction layer of the first predetermined sub-network performs feature extraction on the first sample image (the actual type of the object included therein is "person"), and information of a first sample feature of the first sample image is obtained. The feature extraction layer of the first predetermined sub-network performs feature extraction on the second sample image (the actual type of the object included therein is "storage box"), and obtains information of the first sample feature of the second sample image. The feature extraction layer of the first predetermined sub-network performs feature extraction on the third sample image (the actual type of the contained object is "cart"), and obtains information of the first sample feature of the third sample image.
The prediction output layer of the first predetermined sub-network predicts and outputs a first prediction type of the first sample image as "human" based on the first prediction function corresponding thereto and the information of the first sample feature of the first sample image. And the prediction output layer of the first preset sub-network predicts and outputs the first prediction type of the second sample image as a tool according to the corresponding first prediction function and the information of the first sample characteristic of the second sample image. And the prediction output layer of the first preset sub-network predicts and outputs the first prediction type of the third sample image as the 'vehicle' according to the corresponding first prediction function and the information of the first sample characteristic of the third sample image.
And A3, when the information of the first prediction type is detected to be the information of a preset reference type, inputting the sample image into a second preset sub-network of the preset model, so that the second preset sub-network extracts the characteristics of the sample image to obtain the information of the second sample characteristics of the sample image, and determining the information of the second prediction type of the sample image according to the information of the second sample characteristics.
The second loss function is correspondingly arranged in the second preset sub-network, so that the second preset sub-network can learn the class information of the subclass to which the object in the image belongs. The second penalty function is set corresponding to a second prediction class output by a second predetermined sub-network.
Wherein the first prediction class is used to indicate a large class to which an object contained in the sample image belongs.
The second predetermined subnetwork may also comprise a feature extraction layer, a prediction output layer. The characteristic extraction layer is used for carrying out characteristic extraction on the sample image to obtain information of a second sample characteristic of the sample image; the feature extraction layer may be a Convolutional Neural Network (CNN). And the prediction output layer is used for predicting and outputting the information of the second prediction type of the sample image according to the second prediction function corresponding to the second preset sub-network and the information of the second sample characteristic.
For ease of understanding, the description continues with the example in step a2 above. For example, the preset reference category is "tool", "vehicle". Since the first prediction category of the second sample image is "tool" and the first prediction category of the third sample image is "vehicle", the second sample image and the third sample image are respectively input into the second predetermined sub-network of the predetermined model.
The feature extraction layer of the second predetermined sub-network performs feature extraction on the second sample image (the actual type of the object included therein is "storage box"), and obtains information on the second sample feature of the second sample image. The feature extraction layer of the second predetermined sub-network performs feature extraction on the third sample image (the actual type of the contained object is "cart"), to obtain information of a second sample feature of the third sample image.
And the prediction output layer of the second preset sub-network predicts and outputs a second prediction type of the second sample image as the 'containing box' according to the corresponding second prediction function and the information of the second sample characteristic of the second sample image. And the prediction output layer of the second preset sub-network predicts and outputs a second prediction type of the third sample image as the 'handcart' according to the corresponding second prediction function and the information of the second sample characteristic of the third sample image.
A4, updating model parameters of the preset model according to the information of the first prediction category, the information of the second prediction category, the first category label and the second category label until the preset cascade classification network is taken as a preset cascade classification network when the preset cascade classification network meets preset training stopping conditions.
Wherein, the preset training stopping condition can be set according to the actual requirement. For example, when the total loss value of training is smaller than a preset value, or when the total loss value of training does not change basically, that is, the difference between the total loss values of training corresponding to adjacent training times is smaller than the preset value; or when the iteration times of the preset model training reach the maximum iteration times.
Specifically, in one aspect, a model loss value for a first predetermined subnetwork is determined based on information of a first prediction class and a first class label. On the other hand, a model loss value of the second predetermined subnetwork is determined according to the information of the second prediction class and the second class label. Then, a total training loss value of the preset model is determined according to the model loss value of the first preset sub-network and the model loss value of the second preset sub-network. And updating model parameters (including a first prediction function corresponding to the first preset sub-network and a first prediction function corresponding to the second preset sub-network) of the preset model according to the training total loss value of the preset model until the preset model is taken as a preset cascade classification network when the preset model meets the preset training stopping condition. At this time, a preset cascade classification network may be applied to determine the information of the first identification category and the information of the second identification category of the object to be classified included in the image.
The pre-set cascaded classification network may include a first classification subnetwork and a second classification subnetwork corresponding to the trained pre-set network model. The first classification function corresponding to the first classification sub-network and the second classification function corresponding to the second classification sub-network are respectively a first prediction function of the first preset sub-network and a first prediction function of the second preset sub-network.
Wherein the corresponding first classification function of the first classification subnetwork is used for indicating a data relation between the image feature and a large class (i.e. a first recognition class) to which the object belongs. The corresponding second classification function of the second classification subnetwork is used to indicate the data relationship between the image feature and the large class to which the object belongs, i.e., the second recognition class.
The working principle of the first classifying sub-network and the second classifying sub-network of the predetermined cascaded classifying network is similar to the working principle of the first classifying sub-network and the second classifying sub-network of the predetermined model described in steps a2 and A3, and will not be described herein again.
The model loss value of the first preset sub-network is calculated through the first loss function, and the first prediction type of the sample image and the type information corresponding to the first type label of the sample image are substituted into the first loss function, so that the model loss value of the first preset sub-network can be obtained.
And calculating the model loss value of the second preset sub-network through the second loss function, and substituting the second prediction type of the sample image and the type information corresponding to the second type label of the sample image into the second loss function to obtain the model loss value of the second preset sub-network.
As can be seen from the above, the preset model is trained based on the sample image labeled with the first class label (corresponding to the class to which the object in the sample image belongs) and the second class label (corresponding to the subclass to which the object in the sample image belongs). The trained preset model (namely the preset cascade classification network) can capture the difference characteristics between the objects with larger characteristic difference (namely, the objects belong to the large class), and capture the difference characteristics between the objects with higher characteristic similarity (namely, the objects belong to the small class).
The first predetermined sub-network is trained by combining the first sample image, the second sample image and the third sample image labeled with the first class label, so that the trained first predetermined sub-network (i.e., the first classification sub-network of the predetermined cascade classification network) can capture the difference features between objects with larger feature differences (i.e., between classes to which the objects belong). Because the second predetermined sub-network is trained by combining the second sample image labeled with the second class label and the third sample image, the trained second predetermined sub-network (i.e., the second classification sub-network of the predetermined cascade classification network) can capture the difference features between the objects with higher feature similarity (i.e., between the subclasses to which the objects belong).
Therefore, the preset cascade classification network is used for classification, and the problem that the existing target classification network is difficult to capture the difference characteristics between objects with higher similarity when the characteristics of some objects are larger and the characteristics of some objects are higher in similarity in the objects to be identified is solved. Therefore, the same model (namely the preset cascade classification network) can be adopted to accurately identify the objects needing to be classified finely and the objects needing to be classified roughly, and the accuracy of object classification is improved.
In some embodiments of the application, when the step S20 is to classify the network through a preset cascade, the step S20 may specifically include: inputting the information of the first image characteristic into a first classification sub-network of a preset cascade classification network, so that the first classification sub-network determines a first identification class of the object to be classified according to the information of the first image characteristic.
Specifically, the information of the first image feature is input into a first classification sub-network of a preset cascade classification network, so that the first classification sub-network is called to regress the information of the first identification class of the object to be classified according to a corresponding first classification function and the information of the first image feature.
Correspondingly, when the step S40 passes through the preset cascade classification network, the step S40 may specifically include: when the fact that the information of the first identification category is the same as the information of the preset reference category is detected, the state image is input into a second classification sub-network of the preset cascade classification network, so that the second classification sub-network performs feature extraction processing on the state image to obtain information of second image features of the state image, and the information of the second identification category of the object to be classified is determined according to the second image features and serves as the information of the target category of the object to be classified.
Specifically, before step S40, it is first detected whether the information of the first recognition category and the information of the preset reference category are the same. And when the information of the first identification category and the information of the preset reference category are detected to be the same, inputting the state image into a second classification sub-network of the preset cascade classification network. And calling the first classification sub-network to perform feature extraction processing on the state image to obtain information of a second image feature of the state image, and determining and outputting information of a second identification class of the object to be classified according to a second classification function corresponding to the second classification sub-network and the information of the second image feature.
And the information of the second identification category of the object to be classified, which is output by the second classification sub-network, is the information of the target category of the object to be classified.
From the above, it can be seen that, by presetting the cascade classification network, the difference features between objects with large feature differences can be captured, and the difference features between objects with high feature similarity can be captured. The problem that when the feature difference between some objects (namely, between the large classes to which the objects belong) is large and the feature similarity between some objects (namely, between the small classes to which the objects belong) is high in the objects to be recognized at the same time is solved, the existing target classification network is difficult to capture the difference features between the objects with high similarity. Therefore, the same model (namely the preset cascade classification network) can be adopted to accurately identify the objects needing to be classified finely and the objects needing to be classified roughly, and the accuracy of object classification is improved.
From the above, it can be seen that the state image containing the object to be classified is obtained; performing feature extraction processing on the state image to obtain information of a first image feature of the state image; determining information of a first identification category of the object to be classified according to the information of the first image characteristic; and when the information of the first identification category is detected to be the same as the information of the preset reference category, acquiring the information of the second identification category of the object to be classified as the information of the target category of the object to be classified. In one aspect, classification of an object to be classified can be achieved based on an image.
On the other hand, since the first identification category of the object to be classified (i.e. the large category to which the object to be classified belongs) is determined first, when the first identification category of the object to be classified (i.e. the large category to which the object to be classified belongs) is determined and the preset reference category (i.e. the preset large category to which the object to be classified needs to be further classified), the second identification category of the object to be classified (i.e. the small category to which the object to be classified belongs) is determined, and the second identification category of the object to be classified is taken as the target category of the object to be classified. Therefore, the object to be classified can be classified finely. Therefore, the problem that the existing target classification network is difficult to capture the difference characteristics between objects with higher similarity is solved, and the accuracy of object classification is improved.
In some scenarios, such as a courier sorting scenario, it is sometimes necessary to identify the frequency of people and the frequency of forklifts within a sorting yard (i.e., a specific area). Then, a plurality of images in the sorting field need to be acquired, whether the objects appearing in the sorting field are people or forklifts (namely, the object categories need to be identified based on the images) is identified based on the images, and then the frequency of the people or the forklifts is counted.
To this end, in some embodiments of the present application, the step of "acquiring a status image containing an object to be classified" comprises: acquiring a plurality of images acquired in a specific area; taking the plurality of images as the status image.
Wherein the imaging of the state image comprises a specific region.
For example, please refer to fig. 4, fig. 4 is a scene diagram of a status image provided in the embodiment of the present application; wherein, fig. 4 is a schematic diagram of a sorting site on the left, and the sorting site comprises a sorting area 1 and a sorting area 2; on the right side of fig. 4, the sorting area 2 is set as a specific area, and a status image obtained by acquiring an image of the sorting area 2 is shown.
Specifically, the sorting area 2 is covered with a view of a camera for capturing status images (so that the image captured by the camera includes a specific area), images or videos of the sorting area 2 are captured continuously for a while to obtain a plurality of images, and each image is taken as one status image. Or, further, after acquiring images or videos of the sorting area 2 in a continuous period of time, selecting an image containing an object to be identified (such as an object containing "person", "carton", "goods", "cart", "truck", etc.) as the status image.
Correspondingly, each state image is used for determining the information of the target class of the object to be classified contained in each state image according to the steps (such as steps S10-S40) of the object classification method.
In order to count the frequency of occurrence of the preset class of objects in the specific area, after determining the information of the target class of the object to be classified contained in each state image, the object classification method further includes: and determining the occurrence frequency of objects of a preset category in the specific area according to the information of the target category.
For example, images (including image 1, image 2, image 3, image 4, image 5, and image 6) of a specific area are acquired as state images continuously for a certain period of time, and after object classification is performed on the state images (image 1, image 2, image 3, image 4, image 5, and image 6), the target classes of the objects to be classified contained in the state images (image 1, image 2, image 3, image 4, image 5, and image 6) are determined as follows: "person", "carton", "cart", "trolley", "person", "trolley".
Here, if the object in the preset category is "cart" or "person", it may be determined that the object in the preset category is "cart" with an occurrence frequency of 2/6-1/3 in the specific area, and "person" with an occurrence frequency of 3/6-1/2 in the specific area.
From the above, it can be seen that the target category of the object to be classified contained in each state image is determined by acquiring a plurality of images of the region to be classified as the state images and performing object classification on each state image; according to the target category of the object to be classified contained in each state image, the occurrence frequency of the object of a preset category in a specific area can be counted; therefore, the data statistics requirements of certain scenes are met, the data statistics is automatic, and the data statistics efficiency is improved.
In order to better implement the object classification method in the embodiment of the present application, based on the object classification method, an embodiment of the present application further provides an object classification device, as shown in fig. 5, which is a schematic structural diagram of an embodiment of the object classification device in the embodiment of the present application, where the object classification device 500 includes:
an acquiring unit 501, configured to acquire a state image including an object to be classified;
a processing unit 502, configured to perform feature extraction processing on the state image to obtain information of a first image feature of the state image;
a first classification unit 503, configured to determine information of a first identification category of the object to be classified according to the information of the first image feature;
a second classification unit 504, configured to, when it is detected that the information of the first identification category is the same as the information of a preset reference category, acquire information of a second identification category of the object to be classified as information of a target category of the object to be classified, where the second identification category belongs to the first identification category.
In a possible implementation manner of the present application, the second classification unit 504 is further specifically configured to:
when the condition image is detected to be identical to the preset reference type information, performing feature extraction processing according to the first identification type information and the condition image to obtain second image feature information of the condition image;
and determining the information of the second identification category of the object to be classified according to the information of the second image characteristic, wherein the information is used as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the first classification unit 503 is further specifically configured to:
and when the fact that the information of the first identification category is different from the preset reference category is detected, taking the information of the first identification category as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the first classification unit 503 is further specifically configured to:
inputting the information of the first image characteristic into a first classification sub-network of a preset cascade classification network, so that the first classification sub-network determines information of a first identification class of the object to be classified according to the information of the first image characteristic;
in a possible implementation manner of the present application, the second classification unit 504 is further specifically configured to:
when the fact that the information of the first identification category is the same as the information of the preset reference category is detected, the state image is input into a second classification sub-network of the preset cascade classification network, so that the second classification sub-network performs feature extraction processing on the state image to obtain information of second image features of the state image, and the information of the second identification category of the object to be classified is determined according to the information of the second image features to serve as the information of the target category of the object to be classified.
In a possible implementation manner of the present application, the object classification apparatus further includes a training unit (not shown in the figure), and the training unit is further specifically configured to:
obtaining a sample image and obtaining a class label of the sample image, wherein the sample image comprises a first sample image, a second sample image and a third sample image, the class label comprises a first class label and a second class label, the first class label comprises a first class label of the first sample image, a first class label of the second sample image and a first class label of the third sample image, and the second class label comprises a second class label of the second sample image and a second class label of the third sample image;
inputting the sample image into a first preset sub-network of a preset model, so that the first preset sub-network performs feature extraction on the sample image to obtain information of a first sample feature of the sample image, and determining information of a first prediction category of the sample image according to the information of the first sample feature;
when the information of the first prediction type is detected to be the information of a preset reference type, inputting the sample image into a second preset sub-network of the preset model, so that the second preset sub-network performs feature extraction on the sample image to obtain information of a second sample feature of the sample image, and determining the information of the second prediction type of the sample image according to the information of the second sample feature;
and updating the model parameters of the preset model according to the information of the first prediction category, the information of the second prediction category, the first category label and the second category label until the preset model is taken as a preset cascade classification network when the preset model meets the preset training stopping condition.
In one possible implementation manner of the present application, the training unit is further specifically configured to:
acquiring an original image of an object to be identified;
calling a preset GAN network, and generating a plurality of supplementary images according to the original image;
the original image and the supplementary image are taken as sample images.
In a possible implementation manner of the present application, the obtaining unit 501 is further specifically configured to:
acquiring a plurality of images acquired in a specific area;
taking the plurality of images as the status image;
in a possible implementation manner of the present application, the object classification device further includes a statistical unit (not shown in the figure), and the statistical unit is further specifically configured to:
the method further comprises the following steps:
and determining the occurrence frequency of objects of a preset category in the specific area according to the information of the target category.
In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.
Since the object classification apparatus can execute the steps in the object classification method in any embodiment of the present application corresponding to fig. 1 to 4, the beneficial effects that can be achieved by the object classification method in any embodiment of the present application corresponding to fig. 1 to 4 can be achieved, which are detailed in the foregoing description and will not be repeated herein.
In addition, in order to better implement the object classification method in the embodiment of the present application, based on the object classification method, an electronic device is further provided in the embodiment of the present application, referring to fig. 6, fig. 6 shows a schematic structural diagram of the electronic device in the embodiment of the present application, specifically, the electronic device provided in the embodiment of the present application includes a processor 601, and when the processor 601 executes a computer program stored in a memory 602, each step of the object classification method in any embodiment corresponding to fig. 1 to 4 is implemented; alternatively, the processor 601 is configured to implement the functions of the units in the corresponding embodiment of fig. 5 when executing the computer program stored in the memory 602.
Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in the memory 602 and executed by the processor 601 to implement embodiments of the present application. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.
The electronic device may include, but is not limited to, a processor 601, a memory 602. Those skilled in the art will appreciate that the illustration is merely an example of an electronic device and does not constitute a limitation of the electronic device, and may include more or less components than those illustrated, or combine some components, or different components, for example, the electronic device may further include an input output device, a network access device, a bus, etc., and the processor 601, the memory 602, the input output device, the network access device, etc., are connected via the bus.
The Processor 601 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center for the electronic device and the various interfaces and lines connecting the various parts of the overall electronic device.
The memory 602 may be used for storing computer programs and/or modules, and the processor 601 may implement various functions of the computer apparatus by executing or executing the computer programs and/or modules stored in the memory 602 and calling data stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the electronic device, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the object classification device, the electronic device and the corresponding units thereof described above may refer to the description of the object classification method in any embodiment corresponding to fig. 1 to 4, and are not described herein again in detail.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
For this reason, an embodiment of the present application provides a computer-readable storage medium, where a plurality of instructions are stored, where the instructions can be loaded by a processor to execute steps in an object classification method in any embodiment of the present application corresponding to fig. 1 to 4, and specific operations may refer to descriptions of the object classification method in any embodiment corresponding to fig. 1 to 4, which are not repeated herein.
Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the computer-readable storage medium can execute the steps in the object classification method in any embodiment corresponding to fig. 1 to 4 in the present application, the beneficial effects that can be achieved by the object classification method in any embodiment corresponding to fig. 1 to 4 in the present application can be achieved, which are detailed in the foregoing description and will not be repeated herein.
The foregoing detailed description is directed to a method, an apparatus, an electronic device, and a computer-readable storage medium for classifying an object provided in an embodiment of the present application, and a specific example is applied in the detailed description to explain the principles and embodiments of the present application, and the description of the foregoing embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.
Claims (10)
1. A method of classifying an object, the method comprising:
acquiring a state image containing an object to be classified;
performing feature extraction processing on the state image to obtain information of a first image feature of the state image;
determining information of a first identification category of the object to be classified according to the information of the first image characteristic;
when the first identification category information is detected to be the same as the preset reference category information, acquiring second identification category information of the object to be classified as target category information of the object to be classified, wherein the second identification category belongs to the first identification category.
2. The object classification method according to claim 1, wherein the acquiring, when it is detected that the information of the first identification category is the same as the information of a preset reference category, the information of the second identification category of the object to be classified as the information of the target category of the object to be classified, includes:
when the condition image is detected to be identical to the preset reference type information, performing feature extraction processing according to the first identification type information and the condition image to obtain second image feature information of the condition image;
and determining the information of the second identification category of the object to be classified according to the information of the second image characteristic, wherein the information is used as the information of the target category of the object to be classified.
3. The object classification method according to claim 1, characterized in that the method further comprises:
and when the fact that the information of the first identification category is different from the preset reference category is detected, taking the information of the first identification category as the information of the target category of the object to be classified.
4. The object classification method according to claim 1, wherein the determining information of the first identification class of the object to be classified according to the information of the first image feature comprises:
inputting the information of the first image characteristic into a first classification sub-network of a preset cascade classification network, so that the first classification sub-network determines information of a first identification class of the object to be classified according to the information of the first image characteristic;
the determining, according to the information of the second image feature, information of a second identification category of the object to be classified as information of a target category of the object to be classified includes:
when the fact that the information of the first identification category is the same as the information of the preset reference category is detected, the state image is input into a second classification sub-network of the preset cascade classification network, so that the second classification sub-network performs feature extraction processing on the state image to obtain information of second image features of the state image, and the information of the second identification category of the object to be classified is determined according to the information of the second image features to serve as the information of the target category of the object to be classified.
5. The object classification method according to claim 4, characterized in that the method further comprises:
obtaining a sample image and obtaining a class label of the sample image, wherein the sample image comprises a first sample image, a second sample image and a third sample image, the class label comprises a first class label and a second class label, the first class label comprises a first class label of the first sample image, a first class label of the second sample image and a first class label of the third sample image, and the second class label comprises a second class label of the second sample image and a second class label of the third sample image;
inputting the sample image into a first preset sub-network of a preset model, so that the first preset sub-network performs feature extraction on the sample image to obtain information of a first sample feature of the sample image, and determining information of a first prediction category of the sample image according to the information of the first sample feature;
when the information of the first prediction type is detected to be the information of a preset reference type, inputting the sample image into a second preset sub-network of the preset model, so that the second preset sub-network performs feature extraction on the sample image to obtain information of a second sample feature of the sample image, and determining the information of the second prediction type of the sample image according to the information of the second sample feature;
and updating the model parameters of the preset model according to the information of the first prediction category, the information of the second prediction category, the first category label and the second category label until the preset model is taken as a preset cascade classification network when the preset model meets the preset training stopping condition.
6. The object classification method according to claim 5, wherein the obtaining of the sample image comprises:
acquiring an original image of an object to be identified;
calling a preset GAN network, and generating a plurality of supplementary images according to the original image;
the original image and the supplementary image are taken as sample images.
7. The object classification method according to any one of claims 1 to 6, wherein the acquiring of the state image containing the object to be classified includes:
acquiring a plurality of images acquired in a specific area;
taking the plurality of images as the status image;
the method further comprises the following steps:
and determining the occurrence frequency of objects of a preset category in the specific area according to the information of the target category.
8. An object sorting device, characterized in that it comprises:
an acquisition unit for acquiring a state image containing an object to be classified;
the processing unit is used for carrying out feature extraction processing on the state image to obtain information of a first image feature of the state image;
the first classification unit is used for determining information of a first identification category of the object to be classified according to the information of the first image characteristic;
and the second classification unit is used for acquiring information of a second identification class of the object to be classified as information of a target class of the object to be classified when the first identification class information and the preset reference class information are detected to be the same, wherein the second identification class belongs to the first identification class.
9. An electronic device comprising a processor and a memory, the memory having stored therein a computer program, the processor, when calling the computer program in the memory, performing the object classification method according to any one of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which is loaded by a processor to perform the steps of the object classification method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010629378.3A CN113963189A (en) | 2020-07-03 | 2020-07-03 | Object classification method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010629378.3A CN113963189A (en) | 2020-07-03 | 2020-07-03 | Object classification method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113963189A true CN113963189A (en) | 2022-01-21 |
Family
ID=79459307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010629378.3A Pending CN113963189A (en) | 2020-07-03 | 2020-07-03 | Object classification method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113963189A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911795A (en) * | 2024-03-18 | 2024-04-19 | 杭州食方科技有限公司 | Food image recognition method, apparatus, electronic device, and computer-readable medium |
-
2020
- 2020-07-03 CN CN202010629378.3A patent/CN113963189A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911795A (en) * | 2024-03-18 | 2024-04-19 | 杭州食方科技有限公司 | Food image recognition method, apparatus, electronic device, and computer-readable medium |
CN117911795B (en) * | 2024-03-18 | 2024-06-11 | 杭州食方科技有限公司 | Food image recognition method, apparatus, electronic device, and computer-readable medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bergmann et al. | The MVTec anomaly detection dataset: a comprehensive real-world dataset for unsupervised anomaly detection | |
CN107871130B (en) | Image processing | |
Seguí et al. | Learning to count with deep object features | |
US10824832B2 (en) | Barcode tag detection in side view sample tube images for laboratory automation | |
CN114937179B (en) | Junk image classification method and device, electronic equipment and storage medium | |
CN105574550A (en) | Vehicle identification method and device | |
CN109727275B (en) | Object detection method, device, system and computer readable storage medium | |
JP2018512567A5 (en) | ||
CN107808126A (en) | Vehicle retrieval method and device | |
CN112633297A (en) | Target object identification method and device, storage medium and electronic device | |
Awang et al. | Vehicle counting system based on vehicle type classification using deep learning method | |
Bharadwaj et al. | Can holistic representations be used for face biometric quality assessment? | |
Rematas et al. | Efficient object detection and segmentation with a cascaded hough forest ism | |
US20220375202A1 (en) | Hierarchical sampling for object identification | |
CN113963189A (en) | Object classification method and device, electronic equipment and storage medium | |
CN114255435A (en) | Method and device for detecting abnormality of transport device, electronic apparatus, and storage medium | |
CN112287905A (en) | Vehicle damage identification method, device, equipment and storage medium | |
CN113743434A (en) | Training method of target detection network, image augmentation method and device | |
Cabrera-Gámez et al. | Exploring the use of local descriptors for fish recognition in lifeclef 2015 | |
Kaja et al. | Two stage intelligent automotive system to detect and classify a traffic light | |
Le et al. | Combining deep and handcrafted image features for vehicle classification in drone imagery | |
CN112990245A (en) | Article identification method, apparatus, device and storage medium | |
CN113761263A (en) | Similarity determination method and device and computer readable storage medium | |
CN105160333A (en) | Vehicle model identifying method and vehicle model identifying device | |
Göngör et al. | Design of a chair recognition algorithm and implementation to a humanoid robot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |