CN110705613B

CN110705613B - Object classification method

Info

Publication number: CN110705613B
Application number: CN201910885199.3A
Authority: CN
Inventors: 张发恩; 宋亮; 秦永强
Original assignee: Innovation Qizhi Qingdao Technology Co ltd
Current assignee: Innovation Qizhi Qingdao Technology Co ltd
Priority date: 2019-09-19
Filing date: 2019-09-19
Publication date: 2021-06-11
Anticipated expiration: 2039-09-19
Also published as: CN110705613A

Abstract

The invention discloses an object classification method, which comprises the following steps: firstly, detecting all targets in a scene by using a detection model, and carrying out primary coarse classification; secondly, constructing a position relation graph of the roughly classified targets according to the mutual position relation, and calculating the relation influence weight according to the relative distance; thirdly, graph convolution is carried out on the position relation graph by using the relation influence weight calculated in the second step to obtain a smoothed result; and repeating the steps until the change value is not changed. According to the invention, through constructing the position relation diagram of different objects, the final result is not only the classification result of a single object but also the classification result of surrounding objects is considered, so that objects which are difficult to distinguish by a machine learning model due to occlusion blurring and the like can be reclassified, and simple inference can be carried out, namely the classification result of the objects which are difficult to distinguish is deduced through the classification result of the surrounding objects.

Description

Object classification method

Technical Field

The invention relates to the technical field of product classification, in particular to a technology of an object classification method.

Background

With the development of artificial intelligence, detection and classification techniques play an increasingly important role in daily life, but the essence of the current mainstream detection and classification techniques is to classify each opposite object, and the interrelation between the objects is not considered. For some scenes needing to consider the mutual relation, the performance is poor, and the accuracy of human beings is difficult to achieve.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an object classification method, which can be used for simply deducing the classification of an object which is difficult to distinguish to obtain the class of the object.

In order to achieve the purpose, the invention adopts the following technical scheme:

a method of classifying an object, characterized by: the object classification method comprises the following steps

Firstly, detecting all targets in a scene by using a detection model, and carrying out primary coarse classification;

secondly, constructing a position relation graph of the roughly classified targets according to the mutual position relation, and calculating the relation influence weight according to the relative distance;

thirdly, graph convolution is carried out on the position relation graph by using the relation influence weight calculated in the second step to obtain a smoothed result;

and repeating the steps until the change value is not changed.

The coarse classification is a convolutional neural network such as ResNet, resulting in a probability of belonging to each class.

The mutual position relation uses Euclidean pixel distance as distance, and the weight is e ^ - (distance/k), wherein k is a parameter which can be changed according to actual effect.

The convolution in the third step is weighting, the weighting is carried out according to the weight calculated in the second step, and the smoothed result is x1 sum (wi x0 i); the variation value is x1-x 0.

The invention has the following beneficial effects:

according to the invention, through constructing the position relation diagram of different objects, the final result is not only the classification result of a single object, but also the classification result of surrounding objects is considered, namely the target object and the surrounding objects are combined to carry out the inference of one classification result, so that objects which are difficult to distinguish by a machine learning model due to the existence of shielding blurring and the like can be reclassified, and thus, simple inference can be carried out, namely the classification results of the objects which are difficult to distinguish are inferred through the classification results of the surrounding objects.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic diagram of an object classification method of the present invention;

FIG. 2 is a schematic diagram of a classification module of the object classification method of the present invention.

Detailed Description

The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.

Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.

The object classification method of the invention comprises the following steps:

firstly, detecting all targets in a scene by using a detection model such as Mask-RCNN, and carrying out primary rough classification, namely the probability that the result of the convolutional neural network belongs to each class, if objects are classified into three classes, the result of each object rough classification [ 0.10.10.8 ] respectively represents that the probability of the first class is 0.1, the probability of the second class is 0.1, and the probability of the third class is 0.8, namely the objects are classified into the third class. The convolutional neural network result such as a ResNet result; ResNet (residual Neural network) was proposed by four people, Kaiming He, of Microsoft institute, and a 152-layer Neural network was successfully trained using the ResNet Unit. The structure of ResNet can accelerate the training of the neural network very fast, and the accuracy of the model is greatly improved. Meanwhile, the popularization of ResNet is very good, and even the ResNet can be directly used in an IncepotionNet network. Mask R-CNN is a two-stage framework, the first stage scanning the image and generating proposals (i.e., areas that may contain an object), the second stage classifying the proposals and generating bounding boxes and masks. The Mask R-CNN was extended from Faster R-CNN and was proposed by the same author in the last year. The Faster R-CNN is a popular target detection framework, and the Mask R-CNN expands the target detection framework into an example segmentation framework.

Secondly, constructing a position relation graph of each roughly classified target according to a mutual position relation, calculating a relation influence weight according to a relative distance, and using Euclidean pixel distance as the distance, wherein the relation influence weight is e ^ - (distance/k), and k is a parameter changed according to an actual effect;

thirdly, graph convolution is carried out on the position relation graph by using the weights calculated in the second step, the convolution is weighting, and the smoothed result is x1 sum (wi x0 i);

the above steps are repeated until the variation value stops changing, namely the variation value x1-x0 stops changing, and the variation value x1-x0 is as small as possible. The technical meanings given for the parameters e ^ distance, k, (distance/k), sum, wi, x0i, x0, x1 by the aforementioned e ^ - (distance/k), x1 ═ sum (wi ^ x0i) are as follows: e ^ index; distance is the pixel distance of two objects (the pixel distance is the distance on the picture); sum; wi is the calculated weight; x is the calculated probability. x1 ═ sum (wi x0i) depends on the calculation result of e ^ - (distance/k); x1-x0 refers to the correction of probability.

For example: a detection model such as Mask-RCNN is used for detecting all targets in a scene, including a target object A and surrounding objects B, wherein the target object A has shielding blurring, so that the machine learning model is difficult to distinguish, and the surrounding objects B are easy to distinguish the categories of the targets by the machine learning model. The method comprises the following steps of carrying out primary rough classification on a target object A and a peripheral object B, wherein the classification principle is that a probability numerical value belonging to each class is obtained by utilizing a convolution neural network result. Where the surrounding object B is a neighboring object sufficiently close to the target object a.

Constructing a position relation graph of a target object A and a surrounding object B according to the mutual position relation of the target object A and the surrounding object B, calculating a relation influence weight according to a relative distance, and using a Euclidean pixel distance as a distance, wherein the relation influence weight is e ^ - (distance/k), and k is a parameter changed according to an actual effect;

the position map is subjected to graph convolution, that is, weighting, using the calculated weights, and a smoothed result x1 is sum (wi x0 i).

Repeating the above steps until the variation value stops changing, namely the variation value x1-x0 stops changing, and then the probability that the target object and the surrounding object B are the same is considered to be high. It is inferred that the class of the surrounding object B is the class of the target object a.

According to the object classification method, the mutual position relation of the objects is considered, so that the mutual relation of different objects can be considered in the object detection and classification result, and the classification result is more stable and is close to the result achieved by human beings. In the invention, the technical core is that a position relation graph of different objects is constructed, so that the final result is not only the classification result of a single target object, but also the classification result of surrounding objects is considered, and an inference is carried out. The above process is actually weighted by object periphery classification, and the target object and the similar object are considered to be in the same class with higher probability, so as to carry out inference. As described above, objects that are difficult to distinguish by the machine learning model due to occlusion blurring or the like are reclassified so that the model can be weighted by the calculated weight, and the smoothed result, that is, x1 (sum (wi x0i), is obtained; repeating the steps, and if the change value x1-x0 stops changing, the probability that the target object and the surrounding object are the same is considered to be high, so that the class of the surrounding object, namely the similar object, is inferred to be the class of the target object, namely the object which is difficult to distinguish by the machine learning model. It can be seen that the present invention infers the category of these target objects that are difficult to distinguish through the surrounding objects.

In summary, the object classification method of the present invention connects neighboring objects that are close enough together by using one edge, so that the final result not only is the classification result of a single object, but also an inference is performed in consideration of the classification result of surrounding objects, and therefore, objects that are difficult to be resolved by the machine learning model due to occlusion blurring and the like are reclassified, so that the model can perform simple inference, and the classes of the objects that are difficult to be resolved are inferred by the surrounding objects.

Referring to fig. 1, the principle of the object classification method of the present invention is shown: detecting and roughly classifying by using a detection model; constructing an object relation graph in the picture and calculating weight; performing convolution according to the calculated weight sum graph to obtain a smoothed result x 1-sum (wi x0 i); the above process is repeated until the variation values x1-x0 stop unchanged. FIG. 2 is a diagram of a classification module, i.e., a detection classification model, an object relationship diagram, and a graph convolution.

It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention and the technical principles used, and any changes or substitutions which can be easily conceived by those skilled in the art within the technical scope of the present invention disclosed herein should be covered within the protective scope of the present invention.

Claims

1. An object classification method is characterized in that the object classification method constructs a position relation graph of different objects, so that the final result is not only the classification result of a single target object, but also the classification result of surrounding objects is considered, an inference is made that the target object and the surrounding objects are possibly in the same class by weighting the classification result of the surrounding objects, and the classification result of the object which is difficult to distinguish is deduced according to the classification result of the surrounding objects, wherein the target object is called the object which is difficult to distinguish,

comprises the following steps: firstly, detecting all targets in a scene by using a detection model, wherein all targets comprise a target object and surrounding objects, the target object is shielded and blurred to make the detection model difficult to distinguish, the surrounding objects are easy to distinguish the categories of the target object, the surrounding objects are adjacent objects close to the target object enough, the target object and the surrounding objects are roughly classified for one time, and the classification principle of the rough classification is to obtain a probability value belonging to each category by using a convolutional neural network result;

thirdly, graph convolution is carried out on the position relation graph by using the relation influence weight calculated in the second step to obtain a smoothed result, the second step and the third step are repeated until a variation value stops changing, the probability that the target object and the surrounding object are the same is considered to be high, the category of the surrounding object is inferred to be the category of the target object, the convolution in the third step is weighting, weighting is carried out according to the weight calculated in the second step, and the smoothed result is x1 sum (wi x0 i); the change value is x1-x 0; wherein e ^ index; distance is the pixel distance of two objects, and the pixel distance is the distance on the picture; sum; wi is the calculated weight; x is the calculated probability; x1 ═ sum (wi x0i) depends on the calculation result of e ^ - (distance/k); x1-x0 refers to the correction of probability.

2. The object classification method according to claim 1, characterized in that: the detection model is Mask-RCNN and is a two-stage framework, the image is scanned and a proposal is generated in the first stage, and a bounding box and a Mask are generated by classifying the proposal in the second stage;

the rough classification is [ 0.10.10.8 ], which respectively represents that the first class probability is 0.1, the second class probability is 0.1, and the third class probability is 0.8, namely the rough classification is classified into a third class; the convolutional neural network result is a ResNet result.