Vehicle appearance damage identification method based on deep learning
Technical Field
The invention belongs to the field of artificial intelligence, and particularly relates to a vehicle appearance damage identification method based on deep learning.
Background
In recent years, with the continuous and rapid development of the economic society of China, the quantity of motor vehicles kept in China continuously and rapidly increases, and the problem of vehicle traffic accidents also frequently occurs all the time. Generally, after a vehicle has a traffic accident, a professional insurance company claim settlement person needs to identify the vehicle damage by adopting a manual judgment method, so that the problems of low case processing efficiency of the insurance company, long waiting time of a vehicle owner and the like are caused.
The search of the prior art shows that Chinese patent document No. CN105678622A, published/announced 2016, 06, 15, entitled "analysis method and system of vehicle insurance claim settlement photos" discloses a method for analyzing accident photos uploaded by a mobile terminal by using a conventional convolutional neural network, identifying damaged parts and generating reminding information based on the analysis result. The method is only used for determining the damage part of the vehicle, and specific damage types cannot be identified. In addition, the damage assessment result of the vehicle in the method still needs to be verified manually, and the labor cost is still large.
Further, chinese patent document No. CN107358596A, published/announced 11/17/2017, discloses a method, an apparatus, an electronic device, and a system for vehicle damage assessment based on images. In the patent, the identification of the damage of the vehicle appearance part is realized through sample training by constructing a network model of the convolutional layer CNN and the region suggestion layer RPN. The method adopts the multi-scale and multi-proportion reference frame, and can effectively improve the damage detection of the unconventional scale and proportion. The model algorithm is integrally divided into two stages, firstly, RPN is used for screening rough selection areas of the feature map, and then, a convolutional neural network is used for classifying and regressing the obtained rough selection areas. However, the method has complex process and large calculation amount, so that the algorithm detection speed is slow, and the real-time effect is difficult to achieve.
Disclosure of Invention
In view of this, the invention aims to provide a vehicle appearance damage identification method based on deep learning, which identifies the damage type and degree of a vehicle appearance part in a complex environment based on a deep convolutional neural network model, and improves the algorithm operation speed while ensuring the algorithm precision.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
a vehicle appearance damage identification method based on deep learning comprises the following steps:
the method comprises the following steps: acquiring an actual vehicle appearance damage image and marking the damage type and position;
step two: building a deep convolutional neural network;
step three: carrying out model training to obtain a trained model;
step four: and carrying out vehicle appearance damage identification and model evaluation by using the model obtained by training.
Further, in the first step, a data set is built by self, and the acquired vehicle appearance damage images under various shooting angles, various vehicle types and various environments are stored in the data set.
The acquired images cover distant view images and close view images of damaged appearance of vehicles such as cars, SUVs, MPVs, cross-type passenger vehicles and the like in a plurality of directions such as the front of the vehicle, the rear of the vehicle, the front of the vehicle and the like in the day and at night.
And an image marking tool is adopted to mark the damage of the vehicle appearance piece, such as scratch, dent and the like.
Further, in the step one, the labeled data set is divided into a training set, a verification set and a test set.
Further, in the second step, firstly, a backbone network is built, the feature extraction based on the convolutional neural network model CNN is realized, and parameters such as a network threshold value, the maximum iteration times and the like are set;
building a candidate frame generation network, and taking the extracted feature map as input to realize candidate frame generation based on a convolutional neural network model (CNN); the candidate frames comprise both foreground candidate frames and background candidate frames, and the network directly inputs the generated candidate frames into the next part of network without primary screening, so that the running speed of the model is greatly shortened;
and building a target picture classification network and a boundary box regression network, and inputting the candidate box to realize the classification of the target in the candidate box and the position regression of the target.
Further, in the second step, a residual error network ResNet is used as a backbone network, and the backbone network is expanded by using a pyramid network FPN.
Further, in the second step, the built deep neural network is a feedforward neural network, model training is completed by building a loss function and continuously feeding back and adjusting network parameters, the traditional one-stage network loss function is a cross entropy CE loss function, and the formula is as follows:
wherein y-1 represents a positive sample, y-1 represents a negative sample, and p ∈ [0,1] is a confidence score; if this function is used, when a large number of simple samples exist, even if the errors generated by the respective samples are small, the sum of the errors may have a great influence on the detector;
by adding a weight function in front of the cross entropy loss function, the problem of class imbalance can be solved, so that
Then CE (p, y) is equal to CE (p)
t)=-log(p
t) Increasing the weight function- (1-p)
t)
γThus, the new loss function formula is derived as:
NCE(pt)=-(1-pt)γlog(pt) Wherein γ is a regulatory factor and γ > 0;
the new loss function solves the class imbalance problem of the traditional one-stage network, namely the situation that a large number of simple negative samples overwhelm the detector in the training process due to foreground-background class imbalance in the training process is solved.
Further, in the third step, firstly, initializing parameters to be trained in the network, and inputting a training set into the initialized network for forward propagation; and (5) adjusting parameters in the network by using the loss function in the step two and utilizing the characteristic that the convolutional neural network is used as a feedforward neural network, finishing training until the loss value is smaller than a set threshold value or reaches the maximum iteration number, and finally training to obtain a network model for identifying the vehicle appearance damage.
Further, in step three, the training sample data in the training set includes the original image, the damage location, and the damage type information.
Further, in step three, the model outputs the appearance damage level in the vehicle picture, including the damage type and the damage position.
Compared with the prior art, the vehicle appearance damage identification method based on deep learning has the following advantages:
according to the vehicle appearance damage identification method based on deep learning, the identification of the vehicle appearance damage is realized through the built deep convolutional neural network model, the vehicle appearance damage is accurate in positioning and comprises the damage type and the damage position;
the model algorithm solves the problem of low detection precision caused by the fact that the category is unbalanced in practice by improving the loss function of the algorithm, and realizes the function of quickly and accurately identifying the appearance damage of the vehicle;
the method and the system are beneficial to improving the efficiency and the accuracy of identifying the damage of the vehicle part by an insurance company, and really solve the problems of high labor calling cost, long waiting time of a vehicle owner and the like in actual claim settlement.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention.
In the drawings:
fig. 1 is a flowchart of a vehicle appearance damage identification method based on deep learning according to an embodiment of the present invention;
fig. 2 is a network model structure diagram of a vehicle appearance damage identification method based on deep learning according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art through specific situations.
The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
As shown in fig. 1-2, a method for identifying vehicle appearance damage based on deep learning includes:
the method comprises the following steps: acquiring an actual vehicle appearance damage image and marking the damage type and position;
step two: building a deep convolutional neural network;
step three: carrying out model training to obtain a trained model;
step four: and carrying out vehicle appearance damage identification and model evaluation by using the model obtained by training.
As shown in fig. 1-2, in the first step, a data set is created, and the acquired images of the damage to the appearance of the vehicle at various shooting angles, various vehicle types and various environments are stored in the data set.
The acquired images cover distant view images and close view images of damaged appearance of vehicles such as cars, SUVs, MPVs, cross-type passenger vehicles and the like in a plurality of directions such as the front of the vehicle, the rear of the vehicle, the front of the vehicle and the like in the day and at night.
And an image marking tool is adopted to mark the damage of the vehicle appearance piece, such as scratch, dent and the like.
As shown in fig. 1-2, in step one, the labeled data set is divided into a training set, a validation set and a test set.
As shown in fig. 1-2, in the second step, firstly, a backbone network is built, feature extraction based on a convolutional neural network model CNN is realized, and parameters such as a network threshold value, a maximum iteration number and the like are set;
building a candidate frame generation network, and taking the extracted feature map as input to realize candidate frame generation based on a convolutional neural network model (CNN); the candidate frames comprise both foreground candidate frames and background candidate frames, and the network directly inputs the generated candidate frames into the next part of network without primary screening, so that the running speed of the model is greatly shortened;
and building a target picture classification network and a boundary box regression network, and inputting the candidate box to realize the classification of the target in the candidate box and the position regression of the target.
As shown in fig. 1-2, in step two, a residual error network ResNet with strong feature expression capability is used as a backbone network, such as ResNet-50 or ResNet-101; the extension of the backbone network using pyramid network FPN, such as ResNet-101+ FPN, can better feature the extraction network at multiple scales.
As shown in fig. 1-2, in the second step, the constructed deep neural network is a feedforward neural network, and model training is completed by constructing a loss function and continuously feeding back and adjusting network parameters, where the traditional one-stage network loss function is a cross entropy CE loss function, and the formula is as follows:
wherein y-1 represents a positive sample, y-1 represents a negative sample, and p ∈ [0,1] is a confidence score; if this function is used, when a large number of simple samples exist, even if the errors generated by the respective samples are small, the sum of the errors may have a great influence on the detector;
by adding a weight function in front of the cross entropy loss function, the problem of class imbalance can be solved, so that
Then CE (p, y) is equal to CE (p)
t)=-log(p
t) Increasing the weight function- (1-p)
t)
γThus, the new loss function formula is derived as:
NCE(pt)=-(1-pt)γlog(pt) Wherein γ is a regulatory factor and γ > 0;
the new loss function solves the class imbalance problem of the traditional one-stage network, namely the situation that a large number of simple negative samples overwhelm the detector in the training process due to foreground-background class imbalance in the training process is solved.
As shown in fig. 1-2, in the third step, firstly, a parameter to be trained in the network is initialized, and in this embodiment, the ResNet-101 parameter is used as an initial value of the network convolution part; inputting a training set into the initialized network for forward propagation; and (5) adjusting parameters in the network by using the loss function in the step two and utilizing the characteristic that the convolutional neural network is used as a feedforward neural network, finishing training until the loss value is smaller than a set threshold value or reaches the maximum iteration number, and finally training to obtain a network model for identifying the vehicle appearance damage.
As shown in fig. 1-2, in step three, the training sample data in the training set includes the original image, the lesion location, and the lesion type information.
In step three, as shown in fig. 1-2, the model outputs the apparent damage level including the damage type and the damage location in the vehicle picture.
In this embodiment, a trained network is applied, one or more images to be detected are input to the trained network, and the type (including degree) of vehicle damage and damage position corresponding to each image are output, where the output image area is related to the vehicle appearance damage position in the image. Specifically, if an input image contains a vehicle appearance damage (contained in the damage label of the training sample), outputting an appearance damage and a corresponding position thereof; if there are k appearance impairments (contained in the trained appearance label type), then the k appearance impairments and their corresponding locations are output.
The speed of operation is found to be an effective improvement by comparison with the model with the area proposed network PRN, while the model accuracy is guaranteed.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.