CN111507989A

CN111507989A - Training generation method of semantic segmentation model, and vehicle appearance detection method and device

Info

Publication number: CN111507989A
Application number: CN202010294786.8A
Authority: CN
Inventors: 周康明; 申周
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2020-04-15
Filing date: 2020-04-15
Publication date: 2020-08-07

Abstract

The application relates to a training generation method of a semantic segmentation model, a vehicle appearance detection method, a device, computer equipment and a storage medium. The training generation method of the semantic segmentation model comprises the following steps: acquiring a plurality of image samples; labeling the category in each image sample to generate a training set; acquiring the weight corresponding to each category; training a semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class; and adjusting the model parameters of the semantic segmentation model to be trained according to the loss value to generate the semantic segmentation model. According to the method, the loss function with the weight is adopted to train the model, the importance of each category in the model training task is adjusted by configuring the weight corresponding to the category, the accuracy of model identification can be improved, and further, in the vehicle appearance detection process, the semantic segmentation model obtained by the method can be adopted to improve the accuracy of vehicle appearance detection.

Description

Training generation method of semantic segmentation model, and vehicle appearance detection method and device

Technical Field

The present application relates to the field of vehicle detection technologies, and in particular, to a training generation method and apparatus for a semantic segmentation model, a computer device, and a storage medium, and a vehicle appearance detection method and apparatus, a computer device, and a storage medium.

Background

With the continuous development of social economy and the continuous improvement of the living standard of people, the quantity of motor vehicles kept is rapidly increased, so that the workload of annual inspection of motor vehicles is rapidly increased.

Vehicle appearance recognition is an important part of vehicle annual inspection. In the conventional technology, the vehicle appearance is mainly judged and identified manually. In the annual inspection of vehicles for a large vehicle, it is highly required to determine whether or not dimensions of each member of the large vehicle, such as length, width, thickness, and curvature, are acceptable. In the prior art, the size of each part of the detection cart can be additionally detected in an intelligent detection mode. The intelligent detection mode mostly adopts the deep learning model to detect and obtain the rectangle frame that each part corresponds to fix a position and detect each part. However, this intelligent detection method can only obtain approximate positional information of each component, and has a problem that the vehicle appearance detection is not accurate enough.

Disclosure of Invention

In view of the above, it is necessary to provide a training generation method, an apparatus, a computer device, and a storage medium for a semantic segmentation model capable of improving the accuracy of vehicle outer-end detection, and a vehicle appearance detection method, an apparatus, a computer device, and a storage medium.

In order to achieve the above object, in a first aspect, an embodiment of the present application provides a method for training and generating a semantic segmentation model, where the method includes:

acquiring a plurality of image samples;

labeling the category in each image sample to generate a training sample set;

acquiring the weight corresponding to each category;

inputting a training sample set into a semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class;

and adjusting the model parameters of the semantic segmentation model to be trained according to the loss value to generate the semantic segmentation model.

In order to achieve the above object, in a second aspect, an embodiment of the present application provides a vehicle appearance detection method, including:

detecting the image to be detected by adopting a target detection model to obtain a vehicle area image of the vehicle to be detected;

performing semantic segmentation on the vehicle region image by using the semantic segmentation model of any one of the methods in the first aspect to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected;

determining the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component;

and comparing the size of each vehicle component with the standard size to generate a detection result of the vehicle appearance.

In one embodiment, determining the size of each vehicle component according to the pixel size information of the semantic segmentation image corresponding to each vehicle component includes:

acquiring the size of a reference object of an image to be detected;

calculating the size information corresponding to each pixel point according to the size of the reference object;

and calculating the size information of the pixel points in the semantic segmentation image corresponding to each vehicle part according to the size information corresponding to each pixel point to obtain the size of each vehicle part.

In a third aspect, an embodiment of the present application provides a training generation apparatus for a semantic segmentation model, where the apparatus includes:

the acquisition module is used for acquiring a plurality of image samples;

the training sample set generating module is used for labeling the categories in each image sample to generate a training sample set;

the acquisition module is also used for acquiring the weight corresponding to each category;

the training module is used for inputting the training sample set into the semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class;

and the model generation module is used for adjusting the model parameters of the semantic segmentation model to be trained according to the loss value and generating the semantic segmentation model.

In a fourth aspect, an embodiment of the present application provides a vehicle appearance detection apparatus, including:

the target detection module is used for detecting the image to be detected by adopting a target detection model to obtain a vehicle area image of the vehicle to be detected;

the appearance segmentation module is used for performing semantic segmentation on the vehicle region image by adopting the semantic segmentation model of any one of the methods in the first aspect to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected;

the component size determining module is used for determining the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component;

and the detection result generation module is used for comparing the size of each vehicle component with the standard size, and if the size is within the error range, generating a result that the vehicle appearance size detection is passed.

In a fifth aspect, an embodiment of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of any one of the methods in the first aspect when executing the computer program.

In a sixth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of any one of the methods in the first aspect.

According to the training generation method of the semantic segmentation model, the vehicle appearance detection method and device, the computer equipment and the storage medium, the semantic segmentation model is trained by adopting the loss function with the weight, the corresponding weight is configured for each category, the importance of each category in a model training task is adjusted, and the accuracy of the semantic segmentation model can be improved. In the vehicle appearance detection process, the semantic segmentation model obtained by the method is used for semantically segmenting the vehicle region image on the basis of not losing space detail characteristics and not reducing the model space capacity, and the size of the vehicle part is obtained according to the semantic segmentation image, so that the accuracy of vehicle appearance detection can be improved, the labor cost can be reduced, the response time is shortened, the working efficiency is improved, and the justness and the openness of vehicle annual inspection work are ensured.

Drawings

FIG. 1 is a diagram of an application environment of a method for training and generating a semantic segmentation model in one embodiment;

FIG. 2 is a schematic flow chart illustrating a method for training and generating a semantic segmentation model according to an embodiment;

FIG. 3 is a schematic flow chart diagram illustrating the training steps of the semantic segmentation model in one embodiment;

FIG. 4 is a schematic flow chart illustrating a method for training and generating a semantic segmentation model according to an embodiment;

FIG. 5 is a diagram of an exemplary embodiment of a vehicle appearance detection method;

FIG. 6 is a schematic flow chart diagram of a vehicle appearance detection method according to one embodiment;

FIG. 7 is a schematic flow chart illustrating the dimensional measurement steps for a vehicle component according to one embodiment;

FIG. 8 is a schematic flow chart diagram of a vehicle appearance detection method according to one embodiment;

FIG. 9 is a block diagram showing an example of a training apparatus for a semantic segmentation model;

fig. 10 is a block diagram showing the configuration of a vehicle appearance detecting apparatus according to an embodiment;

FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The training generation method of the semantic segmentation model provided by the application can be applied to the application environment shown in fig. 1. The application environment includes a terminal 110 and a server 120, and the terminal 110 may refer to an electronic device having strong data storage and computing power. The terminal 110 communicates with the server 120 through a network. Wherein the semantic segmentation model to be trained is deployed in the terminal 110. A plurality of image samples used for training the semantic segmentation model to be trained may be pre-stored in the server 120 or the terminal 110. Taking the pre-stored image in the server 120 as an example, specifically, the terminal 110 obtains a plurality of image samples from the server 120. The terminal 110 labels the categories in each image sample to generate a training sample set. The terminal 110 obtains the weight corresponding to each category. The terminal 110 inputs the training sample set to the semantic segmentation model to be trained, iteratively trains the semantic segmentation model to be trained by using a loss function, and weights and obtains a loss value according to the weight and the class probability respectively corresponding to each class. The terminal 110 adjusts the model parameters of the semantic segmentation model to be trained according to the loss value, and generates the semantic segmentation model. The terminal 110 may be, but not limited to, various personal computers, notebook computers, smart phones, and tablet computers, and the server 120 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, a training generation method of a semantic segmentation model is provided, which is described by taking the method as an example applied to the terminal 110 in fig. 1, and includes the following steps:

step S210, acquiring a plurality of image samples.

Wherein the multiple image samples are image samples acquired under different conditions (e.g., different angles, different illumination) using the image acquisition device. Each image sample comprises a target object which needs to be identified and semantically segmented by a semantic segmentation model, for example, the target object can be a person, an animal, a vehicle and the like. Further, the target object may also be a part of an object, for example, a component on a vehicle. The Semantic Segmentation model is not limited to FCN (full volume networks), SegNet (Semantic Segmentation Network), BiSeNet (binary Segmentation Network for Real-time Semantic Segmentation), a two-channel Real-time Semantic Segmentation model. Specifically, a plurality of image samples under different conditions may be acquired by using an image acquisition device, and stored in a terminal for training a model; or the images are stored in the server in advance, and when a plurality of image samples are needed to train the model, the image samples are obtained from the server.

And step S204, labeling the categories in each image sample to generate a training sample set.

Specifically, after a plurality of image samples for model training are obtained, a data annotation stage is entered. The regions of the target object to be identified in each image sample can be labeled with a polygonal frame, such as a rectangular frame. The labeling can be performed manually or automatically by a machine, and is not limited herein.

In step S230, the weight corresponding to each category is obtained.

The weight is the importance degree of a certain factor or index relative to a certain event, which is different from the general specific gravity, and is not only the percentage of the certain factor or index, but also the relative importance degree of the factor or index is emphasized. Therefore, the weight corresponding to each category can be determined according to the importance degree of each category. For example, the object classes to be identified include three: a background class, a target object a and a target object B. If the semantic segmentation needs to be accurately performed on the background in the image, the weight values corresponding to the background class can be configured to be higher than the weight values corresponding to the target object a and the target object B, respectively. Specifically, the weight corresponding to each category may be a specific value configured in advance; or may be a pre-configured weight obtaining rule, and when the model is trained, the weight corresponding to each category is automatically calculated according to the weight obtaining rule, which is not limited herein.

Step S240, inputting the training sample set into the semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and performing weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class.

Among them, the loss function is generally used for parameter estimation of a model in machine learning, and the loss function used for classification problems in machine learning is not limited to the hinge loss function, the exponential loss function, and the cross entropy loss function. In this embodiment, a loss function with weight is adopted, that is, an existing loss function is adjusted according to the weight corresponding to each category to form a loss function with weight, so that a loss value generated in a training process is obtained by weighting and calculating according to the weight corresponding to each category and the category probability. Specifically, after a training sample set is input to a semantic segmentation model to be trained, each class in each training sample is predicted to obtain a class probability of each class. Then, a loss function with weight configured in advance is adopted, and weighted sum is carried out according to the class probability and the weight corresponding to each class, so that a loss value is obtained.

And step S250, adjusting model parameters of the semantic segmentation model to be trained according to the loss value, and generating the semantic segmentation model.

Specifically, in the model training process, the model parameters of the semantic segmentation model to be trained can be adjusted according to the loss value obtained by calculation until a preset stop condition is reached. The preset stop condition may be that a preset number of iterations is reached, or that the obtained loss value satisfies a preset threshold, which is not limited herein. And finally, generating a finally used semantic segmentation model according to the model parameters of the semantic segmentation model with the minimum loss value or the best robustness in the training process.

In the training and generating method of the semantic segmentation model, corresponding weights are configured for each category, and the importance degree of each category in the model training task is adjusted. The semantic segmentation model is trained by adopting the loss function with the weight, so that the accuracy of the semantic segmentation model can be improved.

In one embodiment, the categories include at least one subcategory within the target area, and a background category outside the target area; obtaining the weight corresponding to each category respectively, including: and acquiring weights corresponding to at least one subcategory in the target area and the background category respectively.

Specifically, in many cases, the semantic segmentation task is concerned about the region where the target object is located in the image, and all regions except the region where the target object is located can be regarded as background regions, and accurate segmentation of the background regions is helpful for improving the accuracy of semantic segmentation of the target object. Therefore, in this embodiment, in the data annotation stage, the target object and the background may be annotated respectively to obtain at least one sub-category in the target area and a background category in the background area.

For example, a cart appearance inspection in a vehicle annual inspection is taken as an example. The large vehicle, i.e., the large-sized automobile, refers to an automobile with a large vehicle type, such as a medium-sized or larger truck. In the annual inspection of vehicles for a cart, the requirement for determining whether or not the dimensions of each member of the cart, such as length, width, thickness, and camber, are acceptable is high, and it is necessary to identify each vehicle member in the cart. In general, in an image acquired by an image acquisition device, a middle region of the image is a target vehicle to be detected (a foreground region), and besides, a background region contains some disordered information, other vehicles far away from the background region and similar to the target vehicle. In the data labeling stage, each vehicle component to be identified in the target vehicle area, such as a license plate, a car light, a car logo, a wheel, a car door, an air vent, a car window, a decal, a fence, a protection device, a red-white reflective mark, a fence and the like, may be labeled, generating at least one sub-category of the target vehicle. Then, the part outside the target vehicle area is labeled, and a background category is generated. And configuring corresponding weights for the at least one sub-category and the background category respectively, and training the semantic segmentation model to be trained by adopting a loss function with the weights. In the model training process, the model parameters of the semantic segmentation model to be trained can be adjusted according to the obtained loss values. And finally, obtaining the finally used semantic segmentation model according to the model parameters of the semantic segmentation model with the minimum loss value or the best robustness in the training process.

In this embodiment, at least one sub-category and a background category of the target region are respectively labeled, and a corresponding weight is configured for each category, so that the capability of the semantic segmentation model for identifying the background and the target object can be improved, and the precision of the semantic segmentation model can be improved.

In one embodiment, the weights include primary weights and secondary weights. As shown in fig. 3, in step S240, iterative training is performed on the semantic segmentation model to be trained by using a loss function, and weighting and obtaining a loss value are performed according to the weight and the class probability respectively corresponding to each class, which specifically includes the following steps:

step S241, iterative training is carried out on the semantic segmentation model to be trained by adopting a loss function, and a primary loss value is obtained by carrying out weighted sum on primary weights and primary class probabilities respectively corresponding to at least one subcategory and a background class.

Step S242, adjusting a model parameter of the semantic segmentation model to be trained according to the primary loss value, and generating an initial semantic segmentation model.

Wherein the primary and secondary correspond to stages in which model training is performed. In this embodiment, the process of model training is divided into two stages, the weight used in the first stage is referred to as the primary weight, and the loss value calculated according to the primary weight is referred to as the primary loss value; similarly, the weight used in the second stage is referred to as a secondary weight, and the loss value calculated from the secondary weight is referred to as a secondary loss value. The primary and secondary weights for each category may be different. The initial semantic segmentation model is a model obtained by completing the training of the first-stage model. Specifically, in the first stage of model training, a loss function with weights is adopted, and a primary loss value is obtained by performing weighted sum according to primary weights respectively corresponding to at least one subcategory and a background category and a primary category probability obtained through prediction. And then, adjusting the model parameters of the semantic segmentation model to be trained according to the obtained primary loss value until a preset stop condition is reached. And finally, generating an initial semantic segmentation model according to the model parameters of the semantic segmentation model to be trained with the minimum initial loss value or the best robustness in the training process.

The cart appearance inspection in the vehicle annual inspection is continuously taken as an example. The accurate segmentation of the background region is helpful for improving the precision of semantic segmentation of the target vehicle, so that the importance degree of the background category can be improved in the first stage of model training, and the initial semantic segmentation model obtained by training can accurately segment the background region. That is, the primary weight corresponding to the configuration context category is greater than the primary weight corresponding to at least one sub-category in the target area. The primary weight may be a fixed value of the configuration; or automatically calculated according to the corresponding primary weight generation rule. Taking a fixed numerical value as an example, assuming that the semantic segmentation model supports 21 classes in total (including 20 foreground classes, i.e., sub-classes in the target region; and one background class), the weight of the background class can be set to 0.5, and the weights of the remaining sub-classes can be set to 0.025 (0.5/20).

Step S243, performing iterative training on the initial semantic segmentation model by using a loss function, and performing weighted sum according to secondary weights and secondary category probabilities respectively corresponding to at least one subcategory and the background category to obtain a secondary loss value;

and 244, adjusting the model parameters of the initial semantic segmentation model according to the secondary loss value to generate a semantic segmentation model.

Specifically, after an initial semantic segmentation model capable of accurately segmenting the background region is obtained, a second stage of model training is entered, and the initial semantic segmentation model is trained continuously. In the second stage, the importance degree of at least one subcategory in the target area can be increased, so that the trained semantic segmentation model can accurately segment at least one target object in the target area. That is, the weight corresponding to the configuration context category is smaller than the weight corresponding to at least one sub-category in the target area respectively. The secondary weight may be a fixed numerical value configured in an exemplary manner with reference to the primary weight; or automatically calculated according to the configured quadratic weight generation rule. In the second stage training process, a loss function with weight is adopted, and a secondary loss value is obtained by performing weighted sum according to secondary weights respectively corresponding to at least one subcategory and the background category and the predicted secondary category probability. And then, adjusting the model parameters of the initial semantic segmentation model according to the obtained loss value until a preset stop condition is reached. And finally, generating the semantic segmentation model according to the model parameters of the initial semantic segmentation model with the minimum loss value or the best robustness in the training process.

In the embodiment, the semantic segmentation model is trained in two stages, so that the importance degree of the background category is increased in the first training process, and the initial semantic segmentation model obtained by training can accurately segment the background area; and in the second stage, the initial semantic segmentation model is trained continuously, the importance degree of at least one subcategory in the target area is increased, so that the obtained semantic segmentation model can accurately segment the areas corresponding to the at least one subcategory in the target area, and the segmentation precision of the semantic segmentation model is improved.

In one embodiment, one manner of determining the secondary weight is described. The determining mode of the secondary weight comprises the following steps: performing semantic segmentation on each image sample by adopting an initial semantic segmentation model to obtain semantic segmentation images respectively corresponding to at least one subcategory and a background category in each image sample; and determining secondary weights corresponding to at least one sub-category and the background category respectively according to the pixel points contained in the semantic segmentation image.

The semantic segmentation model aims to accurately segment images corresponding to a plurality of sub-categories in the target area, so that in the second stage of model training, higher secondary weights can be configured for the plurality of sub-categories in the target area, and lower secondary weights can be configured for the background category, so that the capability of the plurality of sub-categories in the target area during recognition of the semantic segmentation model is improved. Specifically, after the initial semantic segmentation model is obtained, semantic segmentation may be performed on each pattern sample by using the initial semantic segmentation model, so as to obtain a semantic segmentation image corresponding to at least one sub-category and a background category in each image sample. For example, if the semantic segmentation model includes 21 classes, corresponding 21 semantic segmented images can be obtained. And then, determining respective corresponding secondary weights according to the pixel points contained in the semantic segmentation images respectively corresponding to the at least one subcategory and the background category. Continuing to take the cart appearance inspection in the vehicle annual inspection as an example, since the number of the pixel points occupied by the vehicle component is small compared with other areas in the image, the proportion of the pixel points occupied by other areas of the vehicle component can be used as the secondary weight of the vehicle component.

In one embodiment, determining the secondary weights corresponding to at least one sub-category and a background category according to the number of pixel points included in the semantic segmentation image specifically includes: acquiring the difference value of the total pixel point number of the image sample and the pixel point numbers respectively corresponding to at least one sub-category and the background category; and acquiring the difference value corresponding to at least one sub-category and the background category respectively, and the ratio of the difference value to the total pixel point number of the image sample as the secondary weight corresponding to at least one sub-category and the background category respectively.

Specifically, for the identification task in which the target object is a small target, the secondary weight corresponding to each sub-category may be calculated by the following formula:

wherein, W_CRepresenting the weight, N representing the total number of pixels of the image, N_CRepresenting the total number of pixel points of the semantic segmentation image corresponding to the category C.

In this embodiment, after the initial semantic segmentation model that can accurately segment the background region is obtained, the secondary weight of each category is readjusted. According to the formula, the loss proportion of the region with large area is small, the loss proportion of the region with small area is large, so that the semantic segmentation model is more concerned about the segmentation effect of the small target, and the precision of the small target segmentation task can be obviously improved.

In an embodiment, as shown in fig. 4, a method for training and generating a semantic segmentation model is described with a specific embodiment, taking the detection of the appearance of a cart in the annual inspection of a vehicle as an example, and includes the following steps:

in step S401, a plurality of image samples are acquired.

The target detection model is not limited to RefineDet (a detector based on a single stage), Faster R-CNN (a target detection network), SSD (Single Shot Multi Box Detector), YO L O (You on L ook one), and the like, and is not limited herein.

And step S402, labeling the categories in each image sample to generate a training sample set. Wherein the category comprises at least one sub-category in the target area and a background category outside the target area.

Specifically, the polygons can be used for labeling parts such as license plates, lamps, logos, wheels, doors, air holes, windows, decals, fenders, protective devices, red and white reflective marks and fences in each vehicle area image sample, and for parts with large radians, the multipoint labeling is used for fitting the outlines of the parts.

Step S403, inputting the training sample set into the semantic segmentation model to be trained. And (3) performing iterative training on the semantic segmentation model to be trained by adopting a loss function with weight, and performing weighted sum according to the primary weight and the primary class probability respectively corresponding to each class to obtain a primary loss value.

And S404, performing semantic segmentation on each image sample by using the initial semantic segmentation model to obtain a semantic segmentation image corresponding to each category in each image sample.

Step S405, determining the secondary weight of each category according to the pixel points contained in the semantic segmentation image corresponding to each category.

And step S406, performing iterative training on the initial semantic segmentation model by using a loss function with weight, and performing weighted sum according to the secondary weight and the secondary category probability respectively corresponding to each category to obtain a secondary loss value.

Specifically, the weighted loss function mentioned in step S403 and step S404 can be obtained in the following manner:

wherein, W_CRepresenting the weight corresponding to the category C; y is_CThe representative and target list is 1, otherwise, the representative and target list is 0; p_CRepresenting the class probability of class C.

Specifically, taking the target object as a small target as an example, for the initial weight, assuming that the semantic segmentation model supports 21 classes (20 foreground classes, i.e. sub-classes in the target region, and one background class), an initial weight fixed value may be set, for example, the weight of the background class is set to 0.5, and the remaining weights are set to 0.025(0.5/20), so as to perform the initial model training.

For the secondary weight of the small target, the secondary weight corresponding to each sub-category can be calculated by the following formula:

wherein N represents the total number of pixel points of the image, N_CRepresenting the total number of pixel points of the semantic segmentation image corresponding to the category C.

Step S407, adjusting model parameters of the initial semantic segmentation model according to the secondary loss value, and generating a semantic segmentation model.

The vehicle appearance detection method provided by the application can be applied to the application environment shown in fig. 5. The application environment comprises a terminal 510 and an image acquisition device 520, wherein the terminal 510 may refer to an electronic device with strong data storage and computing power, and a trained target detection model and a trained semantic segmentation model are deployed in the terminal 510. The object detection model and the semantic segmentation model may be pre-trained using terminals other than terminal 510. The image capturing device 520 may be in the terminal 510 or may be a separate device. Specifically, the terminal 510 detects the to-be-detected image acquired by the image acquisition device 520 by using the target detection model to obtain a vehicle region image of the to-be-detected vehicle; the terminal 510 performs semantic segmentation on the vehicle region image by using the semantic segmentation model obtained by the method described in any of the above embodiments, so as to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected. The terminal 510 determines the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component; and comparing the size of each vehicle component with the standard size to generate a detection result of the vehicle appearance. Further, after obtaining the detection result, the terminal 510 may send the detection result to the server 530 for storage. The terminal 510 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable mobile devices, and the image capturing device 520 is not limited to various cameras and video cameras.

In one embodiment, as shown in fig. 6, a vehicle appearance detection method is provided, which is described by taking the method as an example applied to the terminal 510 in fig. 5, and includes the following steps:

and step S610, detecting the image to be detected by adopting the target detection model to obtain a vehicle area image of the vehicle to be detected.

The target detection model in the embodiment is a trained model, the target detection model is not limited to RefineDet (a single-stage-based detector), fast R-CNN (a target detection network), SSD (Single Shot Multi Box Detector), YO L O (Yoou ly L One) and the like, and is not limited in the process.

The following describes the training process of the target detection model: first, a plurality of original image samples under different conditions, for example, a plurality of vehicle photographs of a vehicle inspection site taken under different angles and different lighting conditions, are acquired. Then, labeling the vehicles in each original image sample by adopting a rectangular frame to generate a training sample set. And finally, training the target detection model to be trained by using the training sample set to obtain the target detection model.

Step S620, performing semantic segmentation on the vehicle region image by adopting the semantic segmentation model described in any one of the methods to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected.

Specifically, the vehicle region image is input to a trained semantic segmentation model. The semantic segmentation models are not limited to FCN, SegNet, BiSeNet. Preferably, in this embodiment, the semantic segmentation model may adopt a two-channel real-time semantic segmentation model BiSeNet. Because the traditional semantic segmentation model has large parameter quantity, low speed and high consumption and video memory, the double-channel real-time semantic segmentation model BiSeNet is adopted, so that quick real-time response can be obtained when semantic recognition is carried out on the vehicle region image, and the efficiency of vehicle appearance inspection is improved.

The semantic segmentation model used in this embodiment may be a model trained by any one of the above training generation methods of the semantic segmentation model. And (3) segmenting each vehicle component needing to be inspected in the vehicle region image by adopting a semantic segmentation model to obtain a semantic segmentation image of each vehicle component. The vehicle parts include, but are not limited to, license plates, lamps, logos, wheels, doors, vents, windows, decals, fenders, guards, red and white reflective signs, and fences.

Step S630, determining the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component.

Specifically, after the semantic segmentation image of each vehicle component is acquired, the classification to which each pixel point in the vehicle region image belongs can be classified according to the component classification to which the semantic segmentation image belongs. Then, the total size of the pixel points in the same category is calculated as the size of the corresponding vehicle component.

Step S640 compares the size of each vehicle component with the standard size, and generates a detection result of the vehicle appearance.

The standard size refers to a reference size of a vehicle component of the vehicle to be inspected, and the standard size may be a registered size of the vehicle to be inspected at the time of registration by a regulatory unit. The standard size may be pre-stored in the server and correspond to a unique identification associated with the vehicle. The unique identification is not limited to a license plate number. Specifically, when the vehicle to be detected is detected, the standard size of the vehicle component can be obtained according to the unique identifier query of the vehicle to be detected. After the size of each vehicle component is acquired, the detected size can be compared with a standard size, and if the detected size is within the error range, the vehicle component is considered to be qualified for detection. If all vehicle parts are detected to be qualified, the size of the vehicle to be detected can be determined to be passed. Further, if it is detected that the size of any one of the vehicle components is outside the error range, it can be determined that the size detection of the vehicle to be detected fails.

According to the vehicle appearance detection method, the semantic segmentation model obtained by the training generation method of the semantic segmentation model is adopted, the vehicle region image is subjected to real-time semantic segmentation on the basis of not losing space detail characteristics and not reducing the space capacity of the model, and the size of a vehicle part is obtained according to the semantic segmentation image, so that the accuracy of vehicle appearance detection can be improved, the labor cost can be reduced, the response time is shortened, the working efficiency is improved, and the justness and the openness of vehicle annual inspection work are ensured.

In one embodiment, as shown in fig. 7, in step S630, determining the size of each vehicle component according to the pixel size information of the semantic segmentation image corresponding to each vehicle component specifically includes the following steps:

step S631, a reference object size of the image to be detected is acquired.

The reference object is a standard-sized object fixedly placed beside the vehicle to be detected, for example, a vertical pole or a conical pole arranged beside the vehicle to be detected. In particular, the reference size may be manually input or selected by the user through the terminal. The terminal obtains the reference object size input or selected by the user.

In step S632, size information corresponding to each pixel point is calculated according to the size of the reference object.

Step S633, calculating pixel size information in the semantic segmentation image corresponding to each vehicle component according to the size information corresponding to each pixel, to obtain the size of each vehicle component.

Specifically, after the reference object size is obtained, the size information corresponding to each pixel point of the reference object size, for example, how many centimeters each pixel point represents, may be obtained. And then, calculating the size of each vehicle part according to the size information represented by each pixel point and the pixel points in the semantic segmentation image corresponding to each vehicle part.

Illustratively, the reference is a vertical post having a length A. The number of the pixel points of the vehicle area image in the direction parallel to the vertical marker post is S, and then the corresponding size information of each pixel point can be obtained as A/S. If the vehicle part is rectangular, and the length and width respectively include X₁And X₂Calculating the length of the vehicle part to be A X X₁(ii) S; width of A X X₂and/S. In the embodiment, the reference object with the standard size is set, the actual size represented by each pixel point is determined according to the size of the reference object, and then the size corresponding to each vehicle part is calculated, so that the operation is simple and the calculation is accurate.

In one embodiment, as shown in FIG. 8, a vehicle appearance inspection method is described by one specific embodiment, comprising the steps of:

step S801, an image to be detected is acquired. The image to be detected can be an image of the left front direction and the right rear direction of a vehicle of a large vehicle shot in a vehicle detection field.

And S802, detecting the image to be detected by adopting a target detection model.

Step S803, it is determined whether or not a vehicle region exists in the image to be detected. If the vehicle region exists, the step S804 is performed to acquire a vehicle region image of the vehicle to be detected. Otherwise, ending the detection.

And step S804, acquiring a vehicle area image of the vehicle to be detected.

Step S805, performing semantic segmentation on the vehicle region image by adopting the semantic segmentation model in any embodiment of the training and generating method of the semantic segmentation model to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected. The semantic segmentation model can adopt a real-time semantic segmentation model BiSeNet.

Step S806, a reference size of the image to be detected is acquired.

In step S807, size information corresponding to each pixel point is calculated based on the reference size.

Step S808, calculating pixel point size information in the semantic segmentation image corresponding to each vehicle component according to the size information corresponding to each pixel point to obtain the size of each vehicle component.

Step S809 compares the size of each vehicle component with the standard size, and if the size of each vehicle component is within the error range, the process proceeds to step S810, and a result that the vehicle appearance size detection is passed is generated. Otherwise, the process proceeds to step S811, and a result of the vehicle external dimension detection failing is generated.

Illustratively, the standard size is obtained by that the continuous length of the reflective sign is more than 300mm (millimeter), the height of the breast board conforms to the value in the standard size and the error does not exceed +/-50 mm, the height of the protective device from the ground is not higher than 50cm, the width of the protective device is not lower than 10cm, and the like. If the sizes of the detected reflective mark, the height of the breast board and the protective device meet the judgment standard, recording the mark as 1, generating a result that the size of the vehicle to be detected passes the detection, and ending the process; if any vehicle part does not meet the judgment standard, recording the mark as 0, generating the result that the size detection of the vehicle to be detected fails, and ending the process.

Further, in addition to detecting the size of each vehicle component from the semantic division image of each vehicle component, it is also possible to detect whether the color, shape, or the like of each vehicle component meets the requirements from the semantic division image of each vehicle component. For example, the standard is that the reflective sign must be red and white, the tail sign board is alternately red and yellow, and the shape is rectangular. If the appearance of each vehicle part is judged to meet the requirements according to the semantic segmentation image, recording the mark as 1, generating a result that the appearance detection of the vehicle to be detected passes, and ending the process; and if any vehicle part does not meet the judgment standard, recording the mark as 0, generating a result that the appearance detection of the vehicle to be detected fails, and ending the process.

It should be understood that although the various steps in the flow charts of fig. 1-8 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1-8 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.

In one embodiment, as shown in fig. 9, there is provided a training generation apparatus 900 for a semantic segmentation model, including: an obtaining module 901, a training sample set generating module 902, a training module 903 and a model generating module 904, wherein:

an obtaining module 901, configured to obtain multiple image samples;

a training sample set generating module 902, configured to label categories in each image sample to generate a training sample set;

the obtaining module 901 is further configured to obtain weights corresponding to each category;

a training module 903, configured to input the training sample set to the semantic segmentation model to be trained, perform iterative training on the semantic segmentation model to be trained by using a loss function, and perform weighting and obtaining a loss value according to a weight and a class probability respectively corresponding to each class;

and a model generating module 904, configured to adjust a model parameter of the semantic segmentation model to be trained according to the loss value, and generate the semantic segmentation model.

In one embodiment, the categories include at least one subcategory within the target area, and a background category outside the target area. The obtaining module 901 is further configured to obtain at least one subcategory in the target area and weights corresponding to the background categories, respectively.

In one embodiment, the weights include primary weights and secondary weights; a training module 903 comprising: the primary training unit is used for performing iterative training on a semantic segmentation model to be trained by adopting a loss function, and performing weighted sum according to primary weights and primary class probabilities respectively corresponding to at least one subcategory and a background class to obtain a primary loss value; the primary model generating unit is used for adjusting model parameters of the semantic segmentation model to be trained according to the primary loss value and generating an initial semantic segmentation model; the secondary training unit is used for performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and performing weighted sum according to secondary weight and secondary category probability respectively corresponding to at least one subcategory and a background category to obtain a secondary loss value; and the secondary training unit is used for adjusting the model parameters of the initial semantic segmentation model according to the secondary loss value and generating the semantic segmentation model.

In one embodiment, the apparatus further includes a secondary weight obtaining module, configured to perform semantic segmentation on the image sample by using an initial semantic segmentation model, so as to obtain semantic segmentation images corresponding to at least one sub-category and a background category in the image sample; and determining secondary weights corresponding to at least one sub-category and the background category respectively according to the pixel points contained in the semantic segmentation image.

In an embodiment, the secondary weight obtaining module is specifically configured to obtain a difference between a total pixel point of the image sample and pixel points corresponding to at least one of the sub-categories and the background category; and acquiring the ratio of the difference value corresponding to at least one subcategory and the background category to the total pixel point number, and taking the ratio as the secondary weight corresponding to at least one subcategory and the background category respectively.

In one embodiment, as shown in fig. 10, there is provided a vehicle appearance detecting apparatus 1000 including: an object detection module 1001, an appearance segmentation module 1002, a component size determination module 1003, and a detection result generation module 1004, wherein:

the target detection module 1001 is used for detecting the image to be detected by adopting a target detection model to obtain a vehicle region image of the vehicle to be detected;

the appearance segmentation module 1002 is configured to perform semantic segmentation on the vehicle region image by using the semantic segmentation model described in any one of the above methods to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected;

the component size determining module 1003 is configured to determine the size of each vehicle component according to pixel size information of the semantic segmentation image corresponding to each vehicle component;

and a detection result generation module 1004 for comparing the size of each vehicle component with the standard size to generate a detection result of the vehicle appearance.

In one embodiment, the component size determination module 1003 includes: the acquisition unit is used for acquiring the size of a reference object of the image to be detected; the pixel point size determining unit is used for calculating size information corresponding to each pixel point according to the size of the reference object; and the size determining unit of the vehicle part is used for calculating the size information of the pixel points in the semantic segmentation image corresponding to each vehicle part according to the size information corresponding to each pixel point to obtain the size of each vehicle part.

For specific limitations of the training generation device and the vehicle appearance detection device of the semantic segmentation model, reference may be made to the above limitations of the training method and the vehicle appearance detection method of the semantic segmentation model, which are not described herein again. The modules in the training generation device and the vehicle appearance detection device of the semantic segmentation model can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a training generation method of a semantic segmentation model and/or a vehicle appearance detection method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a plurality of image samples; labeling the category in each image sample to generate a training sample set; acquiring the weight corresponding to each category; inputting a training sample set into a semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class; and adjusting the model parameters of the semantic segmentation model to be trained according to the loss value to generate the semantic segmentation model.

In one embodiment, the categories include at least one subcategory within the target area, and a background category outside the target area.

In one embodiment, the weights include primary weights and secondary weights; the processor, when executing the computer program, further performs the steps of:

performing iterative training on a semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a primary loss value according to primary weights and class probabilities respectively corresponding to each class; adjusting model parameters of a semantic segmentation model to be trained according to the primary loss value to generate an initial semantic segmentation model; performing semantic segmentation on each image sample by adopting an initial semantic segmentation model to obtain a semantic segmentation image corresponding to each category in each image sample; and (4) performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the secondary weight and the class probability respectively corresponding to each class.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

and determining the secondary weight of each category according to the pixel points contained in the semantic segmentation image corresponding to each category.

and calculating the difference value between the total pixel point number of each image sample and the pixel point number corresponding to each category, and obtaining the secondary weight of each category relative to the ratio of the total pixel point number.

detecting the image to be detected by adopting a target detection model to obtain a vehicle area image of the vehicle to be detected; performing semantic segmentation on the vehicle region image by adopting the semantic segmentation model of any one of the methods to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected; determining the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component; the size of each vehicle component is compared with the standard size, and if the size is within the error range, a result that the vehicle appearance size detection is passed is generated.

acquiring the size of a reference object of an image to be detected; calculating the size information corresponding to each pixel point according to the size of the reference object; and calculating the size information of the pixel points in the semantic segmentation image corresponding to each vehicle part according to the size information corresponding to each pixel point to obtain the size of each vehicle part.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A training generation method of a semantic segmentation model is characterized by comprising the following steps:

acquiring a plurality of image samples;

labeling the category in each image sample to generate a training sample set;

acquiring the weight corresponding to each category;

inputting the training sample set to a semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and performing weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class;

2. The method of claim 1, wherein the categories include at least one sub-category within the target area and a background category outside the target area; the obtaining of the weight corresponding to each category includes:

and acquiring at least one subcategory in the target area and the weight corresponding to the background category respectively.

3. The method of claim 2, wherein the weights comprise primary weights and secondary weights; the iterative training of the semantic segmentation model to be trained by adopting the loss function is carried out, and weighting is carried out according to the weight and the class probability respectively corresponding to each class to obtain a loss value, and the method comprises the following steps:

performing iterative training on the semantic segmentation model to be trained by adopting the loss function, and performing weighted sum according to the primary weight and the primary class probability respectively corresponding to the at least one subcategory and the background class to obtain a primary loss value;

adjusting model parameters of the semantic segmentation model to be trained according to the primary loss value to generate an initial semantic segmentation model;

performing iterative training on the initial semantic segmentation model by using the loss function, and performing weighted sum according to secondary weights and secondary category probabilities respectively corresponding to the at least one subcategory and the background category to obtain a secondary loss value;

the step of adjusting the model parameters of the semantic segmentation model to be trained according to the loss value to generate the semantic segmentation model comprises the following steps:

and adjusting the model parameters of the initial semantic segmentation model according to the secondary loss value to generate the semantic segmentation model.

4. The method of claim 3, wherein the determining of the secondary weight comprises:

performing semantic segmentation on each image sample by using the initial semantic segmentation model to obtain semantic segmentation images respectively corresponding to the at least one subcategory and the background category in each image sample;

and determining secondary weights corresponding to the at least one sub-category and the background category respectively according to the number of pixel points contained in the semantic segmentation image.

5. The method according to claim 4, wherein the determining the secondary weights respectively corresponding to the at least one sub-category and the background category according to the number of pixel points included in the semantic segmentation image comprises:

acquiring the difference value of the total pixel point number of the image sample and the pixel point numbers respectively corresponding to the at least one subcategory and the background category;

and acquiring the difference value corresponding to the at least one subcategory and the background category respectively, and the ratio of the total pixel point number of the image sample as the secondary weight corresponding to the at least one subcategory and the background category respectively.

6. A vehicle appearance detecting method, characterized by comprising:

performing semantic segmentation on the vehicle region image by using the semantic segmentation model of any one of claims 1 to 5 to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected;

determining the size of each vehicle part according to the pixel point size information of the semantic segmentation image corresponding to each vehicle part;

7. The method according to claim 6, wherein the determining the size of each vehicle component according to the pixel point size information of the semantic segmentation image corresponding to each vehicle component comprises:

acquiring the size of a reference object of the image to be detected;

and calculating pixel point size information in the semantic segmentation image corresponding to each vehicle component according to the size information corresponding to each pixel point to obtain the size of each vehicle component.

8. An apparatus for training and generating a semantic segmentation model, the apparatus comprising:

the acquisition module is used for acquiring a plurality of image samples;

the acquiring module is further configured to acquire a weight corresponding to each category;

the training module is used for inputting the training sample set to a semantic segmentation model to be trained, performing iterative training on the semantic segmentation model to be trained by adopting a loss function, and weighting and obtaining a loss value according to the weight and the class probability respectively corresponding to each class;

9. A vehicle appearance detecting apparatus, characterized by comprising:

the appearance segmentation module is used for performing semantic segmentation on the vehicle region image by adopting the semantic segmentation model of any one of claims 1-5 to obtain a semantic segmentation image of each vehicle component in the vehicle to be detected;

and the detection result generation module is used for comparing the size of each vehicle component with the standard size, and if the size of each vehicle component is within the error range, generating a result that the vehicle appearance size detection is passed.

10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.