CN115063589A - Knowledge distillation-based vehicle component segmentation method and related equipment - Google Patents

Knowledge distillation-based vehicle component segmentation method and related equipment Download PDF

Info

Publication number
CN115063589A
CN115063589A CN202210791176.8A CN202210791176A CN115063589A CN 115063589 A CN115063589 A CN 115063589A CN 202210791176 A CN202210791176 A CN 202210791176A CN 115063589 A CN115063589 A CN 115063589A
Authority
CN
China
Prior art keywords
vehicle
component
network
teacher
segmentation result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210791176.8A
Other languages
Chinese (zh)
Inventor
唐子豪
刘莉红
刘玉宇
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210791176.8A priority Critical patent/CN115063589A/en
Publication of CN115063589A publication Critical patent/CN115063589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a knowledge distillation-based vehicle component segmentation method, a knowledge distillation-based vehicle component segmentation device, electronic equipment and a storage medium, wherein the knowledge distillation-based vehicle component segmentation method comprises the following steps: acquiring a labeled data set and an unlabeled data set as training sets, wherein the label data comprises the types of vehicle parts of all pixel points in the vehicle image; establishing an initial teacher network, and training the initial teacher network twice based on a training set to obtain a second teacher network; building an initial student network; extracting dark knowledge of each vehicle image in the training set based on a second teacher network, wherein the dark knowledge reflects feature similarity and position correlation among different types of vehicle components; training the initial student network based on the dark knowledge to obtain a second student network; and acquiring a component segmentation result of the real-time vehicle image based on the second student network. According to the method and the device, the second student network with small parameters and high segmentation precision can be obtained, and the precision and the speed of the vehicle segmentation model in the mobile terminal scene are improved.

Description

Knowledge distillation-based vehicle component segmentation method and related equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a knowledge distillation-based vehicle component segmentation method and device, electronic equipment and a storage medium.
Background
Knowledge distillation is a model compression method, a knowledge distillation system comprises a teacher network with good performance but large parameter quantity and a student network with small parameter quantity, knowledge extraction is carried out on the teacher network to acquire supervision information to train the student network, and therefore the student network achieves better performance.
The vehicle component segmentation has wide requirements in scenes such as intelligent damage assessment, whole vehicle acceptance and the like, a deep learning model is generally used for segmenting a collected vehicle image to obtain a segmentation result of the vehicle component, however, due to factors such as large mobile terminal performance and deep learning model parameters, the traditional deep learning model is often poor in effect in a mobile terminal scene, and therefore the structure and the training method of the deep learning model need to be designed for the vehicle component segmentation scene to improve the speed and the precision of the vehicle component segmentation in the mobile terminal scene.
Disclosure of Invention
In view of the foregoing, it is necessary to provide a knowledge-based vehicle component segmentation method and related apparatus to solve the technical problem of how to improve the speed and accuracy of vehicle component segmentation in a mobile terminal scenario, where the related apparatus includes a knowledge-based vehicle component segmentation apparatus, an electronic device, and a storage medium.
The present application provides a knowledge-based distillation vehicle component segmentation method, the method comprising:
collecting a vehicle image with tag data as a tagged data set, collecting a vehicle image without tag data as an unlabeled data set, and using the tagged data set and the unlabeled data set as training sets, wherein the tag data comprises the vehicle part type of each pixel point in the vehicle image;
building an initial teacher network, and training the initial teacher network based on the training set to obtain a second teacher network;
building an initial student network;
extracting dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, wherein the dark knowledge reflects feature similarity and position correlation among different types of vehicle components;
obtaining a component segmentation result of a real-time vehicle image based on the second student network.
In some embodiments, said training said initial teacher network to obtain a second teacher network based on said training set comprises:
training the initial teacher network based on the labeled data set and the cross entropy loss function to obtain a first teacher network;
obtaining a component segmentation result of each vehicle picture in the unmarked data set based on the first teacher network, wherein the component segmentation result comprises a category vector of each pixel point in the vehicle picture, and the category vector comprises a probability value of the pixel point belonging to each vehicle component;
calculating the credibility index of the segmentation result of each part according to the credibility calculation model;
screening the vehicle images in the unmarked data set based on the credibility index and a preset credibility threshold value to obtain abnormal images;
and acquiring label data of all the abnormal images, and training the first teacher network to obtain a second teacher network based on the abnormal images and the label data of the abnormal images.
In some embodiments, the trustworthiness computing model satisfies the relation:
Figure BDA0003702557220000021
where N is the number of kinds of all vehicle components, W × H is the width and height dimensions of the component division result k,
Figure BDA0003702557220000022
is the probability value alpha of the type c of the vehicle part in the category vector of the pixel point (i, j) in the part segmentation result k k The reliability index of the part segmentation result k is in a value range of (0, 1)]The larger the numerical value, the higher the reliability of the component division result k, and the higher the accuracy of the vehicle component division.
In some embodiments, said screening images of vehicles in said unlabeled dataset for anomaly based on said confidence measure and a preset confidence threshold comprises:
comparing the reliability index with a preset reliability threshold;
if the reliability index is larger than the preset reliability threshold, indicating that the precision of a component segmentation result corresponding to the reliability index meets the requirement, and marking a vehicle picture corresponding to the component segmentation result as a normal image;
if the reliability index is not larger than the preset reliability threshold, the accuracy of the part segmentation result corresponding to the reliability index is not met, and the vehicle picture corresponding to the part segmentation result is marked as an abnormal image.
In some embodiments, said extracting dark knowledge of each vehicle image in said training set based on said second teacher network and training said initial student network based on said dark knowledge results in a second student network, said dark knowledge reflecting feature similarities and location correlations between different kinds of vehicle components, comprises:
simultaneously inputting a target vehicle image into a second teacher network and an initial student network, wherein the target vehicle image is any one of all vehicle images in the training set, an output result of the second teacher network is used as a teacher component segmentation result, an output result of the initial student network is used as a student component segmentation result, and the teacher component segmentation result and the student component segmentation result both comprise a category vector of each pixel point in the target vehicle image;
constructing a position correlation matrix of the target vehicle image based on the teacher component segmentation result;
taking the class vector of the segmentation result of the teacher component and the position correlation matrix as dark knowledge of the target vehicle image, and constructing a first preset loss function based on the dark knowledge;
judging whether the target vehicle image is provided with label data or not to obtain a judgment parameter, if so, constructing a cross entropy loss function based on the label data and the student part segmentation result, wherein the value of the judgment parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0;
constructing a preset loss function based on the judgment parameter and the first preset loss function;
training the initial student network to obtain a second student network based on the preset loss function and all vehicle images in the training set.
In some embodiments, said constructing a position correlation matrix of said target vehicle image based on said teacher component segmentation result comprises:
selecting the vehicle component type corresponding to the maximum probability value in the class vector of the teacher component segmentation result as the vehicle component type of each pixel point;
setting the pixel value of the pixel point of the same vehicle component type as 1, setting the pixel points of other areas as 0 to obtain an area image of the vehicle component, and traversing all the vehicle component types in the segmentation result of the teacher component so as to obtain the area image of each vehicle component;
arranging pixel values in the area image of the vehicle part along a row direction according to a fixed sequence to obtain a position vector of each vehicle part in the target vehicle image;
calculating position correlations of different vehicle components based on the position vectors to construct a position correlation matrix of the target vehicle image, the position correlations satisfying the relation:
x mn =h m (h n ) T
wherein h is m Is the position vector of the vehicle component m in the target vehicle image, (h) n ) T Is the transpose of the position vector, x, of the vehicle component n in the target vehicle image mn Is the vehicle component m and the vehicle component n position correlation.
In some embodiments, in the constructing of the preset loss function based on the determination parameter and the first preset loss function, the preset loss function satisfies the following relation:
Figure BDA0003702557220000041
the method comprises the following steps that (1) Loss1 is a first preset Loss function, Loss2 is a cross entropy Loss function constructed based on label data and student part segmentation results, beta is a judgment parameter and is taken as 0 or 1, and Loss is a preset Loss function;
the first predetermined loss function satisfies the relation:
Figure BDA0003702557220000042
wherein, X s A position correlation matrix for the student part segmentation result, X being a position correlation matrix for the target vehicle image, | X s -X‖ 2 Represents X s -L2 distance of X; p s (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the student part, P (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the teacher part, KL (P) s (i, j), P (i, j)) for calculating P s (i, j), KL divergence between P (i, j); w, H are width and height dimensions of the target vehicle image.
Embodiments of the present application also provide a knowledge-based distillation vehicle component segmentation apparatus, the apparatus including:
the system comprises a collecting unit, a judging unit and a judging unit, wherein the collecting unit is used for collecting a vehicle image with label data as an labeled data set, collecting a vehicle image without the label data as an unlabeled data set, and using the labeled data set and the unlabeled data set as training sets, wherein the label data comprises the vehicle part type of each pixel point in the vehicle image;
the training unit is used for building an initial teacher network and training the initial teacher network based on the training set to obtain a second teacher network;
the building unit is used for building an initial student network;
an extracting unit, configured to extract dark knowledge of each vehicle image in the training set based on the second teacher network, and train the initial student network based on the dark knowledge to obtain a second student network, where the dark knowledge reflects feature similarities and position correlations between different types of vehicle components;
and the segmentation unit is used for acquiring a component segmentation result of the real-time vehicle image based on the second student network.
An embodiment of the present application further provides an electronic device, where the electronic device includes:
a memory storing at least one instruction;
a processor executing instructions stored in the memory to implement the knowledge-based distillation vehicle component segmentation method.
Embodiments of the present application also provide a computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executable by a processor in an electronic device to implement the knowledge distillation based vehicle component segmentation method.
In summary, the initial teacher network is trained twice based on the labeled data and the unlabeled data to obtain the teacher network, the teacher network is used for extracting the dark knowledge of each vehicle picture in the labeled data and the unlabeled data to supervise the training of the lightweight initial student network, so that the second student network with small parameters and high segmentation precision is obtained, the second student network is embedded into the mobile terminal equipment, and the speed and precision of vehicle component segmentation in the mobile terminal scene can be improved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of a knowledge-based distillation vehicle component segmentation method to which the present application is directed.
Fig. 2 is a schematic diagram of the training process of the initial student network to which the present application relates.
Fig. 3 is a functional block diagram of a preferred embodiment of the knowledge-based distillation vehicle component separation apparatus to which the present application relates.
Fig. 4 is a schematic structural diagram of an electronic device according to a preferred embodiment of the knowledge-based distillation vehicle component segmentation method according to the present application.
Detailed Description
For a clearer understanding of the objects, features and advantages of the present application, reference is made to the following detailed description of the present application along with the accompanying drawings and specific examples. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, and the described embodiments are merely some, but not all embodiments of the present application.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first" and "second" may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The embodiment of the Application provides a knowledge-based distillation vehicle component segmentation method, which can be applied to one or more electronic devices, wherein the electronic devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and hardware of the electronic devices includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a client, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a client device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
Fig. 1 is a flow chart of a preferred embodiment of the knowledge-based distillation vehicle component segmentation method of the present application. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
S10, collecting the vehicle image with the label data as a labeled data set, collecting the vehicle image without the label data as an unlabeled data set, and using the labeled data set and the unlabeled data set as a training set, wherein the label data comprises the vehicle part type of each pixel point in the vehicle image.
In an optional embodiment, a large number of vehicle images are collected, label data of each vehicle image are obtained, the label data are images as large as the vehicle images, pixel values in the label data represent preset labels of vehicle component types at each pixel point, the preset labels are integers from 1 to N, wherein N represents the number of all vehicle component types including a background type, the background type is used as a special vehicle component, the background type is a type of pixel points which do not belong to the vehicle component in the vehicle image, the vehicle component types and the preset labels are in one-to-one correspondence, and all the vehicle images and the label data of all the vehicle images are stored as a labeling data set.
In the optional embodiment, a large number of new vehicle images are collected, label data of the new vehicle images do not need to be acquired, all the new vehicle images are directly stored as unmarked data sets, and the marked data sets and the unmarked data sets are used as training sets for subsequent training of a teacher network and a student network.
Therefore, the marked data set and the unmarked data set are collected to be used as training sets, and a data basis is provided for the subsequent training of the teacher network and the student network.
And S11, building an initial teacher network, and training the initial teacher network based on the training set to obtain a second teacher network.
In an optional embodiment, an initial teacher network is established, the input of the initial teacher network is a vehicle picture, the expected output is a component segmentation result of the vehicle picture, the component segmentation result is an image equal to the vehicle image, the component segmentation result includes a category vector of each pixel point in the vehicle picture, the category vector includes probability values of the pixel point belonging to each vehicle component, and the sum of all probability values in the same pixel point category vector is 1; and selecting the vehicle component type corresponding to the maximum probability value in the category vector as the vehicle component type corresponding to the pixel point.
In this optional embodiment, the initial teacher network is an encoder and decoder structure, the encoder uses the convolution layer to down-sample the input vehicle image to obtain a teacher feature map, and sends the teacher feature map into the decoder to use the anti-convolution layer to up-sample to obtain a component segmentation result of the vehicle image. It should be noted that, the vehicle component segmentation accuracy of the initial teacher network directly affects the segmentation accuracy of the subsequent student network, so that it is not necessary to consider the parameters of the initial teacher network to ensure that the vehicle component segmentation accuracy of the initial teacher network is the most advanced condition, and therefore, the initial teacher network may select the existing image segmentation networks with higher accuracy, such as deep lapv3, UNet, and the like, without limitation in the present application.
In this alternative embodiment, in order to ensure that the output of the initial teacher network is the result of the component segmentation of the vehicle image, the initial teacher network needs to be trained based on the training set to obtain a second teacher network. The training the initial teacher network to obtain a second teacher network based on the training set comprises:
training the initial teacher network based on the labeled data set and the cross entropy loss function to obtain a first teacher network;
obtaining a component segmentation result of each vehicle picture in the unlabelled data set based on the first teacher network, wherein the component segmentation result comprises a category vector of each pixel point in the vehicle picture, and the category vector comprises a probability value of each vehicle component to which the pixel point belongs;
calculating the credibility index of the segmentation result of each part according to the credibility calculation model;
screening the vehicle images in the unmarked data set based on the credibility index and a preset credibility threshold value to obtain abnormal images;
and acquiring label data of all the abnormal images, and training the first teacher network to obtain a second teacher network based on the abnormal images and the label data of the abnormal images.
In the optional embodiment, the initial teacher network is trained based on a labeled data set and a cross entropy loss function to obtain a first teacher network, vehicle pictures in the labeled data set are continuously input into the initial teacher network to obtain an output result in the training process, the value of the cross entropy loss function is calculated based on the output result and label data of the vehicle pictures, parameters in the initial teacher network are updated by using a gradient descent method, and when the value of the cross entropy loss function does not change any more, the training of the initial teacher network is stopped to obtain the first teacher network.
In this optional embodiment, all the vehicle pictures in the unlabelled data set are sequentially input to the first teacher network to obtain a component segmentation result of each vehicle picture, where the component segmentation result includes a category vector of each pixel in the vehicle picture, and the category vector includes a probability value that the pixel belongs to each vehicle component; and further calculating the credibility index of each part segmentation result according to a credibility calculation model, taking the part segmentation result k as an example, wherein the credibility calculation model meets the relation:
Figure BDA0003702557220000091
where N is the number of kinds of all vehicle components, W × H is the width and height dimensions of the component division result k,
Figure BDA0003702557220000092
is the probability value alpha of the type c of the vehicle part in the category vector of the pixel point (i, j) in the part segmentation result k k The reliability index of the part segmentation result k is in a value range of (0, 1)]The larger the numerical value is, the higher the reliability of the component division result k is, and the higher the accuracy of the vehicle component division is.
In the above-mentioned formula,
Figure BDA0003702557220000093
the method comprises the steps that the information entropy of a category vector of a pixel point (i, j) is obtained, when the numerical value of the information entropy is larger, the difference value between different probability values in the category vector is smaller, the segmentation result of the pixel point (i, j) is inaccurate, when all probability values in the category vector are equal, the information entropy reaches the maximum value, and the maximum value is log N.
For example, if the component segmentation result 1 only includes one pixel point, the number of the types of all vehicle components is 5, and the class vector corresponding to the pixel point is (0.2,0.1,0,0.1,0.6), the confidence index of the component segmentation result is:
Figure BDA0003702557220000101
in this optional embodiment, the screening, based on the reliability index and a preset reliability threshold, the vehicle image in the unlabeled dataset to obtain an abnormal image includes:
comparing the reliability index with a preset reliability threshold;
if the reliability index is larger than the preset reliability threshold, indicating that the precision of a component segmentation result corresponding to the reliability index meets the requirement, and marking a vehicle picture corresponding to the component segmentation result as a normal image;
if the reliability index is not larger than the preset reliability threshold, the accuracy of the part segmentation result corresponding to the reliability index is not met, and the vehicle picture corresponding to the part segmentation result is marked as an abnormal image. Wherein the preset confidence threshold value is 0.6.
In this optional embodiment, the first teacher network may not learn the distinctive features of different vehicle components well, so that the accuracy of the component segmentation result of the abnormal image may not meet the requirement, so that the label data of all the abnormal images is obtained, and the first teacher network is trained based on all the abnormal images and the label data of all the abnormal images to constrain the first teacher network to learn the distinctive features of different vehicle components, where the obtaining manner of the label data of the abnormal images is labeled manually. In the training process, abnormal images are continuously input into the first teacher network to obtain output results, values of cross entropy loss functions are calculated based on the output results and label data of the abnormal images, parameters in the first teacher network are updated through a gradient descent method, when the values of the cross entropy loss functions do not change any more, the training of the first teacher network is stopped to obtain a second teacher network, and the second teacher network can extract different characteristics of different vehicle components in the vehicle images to obtain accurate vehicle component segmentation results.
Therefore, the initial teacher network is trained twice based on the labeled data set and the unlabeled data set in the training set to obtain a second teacher network, the second teacher network can extract the distinguishing features of different vehicle components in the vehicle image, obtain an accurate vehicle component segmentation result, and guarantee the vehicle component segmentation precision of a subsequent student network.
And S12, building an initial student network.
In an optional embodiment, an initial student network is built, the input and the output of the initial student network are the same as those of the teacher network, the initial student network is composed of a light-weight encoder and a light-weight decoder, the light-weight encoder utilizes a convolution layer to down-sample an input vehicle image to obtain a student characteristic map, the light-weight decoder utilizes a deconvolution layer to up-sample the student characteristic map to obtain a component segmentation result which is large as the input vehicle image, wherein the light-weight encoder can adopt an encoder structure with smaller parameter quantity such as the existing MobileNet and the ShuffleNet, the light-weight decoder can be realized by reducing the number of layers of the deconvolution layer, and the specific structures of the light-weight encoder and the light-weight decoder are not limited in the application.
In this optional embodiment, due to the lightweight feature of the initial student network, the initial student network may be embedded into the mobile terminal device, so as to improve the speed of vehicle component division in the mobile terminal scene, and the mobile terminal device may be a portable intelligent device with a photographing function, such as a smart phone, a smart watch, and a tablet computer, which is not limited in this application.
Therefore, the initial student network is completed, the parameter quantity of the initial student network is small, the initial student network can be well embedded into the mobile terminal device, and the speed of vehicle component segmentation in a mobile terminal scene is improved.
And S13, extracting the dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, wherein the dark knowledge reflects the feature similarity and the position correlation among different types of vehicle components.
In an optional embodiment, after the initial student network is built, a second teacher network with high vehicle component segmentation precision is used for extracting dark knowledge of each vehicle image in the training set, the dark knowledge can reflect feature similarity and position correlation among different types of vehicle components in the vehicle images, the built initial student network is trained based on the training set and the dark knowledge to restrict an output result of the initial student network to be an accurate component segmentation result, and a schematic diagram of a training process of the initial student network is shown in fig. 2.
In this optional embodiment, the extracting dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, where the dark knowledge reflects feature similarities and position correlations between different types of vehicle components, includes:
simultaneously inputting a target vehicle image into a second teacher network and an initial student network, wherein the target vehicle image is any one of all vehicle images in the training set, an output result of the second teacher network is used as a teacher component segmentation result, an output result of the initial student network is used as a student component segmentation result, and the teacher component segmentation result and the student component segmentation result both comprise a category vector of each pixel point in the target vehicle image;
constructing a position correlation matrix of the target vehicle image based on the teacher component segmentation result;
taking the class vector of the segmentation result of the teacher component and the position correlation matrix as dark knowledge of the target vehicle image, and constructing a first preset loss function based on the dark knowledge;
judging whether the target vehicle image is provided with label data or not to obtain a judgment parameter, if so, constructing a cross entropy loss function based on the label data and the student part segmentation result, wherein the value of the judgment parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0;
constructing a preset loss function based on the judgment parameter and the first preset loss function;
training the initial student network to obtain a second student network based on the preset loss function and all vehicle images in the training set.
In this alternative embodiment, the teacher component segmentation result is an accurate component segmentation result, so that the difference of different probability values in the class vector of the teacher component segmentation result may reflect the feature similarity between different types of vehicle components. For example, assuming that the number of types of vehicle components is 5, if the category vector of one pixel point is (0.1,0.3,0,0,0.6), it indicates that the probability that the pixel point belongs to the type of vehicle component 1 is 0.1, the probability that the pixel point belongs to the type of vehicle component 2 is 0.3, and the probability that the pixel point belongs to the type of vehicle component 5 is 0.6, which indicates that the feature similarity between the type of vehicle component 5 and the type of vehicle component 2 is greater than the feature similarity between the type of vehicle component 5 and the type of vehicle component 1.
In this alternative embodiment, the constructing the position correlation matrix of the target vehicle image based on the teacher component division result includes: selecting the vehicle component type corresponding to the maximum probability value in the class vector of the teacher component segmentation result as the vehicle component type of each pixel point; setting the pixel values of the pixel points of the same vehicle component type to be 1, setting the pixel points of other areas to be 0 to obtain an area image of the vehicle component, traversing all the vehicle component types in the teacher component segmentation result so as to obtain the area image of each vehicle component, exemplarily, setting the pixel values of all the pixel points of the vehicle component type c in the teacher component segmentation result to be 1, and setting the pixel values of the pixel points of other areas to be 0 to obtain the area image of the vehicle component c in the target vehicle image; arranging pixel values in the area image of the vehicle part along a row direction according to a fixed sequence to obtain a position vector of each vehicle part in the target vehicle image; calculating the position correlation of different vehicle components based on the position vector to construct a position correlation matrix of the teacher component division result, taking the position correlation matrix of the teacher component division result as a position correlation matrix X of the target vehicle image, wherein the position correlation matrix X is an N × N square matrix, the m-th row and N-th column of values represent the position correlation of a vehicle component m and a vehicle component N in the target vehicle image, and the position correlation satisfies the relation:
x mn =h m (h n ) T
wherein h is m Is the position vector of the vehicle component m in the target vehicle image, (h) n ) T Is the transpose of the position vector, x, of the vehicle component n in the target vehicle image mn Is the vehicle component m and the vehicle component n position correlation.
In this optional embodiment, the category vector of the teacher component segmentation result and the position correlation matrix are used as the dark knowledge of the target vehicle image, and a first preset loss function is constructed based on the dark knowledge, where the first preset loss function is used to constrain feature similarity and position correlation between different types of vehicle components in the student component segmentation result to be consistent with the dark knowledge, and the first preset loss function satisfies the following relation:
Figure BDA0003702557220000131
wherein, X s A position correlation matrix for the student part segmentation result, X being a position correlation matrix for the target vehicle image, | X s -X‖ 2 Represents X s -L2 distance of X; p s (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the student part, P (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the teacher part, KL (P) s (i, j), P (i, j)) for calculating P s (i, j), KL divergence between P (i, j); w, H are width and height dimensions of the target vehicle image.
It should be noted that the position correlation matrix X of the student part division result is obtained based on the student part division result s The method and the teacher-based methodThe method for constructing the position correlation matrix X of the target vehicle image by the component segmentation result is the same.
In this optional embodiment, it is determined whether the target vehicle image has tag data to obtain a determination parameter, and if the target vehicle image has the tag data, a cross entropy loss function is constructed based on the tag data and the student component segmentation result, where a value of the determination parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0; constructing a preset loss function based on the judgment parameter and the first preset loss function, wherein the preset loss function satisfies the relation:
Figure BDA0003702557220000141
the Loss1 is a first preset Loss function, the Loss2 is a cross entropy Loss function constructed based on the label data and the student component segmentation result, beta is a judgment parameter, and the Loss is a preset Loss function.
In this optional embodiment, the initial student network and the second teacher network continuously select target vehicle pictures from the training set to calculate values of a preset loss function, parameters in the initial student network are continuously updated by using a gradient descent method, when the values of the preset loss function do not change any more, training of the initial student network is stopped to obtain a second student network, and the second student network can extract different features of different vehicle components in the vehicle images to obtain accurate vehicle component segmentation results.
Therefore, the dark knowledge of each vehicle picture can be extracted based on the teacher network with high vehicle component segmentation precision to supervise the training process of the initial student network, and the second student network with small parameters and capable of accurately segmenting the vehicle components is obtained.
S14, acquiring a component segmentation result of the real-time vehicle image based on the second student network.
In an optional embodiment, after the second student network with a small parameter quantity is embedded into the mobile terminal device, the user holds the mobile terminal device to collect a real-time vehicle image, and the second student network segments the real-time vehicle image to obtain a component segmentation result.
According to the technical scheme, the initial teacher network is trained twice based on the marked data and the unmarked data to obtain the teacher network, the teacher network is used for extracting the dark knowledge of each vehicle picture in the marked data and the unmarked data so as to supervise the training of the lightweight initial student network, so that the second student network with small parameter and high segmentation precision is obtained, the second student network is embedded into the mobile terminal equipment, and the speed and precision of vehicle component segmentation in the mobile terminal scene can be improved.
Referring to fig. 4, fig. 4 is a functional block diagram of a preferred embodiment of the knowledge-based vehicle component separation apparatus of the present application. The knowledge distillation-based vehicle component segmenting device 11 comprises an acquisition unit 110, a training unit 111, a building unit 112, an extraction unit 113 and a segmenting unit 114. A module/unit as referred to herein is a series of computer readable instruction segments capable of being executed by the processor 13 and performing a fixed function, and is stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
In an alternative embodiment, the collecting unit 110 is configured to collect a vehicle image with tag data as an annotated data set, collect a vehicle image without tag data as an unlabeled data set, and use the annotated data set and the unlabeled data set as a training set, where the tag data includes a vehicle component category of each pixel point in the vehicle image.
In an optional embodiment, a large number of vehicle images are collected, label data of each vehicle image are obtained, the label data are images as large as the vehicle images, pixel values in the label data represent preset labels of vehicle component types at each pixel point, the preset labels are integers from 1 to N, wherein N represents the number of all vehicle component types including a background type, the background type is used as a special vehicle component, the background type is a type of pixel points which do not belong to the vehicle component in the vehicle image, the vehicle component types and the preset labels are in one-to-one correspondence, and all the vehicle images and the label data of all the vehicle images are stored as a labeling data set.
In the optional embodiment, a large number of new vehicle images are collected, label data of the new vehicle images do not need to be acquired, all the new vehicle images are directly stored as unmarked data sets, and the marked data sets and the unmarked data sets are used as training sets for subsequent training of a teacher network and a student network.
In an alternative embodiment, the training unit 111 is configured to build an initial teacher network, which is trained based on the training set to obtain a second teacher network.
In an optional embodiment, an initial teacher network is established, the input of the initial teacher network is a vehicle picture, the expected output is a component segmentation result of the vehicle picture, the component segmentation result is an image equal to the vehicle image, the component segmentation result includes a category vector of each pixel point in the vehicle picture, the category vector includes probability values of the pixel point belonging to each vehicle component, and the sum of all probability values in the same pixel point category vector is 1; and selecting the vehicle component type corresponding to the maximum probability value in the category vector as the vehicle component type corresponding to the pixel point.
In this optional embodiment, the initial teacher network is an encoder and decoder structure, the encoder uses the convolution layer to perform downsampling on the input vehicle image to obtain a teacher feature map, and sends the teacher feature map into the decoder to perform upsampling using the deconvolution layer to obtain a component segmentation result of the vehicle image. It should be noted that, the vehicle component segmentation accuracy of the initial teacher network directly affects the segmentation accuracy of the subsequent student network, so that it is not necessary to consider the parameters of the initial teacher network to ensure that the vehicle component segmentation accuracy of the initial teacher network is the most advanced condition, and therefore, the initial teacher network may select the existing image segmentation networks with higher accuracy, such as deep lapv3, UNet, and the like, without limitation in the present application.
In this alternative embodiment, in order to ensure that the output of the initial teacher network is the result of the component segmentation of the vehicle image, the initial teacher network needs to be trained based on the training set to obtain a second teacher network. The training the initial teacher network to obtain a second teacher network based on the training set comprises:
training the initial teacher network based on the labeled data set and the cross entropy loss function to obtain a first teacher network;
obtaining a component segmentation result of each vehicle picture in the unlabelled data set based on the first teacher network, wherein the component segmentation result comprises a category vector of each pixel point in the vehicle picture, and the category vector comprises a probability value of each vehicle component to which the pixel point belongs;
calculating the credibility index of the segmentation result of each part according to the credibility calculation model;
screening the vehicle images in the unmarked data set based on the reliability index and a preset reliability threshold value to obtain abnormal images;
and acquiring label data of all the abnormal images, and training the first teacher network to obtain a second teacher network based on the abnormal images and the label data of the abnormal images.
In the optional embodiment, the initial teacher network is trained based on a labeled data set and a cross entropy loss function to obtain a first teacher network, vehicle pictures in the labeled data set are continuously input into the initial teacher network to obtain an output result in the training process, the value of the cross entropy loss function is calculated based on the output result and label data of the vehicle pictures, parameters in the initial teacher network are updated by using a gradient descent method, and when the value of the cross entropy loss function does not change any more, the training of the initial teacher network is stopped to obtain the first teacher network.
In this optional embodiment, all the vehicle pictures in the unlabelled data set are sequentially input to the first teacher network to obtain a component segmentation result of each vehicle picture, where the component segmentation result includes a category vector of each pixel in the vehicle picture, and the category vector includes a probability value that the pixel belongs to each vehicle component; and further calculating the credibility index of each part segmentation result according to a credibility calculation model, taking the part segmentation result k as an example, wherein the credibility calculation model meets the relation:
Figure BDA0003702557220000171
where N is the number of kinds of all vehicle components, W × H is the width and height dimensions of the component division result k,
Figure BDA0003702557220000172
is the probability value alpha of the type c of the vehicle part in the category vector of the pixel point (i, j) in the part segmentation result k k The reliability index of the part segmentation result k is in a value range of (0, 1)]The larger the numerical value is, the higher the reliability of the component division result k is, and the higher the accuracy of the vehicle component division is.
In the above-mentioned formula,
Figure BDA0003702557220000173
the method comprises the steps that the information entropy of a category vector of a pixel point (i, j) is obtained, when the numerical value of the information entropy is larger, the difference value between different probability values in the category vector is smaller, the segmentation result of the pixel point (i, j) is inaccurate, when all probability values in the category vector are equal, the information entropy reaches the maximum value, and the maximum value is log N.
For example, if the component segmentation result 1 only includes one pixel, the number of types of all vehicle components is 5, and the class vector corresponding to the pixel is (0.2,0.1,0,0.1,0.6), the confidence index of the component segmentation result is:
Figure BDA0003702557220000174
in this optional embodiment, the screening, based on the reliability index and a preset reliability threshold, the vehicle image in the unlabeled dataset to obtain an abnormal image includes:
comparing the reliability index with a preset reliability threshold;
if the reliability index is larger than the preset reliability threshold, indicating that the precision of a component segmentation result corresponding to the reliability index meets the requirement, and marking a vehicle picture corresponding to the component segmentation result as a normal image;
if the reliability index is not larger than the preset reliability threshold, the accuracy of the part segmentation result corresponding to the reliability index is not met, and the vehicle picture corresponding to the part segmentation result is marked as an abnormal image. Wherein the preset confidence threshold value is 0.6.
In this optional embodiment, the first teacher network may not learn the distinctive features of different vehicle components well, so that the accuracy of the component segmentation result of the abnormal image may not meet the requirement, so that the label data of all the abnormal images is obtained, and the first teacher network is trained based on all the abnormal images and the label data of all the abnormal images to constrain the first teacher network to learn the distinctive features of different vehicle components, where the obtaining manner of the label data of the abnormal images is labeled manually. In the training process, abnormal images are continuously input into the first teacher network to obtain output results, values of cross entropy loss functions are calculated based on the output results and label data of the abnormal images, parameters in the first teacher network are updated through a gradient descent method, when the values of the cross entropy loss functions do not change any more, the training of the first teacher network is stopped to obtain a second teacher network, and the second teacher network can extract different characteristics of different vehicle components in the vehicle images to obtain accurate vehicle component segmentation results.
In an alternative embodiment, the construction unit 112 is used to construct an initial student network.
In an optional embodiment, an initial student network is built, the input and the output of the initial student network are the same as those of the teacher network, the initial student network is composed of a light-weight encoder and a light-weight decoder, the light-weight encoder utilizes a convolution layer to down-sample an input vehicle image to obtain a student characteristic map, the light-weight decoder utilizes a deconvolution layer to up-sample the student characteristic map to obtain a component segmentation result which is large as the input vehicle image, wherein the light-weight encoder can adopt an encoder structure with smaller parameter quantity such as the existing MobileNet and the ShuffleNet, the light-weight decoder can be realized by reducing the number of layers of the deconvolution layer, and the specific structures of the light-weight encoder and the light-weight decoder are not limited in the application.
In this optional embodiment, due to the lightweight feature of the initial student network, the initial student network may be embedded into the mobile terminal device, so as to improve the speed of vehicle component division in the mobile terminal scene, and the mobile terminal device may be a portable intelligent device with a photographing function, such as a smart phone, a smart watch, and a tablet computer, which is not limited in this application.
In an alternative embodiment, the extracting unit 113 is configured to extract dark knowledge of each vehicle image in the training set based on the second teacher network, and train the initial student network to obtain a second student network based on the dark knowledge, where the dark knowledge reflects feature similarities and position correlations between different types of vehicle components.
In an optional embodiment, after the initial student network is built, a second teacher network with high vehicle component segmentation precision is used for extracting dark knowledge of each vehicle image in the training set, the dark knowledge can reflect feature similarity and position correlation among different types of vehicle components in the vehicle images, the built initial student network is trained based on the training set and the dark knowledge to restrict an output result of the initial student network to be an accurate component segmentation result, and a schematic diagram of a training process of the initial student network is shown in fig. 2.
In this optional embodiment, the extracting dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, where the dark knowledge reflects feature similarities and position correlations between different types of vehicle components, includes:
simultaneously inputting a target vehicle image into a second teacher network and an initial student network, wherein the target vehicle image is any one of all vehicle images in the training set, an output result of the second teacher network is used as a teacher component segmentation result, an output result of the initial student network is used as a student component segmentation result, and the teacher component segmentation result and the student component segmentation result both comprise a category vector of each pixel point in the target vehicle image;
constructing a position correlation matrix of the target vehicle image based on the teacher component segmentation result;
taking the class vector of the segmentation result of the teacher component and the position correlation matrix as dark knowledge of the target vehicle image, and constructing a first preset loss function based on the dark knowledge;
judging whether the target vehicle image is provided with label data or not to obtain a judgment parameter, if so, constructing a cross entropy loss function based on the label data and the student part segmentation result, wherein the value of the judgment parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0;
constructing a preset loss function based on the judgment parameter and the first preset loss function;
training the initial student network to obtain a second student network based on the preset loss function and all vehicle images in the training set.
In this alternative embodiment, the teacher component segmentation result is an accurate component segmentation result, so that the difference of different probability values in the class vector of the teacher component segmentation result may reflect the feature similarity between different types of vehicle components. For example, assuming that the number of types of vehicle components is 5, if the category vector of one pixel point is (0.1,0.3,0,0,0.6), it indicates that the probability that the pixel point belongs to the vehicle component type 1 is 0.1, the probability that the pixel point belongs to the vehicle component type 2 is 0.3, and the probability that the pixel point belongs to the vehicle component type 5 is 0.6, which indicates that the feature similarity between the vehicle component type 5 and the vehicle component type 2 is greater than the feature similarity between the vehicle component type 5 and the vehicle component type 1.
In this alternative embodiment, the constructing the position correlation matrix of the target vehicle image based on the teacher component division result includes: selecting the vehicle component type corresponding to the maximum probability value in the class vector of the teacher component segmentation result as the vehicle component type of each pixel point; setting the pixel values of the pixel points of the same vehicle component type to be 1, setting the pixel points of other areas to be 0 to obtain an area image of the vehicle component, traversing all the vehicle component types in the teacher component segmentation result so as to obtain the area image of each vehicle component, exemplarily, setting the pixel values of all the pixel points of the vehicle component type c in the teacher component segmentation result to be 1, and setting the pixel values of the pixel points of other areas to be 0 to obtain the area image of the vehicle component c in the target vehicle image; arranging pixel values in the area image of the vehicle part along a row direction according to a fixed sequence to obtain a position vector of each vehicle part in the target vehicle image; calculating the position correlation of different vehicle components based on the position vector to construct a position correlation matrix of the teacher component division result, taking the position correlation matrix of the teacher component division result as a position correlation matrix X of the target vehicle image, wherein the position correlation matrix X is an N × N square matrix, the m-th row and N-th column of values represent the position correlation of a vehicle component m and a vehicle component N in the target vehicle image, and the position correlation satisfies the relation:
x mn =h m (h n ) T
wherein h is m Is the position vector of the vehicle component m in the target vehicle image, (h) n ) T Is the transpose of the position vector, x, of the vehicle component n in the target vehicle image mn As vehiclesPosition dependence of component m and vehicle component n.
In this optional embodiment, the category vector of the teacher component segmentation result and the position correlation matrix are used as the dark knowledge of the target vehicle image, and a first preset loss function is constructed based on the dark knowledge, where the first preset loss function is used to constrain feature similarity and position correlation between different types of vehicle components in the student component segmentation result to be consistent with the dark knowledge, and the first preset loss function satisfies the following relation:
Figure BDA0003702557220000211
wherein, X s A position correlation matrix for the student part segmentation result, X being a position correlation matrix for the target vehicle image, | X s -X‖ 2 Represents X s -L2 distance of X; p s (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the student part, P (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the teacher part, KL (P) s (i, j), P (i, j)) for calculating P s (i, j), KL divergence between P (i, j); w, H are width and height dimensions of the target vehicle image.
It should be noted that the position correlation matrix X of the student part division result is obtained based on the student part division result s Is the same as the method of constructing the position correlation matrix X of the target vehicle image based on the teacher-component division result.
In this optional embodiment, it is determined whether the target vehicle image has tag data to obtain a determination parameter, and if the target vehicle image has the tag data, a cross entropy loss function is constructed based on the tag data and the student component segmentation result, where a value of the determination parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0; constructing a preset loss function based on the judgment parameter and the first preset loss function, wherein the preset loss function satisfies the relation:
Figure BDA0003702557220000212
the Loss1 is a first preset Loss function, the Loss2 is a cross entropy Loss function constructed based on the label data and the student component segmentation result, beta is a judgment parameter, and the Loss is a preset Loss function.
In this optional embodiment, the initial student network and the second teacher network continuously select target vehicle pictures from the training set to calculate values of a preset loss function, parameters in the initial student network are continuously updated by using a gradient descent method, when the values of the preset loss function do not change any more, training of the initial student network is stopped to obtain a second student network, and the second student network can extract different features of different vehicle components in the vehicle images to obtain accurate vehicle component segmentation results.
In an alternative embodiment, the segmentation unit 114 is configured to obtain a component segmentation result of the real-time vehicle image based on the second student network.
In an optional embodiment, after the second student network with a small parameter quantity is embedded into the mobile terminal device, the user holds the mobile terminal device to collect a real-time vehicle image, and the second student network segments the real-time vehicle image to obtain a component segmentation result.
According to the technical scheme, the initial teacher network is trained twice based on the marked data and the unmarked data to obtain the teacher network, the teacher network is used for extracting the dark knowledge of each vehicle picture in the marked data and the unmarked data so as to supervise the training of the lightweight initial student network, so that the second student network with small parameter and high segmentation precision is obtained, the second student network is embedded into the mobile terminal equipment, and the speed and precision of vehicle component segmentation in the mobile terminal scene can be improved.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 1 comprises a memory 12 and a processor 13. The memory 12 is used for storing computer readable instructions, and the processor 13 is used for executing the computer readable instructions stored in the memory to implement the knowledge-based vehicle component segmentation method according to any one of the above embodiments.
In an alternative embodiment, the electronic device 1 further comprises a bus, a computer program stored in said memory 12 and executable on said processor 13, such as a knowledge-based vehicle component segmentation program.
Fig. 4 only shows the electronic device 1 with the memory 12 and the processor 13, and it will be understood by a person skilled in the art that the structure shown in fig. 4 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
In conjunction with fig. 1, the memory 12 in the electronic device 1 stores a plurality of computer readable instructions to implement a knowledge-based vehicle component segmentation method, and the processor 13 is operable to execute the plurality of instructions to implement:
collecting a vehicle image with tag data as a tagged data set, collecting a vehicle image without tag data as an unlabeled data set, and using the tagged data set and the unlabeled data set as training sets, wherein the tag data comprises the vehicle part type of each pixel point in the vehicle image;
building an initial teacher network, and training the initial teacher network based on the training set to obtain a second teacher network;
building an initial student network;
extracting dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, wherein the dark knowledge reflects feature similarity and position correlation among different types of vehicle components;
obtaining a component segmentation result of a real-time vehicle image based on the second student network.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
It will be understood by those skilled in the art that the schematic diagram is only an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-shaped structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, etc.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that may be adapted to the present application, should also be included in the scope of protection of the present application, and are included by reference.
Memory 12 includes at least one type of readable storage medium, which may be non-volatile or volatile. The readable storage medium includes flash memory, removable hard disks, multimedia cards, card type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic apparatus 1 and various types of data such as codes of a knowledge-based vehicle component division program, but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by various interfaces and lines, and executes various functions and processes data of the electronic device 1 by operating or executing programs or modules stored in the memory 12 (for example, executing a knowledge-based vehicle component division program and the like), and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in the various knowledge-based distillation vehicle component segmentation method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be partitioned into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the electronic device 1. For example, the computer program may be segmented into an acquisition unit 110, a training unit 111, a construction unit 112, an extraction unit 113, a segmentation unit 114.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a Processor (Processor) to execute the parts of the knowledge-based vehicle component segmentation method according to the embodiments of the present application.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow in the method of the embodiments described above may be implemented by a computer program, which may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of the embodiments of the methods described above may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), random-access Memory and other Memory, etc.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 4, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
The present application further provides a computer-readable storage medium (not shown), in which computer-readable instructions are stored, and the computer-readable instructions are executed by a processor in an electronic device to implement the knowledge distillation-based vehicle component segmentation method according to any one of the above embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the specification may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application and not for limiting, and although the present application is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application.

Claims (10)

1. A knowledge-based distillation vehicle component segmentation method, characterized in that the method comprises:
collecting a vehicle image with tag data as a tagged data set, collecting a vehicle image without tag data as an unlabeled data set, and using the tagged data set and the unlabeled data set as training sets, wherein the tag data comprises the vehicle part type of each pixel point in the vehicle image;
building an initial teacher network, and training the initial teacher network based on the training set to obtain a second teacher network;
building an initial student network;
extracting dark knowledge of each vehicle image in the training set based on the second teacher network, and training the initial student network based on the dark knowledge to obtain a second student network, wherein the dark knowledge reflects feature similarity and position correlation among different types of vehicle components;
obtaining a component segmentation result of a real-time vehicle image based on the second student network.
2. The knowledge distillation-based vehicle component segmentation method of claim 1, wherein the training the initial teacher network based on the training set to obtain a second teacher network comprises:
training the initial teacher network based on the labeled data set and the cross entropy loss function to obtain a first teacher network;
obtaining a component segmentation result of each vehicle picture in the unlabelled data set based on the first teacher network, wherein the component segmentation result comprises a category vector of each pixel point in the vehicle picture, and the category vector comprises a probability value of each vehicle component to which the pixel point belongs;
calculating the reliability index of the segmentation result of each component according to the reliability calculation model;
screening the vehicle images in the unmarked data set based on the credibility index and a preset credibility threshold value to obtain abnormal images;
and acquiring label data of all the abnormal images, and training the first teacher network to obtain a second teacher network based on the abnormal images and the label data of the abnormal images.
3. The knowledge-based distillation vehicle component segmentation method of claim 2, wherein the confidence computation model satisfies the relation:
Figure FDA0003702557210000021
where N is the number of kinds of all vehicle components, W × H is the width and height dimensions of the component division result k,
Figure FDA0003702557210000022
is the probability value alpha of the type c of the vehicle part in the category vector of the pixel point (i, j) in the part segmentation result k k The reliability index of the part segmentation result k is a value range of (0, 1)]The larger the numerical value is, the higher the reliability of the component division result k is, and the higher the accuracy of the vehicle component division is.
4. The knowledge-based distillation vehicle component segmentation method of claim 2, wherein the screening the vehicle images in the unlabeled dataset based on the confidence measure and a preset confidence threshold to obtain an anomaly image comprises:
comparing the reliability index with a preset reliability threshold;
if the reliability index is larger than the preset reliability threshold, indicating that the precision of a component segmentation result corresponding to the reliability index meets the requirement, and marking a vehicle picture corresponding to the component segmentation result as a normal image;
if the reliability index is not larger than the preset reliability threshold, the accuracy of the part segmentation result corresponding to the reliability index is not met, and the vehicle picture corresponding to the part segmentation result is marked as an abnormal image.
5. The knowledge-distillation-based vehicle component segmentation method of claim 1, wherein the extracting dark knowledge of each vehicle image in the training set based on the second teacher network and training the initial student network based on the dark knowledge to obtain a second student network, the dark knowledge reflecting feature similarities and position correlations between different types of vehicle components comprises:
simultaneously inputting a target vehicle image into a second teacher network and an initial student network, wherein the target vehicle image is any one of all vehicle images in the training set, an output result of the second teacher network is used as a teacher component segmentation result, an output result of the initial student network is used as a student component segmentation result, and the teacher component segmentation result and the student component segmentation result both comprise a category vector of each pixel point in the target vehicle image;
constructing a position correlation matrix of the target vehicle image based on the teacher component segmentation result;
taking the class vector of the segmentation result of the teacher component and the position correlation matrix as dark knowledge of the target vehicle image, and constructing a first preset loss function based on the dark knowledge;
judging whether the target vehicle image is provided with label data or not to obtain a judgment parameter, if so, constructing a cross entropy loss function based on the label data and the student part segmentation result, wherein the value of the judgment parameter is 1; if the target vehicle image does not have label data, the numerical value of the judgment parameter is 0;
constructing a preset loss function based on the judgment parameter and the first preset loss function;
training the initial student network to obtain a second student network based on the preset loss function and all vehicle images in the training set.
6. The knowledge-distillation-based vehicle component division method of claim 5, wherein the constructing a position correlation matrix of the target vehicle image based on the teacher component division result comprises:
selecting the vehicle component type corresponding to the maximum probability value in the class vector of the teacher component segmentation result as the vehicle component type of each pixel point;
setting the pixel value of the pixel point of the same vehicle component type as 1, setting the pixel points of other areas as 0 to obtain an area image of the vehicle component, and traversing all the vehicle component types in the segmentation result of the teacher component so as to obtain the area image of each vehicle component;
arranging pixel values in the area image of the vehicle part along a row direction according to a fixed sequence to obtain a position vector of each vehicle part in the target vehicle image;
calculating position correlations of different vehicle components based on the position vectors to construct a position correlation matrix of the target vehicle image, the position correlations satisfying the relation:
x mn =h m (h n ) T
wherein h is m Is the position vector of the vehicle component m in the target vehicle image, (h) n ) T Is the transpose of the position vector, x, of the vehicle component n in the target vehicle image mn Is the vehicle component m and the vehicle component n position correlation.
7. The knowledge-based distillation vehicle component division method according to claim 5, wherein in the constructing of the preset loss function based on the judgment parameter and the first preset loss function, the preset loss function satisfies a relation:
Figure FDA0003702557210000041
the Loss1 is a first preset Loss function, the Loss2 is a cross entropy Loss function constructed based on label data and the student component segmentation result, beta is a judgment parameter and takes a value of 0 or 1, and the Loss is the preset Loss function;
the first predetermined loss function satisfies the relation:
Figure FDA0003702557210000042
wherein, X s A position correlation matrix for the student part segmentation result, X being a position correlation matrix for the target vehicle image, | X s -X‖ 2 Represents X s -L2 distance of X; p s (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the student part, P (i, j) is the category vector of the pixel point (i, j) in the segmentation result of the teacher part, KL (P) s (i, j), P (i, j)) for calculating P s (i, j), KL divergence between P (i, j); w, H are width and height dimensions of the target vehicle image.
8. A knowledge-based distillation vehicle component segmenting device, comprising:
the system comprises a collecting unit, a judging unit and a judging unit, wherein the collecting unit is used for collecting a vehicle image with label data as an labeled data set, collecting a vehicle image without the label data as an unlabeled data set, and using the labeled data set and the unlabeled data set as training sets, wherein the label data comprises the vehicle part type of each pixel point in the vehicle image;
the training unit is used for establishing an initial teacher network and training the initial teacher network based on the training set to obtain a second teacher network;
the building unit is used for building an initial student network;
an extracting unit, configured to extract dark knowledge of each vehicle image in the training set based on the second teacher network, and train the initial student network based on the dark knowledge to obtain a second student network, where the dark knowledge reflects feature similarities and position correlations between different types of vehicle components;
and the segmentation unit is used for acquiring a part segmentation result of the real-time vehicle image based on the second student network.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the knowledge distillation based vehicle component segmentation method of any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon computer readable instructions which, when executed by a processor, implement the knowledge distillation based vehicle component segmentation method as claimed in any one of claims 1 to 7.
CN202210791176.8A 2022-06-20 2022-06-20 Knowledge distillation-based vehicle component segmentation method and related equipment Pending CN115063589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210791176.8A CN115063589A (en) 2022-06-20 2022-06-20 Knowledge distillation-based vehicle component segmentation method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210791176.8A CN115063589A (en) 2022-06-20 2022-06-20 Knowledge distillation-based vehicle component segmentation method and related equipment

Publications (1)

Publication Number Publication Date
CN115063589A true CN115063589A (en) 2022-09-16

Family

ID=83203611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210791176.8A Pending CN115063589A (en) 2022-06-20 2022-06-20 Knowledge distillation-based vehicle component segmentation method and related equipment

Country Status (1)

Country Link
CN (1) CN115063589A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115641443A (en) * 2022-12-08 2023-01-24 北京鹰瞳科技发展股份有限公司 Method for training image segmentation network model, method for processing image and product
CN115908253A (en) * 2022-10-18 2023-04-04 中科(黑龙江)数字经济研究院有限公司 Knowledge distillation-based cross-domain medical image segmentation method and device
CN116863279A (en) * 2023-09-01 2023-10-10 南京理工大学 Model distillation method for mobile terminal model light weight based on interpretable guidance

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908253A (en) * 2022-10-18 2023-04-04 中科(黑龙江)数字经济研究院有限公司 Knowledge distillation-based cross-domain medical image segmentation method and device
CN115641443A (en) * 2022-12-08 2023-01-24 北京鹰瞳科技发展股份有限公司 Method for training image segmentation network model, method for processing image and product
CN115641443B (en) * 2022-12-08 2023-04-11 北京鹰瞳科技发展股份有限公司 Method for training image segmentation network model, method for processing image and product
CN116863279A (en) * 2023-09-01 2023-10-10 南京理工大学 Model distillation method for mobile terminal model light weight based on interpretable guidance
CN116863279B (en) * 2023-09-01 2023-11-21 南京理工大学 Model distillation method for mobile terminal model light weight based on interpretable guidance

Similar Documents

Publication Publication Date Title
CN115063589A (en) Knowledge distillation-based vehicle component segmentation method and related equipment
CN115063632B (en) Vehicle damage identification method, device, equipment and medium based on artificial intelligence
CN113705462B (en) Face recognition method, device, electronic equipment and computer readable storage medium
CN111860377A (en) Live broadcast method and device based on artificial intelligence, electronic equipment and storage medium
CN112699775A (en) Certificate identification method, device and equipment based on deep learning and storage medium
CN112232203B (en) Pedestrian recognition method and device, electronic equipment and storage medium
CN115049878B (en) Target detection optimization method, device, equipment and medium based on artificial intelligence
CN114462412B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN115409638B (en) Artificial intelligence-based livestock insurance underwriting and claim settlement method and related equipment
CN115222427A (en) Artificial intelligence-based fraud risk identification method and related equipment
CN112163635A (en) Image classification method, device, server and medium based on deep learning
CN115222443A (en) Client group division method, device, equipment and storage medium
CN112966687B (en) Image segmentation model training method and device and communication equipment
CN114281991A (en) Text classification method and device, electronic equipment and storage medium
CN113705468A (en) Digital image identification method based on artificial intelligence and related equipment
CN116416632A (en) Automatic file archiving method based on artificial intelligence and related equipment
CN115169360A (en) User intention identification method based on artificial intelligence and related equipment
CN112434631B (en) Target object identification method, target object identification device, electronic equipment and readable storage medium
CN112102205B (en) Image deblurring method and device, electronic equipment and storage medium
CN114596435A (en) Semantic segmentation label generation method, device, equipment and storage medium
CN114972761B (en) Vehicle part segmentation method based on artificial intelligence and related equipment
CN114120122B (en) Disaster damage identification method, device, equipment and storage medium based on remote sensing image
CN114943908B (en) Vehicle body damage evidence obtaining method, device, equipment and medium based on artificial intelligence
CN115359272A (en) Claim settlement detection method, device, equipment and storage medium
CN116012891A (en) Image enhancement-based multi-scale pedestrian detection method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination