CN117132847A

CN117132847A - Power expression damaged image self-labeling model training method based on domain generalization

Info

Publication number: CN117132847A
Application number: CN202311095122.9A
Authority: CN
Inventors: 何宇浩; 宋云海; 周震震; 何森; 王黎伟; 王奇; 李为明; 肖耀辉; 何珏; 曾少豪
Original assignee: China Southern Power Grid Corp Ultra High Voltage Transmission Co Electric Power Research Institute
Current assignee: China Southern Power Grid Corp Ultra High Voltage Transmission Co Electric Power Research Institute
Priority date: 2023-08-28
Filing date: 2023-08-28
Publication date: 2023-11-28

Abstract

The application relates to a domain generalization-based power meter damage image self-labeling model training method. The method comprises the following steps: acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained mark damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training mark damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training mark broken image to obtain a trained image information detection layer; and obtaining a trained marking damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer. By adopting the method, the automatic marking of the damaged image of the mark can be realized, and the recognition efficiency of the damage of the mark is improved.

Description

Power expression damaged image self-labeling model training method based on domain generalization

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a domain generalization-based power meter damage image self-labeling model training method.

Background

With the development of artificial intelligence technology, intelligent recognition technology has emerged, which is a technology for processing, analyzing and understanding images by a computer to recognize targets and objects in various modes, and is a practical application for applying a deep learning algorithm. Intelligent identification technology has become a trend in identifying common damage (identification indicator damage) to insulation devices in substations in the power industry.

In the conventional technology, for the common damage of an insulating device (damage of an identification mark) in a transformer substation for identifying the power industry, the marked sample library is still established by depending on the marked sample library, and the marking methods of manually marking the sample and researching the mark image are required, so that the identification efficiency of the mark damage is low.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a domain-generalized-based power meter damage image self-labeling model training method, apparatus, computer device, computer-readable storage medium, and computer program product that can improve the recognition efficiency of meter damage.

In a first aspect, the application provides a domain generalization-based power expression damaged image self-labeling model training method. The method comprises the following steps: acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer; and obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

In a second aspect, the application further provides a power representation damaged image self-labeling model training device based on domain generalization. The device comprises: the data information acquisition module is used for acquiring a training mark damaged image and a mark damaged image marking model to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; the feature extraction layer training module is used for training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; the feature detection layer training module is used for training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer; and the marking model obtaining module is used for obtaining a marking model of the trained marked damaged image according to the trained image feature extraction layer and the trained image information detection layer.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of: acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer; and obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of: acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer; and obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of: acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer; and obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

The power meter damage image self-labeling model training method, the device, the computer equipment, the storage medium and the computer program product based on domain generalization are realized by acquiring a meter damage image for training and a meter damage image labeling model to be trained; the to-be-trained mark damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training mark damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training mark broken image to obtain a trained image information detection layer; and obtaining a trained marking damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

Training an image feature extraction layer to be trained and an image information detection layer to be trained in a damaged image marking model of the damaged image to be trained by using the damaged image of the mark for training, and embedding the trained image feature extraction layer into the trained image information detection layer to obtain a damaged image marking model of the mark for training; the domain generalization algorithm can be adopted to train the domain-unchanged marking damaged image marking model from a small number of data sets in different domains, so that the marking damaged image marking model has strong migration capability, the marking damaged image marking model is used for automatically detecting the unmarked marking damaged image, the automatic marking of the marking damaged image is realized, and the marking damaged recognition efficiency is improved.

Drawings

FIG. 1 is an application environment diagram of a domain generalization-based power representation damaged image self-labeling model training method in an embodiment;

FIG. 2 is a flow chart of a domain generalization-based power representation damaged image self-labeling model training method in an embodiment;

FIG. 3 is a flow chart of a method for obtaining a trained image feature extraction layer in one embodiment;

FIG. 4 is a flow chart of a method of image domain invariant feature values in one embodiment;

FIG. 5 is a flow chart of a method for image domain invariant feature values in another embodiment;

FIG. 6 is a flow chart of a method for obtaining an image position feature vector in one embodiment;

FIG. 7 is a flowchart of a method for obtaining a trained token damaged image annotation model in one embodiment;

FIG. 8 is a flow chart of a method for obtaining a model of a broken image of a tested mark in one embodiment;

FIG. 9 is a schematic diagram of the cross-domain reconstruction principle in one embodiment;

FIG. 10 is a schematic diagram of a connection structure showing a damaged image annotation model, in one embodiment;

FIG. 11 is a block diagram of a power representation damage image self-labeling model training device based on domain generalization in one embodiment;

Fig. 12 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The domain generalization-based power expression damaged image self-labeling model training method provided by the embodiment of the application can be applied to an application environment as shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The server 104 acquires a training mark damaged image and a mark damaged image marking model to be trained from the terminal 102; the to-be-trained mark damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training mark damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training mark broken image to obtain a trained image information detection layer; and obtaining a trained marking damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In one embodiment, as shown in fig. 2, a domain generalization-based power expression damaged image self-labeling model training method is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps:

step 202, obtaining a training mark damaged image and a mark damaged image labeling model to be trained.

The training-use expression damage image may be an image for training an artificial intelligence model, in which expression damage exists in the image.

The damaged image annotation model of the to-be-trained expression can be an image annotation model to be trained, wherein the damaged image annotation model of the to-be-trained expression is formed by using a neural network and comprises an image feature extraction layer to be trained and an image information detection layer to be trained.

The feature extraction layer to be trained can be a feature extraction layer which shows that the damaged image annotation model has extraction domain unchanged features, but is to be trained.

The image information detection layer to be trained may be a neural network that has a detection function in a marking damaged image annotation model, but is to be trained, for example: the image information detection layer may be constituted by a real-time detection of targets (fast R-CNN) implemented using a regional proposal network.

Specifically, the server 104 responds to a terminal 102 about constructing a cross-domain reconstruction task instruction and a marking damaged image marking model instruction, and obtains a training marking damaged image and a marking damaged image marking model to be trained from the terminal 102, wherein the marking damaged image marking model to be trained comprises a feature extraction layer of the image to be trained and an information detection layer of the image to be trained; and the obtained damaged images of the training marks and the damaged image marking models of the marks to be trained are stored in a storage unit, and when the server needs to process any data in the damaged images of the training marks and the damaged image marking models of the marks to be trained, volatile storage resources are called from the storage unit for calculation by a central processing unit. Any data may be a single data input to the central processing unit, or may be a plurality of data input to the central processing unit at the same time.

For example, in response to the instruction of the terminal 102, the server 104 acquires the training damaged image and the damaged image marking model of the to-be-trained mark from the terminal 102, and stores the training damaged image and the damaged image marking model of the to-be-trained mark in the storage unit in the server 104, where the total of 10 pieces of data information corresponding to the training damaged image and the damaged image marking model of the to-be-trained mark acquired by the server 104 may be simultaneously input into the central processor.

And 204, training the image feature extraction layer to be trained by using the training mark damaged image to obtain a trained image feature extraction layer.

The trained image feature extraction layer may be a feature extraction layer which indicates that the damaged image annotation model has extraction domain invariant features and has been trained.

Specifically, the first: because the training damaged image comprises a plurality of different damaged sub-images, and each damaged sub-image is different, the feature extraction layer of the image to be trained realizes style mixing operation on each damaged sub-image through a stylemix mathematical model, so that the damaged sub-images are mixed with the styles of other substations to obtain each mixed damaged image. Wherein, the process of performing style mixing operation on each of the damaged sub-images is as follows, and for the case of mixing in Fourier space, the damaged image x of the training table from the jth substation and the damaged image x of the auxiliary training table randomly selected from the ith substation (i not equal to j) are given _aux View v of training table broken image x _i Can be expressed as:

wherein K is ^-1 Representing the inverse Fourier transform, K ^A And K ^P The amplitude and phase of the fourier transform are returned, respectively.

Second,: each hybrid token damaged image is segmented into a series of token damaged image segments (patches), wherein the segmentation operation may be implemented using an image segmentation layer in a visual transducer model (vision transformer, VIT). Each of the token broken image segments is linearly mapped by a leachable linear mapping (patch mapping) to a representation of a single fixed length of the token broken image vectors that are passed as input to a transducer submodel in the vision transducer model.

In order to retain the position information in the training damaged image, the vision transducer model uses a position coding (positional encoding) function to add information about the position to each input damaged image vector to obtain each image position feature vector. To assist the model in learning features at different locations in the image.

Third,: the vision Transformer model uses the encoder structure of the transducer sub-model to process the input individual image position feature vectors. The transducer encoder is composed of a plurality of identical layers (usually a plurality of self-attention layers and a feedforward neural network layer), each layer performs calculation of a multi-head self-attention mechanism and the feedforward neural network, and the transducer encoder extracts characteristic relations of the position characteristic vectors of each image to obtain the invariant characteristic values of each image domain.

Fourth,: and decoding the image domain invariant feature values of each transformer substation by using a feature decoder (decoder model) of the image feature extraction layer to be trained to obtain decoded image domain invariant feature values, and performing image reconstruction on each decoded image domain invariant feature value to obtain each reconstructed image feature value. As shown in fig. 9, which is a schematic diagram of the principle of cross-domain reconstruction.

Fifth,: inputting the characteristic values of each reconstructed image into a reconstruction loss function, and realizing the training of the image characteristic extraction layer to be trained through the calculation of the reconstruction loss function until the reconstruction loss value output by the image characteristic extraction layer to be trained is smaller than a preset reconstruction loss value, so as to obtain the trained image characteristic extraction layer, wherein the expression of the reconstruction loss function is as follows

And 206, training the image information detection layer to be trained by using the training mark broken image to obtain a trained image information detection layer.

Wherein the trained image information detection layer can be a neural network which has detection function in the marked damaged image annotation model and is trained,

specifically, the first: the training expression broken image is input into a convolutional neural network (such as VGG16, resNet and the like) to be trained to extract feature maps, wherein the feature maps contain feature information of different layers of the image from low level to high level.

Second,: on the feature map, the region selection network (Region Proposal Network, RPN) will slide a small window to generate candidate regions. For each window, the region selection network generates a plurality of anchor frames (candidate frames) of different dimensions and aspect ratios, and then the region selection network classifies (foreground/background) and regresses (position adjustment) each anchor frame by convolution and full connection layer to determine which anchor frames may contain an object. The region selection network may use non-maximal suppression (NMS) to exclude highly overlapping candidate boxes, leaving the most likely candidate boxes as region suggestions.

Third,: for each region, selecting network generated candidate boxes, and mapping the regions in the candidate boxes into feature maps with fixed sizes by using a RoIPooling operation, wherein the feature maps are used as input of subsequent processing. The RoIPooling operation obtains a fixed-size feature representation by dividing the region in each candidate box into fixed-size sub-regions and performing a pooling operation within each sub-region.

Fourth,: the region that passed the RoIPooling is fed into a classification header and a regression header. The classification head is used for predicting whether targets exist in the area, and the regression head is used for predicting position adjustment of the targets. Typically, these headers are full link layers or convolutional layers.

Fifth,: after the classification and regression head processing, the Faster R-CNN will output the target class for each region as well as the location information (bounding box coordinates). Finally, the detection result of the image information detection layer to be trained can be obtained by screening the region with higher classification score and adjusting the boundary box according to the regression information.

Sixth: training the image information detection layer to be trained according to the detection result of the image information detection layer to be trained until the detection result output by the image information detection layer to be trained meets the preset detection result, and obtaining the trained image information detection layer.

And step 208, obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

The trained expression damaged image annotation model can be an already trained image annotation model.

Specifically, since the trained image feature extraction layer and the trained image information detection layer need to be embedded, the determined image annotation model embedded information corresponding to the trained image feature extraction layer and the trained image information detection layer is calculated according to the trained image feature extraction layer and the trained image information detection layer.

And the trained image feature extraction layer is used as an active embedding end, and the trained image information detection layer is used as a passive embedding end, so that the trained image feature extraction layer is embedded into the trained image information detection layer under the guidance of the embedded information of the image annotation model, and the embedded representation damaged image annotation model is obtained.

Inputting the training damaged image into the embedded damaged image marking model, and realizing training of the embedded damaged image marking model through calculation of the embedded damaged image marking model until the model loss value output by the embedded damaged image marking model is smaller than the preset model loss value, so as to obtain the trained damaged image marking model. As shown in fig. 10, the connection structure of the damaged image labeling model is schematically shown.

In the domain generalization-based power meter damage image self-labeling model training method, a meter damage image for training and a meter damage image labeling model to be trained are obtained; the to-be-trained mark damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer; training the image feature extraction layer to be trained by using the training mark damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; training the image information detection layer to be trained by using the training mark broken image to obtain a trained image information detection layer; and obtaining a trained marking damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

In one embodiment, as shown in fig. 3, training the image feature extraction layer to be trained using the training token damaged image to obtain a trained image feature extraction layer, including:

and 302, performing visual domain invariant feature coding on each marking damaged sub-image to obtain each image domain invariant feature value corresponding to the training marking damaged image.

Wherein the visual field invariant feature encoding may be an operation of an encoder implementation encoding using a transducer submodel in a visual transducer model.

The image domain invariant feature value may be a feature information value of a dense attribute of the spatial difference feature assumed by the image.

Specifically, the first: because the training damaged image comprises a plurality of different damaged sub-images, and each damaged sub-image is different, the feature extraction layer of the image to be trained realizes style mixing operation on each damaged sub-image through a stylemix mathematical model, so that the damaged sub-images are mixed with the styles of other substations to obtain each mixed damaged image.

And step 304, performing image reconstruction on the unchanged feature values of each image domain according to the feature decoder of the image feature extraction layer to be trained to obtain the feature values of each reconstructed image.

The feature decoder may be a key component of a transducer model for generating sequence data in natural language processing tasks, such as machine translation, text generation, and abstracts. Its design goal is to combine the context information generated by the encoder with the partial sequence information generated previously to gradually generate the next sequence element.

The reconstructed image feature value may be a feature information value obtained by performing image reconstruction using an image domain invariant feature value.

Specifically, a feature decoder (decoder model) of an image feature extraction layer to be trained is used, the image domain invariant feature values of each transformer substation are decoded to obtain decoded image domain invariant feature values, and image reconstruction is carried out on each decoded image domain invariant feature value to obtain each reconstructed image feature value.

And 306, training the image feature extraction layer to be trained by using the feature values of each reconstructed image to obtain a trained image feature extraction layer.

Specifically, each reconstructed image characteristic value is input into a reconstruction loss function, training of the image characteristic extraction layer to be trained is achieved through calculation of the reconstruction loss function until the reconstruction loss value output by the image characteristic extraction layer to be trained is smaller than a preset reconstruction loss value, and the trained image characteristic extraction layer is obtained.

In this embodiment, after the damaged sub-image is encoded by using the visual transducer model, the feature value of the reconstructed image is obtained by using the feature decoder to decode, and the image feature extraction layer to be trained is trained by using the feature value of the reconstructed image, so that the image feature extraction layer to be trained has the capability of cross-domain reconstruction, and the generalization of the image feature extraction layer is improved.

In one embodiment, as shown in fig. 4, performing a visual domain invariant feature coding on each of the damaged sub-images to obtain each of the image domain invariant feature values corresponding to the damaged images for training, including:

and step 402, performing style mixing processing on each damaged sub-image of the mixed list to obtain each damaged image of the mixed list.

The mixed damaged image may be a damaged image obtained by mixing the characteristic information in different damaged sub images.

Specifically, because the training damaged image comprises a plurality of different damaged sub-images, and each damaged sub-image is different, the feature extraction layer of the image to be trained realizes style mixing operation on each damaged sub-image through a stylemix mathematical model, so that the damaged sub-images are mixed with the styles of other substations to obtain each mixed damaged image.

And step 404, performing the constant characteristic coding of the visual domain on each mixed expression damaged image to obtain the constant characteristic value of each image domain.

Specifically, each hybrid representation-damaged image is segmented into a series of representation-damaged image segments (patches), wherein the segmentation operation may be implemented using an image segmentation layer in a visual transducer model (vision transformer, VIT). Each of the token broken image segments is linearly mapped by a leachable linear mapping (patch mapping) to a representation of a single fixed length of the token broken image vectors that are passed as input to a transducer submodel in the vision transducer model.

The vision Transformer model uses the encoder structure of the transducer sub-model to process the input individual image position feature vectors. The transducer encoder is composed of a plurality of identical layers (usually a plurality of self-attention layers and a feedforward neural network layer), each layer performs calculation of a multi-head self-attention mechanism and the feedforward neural network, and the transducer encoder extracts characteristic relations of the position characteristic vectors of each image to obtain the invariant characteristic values of each image domain.

In this embodiment, by calculating the image domain invariant feature value using the result of performing the style mixing processing on each of the damaged sub-images, the data that can be input to the feature decoder includes the feature information of all the damaged sub-images, and the application range of the data to the image feature extraction layer is improved.

In one embodiment, as shown in fig. 5, performing the visual field invariant feature coding on each hybrid mark damaged image to obtain each image field invariant feature value, including:

Step 502, dividing each mixed damaged image of the mark to obtain each broken image block of the mark.

The marking broken image blocking may be a blocking image obtained by blocking the hybrid marking broken image by using an image segmentation model.

Specifically, each hybrid representation-damaged image is segmented into a series of representation-damaged image segments (patches), wherein the segmentation operation may be implemented using an image segmentation layer in a visual transducer model (vision transformer, VIT).

And 504, performing position coding on each marked broken image block to obtain each image position characteristic vector.

Wherein the position encoding may be adding information about the position to the mark broken image block.

The image position feature vector may be a feature vector indicating that the broken image block has information of the corresponding position added thereto.

Specifically, each of the representation-broken image segments is linearly mapped through a leachable linear mapping (patch mapping) to be represented as a single fixed-length representation-broken image vector that is passed as input to a transducer submodel in the visual transducer model.

And step 506, extracting the characteristic relation of the characteristic vectors of the image positions to obtain the invariable characteristic value of each image domain.

Specifically, the vision Transformer model employs the encoder structure of the transducer sub-model to process the input individual image position feature vectors. The transducer encoder is composed of a plurality of identical layers (usually a plurality of self-attention layers and a feedforward neural network layer), each layer performs calculation of a multi-head self-attention mechanism and the feedforward neural network, and the transducer encoder extracts characteristic relations of the position characteristic vectors of each image to obtain the invariant characteristic values of each image domain.

In this embodiment, the position coding is performed on the mark damaged image blocks corresponding to the hybrid mark damaged image, so as to obtain the image position feature vector and calculate the unchanged feature value of the image domain, so that the training of the mark damaged image marking model to learn the features of different positions in the image can be facilitated.

In one embodiment, as shown in fig. 6, performing position coding on each broken image block to obtain each image position feature vector, including:

and 602, performing linear mapping conversion on each marked broken image block to obtain each marked broken image vector.

The linear mapping conversion may be an operation of mapping the representation broken image block to the expression in the vector space.

The expression broken image vector may be information expressed by mapping the broken image block into a vector space.

Step 604, performing position coding on each marked damaged image vector to obtain each image position feature vector.

Specifically, in order to retain position information in the training-use broken-notation image, the vision transducer model uses a position coding (positional encoding) function to add information on the position to each input broken-notation image vector, and obtain each image position feature vector. To assist the model in learning features at different locations in the image.

In the embodiment, the linear mapping conversion is used for converting the broken image blocks of the table into the vector space and performing position coding, so that the broken image blocks of the table can be subjected to feature conversion and dimension reduction, and the training difficulty of the broken image labeling model of the table is reduced.

In one embodiment, as shown in fig. 7, according to the trained image feature extraction layer and the trained image information detection layer, obtaining a trained token damaged image annotation model includes:

step 702, determining the embedded information of the image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

The image annotation model embedding information may be parameters for embedding the trained image feature extraction layer into the trained image information detection layer.

And step 704, embedding the trained image feature extraction layer into the trained image information detection layer according to the image annotation model embedding information to obtain the embedded mark damaged image annotation model.

The embedded marking damaged image marking model can be obtained by embedding the trained image feature extraction layer into the trained image information detection layer.

Specifically, the trained image feature extraction layer is used as an active embedding end, the trained image information detection layer is used as a passive embedding end, and the trained image feature extraction layer is embedded into the trained image information detection layer under the guidance of the embedded information of the image annotation model, so that the embedded representation damaged image annotation model is obtained.

And step 706, training the embedded damaged image annotation model by using the damaged image for training to obtain a trained damaged image annotation model.

Specifically, the training damaged image is input into the embedded damaged image marking model, the training of the embedded damaged image marking model is realized through the calculation of the embedded damaged image marking model until the model loss value output by the embedded damaged image marking model is smaller than the preset model loss value, and the trained damaged image marking model is obtained.

In this embodiment, by embedding the trained image feature extraction layer into the trained image information detection layer, the marking damaged image annotation model has the domain generalization capability and the detection capability, so that the working efficiency of the marking damaged image annotation model is improved, and better user experience is provided.

In one embodiment, as shown in fig. 8, after the step of obtaining the trained token damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer, the method further includes:

step 802, obtaining a broken image of the test indicator.

The test marking damage image may be an image for testing a trained marking damage image marking model, wherein marking damage exists in the image.

Specifically, the server 104 acquires the test-purpose-list damage image from the terminal 102 in response to the instruction of the terminal 102 to acquire the test-purpose-list damage image, and stores the acquired test-purpose-list damage image in the storage unit, and when the server needs to process any data in the test-purpose-list damage image, the server invokes volatile storage resources from the storage unit for the central processing unit to calculate. Any data may be a single data input to the central processing unit, or may be a plurality of data input to the central processing unit at the same time.

Step 804, inputting the test damaged image to the trained damaged image marking model to obtain damaged image marking information.

The marking broken image marking information may be a result obtained by marking broken images for testing by a trained broken image marking model.

Specifically, inputting a damaged image of the test mark into a trained mark damaged image labeling model, extracting features through a trained image feature extraction layer to obtain a feature value of the damaged image of the test mark, and detecting information through a trained image information detection layer to obtain a detection value of the damaged image of the test mark; and marking the characteristic value of the damage graph of the test mark and the detection value of the damage graph of the test mark, and obtaining marking information of the damage image of the mark.

Step 806, calculating the difference between the marked damaged image marking information and the preset image marking information to obtain a marking information difference value.

The marking information difference value can be a degree of phase difference between marking information of the marked damaged image and marking information of a preset image.

Specifically, marking damaged image marking information and preset image marking information are input into a model test difference function, and a difference value between the damaged image marking information and the preset image marking information is output as a marking information difference value through calculation of the model test difference function.

And step 808, taking the trained marking damaged image marking model as the tested marking damaged image marking model according to the condition that the marking information difference value meets the marking information difference threshold value.

The marking information difference threshold value can be an evaluation standard for judging whether the trained marking damaged image marking model meets the use requirement.

The tested mark damaged image marking model can be an image marking model passing the test.

Specifically, if the difference value of the marking information is smaller than the difference threshold value of the marking information, the marking model of the trained marking damaged image is indicated to meet the use requirement, and the marking model of the trained marking damaged image is used as the marking model of the tested marking damaged image; if the difference value of the marked information is larger than the difference threshold value of the marked information, the marked image marked model of the marked damaged image is shown to be incapable of meeting the use requirement, the marked image marked model of the marked damaged image is returned to the model training step until the difference value of the marked information is smaller than the difference threshold value of the marked information, and the marked image marked model of the marked damaged image of the marked trained is used as the marked model of the marked damaged image of the tested.

In this embodiment, by testing the damaged-image-marked model using the damaged-image-marked image for testing, it is possible to find out defects existing in the damaged-image-marked model before the damaged-image-marked model is used, and to improve the stability of the damaged-image-marked model.

It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a domain-generalization-based power expression damaged image self-labeling model training device for realizing the domain-generalization-based power expression damaged image self-labeling model training method. The implementation scheme of the device for solving the problem is similar to that described in the above method, so the specific limitation in the embodiments of the device for training the self-labeling model of the damaged image of the electric power expression based on domain generalization provided below can be referred to the limitation of the method for training the self-labeling model of the damaged image of the electric power expression based on domain generalization hereinabove, and is not repeated here.

In one embodiment, as shown in fig. 11, there is provided a domain generalization-based power expression damage image self-labeling model training device, including: the data information acquisition module 1102, the feature extraction layer training module 1104, the feature detection layer training module 1106, and the labeling model obtaining module 1108, wherein:

the data information acquisition module 1102 is used for acquiring a training mark damaged image and a mark damaged image marking model to be trained; the to-be-trained mark damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer;

the feature extraction layer training module 1104 is configured to train the feature extraction layer of the image to be trained using the training token damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features;

the feature detection layer training module 1106 is configured to train the image information detection layer to be trained by using the training token damaged image to obtain a trained image information detection layer;

the labeling model obtaining module 1108 is configured to obtain a trained token damaged image labeling model according to the trained image feature extraction layer and the trained image information detection layer.

In one embodiment, the feature extraction layer training module 1104 is further configured to perform a visual domain invariant feature coding on each of the damaged sub-images to obtain an image domain invariant feature value corresponding to the damaged image for training; according to the feature decoder of the image feature extraction layer to be trained, carrying out image reconstruction on the unchanged feature values of each image domain to obtain the feature values of each reconstructed image; and training the image feature extraction layer to be trained by using the feature values of each reconstructed image to obtain a trained image feature extraction layer.

In one embodiment, the feature extraction layer training module 1104 is further configured to perform style blending processing on each of the damaged sub-images of the hybrid token to obtain each of the damaged images of the hybrid token; and (3) performing the constant-visual-domain feature coding on each mixed representation damaged image to obtain the constant-visual-domain feature value of each image.

In one embodiment, the feature extraction layer training module 1104 is further configured to segment each hybrid token damaged image to obtain each token damaged image segment; performing position coding on each marked damaged image block to obtain each image position feature vector; and extracting the characteristic relation of the characteristic vectors of the positions of the images to obtain the invariable characteristic value of each image domain.

In one embodiment, the feature extraction layer training module 1104 is further configured to perform linear mapping conversion on each of the token damaged image blocks to obtain each token damaged image vector; and carrying out position coding on each marked damaged image vector to obtain each image position characteristic vector.

In one embodiment, the labeling model obtaining module 1108 is further configured to determine, according to the trained image feature extraction layer and the trained image information detection layer, image labeling model embedding information; embedding the trained image feature extraction layer into the trained image information detection layer according to the image annotation model embedding information to obtain an embedded marking damaged image annotation model; training the embedded damaged image marking model by using the damaged image for training to obtain a trained damaged image marking model.

In one embodiment, the labeling model obtaining module 1108 is further configured to obtain a test-purpose token damage image; inputting the test marking damaged image into a trained marking damaged image marking model to obtain marking damaged image marking information; calculating the difference between marking information of the marked damaged image and marking information of a preset image to obtain a marking information difference value; and under the condition that the marking information difference value meets the marking information difference threshold value, taking the trained marking damaged image marking model as the tested marking damaged image marking model.

The modules in the domain generalization-based power expression damaged image self-labeling model training device can be realized in whole or in part through software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 12. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing server data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by the processor is used for realizing a domain generalization-based power meter damage image self-labeling model training method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 12 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps in the above-described method embodiments.

The user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. The utility model provides a power expression damaged image self-labeling model training method based on domain generalization, which is characterized in that the method comprises the following steps:

acquiring a training marking damaged image and a marking model of the marking damaged image to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer;

training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features;

Training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer;

and obtaining a trained mark damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer.

2. The method of claim 1, wherein the training indicator wear image comprises a plurality of different indicator wear sub-images; the training of the image feature extraction layer to be trained by using the training expression broken image to obtain a trained image feature extraction layer comprises the following steps:

performing vision domain invariant feature coding on each representation damaged sub-image to obtain each image domain invariant feature value corresponding to the training representation damaged image;

according to the feature decoder of the image feature extraction layer to be trained, carrying out image reconstruction on the unchanged feature values of each image domain to obtain the feature values of each reconstructed image;

and training the image feature extraction layer to be trained by using each reconstructed image feature value to obtain the trained image feature extraction layer.

3. The method according to claim 2, wherein the performing the visual field invariant feature coding on each of the representation damaged sub-images to obtain each image field invariant feature value corresponding to the training representation damaged image includes:

Carrying out style mixing treatment on each marking damaged sub-image to obtain each mixed marking damaged image;

and performing the constant characteristic coding of the visual domain on each mixed representation damaged image to obtain the constant characteristic value of each image domain.

4. A method according to claim 3, wherein said performing a visual field invariant feature code on each of said hybrid-notation corrupted images to obtain each of said image field invariant feature values comprises:

dividing each mixed marking damaged image to obtain each marking damaged image block;

performing position coding on each marked damaged image block to obtain each image position feature vector;

and extracting the characteristic relation of the image position characteristic vector to obtain the unchanged characteristic value of each image domain.

5. The method of claim 4, wherein said performing position encoding on each of said identified broken image segments to obtain each image position feature vector comprises:

performing linear mapping conversion on each marked damaged image block to obtain each marked damaged image vector;

and carrying out position coding on each marked damaged image vector to obtain each image position characteristic vector.

6. The method according to claim 1, wherein the obtaining a trained token damaged image annotation model according to the trained image feature extraction layer and the trained image information detection layer comprises:

determining embedded information of an image annotation model according to the trained image feature extraction layer and the trained image information detection layer;

embedding the trained image feature extraction layer into the trained image information detection layer according to the image annotation model embedding information to obtain an embedded marking damaged image annotation model;

and training the embedded marking damaged image marking model by using the training marking damaged image to obtain the trained marking damaged image marking model.

7. The method of claim 1, wherein after the step of obtaining a trained token damaged image annotation model from the trained image feature extraction layer and the trained image information detection layer, the method further comprises:

acquiring a damage image of a test mark;

inputting the test marking damaged image into the trained marking damaged image marking model to obtain marking damaged image marking information;

Calculating the difference between the marking information of the marked damaged image and the marking information of the preset image to obtain a marking information difference value;

and under the condition that the marking information difference value meets the marking information difference threshold value, taking the trained marking damaged image marking model as a tested marking damaged image marking model.

8. A domain generalization-based power expression damaged image self-labeling model training device, comprising:

the data information acquisition module is used for acquiring a training mark damaged image and a mark damaged image marking model to be trained; the to-be-trained expression damaged image annotation model comprises an to-be-trained image feature extraction layer and an to-be-trained image information detection layer;

the feature extraction layer training module is used for training the image feature extraction layer to be trained by using the training table damaged image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features;

the feature detection layer training module is used for training the image information detection layer to be trained by using the training table damaged image to obtain a trained image information detection layer;

And the marking model obtaining module is used for obtaining a marking model of the trained marked damaged image according to the trained image feature extraction layer and the trained image information detection layer.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.